Fix assembly directives for SHA data generation.

With ".align 8" placed after the symbolic label, the values end up being
placed at the wrong location, thus causing reads of 0 (unitialized) values.
Consider the following assembly code:

foo:
.align 8
.byte 0x1
bar:
.align 8
.byte 0x2
baz:
.align 8
.byte 0x3

This will result in the following sample memory layout:
foo -> 0x400 (with value 1 there)
bar -> 0x401 (with value 0 - uninitialized memory)
       0x408 (with value 2)
baz -> 0x409 (with value 0 - uninitialized memory)
       0x410 (with value 3)

This patch moves the ".align 8" directive before each label.

Change-Id: I0e0a2ea4b4d4a48a9fe660e86f8f0106f2ec7723
diff --git a/tools/build/gen-sha1-stamp.py b/tools/build/gen-sha1-stamp.py
index 239d040..a4c53ef 100755
--- a/tools/build/gen-sha1-stamp.py
+++ b/tools/build/gen-sha1-stamp.py
@@ -76,7 +76,6 @@
 
 def print_asm_data(data, size):
     col = 0
-    sys.stdout.write(".align 8\n")
     for i in xrange(size):
         c = data[i]
         if col == 0:
@@ -95,6 +94,7 @@
 
 def print_asm_symbol_data(sym, h):
     sys.stdout.write("""
+.align 5
 #ifdef __APPLE_CC__
 _%s:\n\
 #else\n\