[CRYPTO] padlock: Fix alignment fault in aes_crypt_copy

The previous patch fixed spurious read faults from occuring by copying
the data if we happen to have a single block at the end of a page.  It
appears that gcc cannot guarantee 16-byte alignment in the kernel with
__attribute__.  The following report from Torben Viets shows a buffer
that's only 8-byte aligned:

> eneral protection fault: 0000 [#1]
> Modules linked in: xt_TCPMSS xt_tcpmss iptable_mangle ipt_MASQUERADE
> xt_tcpudp xt_mark xt_state iptable_nat nf_nat nf_conntrack_ipv4
> iptable_filter ip_tables x_tables pppoe pppox af_packet ppp_generic slhc
> aes_i586
> CPU:    0
> EIP:    0060:[<c035b828>]    Not tainted VLI
> EFLAGS: 00010292   (2.6.23.12 #7)
> EIP is at aes_crypt_copy+0x28/0x40
> eax: f7639ff0   ebx: f6c24050   ecx: 00000001   edx: f6c24030
> esi: f7e89dc8   edi: f7639ff0   ebp: 00010000   esp: f7e89dc8

Since the hardware must have 16-byte alignment, the following patch fixes
this by open coding the alignment adjustment.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
diff --git a/drivers/crypto/padlock-aes.c b/drivers/crypto/padlock-aes.c
index a337b69..5f7e718 100644
--- a/drivers/crypto/padlock-aes.c
+++ b/drivers/crypto/padlock-aes.c
@@ -429,8 +429,8 @@
 
 static void aes_crypt_copy(const u8 *in, u8 *out, u32 *key, struct cword *cword)
 {
-	u8 tmp[AES_BLOCK_SIZE * 2]
-		__attribute__ ((__aligned__(PADLOCK_ALIGNMENT)));
+	u8 buf[AES_BLOCK_SIZE * 2 + PADLOCK_ALIGNMENT - 1];
+	u8 *tmp = PTR_ALIGN(&buf[0], PADLOCK_ALIGNMENT);
 
 	memcpy(tmp, in, AES_BLOCK_SIZE);
 	padlock_xcrypt(tmp, out, key, cword);