crypto: des_3des - add x86-64 assembly implementation

Patch adds x86_64 assembly implementation of Triple DES EDE cipher algorithm.
Two assembly implementations are provided. First is regular 'one-block at
time' encrypt/decrypt function. Second is 'three-blocks at time' function that
gains performance increase on out-of-order CPUs.

tcrypt test results:

Intel Core i5-4570:

des3_ede-asm vs des3_ede-generic:
size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec
16B     1.21x   1.22x   1.27x   1.36x   1.25x   1.25x
64B     1.98x   1.96x   1.23x   2.04x   2.01x   2.00x
256B    2.34x   2.37x   1.21x   2.40x   2.38x   2.39x
1024B   2.50x   2.47x   1.22x   2.51x   2.52x   2.51x
8192B   2.51x   2.53x   1.21x   2.56x   2.54x   2.55x

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
diff --git a/include/crypto/des.h b/include/crypto/des.h
index 2971c63..fc6274c 100644
--- a/include/crypto/des.h
+++ b/include/crypto/des.h
@@ -16,4 +16,7 @@
 
 extern unsigned long des_ekey(u32 *pe, const u8 *k);
 
+extern int __des3_ede_setkey(u32 *expkey, u32 *flags, const u8 *key,
+			     unsigned int keylen);
+
 #endif /* __CRYPTO_DES_H */