x86: drop -funroll-loops for csum_partial_64.c

Impact: performance optimization

I did some rebenchmarking with modern compilers and dropping
-funroll-loops makes the function consistently go faster by a few
percent.  So drop that flag.

Thanks to Richard Guenther for a hint.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
diff --git a/arch/x86/lib/Makefile b/arch/x86/lib/Makefile
index aa3fa41..55e11aa 100644
--- a/arch/x86/lib/Makefile
+++ b/arch/x86/lib/Makefile
@@ -17,9 +17,6 @@
         lib-$(CONFIG_X86_USE_3DNOW) += mmx_32.o
 else
         obj-y += io_64.o iomap_copy_64.o
-
-        CFLAGS_csum-partial_64.o := -funroll-loops
-
         lib-y += csum-partial_64.o csum-copy_64.o csum-wrappers_64.o
         lib-y += thunk_64.o clear_page_64.o copy_page_64.o
         lib-y += memmove_64.o memset_64.o