m68knommu: add optimize memmove() function

Add an m68k/coldfire optimized memmove() function for the m68knommu arch.
This is the same function as used by m68k. Simple speed tests show this
is faster once buffers are larger than 4 bytes, and significantly faster
on much larger buffers (4 times faster above about 100 bytes).

This also goes part of the way to fixing a regression caused by commit
ea61bc461d09e8d331a307916530aaae808c72a2 ("m68k/m68knommu: merge MMU and
non-MMU string.h"), which breaks non-coldfire non-mmu builds (which is
the 68x328 and 68360 families). They currently have no memmove() fucntion
defined, since there was none in the m68knommu/lib functions.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
diff --git a/arch/m68k/include/asm/string.h b/arch/m68k/include/asm/string.h
index ffc3c3f..3219845 100644
--- a/arch/m68k/include/asm/string.h
+++ b/arch/m68k/include/asm/string.h
@@ -99,10 +99,10 @@
 		: "+a" (cs), "+a" (ct), "=d" (res));
 	return res;
 }
+#endif /* CONFIG_COLDFIRE */
 
 #define __HAVE_ARCH_MEMMOVE
 extern void *memmove(void *, const void *, __kernel_size_t);
-#endif /* CONFIG_COLDFIRE */
 
 #define memcmp(d, s, n) __builtin_memcmp(d, s, n)