make fast unaligned memory accesses implicit with SSE4.2 or SSE4a This is a follow-on from the discussion in http://reviews.llvm.org/D12154. This change allows memset/memcpy to use SSE or AVX memory accesses for any chip that has generally fast unaligned memory ops. A motivating use case for this change is a clang invocation that doesn't explicitly set the CPU, but does target a feature that we know only exists on a CPU that supports fast unaligned memops. For example: $ clang -O1 foo.c -mavx This resolves a difference in lowering noted in PR24449: https://llvm.org/bugs/show_bug.cgi?id=24449 Before this patch, we used different store types depending on whether the example can be lowered as a memset or not. Differential Revision: http://reviews.llvm.org/D12288 llvm-svn: 245950

commit: deb8f826a58260244e8bac596d09ea54485837eb [log] [tgz]
author: Sanjay Patel <spatel@rotateright.com> Tue Aug 25 16:29:21 2015 +0000
committer: Sanjay Patel <spatel@rotateright.com> Tue Aug 25 16:29:21 2015 +0000
tree: 8203f1fff3e4e913c8490f9a79ba259da695dca3
parent: 3240cd3421c78b707e80d59ea0bcd5f14c8933fa [diff] [blame]
diff --git a/llvm/lib/Target/X86/X86Subtarget.cpp b/llvm/lib/Target/X86/X86Subtarget.cpp
index 565ba1d..b23b3c0 100644
--- a/llvm/lib/Target/X86/X86Subtarget.cpp
+++ b/llvm/lib/Target/X86/X86Subtarget.cpp

@@ -192,6 +192,13 @@
   // Parse features string and set the CPU.
   ParseSubtargetFeatures(CPUName, FullFS);
 
+  // All CPUs that implement SSE4.2 or SSE4A support unaligned accesses of
+  // 16-bytes and under that are reasonably fast. These features were
+  // introduced with Intel's Nehalem/Silvermont and AMD's Family10h
+  // micro-architectures respectively.
+  if (hasSSE42() || hasSSE4A())
+    IsUAMemUnder32Slow = false;
+  
   InstrItins = getInstrItineraryForCPU(CPUName);
 
   // It's important to keep the MCSubtargetInfo feature bits in sync with
commit	deb8f826a58260244e8bac596d09ea54485837eb	[log] [tgz]
author	Sanjay Patel <spatel@rotateright.com>	Tue Aug 25 16:29:21 2015 +0000
committer	Sanjay Patel <spatel@rotateright.com>	Tue Aug 25 16:29:21 2015 +0000
tree	8203f1fff3e4e913c8490f9a79ba259da695dca3
parent	3240cd3421c78b707e80d59ea0bcd5f14c8933fa [diff] [blame]