[X86] Emit 11-byte or 15-byte NOPs on recent AMD targets, else default to 10-byte NOPs (PR22965)
We currently emit up to 15-byte NOPs on all targets (apart from Silvermont), which stalls performance on some targets with decoders that struggle with 2 or 3 more '66' prefixes.
This patch flags recent AMD targets (btver1/znver1) to still emit 15-byte NOPs and bdver* targets to emit 11-byte NOPs. All other targets now emit 10-byte NOPs apart from SilverMont CPUs which still emit 7-byte NOPS.
Differential Revision: https://reviews.llvm.org/D42616
llvm-svn: 323693
diff --git a/llvm/lib/Target/X86/X86Subtarget.h b/llvm/lib/Target/X86/X86Subtarget.h
index e34735b..b8adb63 100644
--- a/llvm/lib/Target/X86/X86Subtarget.h
+++ b/llvm/lib/Target/X86/X86Subtarget.h
@@ -246,6 +246,14 @@
/// of a YMM or ZMM register without clearing the upper part.
bool HasFastPartialYMMorZMMWrite;
+ /// True if there is no performance penalty for writing NOPs with up to
+ /// 11 bytes.
+ bool HasFast11ByteNOP;
+
+ /// True if there is no performance penalty for writing NOPs with up to
+ /// 15 bytes.
+ bool HasFast15ByteNOP;
+
/// True if gather is reasonably fast. This is true for Skylake client and
/// all AVX-512 CPUs.
bool HasFastGather;