[X86] Add SchedRW for PMULLD Summary: It seems many CPUs don't implement this instruction as well as the other vector multiplies. Often using a multi uop flow. Silvermont in particular has a 7 uop flow with 11 cycle throughput. Sandy Bridge implements it as a single uop with 5 cycle latency and 1 cycle throughput. But Haswell and later use 2 uops with 10 cycle latency and 2 cycle throughput. This patch adds a new X86SchedWritePair we can use to tag this instruction separately. I've provided correct information for Silvermont, Btver2, and Sandy Bridge. I've removed the InstRWs for SandyBridge. I've left Haswell/Broadwell/Skylake InstRWs in place because I wasn't sure how to account for the different load latency between 128 and 256 bits. I also left Znver1 InstRWs in place because the existing values don't match Agner's spreadsheet. I also left a FIXME in the SandyBridge model because it being used for the "generic" model is too optimistic for the 256/512-bit versions since those are multiple uops on all known CPUs. Reviewers: RKSimon, GGanesh, courbet Reviewed By: RKSimon Subscribers: gchatelet, gbedwell, andreadb, llvm-commits Differential Revision: https://reviews.llvm.org/D44972 llvm-svn: 328914

commit: 13a0f83a05ff46341b722d9e6fabe3f32443a3e1 [log] [tgz]
author: Craig Topper <craig.topper@intel.com> Sat Mar 31 04:54:32 2018 +0000
committer: Craig Topper <craig.topper@intel.com> Sat Mar 31 04:54:32 2018 +0000
tree: 82eb11f2d69f927cabb2fd5d38dea9111cbfa2e2
parent: 96871864d2433f98b643a687b8981beba19d3bc3 [diff] [blame]
diff --git a/llvm/lib/Target/X86/X86InstrAVX512.td b/llvm/lib/Target/X86/X86InstrAVX512.td
index 663f2d1..188a167 100644
--- a/llvm/lib/Target/X86/X86InstrAVX512.td
+++ b/llvm/lib/Target/X86/X86InstrAVX512.td

@@ -4505,7 +4505,7 @@
 defm VPSUBUS : avx512_binop_rm_vl_bw<0xD8, 0xD9, "vpsubus", X86subus,
                                      SSE_INTALU_ITINS_P, HasBWI, 0>;
 defm VPMULLD : avx512_binop_rm_vl_d<0x40, "vpmulld", mul,
-                                    SSE_INTMUL_ITINS_P, HasAVX512, 1>, T8PD;
+                                    SSE_PMULLD_ITINS, HasAVX512, 1>, T8PD;
 defm VPMULLW : avx512_binop_rm_vl_w<0xD5, "vpmullw", mul,
                                     SSE_INTMUL_ITINS_P, HasBWI, 1>;
 defm VPMULLQ : avx512_binop_rm_vl_q<0x40, "vpmullq", mul,
commit	13a0f83a05ff46341b722d9e6fabe3f32443a3e1	[log] [tgz]
author	Craig Topper <craig.topper@intel.com>	Sat Mar 31 04:54:32 2018 +0000
committer	Craig Topper <craig.topper@intel.com>	Sat Mar 31 04:54:32 2018 +0000
tree	82eb11f2d69f927cabb2fd5d38dea9111cbfa2e2
parent	96871864d2433f98b643a687b8981beba19d3bc3 [diff] [blame]