[X86][SSE] Add PACKUS support to LowerTruncate
Similar to the existing code to lower to PACKSS, we can use PACKUS if the input vector's leading zero bits extend all the way to the packed/truncated value.
We have to account for pre-SSE41 targets not supporting PACKUSDW
llvm-svn: 317128
diff --git a/llvm/test/CodeGen/X86/combine-srl.ll b/llvm/test/CodeGen/X86/combine-srl.ll
index 1d064b3..9f7f8a97 100644
--- a/llvm/test/CodeGen/X86/combine-srl.ll
+++ b/llvm/test/CodeGen/X86/combine-srl.ll
@@ -181,9 +181,8 @@
; AVX-LABEL: combine_vec_lshr_trunc_lshr0:
; AVX: # BB#0:
; AVX-NEXT: vpsrlq $48, %ymm0, %ymm0
-; AVX-NEXT: vpshufd {{.*#+}} ymm0 = ymm0[0,2,2,3,4,6,6,7]
-; AVX-NEXT: vpermq {{.*#+}} ymm0 = ymm0[0,2,2,3]
-; AVX-NEXT: # kill: %XMM0<def> %XMM0<kill> %YMM0<kill>
+; AVX-NEXT: vextracti128 $1, %ymm0, %xmm1
+; AVX-NEXT: vpackusdw %xmm1, %xmm0, %xmm0
; AVX-NEXT: vzeroupper
; AVX-NEXT: retq
%1 = lshr <4 x i64> %x, <i64 32, i64 32, i64 32, i64 32>