R600/SI: Enable a lot of existing tests for VI (squashed commits)
This is a union of these commits:
* R600/SI: Enable more tests for VI which need no changes
* R600/SI: Enable V_BCNT tests for VI
Differences:
- v_bcnt_..._e32 -> _e64
- s_load_dword* inline offset is in bytes instead of dwords
* R600/SI: Enable all tests for VI which use S_LOAD_DWORD
The inline offset is changed from dwords to bytes.
* R600/SI: Enable LDS tests for VI
Differences:
- the s_load_dword inline offset changed from dwords to bytes
- the tests checked very little on CI, so they have been fixed to check all
instructions that "SI" checked
* R600/SI: Enable lshr tests for VI
* R600/SI: Fix divrem64 tests
- "v_lshl_64" was missing "b" before "64"
- added VI-NOT checks
* R600/SI: Enable the SI.tid test for VI
* R600/SI: Enable the frem test for VI
Also, the frem_f64 checking is added for CI-VI.
* R600/SI: Add VI tests for rsq.clamped
llvm-svn: 228830
diff --git a/llvm/test/CodeGen/R600/llvm.AMDGPU.rsq.clamped.ll b/llvm/test/CodeGen/R600/llvm.AMDGPU.rsq.clamped.ll
index 9336baf..eeff253 100644
--- a/llvm/test/CodeGen/R600/llvm.AMDGPU.rsq.clamped.ll
+++ b/llvm/test/CodeGen/R600/llvm.AMDGPU.rsq.clamped.ll
@@ -1,4 +1,5 @@
; RUN: llc -march=amdgcn -mcpu=SI -verify-machineinstrs < %s | FileCheck -check-prefix=SI -check-prefix=FUNC %s
+; RUN: llc -march=amdgcn -mcpu=tonga -verify-machineinstrs < %s | FileCheck -check-prefix=VI -check-prefix=FUNC %s
; RUN: llc -march=r600 -mcpu=cypress -verify-machineinstrs < %s | FileCheck -check-prefix=EG -check-prefix=FUNC %s
@@ -6,7 +7,15 @@
; FUNC-LABEL: {{^}}rsq_clamped_f32:
; SI: v_rsq_clamp_f32_e32
+
+; VI: v_rsq_f32_e32 [[RSQ:v[0-9]+]], {{s[0-9]+}}
+; VI: v_min_f32_e32 [[MIN:v[0-9]+]], 0x7f7fffff, [[RSQ]]
+; TODO: this constant should be folded:
+; VI: v_mov_b32_e32 [[MINFLT:v[0-9]+]], 0xff7fffff
+; VI: v_max_f32_e32 {{v[0-9]+}}, [[MIN]], [[MINFLT]]
+
; EG: RECIPSQRT_CLAMPED
+
define void @rsq_clamped_f32(float addrspace(1)* %out, float %src) nounwind {
%rsq_clamped = call float @llvm.AMDGPU.rsq.clamped.f32(float %src) nounwind readnone
store float %rsq_clamped, float addrspace(1)* %out, align 4