[NVPTX] Move NVPTXPeephole after NVPTXPrologEpilogPass
Summary:
Offset of frame index is calculated by NVPTXPrologEpilogPass. Before
that the correct offset of stack objects cannot be obtained, which
leads to wrong offset if there are more than 2 frame objects. This patch
move NVPTXPeephole after NVPTXPrologEpilogPass. Because the frame index
is already replaced by %VRFrame in NVPTXPrologEpilogPass, we check
VRFrame register instead, and try to remove the VRFrame if there
is no usage after NVPTXPeephole pass.
Patched by Xuetian Weng.
Test Plan:
Strengthened test/CodeGen/NVPTX/local-stack-frame.ll to check the
offset calculation based on SP and SPL.
Reviewers: jholewinski, jingyue
Reviewed By: jingyue
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D10853
llvm-svn: 241185
diff --git a/llvm/test/CodeGen/NVPTX/local-stack-frame.ll b/llvm/test/CodeGen/NVPTX/local-stack-frame.ll
index fba5dd8..ef1b7da 100644
--- a/llvm/test/CodeGen/NVPTX/local-stack-frame.ll
+++ b/llvm/test/CodeGen/NVPTX/local-stack-frame.ll
@@ -59,10 +59,16 @@
; PTX32: cvta.local.u32 %SP, %SPL;
; PTX32: add.u32 {{%r[0-9]+}}, %SP, 0;
+; PTX32: add.u32 {{%r[0-9]+}}, %SPL, 0;
+; PTX32: add.u32 {{%r[0-9]+}}, %SP, 4;
+; PTX32: add.u32 {{%r[0-9]+}}, %SPL, 4;
; PTX32: st.local.u32 [{{%r[0-9]+}}], {{%r[0-9]+}}
; PTX32: st.local.u32 [{{%r[0-9]+}}], {{%r[0-9]+}}
; PTX64: cvta.local.u64 %SP, %SPL;
; PTX64: add.u64 {{%rd[0-9]+}}, %SP, 0;
+; PTX64: add.u64 {{%rd[0-9]+}}, %SPL, 0;
+; PTX64: add.u64 {{%rd[0-9]+}}, %SP, 4;
+; PTX64: add.u64 {{%rd[0-9]+}}, %SPL, 4;
; PTX64: st.local.u32 [{{%rd[0-9]+}}], {{%r[0-9]+}}
; PTX64: st.local.u32 [{{%rd[0-9]+}}], {{%r[0-9]+}}
define void @foo4() {