[ARM] Fix incorrect conversion of a tail call to an ordinary call
When we emit a tail call for Armv8-M, but then discover that the caller needs to
save/restore `LR`, we convert the tail call to an ordinary one, since restoring
`LR` takes extra instructions, which may negate the benefits of the tail
call. If the callee, however, takes stack arguments, this conversion is
incorrect, since nothing has been done to pass the stack arguments.
Thus the patch reverts https://reviews.llvm.org/rL294000
Also, we improve the instruction sequence for popping `LR` in the case when we
couldn't immediately find a scratch low register, but we can use as a temporary
one of the callee-saved low registers and restore `LR` before popping other
callee-saves.
Differential Revision: https://reviews.llvm.org/D39599
llvm-svn: 318143
diff --git a/llvm/lib/Target/ARM/ARMSubtarget.cpp b/llvm/lib/Target/ARM/ARMSubtarget.cpp
index 7e107e4..e3855cc 100644
--- a/llvm/lib/Target/ARM/ARMSubtarget.cpp
+++ b/llvm/lib/Target/ARM/ARMSubtarget.cpp
@@ -221,8 +221,8 @@
// baseline, since the LDM/POP instruction on Thumb doesn't take LR. This
// means if we need to reload LR, it takes extra instructions, which outweighs
// the value of the tail call; but here we don't know yet whether LR is going
- // to be used. We generate the tail call here and turn it back into CALL/RET
- // in emitEpilogue if LR is used.
+ // to be used. We take the optimistic approach of generating the tail call and
+ // perhaps taking a hit if we need to restore the LR.
// Thumb1 PIC calls to external symbols use BX, so they can be tail calls,
// but we need to make sure there are enough registers; the only valid