[AArch64] Combine callee-save and local stack SP adjustment instructions.

Summary:
If a function needs to allocate both callee-save stack memory and local
stack memory, we currently decrement/increment the SP in two steps:
first for the callee-save area, and then for the local stack area.  This
changes the code to allocate them both at once at the very beginning/end
of the function.  This has two benefits:

1) there is one fewer sub/add micro-op in the prologue/epilogue

2) the stack adjustment instructions act as a scheduling barrier, so
moving them to the very beginning/end of the function increases post-RA
scheduler's ability to move instructions (that only depend on argument
registers) before any of the callee-save stores

This change can cause an increase in instructions if the original local
stack SP decrement could be folded into the first store to the stack.
This occurs when the first local stack store is to stack offset 0.  In
this case we are trading off one more sub instruction for one fewer sub
micro-op (along with benefits (2) and (3) above).

Reviewers: t.p.northover

Subscribers: aemerson, rengolin, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D18619

llvm-svn: 268746
diff --git a/llvm/test/CodeGen/AArch64/aarch64-dynamic-stack-layout.ll b/llvm/test/CodeGen/AArch64/aarch64-dynamic-stack-layout.ll
index 73b6801..9429e87 100644
--- a/llvm/test/CodeGen/AArch64/aarch64-dynamic-stack-layout.ll
+++ b/llvm/test/CodeGen/AArch64/aarch64-dynamic-stack-layout.ll
@@ -98,8 +98,8 @@
 ; CHECK-LABEL: novla_nodynamicrealign_call
 ; CHECK: .cfi_startproc
 ;   Check that used callee-saved registers are saved
-; CHECK: stp	x19, x30, [sp, #-16]!
-; CHECK: sub	sp, sp, #16
+; CHECK: sub	sp, sp, #32
+; CHECK: stp	x19, x30, [sp, #16]
 ;   Check correctness of cfi pseudo-instructions
 ; CHECK: .cfi_def_cfa_offset 32
 ; CHECK: .cfi_offset w30, -8
@@ -110,17 +110,18 @@
 ;   Check correct access to local variable on the stack, through stack pointer
 ; CHECK: ldr	w[[ILOC:[0-9]+]], [sp, #12]
 ;   Check epilogue:
-; CHECK: ldp	x19, x30, [sp], #16
+; CHECK: ldp	x19, x30, [sp, #16]
 ; CHECK: ret
 ; CHECK: .cfi_endproc
 
 ; CHECK-MACHO-LABEL: _novla_nodynamicrealign_call:
 ; CHECK-MACHO: .cfi_startproc
 ;   Check that used callee-saved registers are saved
-; CHECK-MACHO: stp	x20, x19, [sp, #-32]!
+; CHECK-MACHO: sub	sp, sp, #48
+; CHECK-MACHO: stp	x20, x19, [sp, #16]
 ;   Check that the frame pointer is created:
-; CHECK-MACHO: stp	x29, x30, [sp, #16]
-; CHECK-MACHO: add	x29, sp, #16
+; CHECK-MACHO: stp	x29, x30, [sp, #32]
+; CHECK-MACHO: add	x29, sp, #32
 ;   Check correctness of cfi pseudo-instructions
 ; CHECK-MACHO: .cfi_def_cfa w29, 16
 ; CHECK-MACHO: .cfi_offset w30, -8
@@ -133,8 +134,8 @@
 ;   Check correct access to local variable on the stack, through stack pointer
 ; CHECK-MACHO: ldr	w[[ILOC:[0-9]+]], [sp, #12]
 ;   Check epilogue:
-; CHECK-MACHO: ldp	x29, x30, [sp, #16]
-; CHECK-MACHO: ldp	x20, x19, [sp], #32
+; CHECK-MACHO: ldp	x29, x30, [sp, #32]
+; CHECK-MACHO: ldp	x20, x19, [sp, #16]
 ; CHECK-MACHO: ret
 ; CHECK-MACHO: .cfi_endproc