x86: Use -maccumulate-outgoing-args for sane mcount prologues

commit 746357d (x86: Prevent GCC 4.4.x (pentium-mmx et al) function
prologue wreckage) uses -mtune=generic to work around the function
prologue problem with mcount on -march=pentium-mmx and others.

Jakub pointed out that we can use -maccumulate-outgoing-args instead
which is selected by -mtune=generic and prevents the problem without
losing the -march specific optimizations.

Pointed-out-by: Jakub Jelinek <jakub@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: stable@kernel.org
diff --git a/arch/x86/Makefile_32.cpu b/arch/x86/Makefile_32.cpu
index df7fdf8..1937226 100644
--- a/arch/x86/Makefile_32.cpu
+++ b/arch/x86/Makefile_32.cpu
@@ -49,8 +49,9 @@
 # Work around the pentium-mmx code generator madness of gcc4.4.x which
 # does stack alignment by generating horrible code _before_ the mcount
 # prologue (push %ebp, mov %esp, %ebp) which breaks the function graph
-# tracer assumptions 
-cflags-$(CONFIG_FUNCTION_GRAPH_TRACER) += $(call cc-option,-mtune=generic)
+# tracer assumptions. For i686, generic, core2 this is set by the
+# compiler anyway
+cflags-$(CONFIG_FUNCTION_GRAPH_TRACER) += $(call cc-option,-maccumulate-outgoing-args)
 
 # Bug fix for binutils: this option is required in order to keep
 # binutils from generating NOPL instructions against our will.