powerpc: Fix RCU idle and hcall tracing
Tracepoints should not be called inside an rcu_idle_enter/rcu_idle_exit
region. Since pSeries calls H_CEDE in the idle loop, we were violating
this rule.
commit a7b152d5342c (powerpc: Tell RCU about idle after hcall tracing)
tried to work around it by delaying the rcu_idle_enter until after we
called the hcall tracepoint, but there are a number of issues with it.
The hcall tracepoint trampoline code is called conditionally when the
tracepoint is enabled. If the tracepoint is not enabled we never call
rcu_idle_enter. The idle_uses_rcu check was also done at compile time
which breaks multiplatform builds.
The simple fix is to avoid tracing H_CEDE and rely on other tracepoints
and the hypervisor dispatch trace log to work out if we called H_CEDE.
This fixes a hang during boot on pSeries.
Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c
index 948e0e3..7bc73af 100644
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -546,6 +546,13 @@
unsigned long flags;
unsigned int *depth;
+ /*
+ * We cannot call tracepoints inside RCU idle regions which
+ * means we must not trace H_CEDE.
+ */
+ if (opcode == H_CEDE)
+ return;
+
local_irq_save(flags);
depth = &__get_cpu_var(hcall_trace_depth);
@@ -556,8 +563,6 @@
(*depth)++;
preempt_disable();
trace_hcall_entry(opcode, args);
- if (opcode == H_CEDE)
- rcu_idle_enter();
(*depth)--;
out:
@@ -570,6 +575,9 @@
unsigned long flags;
unsigned int *depth;
+ if (opcode == H_CEDE)
+ return;
+
local_irq_save(flags);
depth = &__get_cpu_var(hcall_trace_depth);
@@ -578,8 +586,6 @@
goto out;
(*depth)++;
- if (opcode == H_CEDE)
- rcu_idle_exit();
trace_hcall_exit(opcode, retval, retbuf);
preempt_enable();
(*depth)--;