kthread: Ensure task isn't preempted before dequeue in kthread_parkme

kthread_park waits for the target thread to park itself with
kthread_parkme using a completion variable. kthread_parkme -
which is invoked by the target thread - sets the completion
variable before calling schedule to get itself off of the
runqueue.

This causes an interesting race in the hotplug path. takedown_cpu
invoked for CPU X attempts to park the cpuhp/X thread before
running the stopper thread on CPU X. There is a guarantee that
the task state of cpuhp/X is set to TASK_PARKED, but there is no
guarantee that it's actually off of the runqueue when kthread_park
returns. takedown_cpu proceeds to run the stopper thread on CPUX
which promptly migrates off the still-on-rq cpuhp/X thread to another
cpu CPUY.

All of this is actually OK - cpuhp/X may finally get itself off
of CPU_Y's runqueue at some later point. However, let's assume
CPU_Y has a rather long running RT task, and cpuhp/X doesn't
actually get to run. Now for whatever reason CPU_X is brought online
again, and an attempt is made to unpark cpuhp/X in cpuhp_online_idle
with preemption disabled. kthread_unpark calls kthread_bind_mask,
which finds that the task still active, leading to a schedule()
call in wait_task_inactive, causing a "scheduling while atomic"
BUG.

Now we can force the hotplug thread to actually wait for smpboot
threads to get off of the runqeue - but this sort of defeats the
lightweight nature of parking for everyone else. Let's simply
ensure that the setting of the completion variable and the schedule()
is atomic. This completely fixes the hotplug versus kthread_parkme
race.

Change-Id: Ia624b07119462911a9d4d367100408f4426cb6f6
Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org>
1 file changed