hotplugcpu: Avoid deadlocks by waking active_writer
Commit b2c4623dcd07 ("rcu: More on deadlock between CPU hotplug and expedited
grace periods") introduced another problem that can easily be reproduced by
starting/stopping cpus in a loop.
E.g.:
for i in `seq 5000`; do
echo 1 > /sys/devices/system/cpu/cpu1/online
echo 0 > /sys/devices/system/cpu/cpu1/online
done
Will result in:
INFO: task /cpu_start_stop:1 blocked for more than 120 seconds.
Call Trace:
([<00000000006a028e>] __schedule+0x406/0x91c)
[<0000000000130f60>] cpu_hotplug_begin+0xd0/0xd4
[<0000000000130ff6>] _cpu_up+0x3e/0x1c4
[<0000000000131232>] cpu_up+0xb6/0xd4
[<00000000004a5720>] device_online+0x80/0xc0
[<00000000004a57f0>] online_store+0x90/0xb0
...
And a deadlock.
Problem is that if the last ref in put_online_cpus() can't get the
cpu_hotplug.lock the puts_pending count is incremented, but a sleeping
active_writer might never be woken up, therefore never exiting the loop in
cpu_hotplug_begin().
This fix removes puts_pending and turns refcount into an atomic variable. We
also introduce a wait queue for the active_writer, to avoid possible races and
use-after-free. There is no need to take the lock in put_online_cpus() anymore.
Can't reproduce it with this fix.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
1 file changed