blob: c945062be66c2150d2af4faf4e2b11ffb554da06 [file] [log] [blame]
Mauro Carvalho Chehab9cc07df2017-05-16 21:58:47 -03001===========================================================================
2Proper Locking Under a Preemptible Kernel: Keeping Kernel Code Preempt-Safe
3===========================================================================
4
5:Author: Robert Love <rml@tech9.net>
6:Last Updated: 28 Aug 2002
Linus Torvalds1da177e2005-04-16 15:20:36 -07007
8
Mauro Carvalho Chehab9cc07df2017-05-16 21:58:47 -03009Introduction
10============
Linus Torvalds1da177e2005-04-16 15:20:36 -070011
12
13A preemptible kernel creates new locking issues. The issues are the same as
14those under SMP: concurrency and reentrancy. Thankfully, the Linux preemptible
15kernel model leverages existing SMP locking mechanisms. Thus, the kernel
16requires explicit additional locking for very few additional situations.
17
18This document is for all kernel hackers. Developing code in the kernel
19requires protecting these situations.
20
21
22RULE #1: Per-CPU data structures need explicit protection
Mauro Carvalho Chehab9cc07df2017-05-16 21:58:47 -030023^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Linus Torvalds1da177e2005-04-16 15:20:36 -070024
25
Mauro Carvalho Chehab9cc07df2017-05-16 21:58:47 -030026Two similar problems arise. An example code snippet::
Linus Torvalds1da177e2005-04-16 15:20:36 -070027
28 struct this_needs_locking tux[NR_CPUS];
29 tux[smp_processor_id()] = some_value;
30 /* task is preempted here... */
31 something = tux[smp_processor_id()];
32
33First, since the data is per-CPU, it may not have explicit SMP locking, but
34require it otherwise. Second, when a preempted task is finally rescheduled,
35the previous value of smp_processor_id may not equal the current. You must
36protect these situations by disabling preemption around them.
37
38You can also use put_cpu() and get_cpu(), which will disable preemption.
39
40
41RULE #2: CPU state must be protected.
Mauro Carvalho Chehab9cc07df2017-05-16 21:58:47 -030042^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Linus Torvalds1da177e2005-04-16 15:20:36 -070043
44
45Under preemption, the state of the CPU must be protected. This is arch-
46dependent, but includes CPU structures and state not preserved over a context
47switch. For example, on x86, entering and exiting FPU mode is now a critical
48section that must occur while preemption is disabled. Think what would happen
49if the kernel is executing a floating-point instruction and is then preempted.
50Remember, the kernel does not save FPU state except for user tasks. Therefore,
51upon preemption, the FPU registers will be sold to the lowest bidder. Thus,
52preemption must be disabled around such regions.
53
54Note, some FPU functions are already explicitly preempt safe. For example,
55kernel_fpu_begin and kernel_fpu_end will disable and enable preemption.
Ingo Molnar3a0aee42015-04-22 13:16:47 +020056However, fpu__restore() must be called with preemption disabled.
Linus Torvalds1da177e2005-04-16 15:20:36 -070057
58
59RULE #3: Lock acquire and release must be performed by same task
Mauro Carvalho Chehab9cc07df2017-05-16 21:58:47 -030060^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Linus Torvalds1da177e2005-04-16 15:20:36 -070061
62
63A lock acquired in one task must be released by the same task. This
64means you can't do oddball things like acquire a lock and go off to
65play while another task releases it. If you want to do something
66like this, acquire and release the task in the same code path and
67have the caller wait on an event by the other task.
68
69
Mauro Carvalho Chehab9cc07df2017-05-16 21:58:47 -030070Solution
71========
Linus Torvalds1da177e2005-04-16 15:20:36 -070072
73
74Data protection under preemption is achieved by disabling preemption for the
75duration of the critical region.
76
Mauro Carvalho Chehab9cc07df2017-05-16 21:58:47 -030077::
78
79 preempt_enable() decrement the preempt counter
80 preempt_disable() increment the preempt counter
81 preempt_enable_no_resched() decrement, but do not immediately preempt
82 preempt_check_resched() if needed, reschedule
83 preempt_count() return the preempt counter
Linus Torvalds1da177e2005-04-16 15:20:36 -070084
85The functions are nestable. In other words, you can call preempt_disable
86n-times in a code path, and preemption will not be reenabled until the n-th
87call to preempt_enable. The preempt statements define to nothing if
88preemption is not enabled.
89
90Note that you do not need to explicitly prevent preemption if you are holding
91any locks or interrupts are disabled, since preemption is implicitly disabled
92in those cases.
93
94But keep in mind that 'irqs disabled' is a fundamentally unsafe way of
95disabling preemption - any spin_unlock() decreasing the preemption count
96to 0 might trigger a reschedule. A simple printk() might trigger a reschedule.
97So use this implicit preemption-disabling property only if you know that the
98affected codepath does not do any of this. Best policy is to use this only for
99small, atomic code that you wrote and which calls no complex functions.
100
Mauro Carvalho Chehab9cc07df2017-05-16 21:58:47 -0300101Example::
Linus Torvalds1da177e2005-04-16 15:20:36 -0700102
103 cpucache_t *cc; /* this is per-CPU */
104 preempt_disable();
105 cc = cc_data(searchp);
106 if (cc && cc->avail) {
107 __free_block(searchp, cc_entry(cc), cc->avail);
108 cc->avail = 0;
109 }
110 preempt_enable();
111 return 0;
112
113Notice how the preemption statements must encompass every reference of the
Mauro Carvalho Chehab9cc07df2017-05-16 21:58:47 -0300114critical variables. Another example::
Linus Torvalds1da177e2005-04-16 15:20:36 -0700115
116 int buf[NR_CPUS];
117 set_cpu_val(buf);
118 if (buf[smp_processor_id()] == -1) printf(KERN_INFO "wee!\n");
119 spin_lock(&buf_lock);
120 /* ... */
121
122This code is not preempt-safe, but see how easily we can fix it by simply
123moving the spin_lock up two lines.
124
125
Mauro Carvalho Chehab9cc07df2017-05-16 21:58:47 -0300126Preventing preemption using interrupt disabling
127===============================================
Linus Torvalds1da177e2005-04-16 15:20:36 -0700128
129
130It is possible to prevent a preemption event using local_irq_disable and
131local_irq_save. Note, when doing so, you must be very careful to not cause
132an event that would set need_resched and result in a preemption check. When
133in doubt, rely on locking or explicit preemption disabling.
134
135Note in 2.5 interrupt disabling is now only per-CPU (e.g. local).
136
137An additional concern is proper usage of local_irq_disable and local_irq_save.
138These may be used to protect from preemption, however, on exit, if preemption
139may be enabled, a test to see if preemption is required should be done. If
140these are called from the spin_lock and read/write lock macros, the right thing
141is done. They may also be called within a spin-lock protected region, however,
142if they are ever called outside of this context, a test for preemption should
143be made. Do note that calls from interrupt context or bottom half/ tasklets
144are also protected by preemption locks and so may use the versions which do
145not check preemption.