blob: a6971e34fabe1c95cb22d5bb9f7911b62c7624b5 [file] [log] [blame]
Nicholas Mc Guire9fd7fc32014-12-08 09:33:26 +01001lglock - local/global locks for mostly local access patterns
2------------------------------------------------------------
3
4Origin: Nick Piggin's VFS scalability series introduced during
5 2.6.35++ [1] [2]
6Location: kernel/locking/lglock.c
7 include/linux/lglock.h
8Users: currently only the VFS and stop_machine related code
9
10Design Goal:
11------------
12
13Improve scalability of globally used large data sets that are
14distributed over all CPUs as per_cpu elements.
15
16To manage global data structures that are partitioned over all CPUs
17as per_cpu elements but can be mostly handled by CPU local actions
18lglock will be used where the majority of accesses are cpu local
19reading and occasional cpu local writing with very infrequent
20global write access.
21
22
23* deal with things locally whenever possible
24 - very fast access to the local per_cpu data
25 - reasonably fast access to specific per_cpu data on a different
26 CPU
27* while making global action possible when needed
28 - by expensive access to all CPUs locks - effectively
29 resulting in a globally visible critical section.
30
31Design:
32-------
33
34Basically it is an array of per_cpu spinlocks with the
35lg_local_lock/unlock accessing the local CPUs lock object and the
36lg_local_lock_cpu/unlock_cpu accessing a remote CPUs lock object
37the lg_local_lock has to disable preemption as migration protection so
38that the reference to the local CPUs lock does not go out of scope.
39Due to the lg_local_lock/unlock only touching cpu-local resources it
40is fast. Taking the local lock on a different CPU will be more
41expensive but still relatively cheap.
42
43One can relax the migration constraints by acquiring the current
44CPUs lock with lg_local_lock_cpu, remember the cpu, and release that
45lock at the end of the critical section even if migrated. This should
46give most of the performance benefits without inhibiting migration
47though needs careful considerations for nesting of lglocks and
48consideration of deadlocks with lg_global_lock.
49
50The lg_global_lock/unlock locks all underlying spinlocks of all
51possible CPUs (including those off-line). The preemption disable/enable
52are needed in the non-RT kernels to prevent deadlocks like:
53
54 on cpu 1
55
56 task A task B
57 lg_global_lock
58 got cpu 0 lock
59 <<<< preempt <<<<
60 lg_local_lock_cpu for cpu 0
61 spin on cpu 0 lock
62
63On -RT this deadlock scenario is resolved by the arch_spin_locks in the
64lglocks being replaced by rt_mutexes which resolve the above deadlock
65by boosting the lock-holder.
66
67
68Implementation:
69---------------
70
71The initial lglock implementation from Nick Piggin used some complex
72macros to generate the lglock/brlock in lglock.h - they were later
73turned into a set of functions by Andi Kleen [7]. The change to functions
74was motivated by the presence of multiple lock users and also by them
75being easier to maintain than the generating macros. This change to
76functions is also the basis to eliminated the restriction of not
77being initializeable in kernel modules (the remaining problem is that
78locks are not explicitly initialized - see lockdep-design.txt)
79
80Declaration and initialization:
81-------------------------------
82
83 #include <linux/lglock.h>
84
85 DEFINE_LGLOCK(name)
86 or:
87 DEFINE_STATIC_LGLOCK(name);
88
89 lg_lock_init(&name, "lockdep_name_string");
90
91 on UP this is mapped to DEFINE_SPINLOCK(name) in both cases, note
92 also that as of 3.18-rc6 all declaration in use are of the _STATIC_
93 variant (and it seems that the non-static was never in use).
94 lg_lock_init is initializing the lockdep map only.
95
96Usage:
97------
98
99From the locking semantics it is a spinlock. It could be called a
100locality aware spinlock. lg_local_* behaves like a per_cpu
101spinlock and lg_global_* like a global spinlock.
102No surprises in the API.
103
104 lg_local_lock(*lglock);
105 access to protected per_cpu object on this CPU
106 lg_local_unlock(*lglock);
107
108 lg_local_lock_cpu(*lglock, cpu);
109 access to protected per_cpu object on other CPU cpu
110 lg_local_unlock_cpu(*lglock, cpu);
111
112 lg_global_lock(*lglock);
113 access all protected per_cpu objects on all CPUs
114 lg_global_unlock(*lglock);
115
116 There are no _trylock variants of the lglocks.
117
118Note that the lg_global_lock/unlock has to iterate over all possible
119CPUs rather than the actually present CPUs or a CPU could go off-line
120with a held lock [4] and that makes it very expensive. A discussion on
121these issues can be found at [5]
122
123Constraints:
124------------
125
126 * currently the declaration of lglocks in kernel modules is not
127 possible, though this should be doable with little change.
128 * lglocks are not recursive.
129 * suitable for code that can do most operations on the CPU local
130 data and will very rarely need the global lock
131 * lg_global_lock/unlock is *very* expensive and does not scale
132 * on UP systems all lg_* primitives are simply spinlocks
133 * in PREEMPT_RT the spinlock becomes an rt-mutex and can sleep but
134 does not change the tasks state while sleeping [6].
135 * in PREEMPT_RT the preempt_disable/enable in lg_local_lock/unlock
136 is downgraded to a migrate_disable/enable, the other
137 preempt_disable/enable are downgraded to barriers [6].
138 The deadlock noted for non-RT above is resolved due to rt_mutexes
139 boosting the lock-holder in this case which arch_spin_locks do
140 not do.
141
142lglocks were designed for very specific problems in the VFS and probably
143only are the right answer in these corner cases. Any new user that looks
144at lglocks probably wants to look at the seqlock and RCU alternatives as
145her first choice. There are also efforts to resolve the RCU issues that
146currently prevent using RCU in place of view remaining lglocks.
147
148Note on brlock history:
149-----------------------
150
151The 'Big Reader' read-write spinlocks were originally introduced by
152Ingo Molnar in 2000 (2.4/2.5 kernel series) and removed in 2003. They
153later were introduced by the VFS scalability patch set in 2.6 series
154again as the "big reader lock" brlock [2] variant of lglock which has
155been replaced by seqlock primitives or by RCU based primitives in the
1563.13 kernel series as was suggested in [3] in 2003. The brlock was
157entirely removed in the 3.13 kernel series.
158
159Link: 1 http://lkml.org/lkml/2010/8/2/81
160Link: 2 http://lwn.net/Articles/401738/
161Link: 3 http://lkml.org/lkml/2003/3/9/205
162Link: 4 https://lkml.org/lkml/2011/8/24/185
163Link: 5 http://lkml.org/lkml/2011/12/18/189
164Link: 6 https://www.kernel.org/pub/linux/kernel/projects/rt/
165 patch series - lglocks-rt.patch.patch
166Link: 7 http://lkml.org/lkml/2012/3/5/26