123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167 |
- lglock - local/global locks for mostly local access patterns
- ------------------------------------------------------------
- Origin: Nick Piggin's VFS scalability series introduced during
- 2.6.35++ [1] [2]
- Location: kernel/locking/lglock.c
- include/linux/lglock.h
- Users: currently only the VFS and stop_machine related code
- Design Goal:
- ------------
- Improve scalability of globally used large data sets that are
- distributed over all CPUs as per_cpu elements.
- To manage global data structures that are partitioned over all CPUs
- as per_cpu elements but can be mostly handled by CPU local actions
- lglock will be used where the majority of accesses are cpu local
- reading and occasional cpu local writing with very infrequent
- global write access.
- * deal with things locally whenever possible
- - very fast access to the local per_cpu data
- - reasonably fast access to specific per_cpu data on a different
- CPU
- * while making global action possible when needed
- - by expensive access to all CPUs locks - effectively
- resulting in a globally visible critical section.
- Design:
- -------
- Basically it is an array of per_cpu spinlocks with the
- lg_local_lock/unlock accessing the local CPUs lock object and the
- lg_local_lock_cpu/unlock_cpu accessing a remote CPUs lock object
- the lg_local_lock has to disable preemption as migration protection so
- that the reference to the local CPUs lock does not go out of scope.
- Due to the lg_local_lock/unlock only touching cpu-local resources it
- is fast. Taking the local lock on a different CPU will be more
- expensive but still relatively cheap.
- One can relax the migration constraints by acquiring the current
- CPUs lock with lg_local_lock_cpu, remember the cpu, and release that
- lock at the end of the critical section even if migrated. This should
- give most of the performance benefits without inhibiting migration
- though needs careful considerations for nesting of lglocks and
- consideration of deadlocks with lg_global_lock.
- The lg_global_lock/unlock locks all underlying spinlocks of all
- possible CPUs (including those off-line). The preemption disable/enable
- are needed in the non-RT kernels to prevent deadlocks like:
- on cpu 1
- task A task B
- lg_global_lock
- got cpu 0 lock
- <<<< preempt <<<<
- lg_local_lock_cpu for cpu 0
- spin on cpu 0 lock
- On -RT this deadlock scenario is resolved by the arch_spin_locks in the
- lglocks being replaced by rt_mutexes which resolve the above deadlock
- by boosting the lock-holder.
- Implementation:
- ---------------
- The initial lglock implementation from Nick Piggin used some complex
- macros to generate the lglock/brlock in lglock.h - they were later
- turned into a set of functions by Andi Kleen [7]. The change to functions
- was motivated by the presence of multiple lock users and also by them
- being easier to maintain than the generating macros. This change to
- functions is also the basis to eliminated the restriction of not
- being initializeable in kernel modules (the remaining problem is that
- locks are not explicitly initialized - see lockdep-design.txt)
- Declaration and initialization:
- -------------------------------
- #include <linux/lglock.h>
- DEFINE_LGLOCK(name)
- or:
- DEFINE_STATIC_LGLOCK(name);
- lg_lock_init(&name, "lockdep_name_string");
- on UP this is mapped to DEFINE_SPINLOCK(name) in both cases, note
- also that as of 3.18-rc6 all declaration in use are of the _STATIC_
- variant (and it seems that the non-static was never in use).
- lg_lock_init is initializing the lockdep map only.
- Usage:
- ------
- From the locking semantics it is a spinlock. It could be called a
- locality aware spinlock. lg_local_* behaves like a per_cpu
- spinlock and lg_global_* like a global spinlock.
- No surprises in the API.
- lg_local_lock(*lglock);
- access to protected per_cpu object on this CPU
- lg_local_unlock(*lglock);
- lg_local_lock_cpu(*lglock, cpu);
- access to protected per_cpu object on other CPU cpu
- lg_local_unlock_cpu(*lglock, cpu);
- lg_global_lock(*lglock);
- access all protected per_cpu objects on all CPUs
- lg_global_unlock(*lglock);
- There are no _trylock variants of the lglocks.
- Note that the lg_global_lock/unlock has to iterate over all possible
- CPUs rather than the actually present CPUs or a CPU could go off-line
- with a held lock [4] and that makes it very expensive. A discussion on
- these issues can be found at [5]
- Constraints:
- ------------
- * currently the declaration of lglocks in kernel modules is not
- possible, though this should be doable with little change.
- * lglocks are not recursive.
- * suitable for code that can do most operations on the CPU local
- data and will very rarely need the global lock
- * lg_global_lock/unlock is *very* expensive and does not scale
- * on UP systems all lg_* primitives are simply spinlocks
- * in PREEMPT_RT the spinlock becomes an rt-mutex and can sleep but
- does not change the tasks state while sleeping [6].
- * in PREEMPT_RT the preempt_disable/enable in lg_local_lock/unlock
- is downgraded to a migrate_disable/enable, the other
- preempt_disable/enable are downgraded to barriers [6].
- The deadlock noted for non-RT above is resolved due to rt_mutexes
- boosting the lock-holder in this case which arch_spin_locks do
- not do.
- lglocks were designed for very specific problems in the VFS and probably
- only are the right answer in these corner cases. Any new user that looks
- at lglocks probably wants to look at the seqlock and RCU alternatives as
- her first choice. There are also efforts to resolve the RCU issues that
- currently prevent using RCU in place of view remaining lglocks.
- Note on brlock history:
- -----------------------
- The 'Big Reader' read-write spinlocks were originally introduced by
- Ingo Molnar in 2000 (2.4/2.5 kernel series) and removed in 2003. They
- later were introduced by the VFS scalability patch set in 2.6 series
- again as the "big reader lock" brlock [2] variant of lglock which has
- been replaced by seqlock primitives or by RCU based primitives in the
- 3.13 kernel series as was suggested in [3] in 2003. The brlock was
- entirely removed in the 3.13 kernel series.
- Link: 1 http://lkml.org/lkml/2010/8/2/81
- Link: 2 http://lwn.net/Articles/401738/
- Link: 3 http://lkml.org/lkml/2003/3/9/205
- Link: 4 https://lkml.org/lkml/2011/8/24/185
- Link: 5 http://lkml.org/lkml/2011/12/18/189
- Link: 6 https://www.kernel.org/pub/linux/kernel/projects/rt/
- patch series - lglocks-rt.patch.patch
- Link: 7 http://lkml.org/lkml/2012/3/5/26
|