Blame - Documentation/mutex-design.txt - kernel/msm-4.9

blob: 38c10fd7f4110448facd7089b985c4776d264d85 [file] [log] [blame]

Ingo Molnar	f3f54ff	2006-01-09 15:59:20 -0800	[diff] [blame]	1	Generic Mutex Subsystem
				2
				3	started by Ingo Molnar <mingo@redhat.com>
				4
				5	"Why on earth do we need a new mutex subsystem, and what's wrong
				6	with semaphores?"
				7
				8	firstly, there's nothing wrong with semaphores. But if the simpler
				9	mutex semantics are sufficient for your code, then there are a couple
				10	of advantages of mutexes:
				11
Randy Dunlap	ef5dc12	2010-09-02 15:48:16 -0700	[diff] [blame]	12	- 'struct mutex' is smaller on most architectures: E.g. on x86,
Ingo Molnar	f3f54ff	2006-01-09 15:59:20 -0800	[diff] [blame]	13	'struct semaphore' is 20 bytes, 'struct mutex' is 16 bytes.
				14	A smaller structure size means less RAM footprint, and better
				15	CPU-cache utilization.
				16
				17	- tighter code. On x86 i get the following .text sizes when
				18	switching all mutex-alike semaphores in the kernel to the mutex
				19	subsystem:
				20
				21	text data bss dec hex filename
				22	3280380 868188 396860 4545428 455b94 vmlinux-semaphore
				23	3255329 865296 396732 4517357 44eded vmlinux-mutex
				24
				25	that's 25051 bytes of code saved, or a 0.76% win - off the hottest
				26	codepaths of the kernel. (The .data savings are 2892 bytes, or 0.33%)
				27	Smaller code means better icache footprint, which is one of the
				28	major optimization goals in the Linux kernel currently.
				29
				30	- the mutex subsystem is slightly faster and has better scalability for
				31	contended workloads. On an 8-way x86 system, running a mutex-based
				32	kernel and testing creat+unlink+close (of separate, per-task files)
				33	in /tmp with 16 parallel tasks, the average number of ops/sec is:
				34
				35	Semaphores: Mutexes:
				36
				37	$ ./test-mutex V 16 10 $ ./test-mutex V 16 10
				38	8 CPUs, running 16 tasks. 8 CPUs, running 16 tasks.
				39	checking VFS performance. checking VFS performance.
				40	avg loops/sec: 34713 avg loops/sec: 84153
				41	CPU utilization: 63% CPU utilization: 22%
				42
				43	i.e. in this workload, the mutex based kernel was 2.4 times faster
				44	than the semaphore based kernel, _and_ it also had 2.8 times less CPU
				45	utilization. (In terms of 'ops per CPU cycle', the semaphore kernel
				46	performed 551 ops/sec per 1% of CPU time used, while the mutex kernel
				47	performed 3825 ops/sec per 1% of CPU time used - it was 6.9 times
				48	more efficient.)
				49
				50	the scalability difference is visible even on a 2-way P4 HT box:
				51
				52	Semaphores: Mutexes:
				53
				54	$ ./test-mutex V 16 10 $ ./test-mutex V 16 10
				55	4 CPUs, running 16 tasks. 8 CPUs, running 16 tasks.
				56	checking VFS performance. checking VFS performance.
				57	avg loops/sec: 127659 avg loops/sec: 181082
				58	CPU utilization: 100% CPU utilization: 34%
				59
				60	(the straight performance advantage of mutexes is 41%, the per-cycle
				61	efficiency of mutexes is 4.1 times better.)
				62
				63	- there are no fastpath tradeoffs, the mutex fastpath is just as tight
				64	as the semaphore fastpath. On x86, the locking fastpath is 2
				65	instructions:
				66
				67	c0377ccb <mutex_lock>:
				68	c0377ccb: f0 ff 08 lock decl (%eax)
Denys Vlasenko	75ddb0e	2010-02-20 01:03:48 +0100	[diff] [blame]	69	c0377cce: 78 0e js c0377cde <.text..lock.mutex>
Ingo Molnar	f3f54ff	2006-01-09 15:59:20 -0800	[diff] [blame]	70	c0377cd0: c3 ret
				71
				72	the unlocking fastpath is equally tight:
				73
				74	c0377cd1 <mutex_unlock>:
				75	c0377cd1: f0 ff 00 lock incl (%eax)
Denys Vlasenko	75ddb0e	2010-02-20 01:03:48 +0100	[diff] [blame]	76	c0377cd4: 7e 0f jle c0377ce5 <.text..lock.mutex+0x7>
Ingo Molnar	f3f54ff	2006-01-09 15:59:20 -0800	[diff] [blame]	77	c0377cd6: c3 ret
				78
				79	- 'struct mutex' semantics are well-defined and are enforced if
				80	CONFIG_DEBUG_MUTEXES is turned on. Semaphores on the other hand have
				81	virtually no debugging code or instrumentation. The mutex subsystem
				82	checks and enforces the following rules:
				83
				84	* - only one task can hold the mutex at a time
				85	* - only the owner can unlock the mutex
				86	* - multiple unlocks are not permitted
				87	* - recursive locking is not permitted
				88	* - a mutex object must be initialized via the API
				89	* - a mutex object must not be initialized via memset or copying
				90	* - task may not exit with mutex held
				91	* - memory areas where held locks reside must not be freed
				92	* - held mutexes must not be reinitialized
Matti Linnanvuori	f20fda4	2007-10-16 23:29:41 -0700	[diff] [blame]	93	* - mutexes may not be used in hardware or software interrupt
				94	* contexts such as tasklets and timers
Ingo Molnar	f3f54ff	2006-01-09 15:59:20 -0800	[diff] [blame]	95
				96	furthermore, there are also convenience features in the debugging
				97	code:
				98
				99	* - uses symbolic names of mutexes, whenever they are printed in debug output
				100	* - point-of-acquire tracking, symbolic lookup of function names
				101	* - list of all locks held in the system, printout of them
				102	* - owner tracking
				103	* - detects self-recursing locks and prints out all relevant info
				104	* - detects multi-task circular deadlocks and prints out all affected
				105	* locks and tasks (and only those tasks)
				106
				107	Disadvantages
				108	-------------
				109
				110	The stricter mutex API means you cannot use mutexes the same way you
				111	can use semaphores: e.g. they cannot be used from an interrupt context,
				112	nor can they be unlocked from a different context that which acquired
				113	it. [ I'm not aware of any other (e.g. performance) disadvantages from
				114	using mutexes at the moment, please let me know if you find any. ]
				115
				116	Implementation of mutexes
				117	-------------------------
				118
				119	'struct mutex' is the new mutex type, defined in include/linux/mutex.h
				120	and implemented in kernel/mutex.c. It is a counter-based mutex with a
				121	spinlock and a wait-list. The counter has 3 states: 1 for "unlocked",
				122	0 for "locked" and negative numbers (usually -1) for "locked, potential
				123	waiters queued".
				124
				125	the APIs of 'struct mutex' have been streamlined:
				126
				127	DEFINE_MUTEX(name);
				128
				129	mutex_init(mutex);
				130
				131	void mutex_lock(struct mutex *lock);
				132	int mutex_lock_interruptible(struct mutex *lock);
				133	int mutex_trylock(struct mutex *lock);
				134	void mutex_unlock(struct mutex *lock);
				135	int mutex_is_locked(struct mutex *lock);
Robert P. J. Day	343b901	2007-10-20 00:15:26 +0200	[diff] [blame]	136	void mutex_lock_nested(struct mutex *lock, unsigned int subclass);
				137	int mutex_lock_interruptible_nested(struct mutex *lock,
				138	unsigned int subclass);
Randy Dunlap	ef5dc12	2010-09-02 15:48:16 -0700	[diff] [blame]	139	int atomic_dec_and_mutex_lock(atomic_t cnt, struct mutex lock);