Blame - Documentation/prio_tree.txt - kernel/msm-4.9

blob: 3aa68f9a117b19cac846aabd769d53d46f657e82 [file] [log] [blame]

Linus Torvalds	1da177e	2005-04-16 15:20:36 -0700	[diff] [blame]	1	The prio_tree.c code indexes vmas using 3 different indexes:
				2	* heap_index = vm_pgoff + vm_size_in_pages : end_vm_pgoff
				3	* radix_index = vm_pgoff : start_vm_pgoff
				4	* size_index = vm_size_in_pages
				5
				6	A regular radix-priority-search-tree indexes vmas using only heap_index and
				7	radix_index. The conditions for indexing are:
				8	* ->heap_index >= ->left->heap_index &&
				9	->heap_index >= ->right->heap_index
				10	* if (->heap_index == ->left->heap_index)
				11	then ->radix_index < ->left->radix_index;
				12	* if (->heap_index == ->right->heap_index)
				13	then ->radix_index < ->right->radix_index;
				14	* nodes are hashed to left or right subtree using radix_index
				15	similar to a pure binary radix tree.
				16
				17	A regular radix-priority-search-tree helps to store and query
				18	intervals (vmas). However, a regular radix-priority-search-tree is only
				19	suitable for storing vmas with different radix indices (vm_pgoff).
				20
				21	Therefore, the prio_tree.c extends the regular radix-priority-search-tree
				22	to handle many vmas with the same vm_pgoff. Such vmas are handled in
				23	2 different ways: 1) All vmas with the same radix _and_ heap indices are
				24	linked using vm_set.list, 2) if there are many vmas with the same radix
				25	index, but different heap indices and if the regular radix-priority-search
				26	tree cannot index them all, we build an overflow-sub-tree that indexes such
				27	vmas using heap and size indices instead of heap and radix indices. For
				28	example, in the figure below some vmas with vm_pgoff = 0 (zero) are
				29	indexed by regular radix-priority-search-tree whereas others are pushed
				30	into an overflow-subtree. Note that all vmas in an overflow-sub-tree have
				31	the same vm_pgoff (radix_index) and if necessary we build different
				32	overflow-sub-trees to handle each possible radix_index. For example,
				33	in figure we have 3 overflow-sub-trees corresponding to radix indices
				34	0, 2, and 4.
				35
				36	In the final tree the first few (prio_tree_root->index_bits) levels
				37	are indexed using heap and radix indices whereas the overflow-sub-trees below
				38	those levels (i.e. levels prio_tree_root->index_bits + 1 and higher) are
				39	indexed using heap and size indices. In overflow-sub-trees the size_index
				40	is used for hashing the nodes to appropriate places.
				41
				42	Now, an example prio_tree:
				43
				44	vmas are represented [radix_index, size_index, heap_index]
				45	i.e., [start_vm_pgoff, vm_size_in_pages, end_vm_pgoff]
				46
				47	level prio_tree_root->index_bits = 3
				48	-----
				49	_
				50	0 [0,7,7] \|
				51	/ \ \|
				52	------------------ ------------ \| Regular
				53	/ \ \| radix priority
				54	1 [1,6,7] [4,3,7] \| search tree
				55	/ \ / \ \|
				56	------- ----- ------ ----- \| heap-and-radix
				57	/ \ / \ \| indexed
				58	2 [0,6,6] [2,5,7] [5,2,7] [6,1,7] \|
				59	/ \ / \ / \ / \ \|
				60	3 [0,5,5] [1,5,6] [2,4,6] [3,4,7] [4,2,6] [5,1,6] [6,0,6] [7,0,7] \|
				61	/ / / _
				62	/ / / _
				63	4 [0,4,4] [2,3,5] [4,1,5] \|
				64	/ / / \|
				65	5 [0,3,3] [2,2,4] [4,0,4] \| Overflow-sub-trees
				66	/ / \|
				67	6 [0,2,2] [2,1,3] \| heap-and-size
				68	/ / \| indexed
				69	7 [0,1,1] [2,0,2] \|
				70	/ \|
				71	8 [0,0,0] \|
				72	_
				73
				74	Note that we use prio_tree_root->index_bits to optimize the height
				75	of the heap-and-radix indexed tree. Since prio_tree_root->index_bits is
				76	set according to the maximum end_vm_pgoff mapped, we are sure that all
				77	bits (in vm_pgoff) above prio_tree_root->index_bits are 0 (zero). Therefore,
				78	we only use the first prio_tree_root->index_bits as radix_index.
				79	Whenever index_bits is increased in prio_tree_expand, we shuffle the tree
				80	to make sure that the first prio_tree_root->index_bits levels of the tree
				81	is indexed properly using heap and radix indices.
				82
				83	We do not optimize the height of overflow-sub-trees using index_bits.
				84	The reason is: there can be many such overflow-sub-trees and all of
				85	them have to be suffled whenever the index_bits increases. This may involve
				86	walking the whole prio_tree in prio_tree_insert->prio_tree_expand code
				87	path which is not desirable. Hence, we do not optimize the height of the
				88	heap-and-size indexed overflow-sub-trees using prio_tree->index_bits.
				89	Instead the overflow sub-trees are indexed using full BITS_PER_LONG bits
				90	of size_index. This may lead to skewed sub-trees because most of the
Paolo Ornati	670e9f3	2006-10-03 22:57:56 +0200	[diff] [blame]	91	higher significant bits of the size_index are likely to be 0 (zero). In
Linus Torvalds	1da177e	2005-04-16 15:20:36 -0700	[diff] [blame]	92	the example above, all 3 overflow-sub-trees are skewed. This may marginally
				93	affect the performance. However, processes rarely map many vmas with the
				94	same start_vm_pgoff but different end_vm_pgoffs. Therefore, we normally
				95	do not require overflow-sub-trees to index all vmas.
				96
				97	From the above discussion it is clear that the maximum height of
				98	a prio_tree can be prio_tree_root->index_bits + BITS_PER_LONG.
				99	However, in most of the common cases we do not need overflow-sub-trees,
				100	so the tree height in the common cases will be prio_tree_root->index_bits.
				101
				102	It is fair to mention here that the prio_tree_root->index_bits
				103	is increased on demand, however, the index_bits is not decreased when
				104	vmas are removed from the prio_tree. That's tricky to do. Hence, it's
				105	left as a home work problem.
				106
				107