Blame - docs/MergeFunctions.rst - fp2-dev/platform/external/llvm

blob: b2f6030edc1cc8fa141403ce5f682b0cd32385c3 [file] [log] [blame]

Stephen Hines	ebe69fe	2015-03-23 12:10:34 -0700	[diff] [blame]	1	=================================
				2	MergeFunctions pass, how it works
				3	=================================
				4
				5	.. contents::
				6	:local:
				7
				8	Introduction
				9	============
				10	Sometimes code contains equal functions, or functions that does exactly the same
				11	thing even though they are non-equal on the IR level (e.g.: multiplication on 2
				12	and 'shl 1'). It could happen due to several reasons: mainly, the usage of
				13	templates and automatic code generators. Though, sometimes user itself could
				14	write the same thing twice :-)
				15
				16	The main purpose of this pass is to recognize such functions and merge them.
				17
				18	Why would I want to read this document?
				19	---------------------------------------
				20	Document is the extension to pass comments and describes the pass logic. It
				21	describes algorithm that is used in order to compare functions, it also
				22	explains how we could combine equal functions correctly, keeping module valid.
				23
				24	Material is brought in top-down form, so reader could start learn pass from
				25	ideas and end up with low-level algorithm details, thus preparing him for
				26	reading the sources.
				27
				28	So main goal is do describe algorithm and logic here; the concept. This document
				29	is good for you, if you don't want to read the source code, but want to
				30	understand pass algorithms. Author tried not to repeat the source-code and
				31	cover only common cases, and thus avoid cases when after minor code changes we
				32	need to update this document.
				33
				34
				35	What should I know to be able to follow along with this document?
				36	-----------------------------------------------------------------
				37
				38	Reader should be familiar with common compile-engineering principles and LLVM
				39	code fundamentals. In this article we suppose reader is familiar with
				40	`Single Static Assingment <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_
				41	concepts. Understanding of
				42	`IR structure <http://llvm.org/docs/LangRef.html#high-level-structure>`_ is
				43	also important.
				44
				45	We will use such terms as
				46	"`module <http://llvm.org/docs/LangRef.html#high-level-structure>`_",
				47	"`function <http://llvm.org/docs/ProgrammersManual.html#the-function-class>`_",
				48	"`basic block <http://en.wikipedia.org/wiki/Basic_block>`_",
				49	"`user <http://llvm.org/docs/ProgrammersManual.html#the-user-class>`_",
				50	"`value <http://llvm.org/docs/ProgrammersManual.html#the-value-class>`_",
				51	"`instruction <http://llvm.org/docs/ProgrammersManual.html#the-instruction-class>`_".
				52
				53	As a good start point, Kaleidoscope tutorial could be used:
				54
				55	:doc:`tutorial/index`
				56
				57	Especially it's important to understand chapter 3 of tutorial:
				58
				59	:doc:`tutorial/LangImpl3`
				60
Pirama Arumuga Nainar	4c5e43d	2015-04-08 08:55:49 -0700	[diff] [blame]	61	Reader also should know how passes work in LLVM, they could use next article as
				62	a reference and start point here:
Stephen Hines	ebe69fe	2015-03-23 12:10:34 -0700	[diff] [blame]	63
				64	:doc:`WritingAnLLVMPass`
				65
				66	What else? Well perhaps reader also should have some experience in LLVM pass
				67	debugging and bug-fixing.
				68
				69	What I gain by reading this document?
				70	-------------------------------------
				71	Main purpose is to provide reader with comfortable form of algorithms
				72	description, namely the human reading text. Since it could be hard to
				73	understand algorithm straight from the source code: pass uses some principles
				74	that have to be explained first.
				75
				76	Author wishes to everybody to avoid case, when you read code from top to bottom
				77	again and again, and yet you don't understand why we implemented it that way.
				78
				79	We hope that after this article reader could easily debug and improve
				80	MergeFunctions pass and thus help LLVM project.
				81
				82	Narrative structure
				83	-------------------
				84	Article consists of three parts. First part explains pass functionality on the
				85	top-level. Second part describes the comparison procedure itself. The third
				86	part describes the merging process.
				87
				88	In every part author also tried to put the contents into the top-down form.
				89	First, the top-level methods will be described, while the terminal ones will be
				90	at the end, in the tail of each part. If reader will see the reference to the
Pirama Arumuga Nainar	4c5e43d	2015-04-08 08:55:49 -0700	[diff] [blame]	91	method that wasn't described yet, they will find its description a bit below.
Stephen Hines	ebe69fe	2015-03-23 12:10:34 -0700	[diff] [blame]	92
				93	Basics
				94	======
				95
				96	How to do it?
				97	-------------
				98	Do we need to merge functions? Obvious thing is: yes that's a quite possible
				99	case, since usually we do have duplicates. And it would be good to get rid of
				100	them. But how to detect such a duplicates? The idea is next: we split functions
				101	onto small bricks (parts), then we compare "bricks" amount, and if it equal,
				102	compare "bricks" themselves, and then do our conclusions about functions
				103	themselves.
				104
				105	What the difference it could be? For example, on machine with 64-bit pointers
				106	(let's assume we have only one address space), one function stores 64-bit
				107	integer, while another one stores a pointer. So if the target is a machine
				108	mentioned above, and if functions are identical, except the parameter type (we
				109	could consider it as a part of function type), then we can treat ``uint64_t``
				110	and``void*`` as equal.
				111
				112	It was just an example; possible details are described a bit below.
				113
				114	As another example reader may imagine two more functions. First function
				115	performs multiplication on 2, while the second one performs arithmetic right
				116	shift on 1.
				117
				118	Possible solutions
				119	^^^^^^^^^^^^^^^^^^
				120	Let's briefly consider possible options about how and what we have to implement
				121	in order to create full-featured functions merging, and also what it would
				122	meant for us.
				123
				124	Equal functions detection, obviously supposes "detector" method to be
				125	implemented, latter should answer the question "whether functions are equal".
				126	This "detector" method consists of tiny "sub-detectors", each of them answers
				127	exactly the same question, but for function parts.
				128
				129	As the second step, we should merge equal functions. So it should be a "merger"
				130	method. "Merger" accepts two functions F1 and F2, and produces F1F2
				131	function, the result of merging.
				132
				133	Having such a routines in our hands, we can process whole module, and merge all
				134	equal functions.
				135
				136	In this case, we have to compare every function with every another function. As
				137	reader could notice, this way seems to be quite expensive. Of course we could
				138	introduce hashing and other helpers, but it is still just an optimization, and
				139	thus the level of O(N*N) complexity.
				140
				141	Can we reach another level? Could we introduce logarithmical search, or random
				142	access lookup? The answer is: "yes".
				143
				144	Random-access
				145	"""""""""""""
				146	How it could be done? Just convert each function to number, and gather all of
				147	them in special hash-table. Functions with equal hash are equal. Good hashing
				148	means, that every function part must be taken into account. That means we have
				149	to convert every function part into some number, and then add it into hash.
				150	Lookup-up time would be small, but such approach adds some delay due to hashing
				151	routine.
				152
				153	Logarithmical search
				154	""""""""""""""""""""
				155	We could introduce total ordering among the functions set, once we had it we
				156	could then implement a logarithmical search. Lookup time still depends on N,
				157	but adds a little of delay (log(N)).
				158
				159	Present state
				160	"""""""""""""
				161	Both of approaches (random-access and logarithmical) has been implemented and
				162	tested. And both of them gave a very good improvement. And what was most
				163	surprising, logarithmical search was faster; sometimes up to 15%. Hashing needs
				164	some extra CPU time, and it is the main reason why it works slower; in most of
				165	cases total "hashing" time was greater than total "logarithmical-search" time.
				166
				167	So, preference has been granted to the "logarithmical search".
				168
				169	Though in the case of need, logarithmical-search (read "total-ordering") could
				170	be used as a milestone on our way to the random-access implementation.
				171
				172	Every comparison is based either on the numbers or on flags comparison. In
				173	random-access approach we could use the same comparison algorithm. During
				174	comparison we exit once we find the difference, but here we might have to scan
				175	whole function body every time (note, it could be slower). Like in
				176	"total-ordering", we will track every numbers and flags, but instead of
				177	comparison, we should get numbers sequence and then create the hash number. So,
				178	once again, total-ordering could be considered as a milestone for even faster
				179	(in theory) random-access approach.
				180
				181	MergeFunctions, main fields and runOnModule
				182	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				183	There are two most important fields in class:
				184
				185	``FnTree`` – the set of all unique functions. It keeps items that couldn't be
				186	merged with each other. It is defined as:
				187
				188	``std::set<FunctionNode> FnTree;``
				189
				190	Here ``FunctionNode`` is a wrapper for ``llvm::Function`` class, with
				191	implemented “<” operator among the functions set (below we explain how it works
				192	exactly; this is a key point in fast functions comparison).
				193
				194	``Deferred`` – merging process can affect bodies of functions that are in
				195	``FnTree`` already. Obviously such functions should be rechecked again. In this
				196	case we remove them from ``FnTree``, and mark them as to be rescanned, namely
				197	put them into ``Deferred`` list.
				198
				199	runOnModule
				200	"""""""""""
				201	The algorithm is pretty simple:
				202
				203	1. Put all module's functions into the worklist.
				204
				205	2. Scan worklist's functions twice: first enumerate only strong functions and
				206	then only weak ones:
				207
				208	2.1. Loop body: take function from worklist (call it FCur) and try to
				209	insert it into FnTree: check whether FCur is equal to one of functions
				210	in FnTree. If there is equal function in FnTree (call it FExists):
				211	merge function FCur with FExists. Otherwise add function from worklist
				212	to FnTree.
				213
				214	3. Once worklist scanning and merging operations is complete, check Deferred
				215	list. If it is not empty: refill worklist contents with Deferred list and
				216	do step 2 again, if Deferred is empty, then exit from method.
				217
				218	Comparison and logarithmical search
				219	"""""""""""""""""""""""""""""""""""
				220	Let's recall our task: for every function F from module M, we have to find
				221	equal functions F` in shortest time, and merge them into the single function.
				222
				223	Defining total ordering among the functions set allows to organize functions
				224	into the binary tree. The lookup procedure complexity would be estimated as
				225	O(log(N)) in this case. But how to define total-ordering?
				226
				227	We have to introduce a single rule applicable to every pair of functions, and
				228	following this rule then evaluate which of them is greater. What kind of rule
				229	it could be? Let's declare it as "compare" method, that returns one of 3
				230	possible values:
				231
				232	-1, left is less than right,
				233
				234	0, left and right are equal,
				235
				236	1, left is greater than right.
				237
				238	Of course it means, that we have to maintain
				239	strict and non-strict order relation properties:
				240
				241	* reflexivity (``a <= a``, ``a == a``, ``a >= a``),
				242	* antisymmetry (if ``a <= b`` and ``b <= a`` then ``a == b``),
				243	* transitivity (``a <= b`` and ``b <= c``, then ``a <= c``)
				244	* asymmetry (if ``a < b``, then ``a > b`` or ``a == b``).
				245
				246	As it was mentioned before, comparison routine consists of
				247	"sub-comparison-routines", each of them also consists
				248	"sub-comparison-routines", and so on, finally it ends up with a primitives
				249	comparison.
				250
				251	Below, we will use the next operations:
				252
				253	#. ``cmpNumbers(number1, number2)`` is method that returns -1 if left is less
				254	than right; 0, if left and right are equal; and 1 otherwise.
				255
				256	#. ``cmpFlags(flag1, flag2)`` is hypothetical method that compares two flags.
				257	The logic is the same as in ``cmpNumbers``, where ``true`` is 1, and
				258	``false`` is 0.
				259
				260	The rest of article is based on MergeFunctions.cpp source code
				261	(<llvm_dir>/lib/Transforms/IPO/MergeFunctions.cpp). We would like to ask
				262	reader to keep this file open nearby, so we could use it as a reference for
				263	further explanations.
				264
				265	Now we're ready to proceed to the next chapter and see how it works.
				266
				267	Functions comparison
				268	====================
				269	At first, let's define how exactly we compare complex objects.
				270
				271	Complex objects comparison (function, basic-block, etc) is mostly based on its
				272	sub-objects comparison results. So it is similar to the next "tree" objects
				273	comparison:
				274
				275	#. For two trees T1 and T2 we perform depth-first-traversal and have
				276	two sequences as a product: "T1Items" and "T2Items".
				277
				278	#. Then compare chains "T1Items" and "T2Items" in
				279	most-significant-item-first order. Result of items comparison would be the
				280	result of T1 and T2 comparison itself.
				281
				282	FunctionComparator::compare(void)
				283	---------------------------------
				284	Brief look at the source code tells us, that comparison starts in
				285	“``int FunctionComparator::compare(void)``” method.
				286
				287	1. First parts to be compared are function's attributes and some properties that
				288	outsides “attributes” term, but still could make function different without
				289	changing its body. This part of comparison is usually done within simple
				290	cmpNumbers or cmpFlags operations (e.g.
				291	``cmpFlags(F1->hasGC(), F2->hasGC())``). Below is full list of function's
				292	properties to be compared on this stage:
				293
				294	* Attributes (those are returned by ``Function::getAttributes()``
				295	method).
				296
				297	* GC, for equivalence, RHS and LHS should be both either without
				298	GC or with the same one.
				299
				300	* Section, just like a GC: RHS and LHS should be defined in the
				301	same section.
				302
				303	* Variable arguments. LHS and RHS should be both either with or
				304	without var-args.
				305
				306	* Calling convention should be the same.
				307
				308	2. Function type. Checked by ``FunctionComparator::cmpType(Type, Type)``
				309	method. It checks return type and parameters type; the method itself will be
				310	described later.
				311
				312	3. Associate function formal parameters with each other. Then comparing function
				313	bodies, if we see the usage of LHS's i-th argument in LHS's body, then,
				314	we want to see usage of RHS's i-th argument at the same place in RHS's
				315	body, otherwise functions are different. On this stage we grant the preference
				316	to those we met later in function body (value we met first would be less).
				317	This is done by “``FunctionComparator::cmpValues(const Value, const Value)``”
				318	method (will be described a bit later).
				319
				320	4. Function body comparison. As it written in method comments:
				321
				322	“We do a CFG-ordered walk since the actual ordering of the blocks in the linked
				323	list is immaterial. Our walk starts at the entry block for both functions, then
				324	takes each block from each terminator in order. As an artifact, this also means
				325	that unreachable blocks are ignored.”
				326
				327	So, using this walk we get BBs from left and right in the same order, and
				328	compare them by “``FunctionComparator::compare(const BasicBlock*, const
				329	BasicBlock*)``” method.
				330
				331	We also associate BBs with each other, like we did it with function formal
				332	arguments (see ``cmpValues`` method below).
				333
				334	FunctionComparator::cmpType
				335	---------------------------
				336	Consider how types comparison works.
				337
				338	1. Coerce pointer to integer. If left type is a pointer, try to coerce it to the
				339	integer type. It could be done if its address space is 0, or if address spaces
				340	are ignored at all. Do the same thing for the right type.
				341
				342	2. If left and right types are equal, return 0. Otherwise we need to give
				343	preference to one of them. So proceed to the next step.
				344
				345	3. If types are of different kind (different type IDs). Return result of type
				346	IDs comparison, treating them as a numbers (use ``cmpNumbers`` operation).
				347
				348	4. If types are vectors or integers, return result of their pointers comparison,
				349	comparing them as numbers.
				350
				351	5. Check whether type ID belongs to the next group (call it equivalent-group):
				352
				353	* Void
				354
				355	* Float
				356
				357	* Double
				358
				359	* X86_FP80
				360
				361	* FP128
				362
				363	* PPC_FP128
				364
				365	* Label
				366
				367	* Metadata.
				368
				369	If ID belongs to group above, return 0. Since it's enough to see that
				370	types has the same ``TypeID``. No additional information is required.
				371
				372	6. Left and right are pointers. Return result of address space comparison
				373	(numbers comparison).
				374
				375	7. Complex types (structures, arrays, etc.). Follow complex objects comparison
				376	technique (see the very first paragraph of this chapter). Both left and
				377	right are to be expanded and their element types will be checked the same
				378	way. If we get -1 or 1 on some stage, return it. Otherwise return 0.
				379
				380	8. Steps 1-6 describe all the possible cases, if we passed steps 1-6 and didn't
				381	get any conclusions, then invoke ``llvm_unreachable``, since it's quite
				382	unexpectable case.
				383
				384	cmpValues(const Value, const Value)
				385	-------------------------------------
				386	Method that compares local values.
				387
				388	This method gives us an answer on a very curious quesion: whether we could treat
				389	local values as equal, and which value is greater otherwise. It's better to
				390	start from example:
				391
				392	Consider situation when we're looking at the same place in left function "FL"
				393	and in right function "FR". And every part of left place is equal to the
				394	corresponding part of right place, and (!) both parts use Value instances,
				395	for example:
				396
				397	.. code-block:: llvm
				398
				399	instr0 i32 %LV ; left side, function FL
				400	instr0 i32 %RV ; right side, function FR
				401
				402	So, now our conclusion depends on Value instances comparison.
				403
				404	Main purpose of this method is to determine relation between such values.
				405
				406	What we expect from equal functions? At the same place, in functions "FL" and
				407	"FR" we expect to see equal values, or values defined at the same place
				408	in "FL" and "FR".
				409
				410	Consider small example here:
				411
				412	.. code-block:: llvm
				413
				414	define void %f(i32 %pf0, i32 %pf1) {
				415	instr0 i32 %pf0 instr1 i32 %pf1 instr2 i32 123
				416	}
				417
				418	.. code-block:: llvm
				419
				420	define void %g(i32 %pg0, i32 %pg1) {
				421	instr0 i32 %pg0 instr1 i32 %pg0 instr2 i32 123
				422	}
				423
				424	In this example, pf0 is associated with pg0, pf1 is associated with pg1,
				425	and we also declare that pf0 < pf1, and thus pg0 < pf1.
				426
				427	Instructions with opcode "instr0" would be equal, since their types and
				428	opcodes are equal, and values are associated.
				429
				430	Instruction with opcode "instr1" from f is greater than instruction with
				431	opcode "instr1" from g; here we have equal types and opcodes, but "pf1 is
				432	greater than "pg0".
				433
				434	And instructions with opcode "instr2" are equal, because their opcodes and
				435	types are equal, and the same constant is used as a value.
				436
				437	What we assiciate in cmpValues?
				438	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				439	* Function arguments. i-th argument from left function associated with
				440	i-th argument from right function.
				441	* BasicBlock instances. In basic-block enumeration loop we associate i-th
				442	BasicBlock from the left function with i-th BasicBlock from the right
				443	function.
				444	* Instructions.
				445	* Instruction operands. Note, we can meet Value here we have never seen
				446	before. In this case it is not a function argument, nor BasicBlock, nor
				447	Instruction. It is global value. It is constant, since its the only
				448	supposed global here. Method also compares:
				449	* Constants that are of the same type.
				450	* If right constant could be losslessly bit-casted to the left one, then we
				451	also compare them.
				452
				453	How to implement cmpValues?
				454	^^^^^^^^^^^^^^^^^^^^^^^^^^^
				455	Association is a case of equality for us. We just treat such values as equal.
				456	But, in general, we need to implement antisymmetric relation. As it was
				457	mentioned above, to understand what is less, we can use order in which we
				458	meet values. If both of values has the same order in function (met at the same
				459	time), then treat values as associated. Otherwise – it depends on who was
				460	first.
				461
				462	Every time we run top-level compare method, we initialize two identical maps
				463	(one for the left side, another one for the right side):
				464
				465	``map<Value, int> sn_mapL, sn_mapR;``
				466
				467	The key of the map is the Value itself, the value – is its order (call it
				468	serial number).
				469
				470	To add value V we need to perform the next procedure:
				471
				472	``sn_map.insert(std::make_pair(V, sn_map.size()));``
				473
				474	For the first Value, map will return 0, for second Value map will return
				475	1, and so on.
				476
				477	Then we can check whether left and right values met at the same time with simple
				478	comparison:
				479
				480	``cmpNumbers(sn_mapL[Left], sn_mapR[Right]);``
				481
				482	Of course, we can combine insertion and comparison:
				483
				484	.. code-block:: c++
				485
				486	std::pair<iterator, bool>
				487	LeftRes = sn_mapL.insert(std::make_pair(Left, sn_mapL.size())), RightRes
				488	= sn_mapR.insert(std::make_pair(Right, sn_mapR.size()));
				489	return cmpNumbers(LeftRes.first->second, RightRes.first->second);
				490
				491	Let's look, how whole method could be implemented.
				492
				493	1. we have to start from the bad news. Consider function self and
				494	cross-referencing cases:
				495
				496	.. code-block:: c++
				497
				498	// self-reference unsigned fact0(unsigned n) { return n > 1 ? n
				499	* fact0(n-1) : 1; } unsigned fact1(unsigned n) { return n > 1 ? n *
				500	fact1(n-1) : 1; }
				501
				502	// cross-reference unsigned ping(unsigned n) { return n!= 0 ? pong(n-1) : 0;
				503	} unsigned pong(unsigned n) { return n!= 0 ? ping(n-1) : 0; }
				504
				505	..
				506
				507	This comparison has been implemented in initial MergeFunctions pass
				508	version. But, unfortunately, it is not transitive. And this is the only case
				509	we can't convert to less-equal-greater comparison. It is a seldom case, 4-5
				510	functions of 10000 (checked on test-suite), and, we hope, reader would
				511	forgive us for such a sacrifice in order to get the O(log(N)) pass time.
				512
				513	2. If left/right Value is a constant, we have to compare them. Return 0 if it
				514	is the same constant, or use ``cmpConstants`` method otherwise.
				515
				516	3. If left/right is InlineAsm instance. Return result of Value pointers
				517	comparison.
				518
				519	4. Explicit association of L (left value) and R (right value). We need to
				520	find out whether values met at the same time, and thus are associated. Or we
				521	need to put the rule: when we treat L < R. Now it is easy: just return
				522	result of numbers comparison:
				523
				524	.. code-block:: c++
				525
				526	std::pair<iterator, bool>
				527	LeftRes = sn_mapL.insert(std::make_pair(Left, sn_mapL.size())),
				528	RightRes = sn_mapR.insert(std::make_pair(Right, sn_mapR.size()));
				529	if (LeftRes.first->second == RightRes.first->second) return 0;
				530	if (LeftRes.first->second < RightRes.first->second) return -1;
				531	return 1;
				532
				533	Now when cmpValues returns 0, we can proceed comparison procedure. Otherwise,
				534	if we get (-1 or 1), we need to pass this result to the top level, and finish
				535	comparison procedure.
				536
				537	cmpConstants
				538	------------
				539	Performs constants comparison as follows:
				540
				541	1. Compare constant types using ``cmpType`` method. If result is -1 or 1, goto
				542	step 2, otherwise proceed to step 3.
				543
				544	2. If types are different, we still can check whether constants could be
				545	losslessly bitcasted to each other. The further explanation is modification of
				546	``canLosslesslyBitCastTo`` method.
				547
				548	2.1 Check whether constants are of the first class types
				549	(``isFirstClassType`` check):
				550
				551	2.1.1. If both constants are not of the first class type: return result
				552	of ``cmpType``.
				553
				554	2.1.2. Otherwise, if left type is not of the first class, return -1. If
				555	right type is not of the first class, return 1.
				556
				557	2.1.3. If both types are of the first class type, proceed to the next step
				558	(2.1.3.1).
				559
				560	2.1.3.1. If types are vectors, compare their bitwidth using the
				561	cmpNumbers. If result is not 0, return it.
				562
				563	2.1.3.2. Different types, but not a vectors:
				564
				565	* if both of them are pointers, good for us, we can proceed to step 3.
				566	* if one of types is pointer, return result of isPointer flags
				567	comparison (cmpFlags operation).
				568	* otherwise we have no methods to prove bitcastability, and thus return
				569	result of types comparison (-1 or 1).
				570
				571	Steps below are for the case when types are equal, or case when constants are
				572	bitcastable:
				573
				574	3. One of constants is a "null" value. Return the result of
				575	``cmpFlags(L->isNullValue, R->isNullValue)`` comparison.
				576
				577	4. Compare value IDs, and return result if it is not 0:
				578
				579	.. code-block:: c++
				580
				581	if (int Res = cmpNumbers(L->getValueID(), R->getValueID()))
				582	return Res;
				583
				584	5. Compare the contents of constants. The comparison depends on kind of
				585	constants, but on this stage it is just a lexicographical comparison. Just see
				586	how it was described in the beginning of "Functions comparison" paragraph.
				587	Mathematically it is equal to the next case: we encode left constant and right
				588	constant (with similar way bitcode-writer does). Then compare left code
				589	sequence and right code sequence.
				590
				591	compare(const BasicBlock, const BasicBlock)
				592	---------------------------------------------
				593	Compares two BasicBlock instances.
				594
				595	It enumerates instructions from left BB and right BB.
				596
				597	1. It assigns serial numbers to the left and right instructions, using
				598	``cmpValues`` method.
				599
				600	2. If one of left or right is GEP (``GetElementPtr``), then treat GEP as
				601	greater than other instructions, if both instructions are GEPs use ``cmpGEP``
				602	method for comparison. If result is -1 or 1, pass it to the top-level
				603	comparison (return it).
				604
				605	3.1. Compare operations. Call ``cmpOperation`` method. If result is -1 or
				606	1, return it.
				607
				608	3.2. Compare number of operands, if result is -1 or 1, return it.
				609
				610	3.3. Compare operands themselves, use ``cmpValues`` method. Return result
				611	if it is -1 or 1.
				612
				613	3.4. Compare type of operands, using ``cmpType`` method. Return result if
				614	it is -1 or 1.
				615
				616	3.5. Proceed to the next instruction.
				617
				618	4. We can finish instruction enumeration in 3 cases:
				619
				620	4.1. We reached the end of both left and right basic-blocks. We didn't
				621	exit on steps 1-3, so contents is equal, return 0.
				622
				623	4.2. We have reached the end of the left basic-block. Return -1.
				624
				625	4.3. Return 1 (the end of the right basic block).
				626
				627	cmpGEP
				628	------
				629	Compares two GEPs (``getelementptr`` instructions).
				630
				631	It differs from regular operations comparison with the only thing: possibility
				632	to use ``accumulateConstantOffset`` method.
				633
				634	So, if we get constant offset for both left and right GEPs, then compare it as
				635	numbers, and return comparison result.
				636
				637	Otherwise treat it like a regular operation (see previous paragraph).
				638
				639	cmpOperation
				640	------------
				641	Compares instruction opcodes and some important operation properties.
				642
				643	1. Compare opcodes, if it differs return the result.
				644
				645	2. Compare number of operands. If it differs – return the result.
				646
				647	3. Compare operation types, use cmpType. All the same – if types are
				648	different, return result.
				649
				650	4. Compare subclassOptionalData, get it with ``getRawSubclassOptionalData``
				651	method, and compare it like a numbers.
				652
				653	5. Compare operand types.
				654
				655	6. For some particular instructions check equivalence (relation in our case) of
				656	some significant attributes. For example we have to compare alignment for
				657	``load`` instructions.
				658
				659	O(log(N))
				660	---------
				661	Methods described above implement order relationship. And latter, could be used
				662	for nodes comparison in a binary tree. So we can organize functions set into
				663	the binary tree and reduce the cost of lookup procedure from
				664	O(N*N) to O(log(N)).
				665
				666	Merging process, mergeTwoFunctions
				667	==================================
				668	Once MergeFunctions detected that current function (G) is equal to one that
				669	were analyzed before (function F) it calls ``mergeTwoFunctions(Function*,
				670	Function*)``.
				671
				672	Operation affects ``FnTree`` contents with next way: F will stay in
				673	``FnTree``. G being equal to F will not be added to ``FnTree``. Calls of
				674	G would be replaced with something else. It changes bodies of callers. So,
				675	functions that calls G would be put into ``Deferred`` set and removed from
				676	``FnTree``, and analyzed again.
				677
				678	The approach is next:
				679
				680	1. Most wished case: when we can use alias and both of F and G are weak. We
				681	make both of them with aliases to the third strong function H. Actually H
				682	is F. See below how it's made (but it's better to look straight into the
				683	source code). Well, this is a case when we can just replace G with F
				684	everywhere, we use ``replaceAllUsesWith`` operation here (RAUW).
				685
				686	2. F could not be overridden, while G could. It would be good to do the
				687	next: after merging the places where overridable function were used, still use
				688	overridable stub. So try to make G alias to F, or create overridable tail
				689	call wrapper around F and replace G with that call.
				690
				691	3. Neither F nor G could be overridden. We can't use RAUW. We can just
				692	change the callers: call F instead of G. That's what
				693	``replaceDirectCallers`` does.
				694
				695	Below is detailed body description.
				696
				697	If “F” may be overridden
				698	------------------------
				699	As follows from ``mayBeOverridden`` comments: “whether the definition of this
				700	global may be replaced by something non-equivalent at link time”. If so, thats
				701	ok: we can use alias to F instead of G or change call instructions itself.
				702
				703	HasGlobalAliases, removeUsers
				704	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				705	First consider the case when we have global aliases of one function name to
				706	another. Our purpose is make both of them with aliases to the third strong
				707	function. Though if we keep F alive and without major changes we can leave it
				708	in ``FnTree``. Try to combine these two goals.
				709
				710	Do stub replacement of F itself with an alias to F.
				711
				712	1. Create stub function H, with the same name and attributes like function
				713	F. It takes maximum alignment of F and G.
				714
				715	2. Replace all uses of function F with uses of function H. It is the two
				716	steps procedure instead. First of all, we must take into account, all functions
				717	from whom F is called would be changed: since we change the call argument
				718	(from F to H). If so we must to review these caller functions again after
				719	this procedure. We remove callers from ``FnTree``, method with name
				720	``removeUsers(F)`` does that (don't confuse with ``replaceAllUsesWith``):
				721
				722	2.1. ``Inside removeUsers(Value*
				723	V)`` we go through the all values that use value V (or F in our context).
				724	If value is instruction, we go to function that holds this instruction and
				725	mark it as to-be-analyzed-again (put to ``Deferred`` set), we also remove
				726	caller from ``FnTree``.
				727
				728	2.2. Now we can do the replacement: call ``F->replaceAllUsesWith(H)``.
				729
				730	3. H (that now "officially" plays F's role) is replaced with alias to F.
				731	Do the same with G: replace it with alias to F. So finally everywhere F
				732	was used, we use H and it is alias to F, and everywhere G was used we
				733	also have alias to F.
				734
				735	4. Set F linkage to private. Make it strong :-)
				736
				737	No global aliases, replaceDirectCallers
				738	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				739	If global aliases are not supported. We call ``replaceDirectCallers`` then. Just
				740	go through all calls of G and replace it with calls of F. If you look into
				741	method you will see that it scans all uses of G too, and if use is callee (if
				742	user is call instruction and G is used as what to be called), we replace it
				743	with use of F.
				744
				745	If “F” could not be overridden, fix it!
				746	"""""""""""""""""""""""""""""""""""""""
				747
				748	We call ``writeThunkOrAlias(Function F, Function G)``. Here we try to replace
				749	G with alias to F first. Next conditions are essential:
				750
				751	* target should support global aliases,
				752	* the address itself of G should be not significant, not named and not
				753	referenced anywhere,
				754	* function should come with external, local or weak linkage.
				755
				756	Otherwise we write thunk: some wrapper that has G's interface and calls F,
				757	so G could be replaced with this wrapper.
				758
				759	writeAlias
				760
				761	As follows from llvm reference:
				762
				763	“Aliases act as second name for the aliasee value”. So we just want to create
				764	second name for F and use it instead of G:
				765
				766	1. create global alias itself (GA),
				767
				768	2. adjust alignment of F so it must be maximum of current and G's alignment;
				769
				770	3. replace uses of G:
				771
				772	3.1. first mark all callers of G as to-be-analyzed-again, using
				773	``removeUsers`` method (see chapter above),
				774
				775	3.2. call ``G->replaceAllUsesWith(GA)``.
				776
				777	4. Get rid of G.
				778
				779	writeThunk
				780
				781	As it written in method comments:
				782
				783	“Replace G with a simple tail call to bitcast(F). Also replace direct uses of G
				784	with bitcast(F). Deletes G.”
				785
				786	In general it does the same as usual when we want to replace callee, except the
				787	first point:
				788
				789	1. We generate tail call wrapper around F, but with interface that allows use
				790	it instead of G.
				791
				792	2. “As-usual”: ``removeUsers`` and ``replaceAllUsesWith`` then.
				793
				794	3. Get rid of G.
				795
				796	That's it.
				797	==========
				798	We have described how to detect equal functions, and how to merge them, and in
				799	first chapter we have described how it works all-together. Author hopes, reader
				800	have some picture from now, and it helps him improve and debug this pass.
				801
				802	Reader is welcomed to send us any questions and proposals ;-)