blob: 62b292a8f74d0eb1a04055f6af38e518c6438dfb [file] [log] [blame]
George Burgess IV3bbeb732016-08-17 00:17:29 +00001=========
2MemorySSA
3=========
4
5.. contents::
6 :local:
7
8Introduction
9============
10
11``MemorySSA`` is an analysis that allows us to cheaply reason about the
12interactions between various memory operations. Its goal is to replace
13``MemoryDependenceAnalysis`` for most (if not all) use-cases. This is because,
14unless you're very careful, use of ``MemoryDependenceAnalysis`` can easily
15result in quadratic-time algorithms in LLVM. Additionally, ``MemorySSA`` doesn't
16have as many arbitrary limits as ``MemoryDependenceAnalysis``, so you should get
17better results, too.
18
19At a high level, one of the goals of ``MemorySSA`` is to provide an SSA based
20form for memory, complete with def-use and use-def chains, which
21enables users to quickly find may-def and may-uses of memory operations.
22It can also be thought of as a way to cheaply give versions to the complete
23state of heap memory, and associate memory operations with those versions.
24
25This document goes over how ``MemorySSA`` is structured, and some basic
26intuition on how ``MemorySSA`` works.
27
28A paper on MemorySSA (with notes about how it's implemented in GCC) `can be
29found here <http://www.airs.com/dnovillo/Papers/mem-ssa.pdf>`_. Though, it's
30relatively out-of-date; the paper references multiple heap partitions, but GCC
31eventually swapped to just using one, like we now have in LLVM. Like
32GCC's, LLVM's MemorySSA is intraprocedural.
33
34
35MemorySSA Structure
36===================
37
38MemorySSA is a virtual IR. After it's built, ``MemorySSA`` will contain a
39structure that maps ``Instruction`` s to ``MemoryAccess`` es, which are
40``MemorySSA``'s parallel to LLVM ``Instruction`` s.
41
42Each ``MemoryAccess`` can be one of three types:
43
44- ``MemoryPhi``
45- ``MemoryUse``
46- ``MemoryDef``
47
48``MemoryPhi`` s are ``PhiNode`` , but for memory operations. If at any
49point we have two (or more) ``MemoryDef`` s that could flow into a
50``BasicBlock``, the block's top ``MemoryAccess`` will be a
51``MemoryPhi``. As in LLVM IR, ``MemoryPhi`` s don't correspond to any
52concrete operation. As such, you can't look up a ``MemoryPhi`` with an
53``Instruction`` (though we do allow you to do so with a
54``BasicBlock``).
55
56Note also that in SSA, Phi nodes merge must-reach definitions (that
57is, definite new versions of variables). In MemorySSA, PHI nodes merge
58may-reach definitions (that is, until disambiguated, the versions that
59reach a phi node may or may not clobber a given variable)
60
61``MemoryUse`` s are operations which use but don't modify memory. An example of
62a ``MemoryUse`` is a ``load``, or a ``readonly`` function call.
63
64``MemoryDef`` s are operations which may either modify memory, or which
65otherwise clobber memory in unquantifiable ways. Examples of ``MemoryDef`` s
66include ``store`` s, function calls, ``load`` s with ``acquire`` (or higher)
67ordering, volatile operations, memory fences, etc.
68
69Every function that exists has a special ``MemoryDef`` called ``liveOnEntry``.
70It dominates every ``MemoryAccess`` in the function that ``MemorySSA`` is being
71run on, and implies that we've hit the top of the function. It's the only
72``MemoryDef`` that maps to no ``Instruction`` in LLVM IR. Use of
73``liveOnEntry`` implies that the memory being used is either undefined or
74defined before the function begins.
75
76An example of all of this overlayed on LLVM IR (obtained by running ``opt
77-passes='print<memoryssa>' -disable-output`` on an ``.ll`` file) is below. When
78viewing this example, it may be helpful to view it in terms of clobbers. The
79operands of a given ``MemoryAccess`` are all (potential) clobbers of said
80MemoryAccess, and the value produced by a ``MemoryAccess`` can act as a clobber
81for other ``MemoryAccess`` es. Another useful way of looking at it is in
82terms of heap versions. In that view, operands of of a given
83``MemoryAccess`` are the version of the heap before the operation, and
84if the access produces a value, the value is the new version of the heap
85after the operation.
86
87.. code-block:: llvm
88
89 define void @foo() {
90 entry:
91 %p1 = alloca i8
92 %p2 = alloca i8
93 %p3 = alloca i8
94 ; 1 = MemoryDef(liveOnEntry)
95 store i8 0, i8* %p3
96 br label %while.cond
97
98 while.cond:
99 ; 6 = MemoryPhi({%0,1},{if.end,4})
100 br i1 undef, label %if.then, label %if.else
101
102 if.then:
103 ; 2 = MemoryDef(6)
104 store i8 0, i8* %p1
105 br label %if.end
106
107 if.else:
108 ; 3 = MemoryDef(6)
109 store i8 1, i8* %p2
110 br label %if.end
111
112 if.end:
George Burgess IV67c58852016-08-17 01:50:54 +0000113 ; 5 = MemoryPhi({if.then,2},{if.else,3})
George Burgess IV3bbeb732016-08-17 00:17:29 +0000114 ; MemoryUse(5)
115 %1 = load i8, i8* %p1
116 ; 4 = MemoryDef(5)
117 store i8 2, i8* %p2
118 ; MemoryUse(1)
119 %2 = load i8, i8* %p3
120 br label %while.cond
121 }
122
George Burgess IV67c58852016-08-17 01:50:54 +0000123The ``MemorySSA`` IR is shown in comments that precede the instructions they map
George Burgess IV3bbeb732016-08-17 00:17:29 +0000124to (if such an instruction exists). For example, ``1 = MemoryDef(liveOnEntry)``
125is a ``MemoryAccess`` (specifically, a ``MemoryDef``), and it describes the LLVM
126instruction ``store i8 0, i8* %p3``. Other places in ``MemorySSA`` refer to this
127particular ``MemoryDef`` as ``1`` (much like how one can refer to ``load i8, i8*
128%p1`` in LLVM with ``%1``). Again, ``MemoryPhi`` s don't correspond to any LLVM
129Instruction, so the line directly below a ``MemoryPhi`` isn't special.
130
131Going from the top down:
132
George Burgess IV67c58852016-08-17 01:50:54 +0000133- ``6 = MemoryPhi({entry,1},{if.end,4})`` notes that, when entering
134 ``while.cond``, the reaching definition for it is either ``1`` or ``4``. This
135 ``MemoryPhi`` is referred to in the textual IR by the number ``6``.
George Burgess IV3bbeb732016-08-17 00:17:29 +0000136- ``2 = MemoryDef(6)`` notes that ``store i8 0, i8* %p1`` is a definition,
137 and its reaching definition before it is ``6``, or the ``MemoryPhi`` after
138 ``while.cond``.
139- ``3 = MemoryDef(6)`` notes that ``store i8 0, i8* %p2`` is a definition; its
140 reaching definition is also ``6``.
George Burgess IV67c58852016-08-17 01:50:54 +0000141- ``5 = MemoryPhi({if.then,2},{if.else,3})`` notes that the clobber before
George Burgess IV3bbeb732016-08-17 00:17:29 +0000142 this block could either be ``2`` or ``3``.
143- ``MemoryUse(5)`` notes that ``load i8, i8* %p1`` is a use of memory, and that
144 it's clobbered by ``5``.
145- ``4 = MemoryDef(5)`` notes that ``store i8 2, i8* %p2`` is a definition; it's
146 reaching definition is ``5``.
147- ``MemoryUse(1)`` notes that ``load i8, i8* %p3`` is just a user of memory,
148 and the last thing that could clobber this use is above ``while.cond`` (e.g.
149 the store to ``%p3``). In heap versioning parlance, it really
150 only depends on the heap version 1, and is unaffected by the new
151 heap versions generated since then.
152
153As an aside, ``MemoryAccess`` is a ``Value`` mostly for convenience; it's not
154meant to interact with LLVM IR.
155
156Design of MemorySSA
157===================
158
159``MemorySSA`` is an analysis that can be built for any arbitrary function. When
160it's built, it does a pass over the function's IR in order to build up its
161mapping of ``MemoryAccess`` es. You can then query ``MemorySSA`` for things like
162the dominance relation between ``MemoryAccess`` es, and get the ``MemoryAccess``
163for any given ``Instruction`` .
164
165When ``MemorySSA`` is done building, it also hands you a ``MemorySSAWalker``
166that you can use (see below).
167
168
169The walker
170----------
171
172A structure that helps ``MemorySSA`` do its job is the ``MemorySSAWalker``, or
173the walker, for short. The goal of the walker is to provide answers to clobber
174queries beyond what's represented directly by ``MemoryAccess`` es. For example,
175given:
176
177.. code-block:: llvm
178
179 define void @foo() {
180 %a = alloca i8
181 %b = alloca i8
182
183 ; 1 = MemoryDef(liveOnEntry)
184 store i8 0, i8* %a
185 ; 2 = MemoryDef(1)
186 store i8 0, i8* %b
187 }
188
189The store to ``%a`` is clearly not a clobber for the store to ``%b``. It would
190be the walker's goal to figure this out, and return ``liveOnEntry`` when queried
191for the clobber of ``MemoryAccess`` ``2``.
192
193By default, ``MemorySSA`` provides a walker that can optimize ``MemoryDef`` s
194and ``MemoryUse`` s by consulting alias analysis. Walkers were built to be
195flexible, though, so it's entirely reasonable (and expected) to create more
196specialized walkers (e.g. one that queries ``GlobalsAA``).
197
198
199Locating clobbers yourself
200^^^^^^^^^^^^^^^^^^^^^^^^^^
201
202If you choose to make your own walker, you can find the clobber for a
203``MemoryAccess`` by walking every ``MemoryDef`` that dominates said
204``MemoryAccess``. The structure of ``MemoryDef`` s makes this relatively simple;
205they ultimately form a linked list of every clobber that dominates the
206``MemoryAccess`` that you're trying to optimize. In other words, the
207``definingAccess`` of a ``MemoryDef`` is always the nearest dominating
208``MemoryDef`` or ``MemoryPhi`` of said ``MemoryDef``.
209
210
211Use optimization
212----------------
213
214``MemorySSA`` will optimize some ``MemoryAccess`` es at build-time.
George Burgess IV67c58852016-08-17 01:50:54 +0000215Specifically, we optimize the operand of every ``MemoryUse`` to point to the
George Burgess IV3bbeb732016-08-17 00:17:29 +0000216actual clobber of said ``MemoryUse``. This can be seen in the above example; the
217second ``MemoryUse`` in ``if.end`` has an operand of ``1``, which is a
218``MemoryDef`` from the entry block. This is done to make walking,
219value numbering, etc, faster and easier.
220It is not possible to optimize ``MemoryDef`` in the same way, as we
221restrict ``MemorySSA`` to one heap variable and, thus, one Phi node
222per block.
223
224
225Invalidation and updating
226-------------------------
227
228Because ``MemorySSA`` keeps track of LLVM IR, it needs to be updated whenever
229the IR is updated. "Update", in this case, includes the addition, deletion, and
George Burgess IV67c58852016-08-17 01:50:54 +0000230motion of ``Instructions``. The update API is being made on an as-needed basis.
231If you'd like examples, ``GVNHoist`` is a user of ``MemorySSA`` s update API.
George Burgess IV3bbeb732016-08-17 00:17:29 +0000232
233
234Phi placement
235^^^^^^^^^^^^^
236
237``MemorySSA`` only places ``MemoryPhi`` s where they're actually
238needed. That is, it is a pruned SSA form, like LLVM's SSA form. For
239example, consider:
240
241.. code-block:: llvm
242
243 define void @foo() {
244 entry:
245 %p1 = alloca i8
246 %p2 = alloca i8
247 %p3 = alloca i8
248 ; 1 = MemoryDef(liveOnEntry)
249 store i8 0, i8* %p3
250 br label %while.cond
251
252 while.cond:
253 ; 3 = MemoryPhi({%0,1},{if.end,2})
254 br i1 undef, label %if.then, label %if.else
255
256 if.then:
257 br label %if.end
258
259 if.else:
260 br label %if.end
261
262 if.end:
263 ; MemoryUse(1)
264 %1 = load i8, i8* %p1
265 ; 2 = MemoryDef(3)
266 store i8 2, i8* %p2
267 ; MemoryUse(1)
268 %2 = load i8, i8* %p3
269 br label %while.cond
270 }
271
272Because we removed the stores from ``if.then`` and ``if.else``, a ``MemoryPhi``
273for ``if.end`` would be pointless, so we don't place one. So, if you need to
274place a ``MemoryDef`` in ``if.then`` or ``if.else``, you'll need to also create
275a ``MemoryPhi`` for ``if.end``.
276
277If it turns out that this is a large burden, we can just place ``MemoryPhi`` s
278everywhere. Because we have Walkers that are capable of optimizing above said
279phis, doing so shouldn't prohibit optimizations.
280
281
282Non-Goals
283---------
284
285``MemorySSA`` is meant to reason about the relation between memory
286operations, and enable quicker querying.
287It isn't meant to be the single source of truth for all potential memory-related
288optimizations. Specifically, care must be taken when trying to use ``MemorySSA``
289to reason about atomic or volatile operations, as in:
290
291.. code-block:: llvm
292
293 define i8 @foo(i8* %a) {
294 entry:
295 br i1 undef, label %if.then, label %if.end
296
297 if.then:
298 ; 1 = MemoryDef(liveOnEntry)
299 %0 = load volatile i8, i8* %a
300 br label %if.end
301
302 if.end:
303 %av = phi i8 [0, %entry], [%0, %if.then]
304 ret i8 %av
305 }
306
307Going solely by ``MemorySSA``'s analysis, hoisting the ``load`` to ``entry`` may
308seem legal. Because it's a volatile load, though, it's not.
309
310
311Design tradeoffs
312----------------
313
314Precision
315^^^^^^^^^
George Burgess IV67c58852016-08-17 01:50:54 +0000316
George Burgess IV3bbeb732016-08-17 00:17:29 +0000317``MemorySSA`` in LLVM deliberately trades off precision for speed.
318Let us think about memory variables as if they were disjoint partitions of the
319heap (that is, if you have one variable, as above, it represents the entire
320heap, and if you have multiple variables, each one represents some
321disjoint portion of the heap)
322
323First, because alias analysis results conflict with each other, and
324each result may be what an analysis wants (IE
325TBAA may say no-alias, and something else may say must-alias), it is
326not possible to partition the heap the way every optimization wants.
327Second, some alias analysis results are not transitive (IE A noalias B,
328and B noalias C, does not mean A noalias C), so it is not possible to
329come up with a precise partitioning in all cases without variables to
330represent every pair of possible aliases. Thus, partitioning
331precisely may require introducing at least N^2 new virtual variables,
332phi nodes, etc.
333
334Each of these variables may be clobbered at multiple def sites.
335
336To give an example, if you were to split up struct fields into
337individual variables, all aliasing operations that may-def multiple struct
338fields, will may-def more than one of them. This is pretty common (calls,
339copies, field stores, etc).
340
341Experience with SSA forms for memory in other compilers has shown that
342it is simply not possible to do this precisely, and in fact, doing it
343precisely is not worth it, because now all the optimizations have to
344walk tons and tons of virtual variables and phi nodes.
345
346So we partition. At the point at which you partition, again,
347experience has shown us there is no point in partitioning to more than
348one variable. It simply generates more IR, and optimizations still
349have to query something to disambiguate further anyway.
350
351As a result, LLVM partitions to one variable.
352
353Use Optimization
354^^^^^^^^^^^^^^^^
355
356Unlike other partitioned forms, LLVM's ``MemorySSA`` does make one
357useful guarantee - all loads are optimized to point at the thing that
358actually clobbers them. This gives some nice properties. For example,
359for a given store, you can find all loads actually clobbered by that
360store by walking the immediate uses of the store.