blob: e93cd3671ce2c0d24209110dde441b4a42a0fec7 [file] [log] [blame]
George Burgess IV3bbeb732016-08-17 00:17:29 +00001=========
2MemorySSA
3=========
4
5.. contents::
6 :local:
7
8Introduction
9============
10
11``MemorySSA`` is an analysis that allows us to cheaply reason about the
12interactions between various memory operations. Its goal is to replace
13``MemoryDependenceAnalysis`` for most (if not all) use-cases. This is because,
14unless you're very careful, use of ``MemoryDependenceAnalysis`` can easily
15result in quadratic-time algorithms in LLVM. Additionally, ``MemorySSA`` doesn't
16have as many arbitrary limits as ``MemoryDependenceAnalysis``, so you should get
17better results, too.
18
19At a high level, one of the goals of ``MemorySSA`` is to provide an SSA based
20form for memory, complete with def-use and use-def chains, which
21enables users to quickly find may-def and may-uses of memory operations.
22It can also be thought of as a way to cheaply give versions to the complete
23state of heap memory, and associate memory operations with those versions.
24
25This document goes over how ``MemorySSA`` is structured, and some basic
26intuition on how ``MemorySSA`` works.
27
28A paper on MemorySSA (with notes about how it's implemented in GCC) `can be
29found here <http://www.airs.com/dnovillo/Papers/mem-ssa.pdf>`_. Though, it's
30relatively out-of-date; the paper references multiple heap partitions, but GCC
31eventually swapped to just using one, like we now have in LLVM. Like
32GCC's, LLVM's MemorySSA is intraprocedural.
33
34
35MemorySSA Structure
36===================
37
38MemorySSA is a virtual IR. After it's built, ``MemorySSA`` will contain a
39structure that maps ``Instruction`` s to ``MemoryAccess`` es, which are
40``MemorySSA``'s parallel to LLVM ``Instruction`` s.
41
42Each ``MemoryAccess`` can be one of three types:
43
44- ``MemoryPhi``
45- ``MemoryUse``
46- ``MemoryDef``
47
48``MemoryPhi`` s are ``PhiNode`` , but for memory operations. If at any
49point we have two (or more) ``MemoryDef`` s that could flow into a
50``BasicBlock``, the block's top ``MemoryAccess`` will be a
51``MemoryPhi``. As in LLVM IR, ``MemoryPhi`` s don't correspond to any
52concrete operation. As such, you can't look up a ``MemoryPhi`` with an
53``Instruction`` (though we do allow you to do so with a
54``BasicBlock``).
55
56Note also that in SSA, Phi nodes merge must-reach definitions (that
57is, definite new versions of variables). In MemorySSA, PHI nodes merge
58may-reach definitions (that is, until disambiguated, the versions that
59reach a phi node may or may not clobber a given variable)
60
61``MemoryUse`` s are operations which use but don't modify memory. An example of
62a ``MemoryUse`` is a ``load``, or a ``readonly`` function call.
63
64``MemoryDef`` s are operations which may either modify memory, or which
65otherwise clobber memory in unquantifiable ways. Examples of ``MemoryDef`` s
66include ``store`` s, function calls, ``load`` s with ``acquire`` (or higher)
67ordering, volatile operations, memory fences, etc.
68
69Every function that exists has a special ``MemoryDef`` called ``liveOnEntry``.
70It dominates every ``MemoryAccess`` in the function that ``MemorySSA`` is being
71run on, and implies that we've hit the top of the function. It's the only
72``MemoryDef`` that maps to no ``Instruction`` in LLVM IR. Use of
73``liveOnEntry`` implies that the memory being used is either undefined or
74defined before the function begins.
75
76An example of all of this overlayed on LLVM IR (obtained by running ``opt
77-passes='print<memoryssa>' -disable-output`` on an ``.ll`` file) is below. When
78viewing this example, it may be helpful to view it in terms of clobbers. The
79operands of a given ``MemoryAccess`` are all (potential) clobbers of said
80MemoryAccess, and the value produced by a ``MemoryAccess`` can act as a clobber
81for other ``MemoryAccess`` es. Another useful way of looking at it is in
82terms of heap versions. In that view, operands of of a given
83``MemoryAccess`` are the version of the heap before the operation, and
84if the access produces a value, the value is the new version of the heap
85after the operation.
86
87.. code-block:: llvm
88
89 define void @foo() {
90 entry:
91 %p1 = alloca i8
92 %p2 = alloca i8
93 %p3 = alloca i8
94 ; 1 = MemoryDef(liveOnEntry)
95 store i8 0, i8* %p3
96 br label %while.cond
97
98 while.cond:
99 ; 6 = MemoryPhi({%0,1},{if.end,4})
100 br i1 undef, label %if.then, label %if.else
101
102 if.then:
103 ; 2 = MemoryDef(6)
104 store i8 0, i8* %p1
105 br label %if.end
106
107 if.else:
108 ; 3 = MemoryDef(6)
109 store i8 1, i8* %p2
110 br label %if.end
111
112 if.end:
113 ; 5 = MemoryPhi({if.then,2},{if.then,3})
114 ; MemoryUse(5)
115 %1 = load i8, i8* %p1
116 ; 4 = MemoryDef(5)
117 store i8 2, i8* %p2
118 ; MemoryUse(1)
119 %2 = load i8, i8* %p3
120 br label %while.cond
121 }
122
123The ``MemorySSA`` IR is located comments that precede the instructions they map
124to (if such an instruction exists). For example, ``1 = MemoryDef(liveOnEntry)``
125is a ``MemoryAccess`` (specifically, a ``MemoryDef``), and it describes the LLVM
126instruction ``store i8 0, i8* %p3``. Other places in ``MemorySSA`` refer to this
127particular ``MemoryDef`` as ``1`` (much like how one can refer to ``load i8, i8*
128%p1`` in LLVM with ``%1``). Again, ``MemoryPhi`` s don't correspond to any LLVM
129Instruction, so the line directly below a ``MemoryPhi`` isn't special.
130
131Going from the top down:
132
133- ``6 = MemoryPhi({%0,1},{if.end,4})`` notes that, when entering ``while.cond``,
134 the reaching definition for it is either ``1`` or ``4``. This ``MemoryPhi`` is
135 referred to in the textual IR by the number ``6``.
136- ``2 = MemoryDef(6)`` notes that ``store i8 0, i8* %p1`` is a definition,
137 and its reaching definition before it is ``6``, or the ``MemoryPhi`` after
138 ``while.cond``.
139- ``3 = MemoryDef(6)`` notes that ``store i8 0, i8* %p2`` is a definition; its
140 reaching definition is also ``6``.
141- ``5 = MemoryPhi({if.then,2},{if.then,3})`` notes that the clobber before
142 this block could either be ``2`` or ``3``.
143- ``MemoryUse(5)`` notes that ``load i8, i8* %p1`` is a use of memory, and that
144 it's clobbered by ``5``.
145- ``4 = MemoryDef(5)`` notes that ``store i8 2, i8* %p2`` is a definition; it's
146 reaching definition is ``5``.
147- ``MemoryUse(1)`` notes that ``load i8, i8* %p3`` is just a user of memory,
148 and the last thing that could clobber this use is above ``while.cond`` (e.g.
149 the store to ``%p3``). In heap versioning parlance, it really
150 only depends on the heap version 1, and is unaffected by the new
151 heap versions generated since then.
152
153As an aside, ``MemoryAccess`` is a ``Value`` mostly for convenience; it's not
154meant to interact with LLVM IR.
155
156Design of MemorySSA
157===================
158
159``MemorySSA`` is an analysis that can be built for any arbitrary function. When
160it's built, it does a pass over the function's IR in order to build up its
161mapping of ``MemoryAccess`` es. You can then query ``MemorySSA`` for things like
162the dominance relation between ``MemoryAccess`` es, and get the ``MemoryAccess``
163for any given ``Instruction`` .
164
165When ``MemorySSA`` is done building, it also hands you a ``MemorySSAWalker``
166that you can use (see below).
167
168
169The walker
170----------
171
172A structure that helps ``MemorySSA`` do its job is the ``MemorySSAWalker``, or
173the walker, for short. The goal of the walker is to provide answers to clobber
174queries beyond what's represented directly by ``MemoryAccess`` es. For example,
175given:
176
177.. code-block:: llvm
178
179 define void @foo() {
180 %a = alloca i8
181 %b = alloca i8
182
183 ; 1 = MemoryDef(liveOnEntry)
184 store i8 0, i8* %a
185 ; 2 = MemoryDef(1)
186 store i8 0, i8* %b
187 }
188
189The store to ``%a`` is clearly not a clobber for the store to ``%b``. It would
190be the walker's goal to figure this out, and return ``liveOnEntry`` when queried
191for the clobber of ``MemoryAccess`` ``2``.
192
193By default, ``MemorySSA`` provides a walker that can optimize ``MemoryDef`` s
194and ``MemoryUse`` s by consulting alias analysis. Walkers were built to be
195flexible, though, so it's entirely reasonable (and expected) to create more
196specialized walkers (e.g. one that queries ``GlobalsAA``).
197
198
199Locating clobbers yourself
200^^^^^^^^^^^^^^^^^^^^^^^^^^
201
202If you choose to make your own walker, you can find the clobber for a
203``MemoryAccess`` by walking every ``MemoryDef`` that dominates said
204``MemoryAccess``. The structure of ``MemoryDef`` s makes this relatively simple;
205they ultimately form a linked list of every clobber that dominates the
206``MemoryAccess`` that you're trying to optimize. In other words, the
207``definingAccess`` of a ``MemoryDef`` is always the nearest dominating
208``MemoryDef`` or ``MemoryPhi`` of said ``MemoryDef``.
209
210
211Use optimization
212----------------
213
214``MemorySSA`` will optimize some ``MemoryAccess`` es at build-time.
215Specifically, we optimize the operand of every ``MemoryUse`` s to point to the
216actual clobber of said ``MemoryUse``. This can be seen in the above example; the
217second ``MemoryUse`` in ``if.end`` has an operand of ``1``, which is a
218``MemoryDef`` from the entry block. This is done to make walking,
219value numbering, etc, faster and easier.
220It is not possible to optimize ``MemoryDef`` in the same way, as we
221restrict ``MemorySSA`` to one heap variable and, thus, one Phi node
222per block.
223
224
225Invalidation and updating
226-------------------------
227
228Because ``MemorySSA`` keeps track of LLVM IR, it needs to be updated whenever
229the IR is updated. "Update", in this case, includes the addition, deletion, and
230motion of IR instructions. The update API is being made on an as-needed basis.
231
232
233Phi placement
234^^^^^^^^^^^^^
235
236``MemorySSA`` only places ``MemoryPhi`` s where they're actually
237needed. That is, it is a pruned SSA form, like LLVM's SSA form. For
238example, consider:
239
240.. code-block:: llvm
241
242 define void @foo() {
243 entry:
244 %p1 = alloca i8
245 %p2 = alloca i8
246 %p3 = alloca i8
247 ; 1 = MemoryDef(liveOnEntry)
248 store i8 0, i8* %p3
249 br label %while.cond
250
251 while.cond:
252 ; 3 = MemoryPhi({%0,1},{if.end,2})
253 br i1 undef, label %if.then, label %if.else
254
255 if.then:
256 br label %if.end
257
258 if.else:
259 br label %if.end
260
261 if.end:
262 ; MemoryUse(1)
263 %1 = load i8, i8* %p1
264 ; 2 = MemoryDef(3)
265 store i8 2, i8* %p2
266 ; MemoryUse(1)
267 %2 = load i8, i8* %p3
268 br label %while.cond
269 }
270
271Because we removed the stores from ``if.then`` and ``if.else``, a ``MemoryPhi``
272for ``if.end`` would be pointless, so we don't place one. So, if you need to
273place a ``MemoryDef`` in ``if.then`` or ``if.else``, you'll need to also create
274a ``MemoryPhi`` for ``if.end``.
275
276If it turns out that this is a large burden, we can just place ``MemoryPhi`` s
277everywhere. Because we have Walkers that are capable of optimizing above said
278phis, doing so shouldn't prohibit optimizations.
279
280
281Non-Goals
282---------
283
284``MemorySSA`` is meant to reason about the relation between memory
285operations, and enable quicker querying.
286It isn't meant to be the single source of truth for all potential memory-related
287optimizations. Specifically, care must be taken when trying to use ``MemorySSA``
288to reason about atomic or volatile operations, as in:
289
290.. code-block:: llvm
291
292 define i8 @foo(i8* %a) {
293 entry:
294 br i1 undef, label %if.then, label %if.end
295
296 if.then:
297 ; 1 = MemoryDef(liveOnEntry)
298 %0 = load volatile i8, i8* %a
299 br label %if.end
300
301 if.end:
302 %av = phi i8 [0, %entry], [%0, %if.then]
303 ret i8 %av
304 }
305
306Going solely by ``MemorySSA``'s analysis, hoisting the ``load`` to ``entry`` may
307seem legal. Because it's a volatile load, though, it's not.
308
309
310Design tradeoffs
311----------------
312
313Precision
314^^^^^^^^^
315``MemorySSA`` in LLVM deliberately trades off precision for speed.
316Let us think about memory variables as if they were disjoint partitions of the
317heap (that is, if you have one variable, as above, it represents the entire
318heap, and if you have multiple variables, each one represents some
319disjoint portion of the heap)
320
321First, because alias analysis results conflict with each other, and
322each result may be what an analysis wants (IE
323TBAA may say no-alias, and something else may say must-alias), it is
324not possible to partition the heap the way every optimization wants.
325Second, some alias analysis results are not transitive (IE A noalias B,
326and B noalias C, does not mean A noalias C), so it is not possible to
327come up with a precise partitioning in all cases without variables to
328represent every pair of possible aliases. Thus, partitioning
329precisely may require introducing at least N^2 new virtual variables,
330phi nodes, etc.
331
332Each of these variables may be clobbered at multiple def sites.
333
334To give an example, if you were to split up struct fields into
335individual variables, all aliasing operations that may-def multiple struct
336fields, will may-def more than one of them. This is pretty common (calls,
337copies, field stores, etc).
338
339Experience with SSA forms for memory in other compilers has shown that
340it is simply not possible to do this precisely, and in fact, doing it
341precisely is not worth it, because now all the optimizations have to
342walk tons and tons of virtual variables and phi nodes.
343
344So we partition. At the point at which you partition, again,
345experience has shown us there is no point in partitioning to more than
346one variable. It simply generates more IR, and optimizations still
347have to query something to disambiguate further anyway.
348
349As a result, LLVM partitions to one variable.
350
351Use Optimization
352^^^^^^^^^^^^^^^^
353
354Unlike other partitioned forms, LLVM's ``MemorySSA`` does make one
355useful guarantee - all loads are optimized to point at the thing that
356actually clobbers them. This gives some nice properties. For example,
357for a given store, you can find all loads actually clobbered by that
358store by walking the immediate uses of the store.