blob: 251e25cefab89692266ba9a4fa3f79a8eabd48d4 [file] [log] [blame]
Jim Stichnoth5bc2b1d2014-05-22 13:38:48 -07001Target-specific lowering in ICE
2===============================
3
4This document discusses several issues around generating target-specific ICE
5instructions from high-level ICE instructions.
6
7Meeting register address mode constraints
8-----------------------------------------
9
10Target-specific instructions often require specific operands to be in physical
11registers. Sometimes one specific register is required, but usually any
12register in a particular register class will suffice, and that register class is
13defined by the instruction/operand type.
14
15The challenge is that ``Variable`` represents an operand that is either a stack
16location in the current frame, or a physical register. Register allocation
17happens after target-specific lowering, so during lowering we generally don't
18know whether a ``Variable`` operand will meet a target instruction's physical
19register requirement.
20
21To this end, ICE allows certain hints/directives:
22
23 * ``Variable::setWeightInfinite()`` forces a ``Variable`` to get some
24 physical register (without specifying which particular one) from a
25 register class.
26
27 * ``Variable::setRegNum()`` forces a ``Variable`` to be assigned a specific
28 physical register.
29
30 * ``Variable::setPreferredRegister()`` registers a preference for a physical
31 register based on another ``Variable``'s physical register assignment.
32
33These hints/directives are described below in more detail. In most cases,
34though, they don't need to be explicity used, as the routines that create
35lowered instructions have reasonable defaults and simple options that control
36these hints/directives.
37
38The recommended ICE lowering strategy is to generate extra assignment
39instructions involving extra ``Variable`` temporaries, using the
40hints/directives to force suitable register assignments for the temporaries, and
41then let the global register allocator clean things up.
42
43Note: There is a spectrum of *implementation complexity* versus *translation
44speed* versus *code quality*. This recommended strategy picks a point on the
45spectrum representing very low complexity ("splat-isel"), pretty good code
46quality in terms of frame size and register shuffling/spilling, but perhaps not
47the fastest translation speed since extra instructions and operands are created
48up front and cleaned up at the end.
49
50Ensuring some physical register
51^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
52
53The x86 instruction::
54
55 mov dst, src
56
57needs at least one of its operands in a physical register (ignoring the case
58where ``src`` is a constant). This can be done as follows::
59
60 mov reg, src
61 mov dst, reg
62
63so long as ``reg`` is guaranteed to have a physical register assignment. The
64low-level lowering code that accomplishes this looks something like::
65
66 Variable *Reg;
67 Reg = Func->makeVariable(Dst->getType());
68 Reg->setWeightInfinite();
69 NewInst = InstX8632Mov::create(Func, Reg, Src);
70 NewInst = InstX8632Mov::create(Func, Dst, Reg);
71
72``Cfg::makeVariable()`` generates a new temporary, and
73``Variable::setWeightInfinite()`` gives it infinite weight for the purpose of
74register allocation, thus guaranteeing it a physical register.
75
76The ``_mov(Dest, Src)`` method in the ``TargetX8632`` class is sufficiently
77powerful to handle these details in most situations. Its ``Dest`` argument is
78an in/out parameter. If its input value is ``NULL``, then a new temporary
79variable is created, its type is set to the same type as the ``Src`` operand, it
80is given infinite register weight, and the new ``Variable`` is returned through
81the in/out parameter. (This is in addition to the new temporary being the dest
82operand of the ``mov`` instruction.) The simpler version of the above example
83is::
84
85 Variable *Reg = NULL;
86 _mov(Reg, Src);
87 _mov(Dst, Reg);
88
89Preferring another ``Variable``'s physical register
90^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
91
92One problem with this example is that the register allocator usually just
93assigns the first available register to a live range. If this instruction ends
94the live range of ``src``, this may lead to code like the following::
95
96 mov reg:eax, src:esi
97 mov dst:edi, reg:eax
98
99Since the first instruction happens to end the live range of ``src:esi``, it
100would be better to assign ``esi`` to ``reg``::
101
102 mov reg:esi, src:esi
103 mov dst:edi, reg:esi
104
105The first instruction, ``mov esi, esi``, is a redundant assignment and will
106ultimately be elided, leaving just ``mov edi, esi``.
107
108We can tell the register allocator to prefer the register assigned to a
109different ``Variable``, using ``Variable::setPreferredRegister()``::
110
111 Variable *Reg;
112 Reg = Func->makeVariable(Dst->getType());
113 Reg->setWeightInfinite();
114 Reg->setPreferredRegister(Src);
115 NewInst = InstX8632Mov::create(Func, Reg, Src);
116 NewInst = InstX8632Mov::create(Func, Dst, Reg);
117
118Or more simply::
119
120 Variable *Reg = NULL;
121 _mov(Reg, Src);
122 _mov(Dst, Reg);
123 Reg->setPreferredRegister(llvm::dyn_cast<Variable>(Src));
124
125The usefulness of ``setPreferredRegister()`` is tied into the implementation of
126the register allocator. ICE uses linear-scan register allocation, which sorts
127live ranges by starting point and assigns registers in that order. Using
128``B->setPreferredRegister(A)`` only helps when ``A`` has already been assigned a
129register by the time ``B`` is being considered. For an assignment ``B=A``, this
130is usually a safe assumption because ``B``'s live range begins at this
131instruction but ``A``'s live range must have started earlier. (There may be
132exceptions for variables that are no longer in SSA form.) But
133``A->setPreferredRegister(B)`` is unlikely to help unless ``B`` has been
134precolored. In summary, generally the best practice is to use a pattern like::
135
136 NewInst = InstX8632Mov::create(Func, Dst, Src);
137 Dst->setPreferredRegister(Src);
138 //Src->setPreferredRegister(Dst); -- unlikely to have any effect
139
140Ensuring a specific physical register
141^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
142
143Some instructions require operands in specific physical registers, or produce
144results in specific physical registers. For example, the 32-bit ``ret``
145instruction needs its operand in ``eax``. This can be done with
146``Variable::setRegNum()``::
147
148 Variable *Reg;
149 Reg = Func->makeVariable(Src->getType());
150 Reg->setWeightInfinite();
151 Reg->setRegNum(Reg_eax);
152 NewInst = InstX8632Mov::create(Func, Reg, Src);
153 NewInst = InstX8632Ret::create(Func, Reg);
154
155Precoloring with ``Variable::setRegNum()`` effectively gives it infinite weight
156for register allocation, so the call to ``Variable::setWeightInfinite()`` is
157technically unnecessary, but perhaps documents the intention a bit more
158strongly.
159
160The ``_mov(Dest, Src, RegNum)`` method in the ``TargetX8632`` class has an
161optional ``RegNum`` argument to force a specific register assignment when the
162input ``Dest`` is ``NULL``. As described above, passing in ``Dest=NULL`` causes
163a new temporary variable to be created with infinite register weight, and in
164addition the specific register is chosen. The simpler version of the above
165example is::
166
167 Variable *Reg = NULL;
168 _mov(Reg, Src, Reg_eax);
169 _ret(Reg);
170
171Disabling live-range interference
172^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
173
174Another problem with the "``mov reg,src; mov dst,reg``" example happens when
175the instructions do *not* end the live range of ``src``. In this case, the live
176ranges of ``reg`` and ``src`` interfere, so they can't get the same physical
177register despite the explicit preference. However, ``reg`` is meant to be an
178alias of ``src`` so they needn't be considered to interfere with each other.
179This can be expressed via the second (bool) argument of
180``setPreferredRegister()``::
181
182 Variable *Reg;
183 Reg = Func->makeVariable(Dst->getType());
184 Reg->setWeightInfinite();
185 Reg->setPreferredRegister(Src, true);
186 NewInst = InstX8632Mov::create(Func, Reg, Src);
187 NewInst = InstX8632Mov::create(Func, Dst, Reg);
188
189This should be used with caution and probably only for these short-live-range
190temporaries, otherwise the classic "lost copy" or "lost swap" problem may be
191encountered.
192
193Instructions with register side effects
194---------------------------------------
195
196Some instructions produce unwanted results in other registers, or otherwise kill
197preexisting values in other registers. For example, a ``call`` kills the
198scratch registers. Also, the x86-32 ``idiv`` instruction produces the quotient
199in ``eax`` and the remainder in ``edx``, but generally only one of those is
200needed in the lowering. It's important that the register allocator doesn't
201allocate that register to a live range that spans the instruction.
202
203ICE provides the ``InstFakeKill`` pseudo-instruction to mark such register
204kills. For each of the instruction's source variables, a fake trivial live
205range is created that begins and ends in that instruction. The ``InstFakeKill``
206instruction is inserted after the ``call`` instruction. For example::
207
208 CallInst = InstX8632Call::create(Func, ... );
209 VarList KilledRegs;
210 KilledRegs.push_back(eax);
211 KilledRegs.push_back(ecx);
212 KilledRegs.push_back(edx);
213 NewInst = InstFakeKill::create(Func, KilledRegs, CallInst);
214
215The last argument to the ``InstFakeKill`` constructor links it to the previous
216call instruction, such that if its linked instruction is dead-code eliminated,
217the ``InstFakeKill`` instruction is eliminated as well.
218
219The killed register arguments need to be assigned a physical register via
220``Variable::setRegNum()`` for this to be effective. To avoid a massive
221proliferation of ``Variable`` temporaries, the ``TargetLowering`` object caches
222one precolored ``Variable`` for each physical register::
223
224 CallInst = InstX8632Call::create(Func, ... );
225 VarList KilledRegs;
226 Variable *eax = Func->getTarget()->getPhysicalRegister(Reg_eax);
227 Variable *ecx = Func->getTarget()->getPhysicalRegister(Reg_ecx);
228 Variable *edx = Func->getTarget()->getPhysicalRegister(Reg_edx);
229 KilledRegs.push_back(eax);
230 KilledRegs.push_back(ecx);
231 KilledRegs.push_back(edx);
232 NewInst = InstFakeKill::create(Func, KilledRegs, CallInst);
233
234On first glance, it may seem unnecessary to explicitly kill the register that
235returns the ``call`` return value. However, if for some reason the ``call``
236result ends up being unused, dead-code elimination could remove dead assignments
237and incorrectly expose the return value register to a register allocation
238assignment spanning the call, which would be incorrect.
239
240Instructions producing multiple values
241--------------------------------------
242
243ICE instructions allow at most one destination ``Variable``. Some machine
244instructions produce more than one usable result. For example, the x86-32
245``call`` ABI returns a 64-bit integer result in the ``edx:eax`` register pair.
246Also, x86-32 has a version of the ``imul`` instruction that produces a 64-bit
247result in the ``edx:eax`` register pair.
248
249To support multi-dest instructions, ICE provides the ``InstFakeDef``
250pseudo-instruction, whose destination can be precolored to the appropriate
251physical register. For example, a ``call`` returning a 64-bit result in
252``edx:eax``::
253
254 CallInst = InstX8632Call::create(Func, RegLow, ... );
255 ...
256 NewInst = InstFakeKill::create(Func, KilledRegs, CallInst);
257 Variable *RegHigh = Func->makeVariable(IceType_i32);
258 RegHigh->setRegNum(Reg_edx);
259 NewInst = InstFakeDef::create(Func, RegHigh);
260
261``RegHigh`` is then assigned into the desired ``Variable``. If that assignment
262ends up being dead-code eliminated, the ``InstFakeDef`` instruction may be
263eliminated as well.
264
265Preventing dead-code elimination
266--------------------------------
267
268ICE instructions with a non-NULL ``Dest`` are subject to dead-code elimination.
269However, some instructions must not be eliminated in order to preserve side
270effects. This applies to most function calls, volatile loads, and loads and
271integer divisions where the underlying language and runtime are relying on
272hardware exception handling.
273
274ICE facilitates this with the ``InstFakeUse`` pseudo-instruction. This forces a
275use of its source ``Variable`` to keep that variable's definition alive. Since
276the ``InstFakeUse`` instruction has no ``Dest``, it will not be eliminated.
277
278Here is the full example of the x86-32 ``call`` returning a 32-bit integer
279result::
280
281 Variable *Reg = Func->makeVariable(IceType_i32);
282 Reg->setRegNum(Reg_eax);
283 CallInst = InstX8632Call::create(Func, Reg, ... );
284 VarList KilledRegs;
285 Variable *eax = Func->getTarget()->getPhysicalRegister(Reg_eax);
286 Variable *ecx = Func->getTarget()->getPhysicalRegister(Reg_ecx);
287 Variable *edx = Func->getTarget()->getPhysicalRegister(Reg_edx);
288 KilledRegs.push_back(eax);
289 KilledRegs.push_back(ecx);
290 KilledRegs.push_back(edx);
291 NewInst = InstFakeKill::create(Func, KilledRegs, CallInst);
292 NewInst = InstFakeUse::create(Func, Reg);
293 NewInst = InstX8632Mov::create(Func, Result, Reg);
294
295Without the ``InstFakeUse``, the entire call sequence could be dead-code
296eliminated if its result were unused.
297
298One more note on this topic. These tools can be used to allow a multi-dest
299instruction to be dead-code eliminated only when none of its results is live.
300The key is to use the optional source parameter of the ``InstFakeDef``
301instruction. Using pseudocode::
302
303 t1:eax = call foo(arg1, ...)
304 InstFakeKill(eax, ecx, edx)
305 t2:edx = InstFakeDef(t1)
306 v_result_low = t1
307 v_result_high = t2
308
309If ``v_result_high`` is live but ``v_result_low`` is dead, adding ``t1`` as an
310argument to ``InstFakeDef`` suffices to keep the ``call`` instruction live.