blob: 113375fa7f2feb5a4dc3c451db42561f8bbb543e [file] [log] [blame]
Daniel Sanders928920a2013-09-27 10:42:22 +00001Code Generation Notes for MSA
2=============================
3
4Intrinsics are lowered to SelectionDAG nodes where possible in order to enable
5optimisation, reduce the size of the ISel matcher, and reduce repetition in
6the implementation. In a small number of cases, this can cause different
7(semantically equivalent) instructions to be used in place of the requested
8instruction, even when no optimisation has taken place.
9
10Instructions
11============
12
13This section describes any quirks of instruction selection for MSA. For
14example, two instructions might be equally valid for some given IR and one is
15chosen in preference to the other.
16
Daniel Sanders3f6eb542013-11-12 10:45:18 +000017bclri.b:
18 It is not possible to emit bclri.b since andi.b covers exactly the
19 same cases. andi.b should use fractionally less power than bclri.b in
20 most hardware implementations so it is used in preference to bclri.b.
21
Daniel Sanders928920a2013-09-27 10:42:22 +000022vshf.w:
23 It is not possible to emit vshf.w when the shuffle description is
24 constant since shf.w covers exactly the same cases. shf.w is used
25 instead. It is also impossible for the shuffle description to be
26 unknown at compile-time due to the definition of shufflevector in
27 LLVM IR.
28
Daniel Sanderse7ef0c82013-10-30 13:07:44 +000029vshf.[bhwd]
30 When the shuffle description describes a splat operation, splat.[bhwd]
31 instructions will be selected instead of vshf.[bhwd]. Unlike the ilv*,
32 and pck* instructions, this is matched from MipsISD::VSHF instead of
33 a special-case MipsISD node.
34
Daniel Sanders928920a2013-09-27 10:42:22 +000035ilvl.d, pckev.d:
36 It is not possible to emit ilvl.d, or pckev.d since ilvev.d covers the
37 same shuffle. ilvev.d will be emitted instead.
38
39ilvr.d, ilvod.d, pckod.d:
40 It is not possible to emit ilvr.d, or pckod.d since ilvod.d covers the
41 same shuffle. ilvod.d will be emitted instead.
Daniel Sanders7e51fe12013-09-27 11:48:57 +000042
Daniel Sanderse7ef0c82013-10-30 13:07:44 +000043splat.[bhwd]
44 The intrinsic will work as expected. However, unlike other intrinsics
45 it lowers directly to MipsISD::VSHF instead of using common IR.
46
Daniel Sanders7e51fe12013-09-27 11:48:57 +000047splati.w:
48 It is not possible to emit splati.w since shf.w covers the same cases.
49 shf.w will be emitted instead.
Daniel Sanders7f3d9462013-09-27 13:04:21 +000050
Daniel Sandersd74b1302013-10-30 14:45:14 +000051copy_s.w:
Daniel Sanders7f3d9462013-09-27 13:04:21 +000052 On MIPS32, the copy_u.d intrinsic will emit this instruction instead of
53 copy_u.w. This is semantically equivalent since the general-purpose
54 register file is 32-bits wide.
Daniel Sandersd74b1302013-10-30 14:45:14 +000055
56binsri.[bhwd], binsli.[bhwd]:
57 These two operations are equivalent to each other with the operands
58 swapped and condition inverted. The compiler may use either one as
59 appropriate.
60 Furthermore, the compiler may use bsel.[bhwd] for some masks that do
61 not survive the legalization process (this is a bug and will be fixed).
Daniel Sandersab94b532013-10-30 15:20:38 +000062
63bmnz.v, bmz.v, bsel.v:
64 These three operations differ only in the operand that is tied to the
Daniel Sandersdf2215452014-03-12 11:54:00 +000065 result and the order of the operands.
Daniel Sandersab94b532013-10-30 15:20:38 +000066 It is (currently) not possible to emit bmz.v, or bsel.v since bmnz.v is
67 the same operation and will be emitted instead.
68 In future, the compiler may choose between these three instructions
69 according to register allocation.
Daniel Sandersdf2215452014-03-12 11:54:00 +000070 These three operations can be very confusing so here is a mapping
71 between the instructions and the vselect node in one place:
72 bmz.v wd, ws, wt/i8 -> (vselect wt/i8, wd, ws)
73 bmnz.v wd, ws, wt/i8 -> (vselect wt/i8, ws, wd)
74 bsel.v wd, ws, wt/i8 -> (vselect wd, wt/i8, ws)
Daniel Sandersab94b532013-10-30 15:20:38 +000075
76bmnzi.b, bmzi.b:
77 Like their non-immediate counterparts, bmnzi.v and bmzi.v are the same
78 operation with the operands swapped. bmnzi.v will (currently) be emitted
79 for both cases.
80
81bseli.v:
82 Unlike the non-immediate versions, bseli.v is distinguishable from
83 bmnzi.b and bmzi.b and can be emitted.