blob: f8cf08f3c3a107c10b348e14f86dd8bfe4e316ae [file] [log] [blame]
Matt Wala9dbe38e2014-08-15 15:02:13 -07001Missing support
2===============
3
Andrew Scull57e12682015-09-16 11:30:19 -07004* The PNaCl LLVM backend expands shufflevector operations into sequences of
5 insertelement and extractelement operations. For instance:
Matt Wala9dbe38e2014-08-15 15:02:13 -07006
7 define <4 x i32> @shuffle(<4 x i32> %arg1, <4 x i32> %arg2) {
8 entry:
Andrew Scull57e12682015-09-16 11:30:19 -07009 %res = shufflevector <4 x i32> %arg1,
10 <4 x i32> %arg2,
11 <4 x i32> <i32 4, i32 5, i32 0, i32 1>
Matt Wala9dbe38e2014-08-15 15:02:13 -070012 ret <4 x i32> %res
13 }
14
15 gets expanded into:
16
17 define <4 x i32> @shuffle(<4 x i32> %arg1, <4 x i32> %arg2) {
18 entry:
19 %0 = extractelement <4 x i32> %arg2, i32 0
20 %1 = insertelement <4 x i32> undef, i32 %0, i32 0
21 %2 = extractelement <4 x i32> %arg2, i32 1
22 %3 = insertelement <4 x i32> %1, i32 %2, i32 1
23 %4 = extractelement <4 x i32> %arg1, i32 0
24 %5 = insertelement <4 x i32> %3, i32 %4, i32 2
25 %6 = extractelement <4 x i32> %arg1, i32 1
26 %7 = insertelement <4 x i32> %5, i32 %6, i32 3
27 ret <4 x i32> %7
28 }
29
30 Subzero should recognize these sequences and recombine them into
31 shuffle operations where appropriate.
32
33* Add support for vector constants in the backend. The current code
Andrew Scull57e12682015-09-16 11:30:19 -070034 materializes the vector constants it needs (eg. for performing icmp on
35 unsigned operands) using register operations, but this should be changed to
36 loading them from a constant pool if the register initialization is too
37 complicated (such as in TargetX8632::makeVectorOfHighOrderBits()).
Matt Wala9dbe38e2014-08-15 15:02:13 -070038
Andrew Scull57e12682015-09-16 11:30:19 -070039* [x86 specific] llvm-mc does not allow lea to take a mem128 memory operand
40 when assembling x86-32 code. The current InstX8632Lea::emit() code uses
41 Variable::asType() to convert any mem128 Variables into a compatible memory
42 operand type. However, the emit code does not do any conversions of
43 OperandX8632Mem, so if an OperandX8632Mem is passed to lea as mem128 the
44 resulting code will not assemble. One way to fix this is by implementing
Matt Wala9dbe38e2014-08-15 15:02:13 -070045 OperandX8632Mem::asType().
46
Andrew Scull57e12682015-09-16 11:30:19 -070047* [x86 specific] Lower shl with <4 x i32> using some clever float conversion:
Matt Wala9dbe38e2014-08-15 15:02:13 -070048http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20100726/105087.html
49
Andrew Scull57e12682015-09-16 11:30:19 -070050* [x86 specific] Add support for using aligned mov operations (movaps). This
51 will require passing alignment information to loads and stores.
Matt Wala9dbe38e2014-08-15 15:02:13 -070052
53x86 SIMD Diversification
54========================
55
Andrew Scull57e12682015-09-16 11:30:19 -070056* Vector "bitwise" operations have several variant instructions: the AND
57 operation can be implemented with pand, andpd, or andps. This pattern also
58 holds for ANDN, OR, and XOR.
Matt Wala9dbe38e2014-08-15 15:02:13 -070059
Andrew Scull57e12682015-09-16 11:30:19 -070060* Vector "mov" instructions can be diversified (eg. movdqu instead of movups)
61 at the cost of a possible performance penalty.
Matt Wala9dbe38e2014-08-15 15:02:13 -070062
Andrew Scull57e12682015-09-16 11:30:19 -070063* Scalar FP arithmetic can be diversified by performing the operations with the
64 vector version of the instructions.