blob: 1f982b7f76a6a527b4660140331bdf9230acf8f8 [file] [log] [blame]
Chris Lattner086c0142006-02-03 06:21:43 +00001Target Independent Opportunities:
2
3===-------------------------------------------------------------------------===
4
5FreeBench/mason contains code like this:
6
7static p_type m0u(p_type p) {
8 int m[]={0, 8, 1, 2, 16, 5, 13, 7, 14, 9, 3, 4, 11, 12, 15, 10, 17, 6};
9 p_type pu;
10 pu.a = m[p.a];
11 pu.b = m[p.b];
12 pu.c = m[p.c];
13 return pu;
14}
15
16We currently compile this into a memcpy from a static array into 'm', then
17a bunch of loads from m. It would be better to avoid the memcpy and just do
18loads from the static array.
19
Nate Begeman81e80972006-03-17 01:40:33 +000020//===---------------------------------------------------------------------===//
21
22Make the PPC branch selector target independant
23
24//===---------------------------------------------------------------------===//
Chris Lattner086c0142006-02-03 06:21:43 +000025
26Get the C front-end to expand hypot(x,y) -> llvm.sqrt(x*x+y*y) when errno and
27precision don't matter (ffastmath). Misc/mandel will like this. :)
28
Chris Lattner086c0142006-02-03 06:21:43 +000029//===---------------------------------------------------------------------===//
30
31Solve this DAG isel folding deficiency:
32
33int X, Y;
34
35void fn1(void)
36{
37 X = X | (Y << 3);
38}
39
40compiles to
41
42fn1:
43 movl Y, %eax
44 shll $3, %eax
45 orl X, %eax
46 movl %eax, X
47 ret
48
49The problem is the store's chain operand is not the load X but rather
50a TokenFactor of the load X and load Y, which prevents the folding.
51
52There are two ways to fix this:
53
541. The dag combiner can start using alias analysis to realize that y/x
55 don't alias, making the store to X not dependent on the load from Y.
562. The generated isel could be made smarter in the case it can't
57 disambiguate the pointers.
58
59Number 1 is the preferred solution.
60
Evan Chenge617b082006-03-13 23:19:10 +000061This has been "fixed" by a TableGen hack. But that is a short term workaround
62which will be removed once the proper fix is made.
63
Chris Lattner086c0142006-02-03 06:21:43 +000064//===---------------------------------------------------------------------===//
65
Chris Lattnera1532bc2006-02-21 18:29:44 +000066Turn this into a signed shift right in instcombine:
67
68int f(unsigned x) {
69 return x >> 31 ? -1 : 0;
70}
71
72http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25600
73http://gcc.gnu.org/ml/gcc-patches/2006-02/msg01492.html
74
Chris Lattner89188a12006-03-02 22:34:38 +000075//===---------------------------------------------------------------------===//
76
Chris Lattnerb27b69f2006-03-04 01:19:34 +000077On targets with expensive 64-bit multiply, we could LSR this:
78
79for (i = ...; ++i) {
80 x = 1ULL << i;
81
82into:
83 long long tmp = 1;
84 for (i = ...; ++i, tmp+=tmp)
85 x = tmp;
86
87This would be a win on ppc32, but not x86 or ppc64.
88
Chris Lattnerad019932006-03-04 08:44:51 +000089//===---------------------------------------------------------------------===//
Chris Lattner5b0fe7d2006-03-05 20:00:08 +000090
91Shrink: (setlt (loadi32 P), 0) -> (setlt (loadi8 Phi), 0)
92
93//===---------------------------------------------------------------------===//
Chris Lattner549f27d22006-03-07 02:46:26 +000094
Chris Lattnerc20995e2006-03-11 20:17:08 +000095Reassociate should turn: X*X*X*X -> t=(X*X) (t*t) to eliminate a multiply.
96
97//===---------------------------------------------------------------------===//
98
Chris Lattner74cfb7d2006-03-11 20:20:40 +000099Interesting? testcase for add/shift/mul reassoc:
100
101int bar(int x, int y) {
102 return x*x*x+y+x*x*x*x*x*y*y*y*y;
103}
104int foo(int z, int n) {
105 return bar(z, n) + bar(2*z, 2*n);
106}
107
108//===---------------------------------------------------------------------===//
109
Chris Lattner82c78b22006-03-09 20:13:21 +0000110These two functions should generate the same code on big-endian systems:
111
112int g(int *j,int *l) { return memcmp(j,l,4); }
113int h(int *j, int *l) { return *j - *l; }
114
115this could be done in SelectionDAGISel.cpp, along with other special cases,
116for 1,2,4,8 bytes.
117
118//===---------------------------------------------------------------------===//
119
Chris Lattnercbd3cdd2006-03-14 19:31:24 +0000120This code:
121int rot(unsigned char b) { int a = ((b>>1) ^ (b<<7)) & 0xff; return a; }
122
123Can be improved in two ways:
124
1251. The instcombiner should eliminate the type conversions.
1262. The X86 backend should turn this into a rotate by one bit.
127
Evan Chengd3864b52006-03-19 06:09:23 +0000128//===---------------------------------------------------------------------===//
129
130Add LSR exit value substitution. It'll probably be a win for Ackermann, etc.
Chris Lattnerc04b4232006-03-22 07:33:46 +0000131
132//===---------------------------------------------------------------------===//
133
134It would be nice to revert this patch:
135http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20060213/031986.html
136
137And teach the dag combiner enough to simplify the code expanded before
138legalize. It seems plausible that this knowledge would let it simplify other
139stuff too.
140
Chris Lattnere6cd96d2006-03-24 19:59:17 +0000141//===---------------------------------------------------------------------===//
142
Evan Cheng67d3d4c2006-03-31 22:35:14 +0000143For packed types, TargetData.cpp::getTypeInfo() returns alignment that is equal
144to the type size. It works but can be overly conservative as the alignment of
145specific packed types are target dependent.
Chris Lattnereaa7c062006-04-01 04:08:29 +0000146
147//===---------------------------------------------------------------------===//
148
149We should add 'unaligned load/store' nodes, and produce them from code like
150this:
151
152v4sf example(float *P) {
153 return (v4sf){P[0], P[1], P[2], P[3] };
154}
155
156//===---------------------------------------------------------------------===//
157
Chris Lattner52951222006-04-02 01:47:20 +0000158We should constant fold packed type casts at the LLVM level, regardless of the
159cast. Currently we cannot fold some casts because we don't have TargetData
160information in the constant folder, so we don't know the endianness of the
161target!
162
163//===---------------------------------------------------------------------===//
Chris Lattner879acef2006-04-20 18:49:28 +0000164
Chris Lattner16abfdf2006-05-18 18:26:13 +0000165Add support for conditional increments, and other related patterns. Instead
166of:
167
168 movl 136(%esp), %eax
169 cmpl $0, %eax
170 je LBB16_2 #cond_next
171LBB16_1: #cond_true
172 incl _foo
173LBB16_2: #cond_next
174
175emit:
176 movl _foo, %eax
177 cmpl $1, %edi
178 sbbl $-1, %eax
179 movl %eax, _foo
180
181//===---------------------------------------------------------------------===//
Chris Lattner870cf1b2006-05-19 20:45:08 +0000182
183Combine: a = sin(x), b = cos(x) into a,b = sincos(x).
184
185Expand these to calls of sin/cos and stores:
186 double sincos(double x, double *sin, double *cos);
187 float sincosf(float x, float *sin, float *cos);
188 long double sincosl(long double x, long double *sin, long double *cos);
189
190Doing so could allow SROA of the destination pointers. See also:
191http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17687
192
193//===---------------------------------------------------------------------===//
Chris Lattnerf00f68a2006-05-19 21:01:38 +0000194
195Scalar Repl cannot currently promote this testcase to 'ret long cst':
196
197 %struct.X = type { int, int }
198 %struct.Y = type { %struct.X }
199ulong %bar() {
200 %retval = alloca %struct.Y, align 8 ; <%struct.Y*> [#uses=3]
201 %tmp12 = getelementptr %struct.Y* %retval, int 0, uint 0, uint 0 ; <int*> [#uses=1]
202 store int 0, int* %tmp12
203 %tmp15 = getelementptr %struct.Y* %retval, int 0, uint 0, uint 1 ; <int*> [#uses=1]
204 store int 1, int* %tmp15
205 %retval = cast %struct.Y* %retval to ulong* ; <ulong*> [#uses=1]
206 %retval = load ulong* %retval ; <ulong> [#uses=1]
207 ret ulong %retval
208}
209
210it should be extended to do so.
211
212//===---------------------------------------------------------------------===//
Chris Lattnere8263e62006-05-21 03:57:07 +0000213
214Turn this into a single byte store with no load (the other 3 bytes are
215unmodified):
216
217void %test(uint* %P) {
218 %tmp = load uint* %P
219 %tmp14 = or uint %tmp, 3305111552
220 %tmp15 = and uint %tmp14, 3321888767
221 store uint %tmp15, uint* %P
222 ret void
223}
224
Chris Lattner9e18ef52006-05-30 21:29:15 +0000225//===---------------------------------------------------------------------===//
226
227dag/inst combine "clz(x)>>5 -> x==0" for 32-bit x.
228
229Compile:
230
231int bar(int x)
232{
233 int t = __builtin_clz(x);
234 return -(t>>5);
235}
236
237to:
238
239_bar: addic r3,r3,-1
240 subfe r3,r3,r3
241 blr
242
Chris Lattnercbce2f62006-09-15 20:31:36 +0000243//===---------------------------------------------------------------------===//
244
245Legalize should lower ctlz like this:
246 ctlz(x) = popcnt((x-1) & ~x)
247
248on targets that have popcnt but not ctlz. itanium, what else?
Chris Lattner9e18ef52006-05-30 21:29:15 +0000249