Blame - lib/Target/README.txt - fp2-dev/platform/external/llvm

blob: 650eabf2d4d15b9203fb1577e47ea17d854683c0 [file] [log] [blame]

Chris Lattner	086c014	2006-02-03 06:21:43 +0000	[diff] [blame]	1	Target Independent Opportunities:
				2
				3	===-------------------------------------------------------------------------===
				4
				5	FreeBench/mason contains code like this:
				6
				7	static p_type m0u(p_type p) {
				8	int m[]={0, 8, 1, 2, 16, 5, 13, 7, 14, 9, 3, 4, 11, 12, 15, 10, 17, 6};
				9	p_type pu;
				10	pu.a = m[p.a];
				11	pu.b = m[p.b];
				12	pu.c = m[p.c];
				13	return pu;
				14	}
				15
				16	We currently compile this into a memcpy from a static array into 'm', then
				17	a bunch of loads from m. It would be better to avoid the memcpy and just do
				18	loads from the static array.
				19
Nate Begeman	81e8097	2006-03-17 01:40:33 +0000	[diff] [blame]	20	//===---------------------------------------------------------------------===//
				21
				22	Make the PPC branch selector target independant
				23
				24	//===---------------------------------------------------------------------===//
Chris Lattner	086c014	2006-02-03 06:21:43 +0000	[diff] [blame]	25
				26	Get the C front-end to expand hypot(x,y) -> llvm.sqrt(xx+yy) when errno and
				27	precision don't matter (ffastmath). Misc/mandel will like this. :)
				28
Chris Lattner	086c014	2006-02-03 06:21:43 +0000	[diff] [blame]	29	//===---------------------------------------------------------------------===//
				30
				31	Solve this DAG isel folding deficiency:
				32
				33	int X, Y;
				34
				35	void fn1(void)
				36	{
				37	X = X \| (Y << 3);
				38	}
				39
				40	compiles to
				41
				42	fn1:
				43	movl Y, %eax
				44	shll $3, %eax
				45	orl X, %eax
				46	movl %eax, X
				47	ret
				48
				49	The problem is the store's chain operand is not the load X but rather
				50	a TokenFactor of the load X and load Y, which prevents the folding.
				51
				52	There are two ways to fix this:
				53
				54	1. The dag combiner can start using alias analysis to realize that y/x
				55	don't alias, making the store to X not dependent on the load from Y.
				56	2. The generated isel could be made smarter in the case it can't
				57	disambiguate the pointers.
				58
				59	Number 1 is the preferred solution.
				60
Evan Cheng	e617b08	2006-03-13 23:19:10 +0000	[diff] [blame]	61	This has been "fixed" by a TableGen hack. But that is a short term workaround
				62	which will be removed once the proper fix is made.
				63
Chris Lattner	086c014	2006-02-03 06:21:43 +0000	[diff] [blame]	64	//===---------------------------------------------------------------------===//
				65
Chris Lattner	a1532bc	2006-02-21 18:29:44 +0000	[diff] [blame]	66	Turn this into a signed shift right in instcombine:
				67
				68	int f(unsigned x) {
				69	return x >> 31 ? -1 : 0;
				70	}
				71
				72	http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25600
				73	http://gcc.gnu.org/ml/gcc-patches/2006-02/msg01492.html
				74
Chris Lattner	89188a1	2006-03-02 22:34:38 +0000	[diff] [blame]	75	//===---------------------------------------------------------------------===//
				76
Chris Lattner	b27b69f	2006-03-04 01:19:34 +0000	[diff] [blame]	77	On targets with expensive 64-bit multiply, we could LSR this:
				78
				79	for (i = ...; ++i) {
				80	x = 1ULL << i;
				81
				82	into:
				83	long long tmp = 1;
				84	for (i = ...; ++i, tmp+=tmp)
				85	x = tmp;
				86
				87	This would be a win on ppc32, but not x86 or ppc64.
				88
Chris Lattner	ad01993	2006-03-04 08:44:51 +0000	[diff] [blame]	89	//===---------------------------------------------------------------------===//
Chris Lattner	5b0fe7d	2006-03-05 20:00:08 +0000	[diff] [blame]	90
				91	Shrink: (setlt (loadi32 P), 0) -> (setlt (loadi8 Phi), 0)
				92
				93	//===---------------------------------------------------------------------===//
Chris Lattner	549f27d2	2006-03-07 02:46:26 +0000	[diff] [blame]	94
Chris Lattner	c20995e	2006-03-11 20:17:08 +0000	[diff] [blame]	95	Reassociate should turn: XXXX -> t=(XX) (t*t) to eliminate a multiply.
				96
				97	//===---------------------------------------------------------------------===//
				98
Chris Lattner	74cfb7d	2006-03-11 20:20:40 +0000	[diff] [blame]	99	Interesting? testcase for add/shift/mul reassoc:
				100
				101	int bar(int x, int y) {
				102	return xxx+y+xxxxxyyyy;
				103	}
				104	int foo(int z, int n) {
				105	return bar(z, n) + bar(2z, 2n);
				106	}
				107
				108	//===---------------------------------------------------------------------===//
				109
Chris Lattner	82c78b2	2006-03-09 20:13:21 +0000	[diff] [blame]	110	These two functions should generate the same code on big-endian systems:
				111
				112	int g(int j,int l) { return memcmp(j,l,4); }
				113	int h(int j, int l) { return j - l; }
				114
				115	this could be done in SelectionDAGISel.cpp, along with other special cases,
				116	for 1,2,4,8 bytes.
				117
				118	//===---------------------------------------------------------------------===//
				119
Chris Lattner	cbd3cdd	2006-03-14 19:31:24 +0000	[diff] [blame]	120	This code:
				121	int rot(unsigned char b) { int a = ((b>>1) ^ (b<<7)) & 0xff; return a; }
				122
				123	Can be improved in two ways:
				124
				125	1. The instcombiner should eliminate the type conversions.
				126	2. The X86 backend should turn this into a rotate by one bit.
				127
Evan Cheng	d3864b5	2006-03-19 06:09:23 +0000	[diff] [blame]	128	//===---------------------------------------------------------------------===//
				129
				130	Add LSR exit value substitution. It'll probably be a win for Ackermann, etc.
Chris Lattner	c04b423	2006-03-22 07:33:46 +0000	[diff] [blame]	131
				132	//===---------------------------------------------------------------------===//
				133
				134	It would be nice to revert this patch:
				135	http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20060213/031986.html
				136
				137	And teach the dag combiner enough to simplify the code expanded before
				138	legalize. It seems plausible that this knowledge would let it simplify other
				139	stuff too.
				140
Chris Lattner	e6cd96d	2006-03-24 19:59:17 +0000	[diff] [blame]	141	//===---------------------------------------------------------------------===//
				142
				143	The loop unroller should be enhanced to be able to unroll loops that aren't
				144	single basic blocks. It should be able to handle stuff like this:
				145
				146	for (i = 0; i < c1; ++i)
				147	if (c2 & (1 << i))
				148	foo
				149
				150	where c1/c2 are constants.
				151
Evan Cheng	67d3d4c	2006-03-31 22:35:14 +0000	[diff] [blame]	152	//===---------------------------------------------------------------------===//
Chris Lattner	e6cd96d	2006-03-24 19:59:17 +0000	[diff] [blame]	153
Evan Cheng	67d3d4c	2006-03-31 22:35:14 +0000	[diff] [blame]	154	For packed types, TargetData.cpp::getTypeInfo() returns alignment that is equal
				155	to the type size. It works but can be overly conservative as the alignment of
				156	specific packed types are target dependent.
Chris Lattner	eaa7c06	2006-04-01 04:08:29 +0000	[diff] [blame]	157
				158	//===---------------------------------------------------------------------===//
				159
				160	We should add 'unaligned load/store' nodes, and produce them from code like
				161	this:
				162
				163	v4sf example(float *P) {
				164	return (v4sf){P[0], P[1], P[2], P[3] };
				165	}
				166
				167	//===---------------------------------------------------------------------===//
				168
Chris Lattner	5295122	2006-04-02 01:47:20 +0000	[diff] [blame]	169	We should constant fold packed type casts at the LLVM level, regardless of the
				170	cast. Currently we cannot fold some casts because we don't have TargetData
				171	information in the constant folder, so we don't know the endianness of the
				172	target!
				173
				174	//===---------------------------------------------------------------------===//
Chris Lattner	879acef	2006-04-20 18:49:28 +0000	[diff] [blame]	175
				176	Consider this:
				177
				178	unsigned short swap_16(unsigned short v) { return (v>>8) \| (v<<8); }
				179
				180	Compiled with the ppc backend:
				181
				182	_swap_16:
				183	slwi r2, r3, 8
				184	srwi r3, r3, 8
				185	or r2, r3, r2
				186	rlwinm r3, r2, 0, 16, 31
				187	blr
				188
				189	The rlwinm (an and by 65535) is dead. The dag combiner should propagate bits
				190	better than that to see this.
				191
				192	//===---------------------------------------------------------------------===//
Chris Lattner	16abfdf	2006-05-18 18:26:13 +0000	[diff] [blame]	193
				194	Add support for conditional increments, and other related patterns. Instead
				195	of:
				196
				197	movl 136(%esp), %eax
				198	cmpl $0, %eax
				199	je LBB16_2 #cond_next
				200	LBB16_1: #cond_true
				201	incl _foo
				202	LBB16_2: #cond_next
				203
				204	emit:
				205	movl _foo, %eax
				206	cmpl $1, %edi
				207	sbbl $-1, %eax
				208	movl %eax, _foo
				209
				210	//===---------------------------------------------------------------------===//
Chris Lattner	870cf1b	2006-05-19 20:45:08 +0000	[diff] [blame]	211
				212	Combine: a = sin(x), b = cos(x) into a,b = sincos(x).
				213
				214	Expand these to calls of sin/cos and stores:
				215	double sincos(double x, double sin, double cos);
				216	float sincosf(float x, float sin, float cos);
				217	long double sincosl(long double x, long double sin, long double cos);
				218
				219	Doing so could allow SROA of the destination pointers. See also:
				220	http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17687
				221
				222	//===---------------------------------------------------------------------===//
Chris Lattner	f00f68a	2006-05-19 21:01:38 +0000	[diff] [blame]	223
				224	Scalar Repl cannot currently promote this testcase to 'ret long cst':
				225
				226	%struct.X = type { int, int }
				227	%struct.Y = type { %struct.X }
				228	ulong %bar() {
				229	%retval = alloca %struct.Y, align 8 ; <%struct.Y*> [#uses=3]
				230	%tmp12 = getelementptr %struct.Y* %retval, int 0, uint 0, uint 0 ; <int*> [#uses=1]
				231	store int 0, int* %tmp12
				232	%tmp15 = getelementptr %struct.Y* %retval, int 0, uint 0, uint 1 ; <int*> [#uses=1]
				233	store int 1, int* %tmp15
				234	%retval = cast %struct.Y* %retval to ulong* ; <ulong*> [#uses=1]
				235	%retval = load ulong* %retval ; <ulong> [#uses=1]
				236	ret ulong %retval
				237	}
				238
				239	it should be extended to do so.
				240
				241	//===---------------------------------------------------------------------===//
Chris Lattner	e8263e6	2006-05-21 03:57:07 +0000	[diff] [blame]	242
				243	Turn this into a single byte store with no load (the other 3 bytes are
				244	unmodified):
				245
				246	void %test(uint* %P) {
				247	%tmp = load uint* %P
				248	%tmp14 = or uint %tmp, 3305111552
				249	%tmp15 = and uint %tmp14, 3321888767
				250	store uint %tmp15, uint* %P
				251	ret void
				252	}
				253
Chris Lattner	9e18ef5	2006-05-30 21:29:15 +0000	[diff] [blame]	254	//===---------------------------------------------------------------------===//
				255
				256	dag/inst combine "clz(x)>>5 -> x==0" for 32-bit x.
				257
				258	Compile:
				259
				260	int bar(int x)
				261	{
				262	int t = __builtin_clz(x);
				263	return -(t>>5);
				264	}
				265
				266	to:
				267
				268	_bar: addic r3,r3,-1
				269	subfe r3,r3,r3
				270	blr
				271
				272