Blame - lib/Target/README.txt - fp2-dev/platform/external/llvm

blob: 1f982b7f76a6a527b4660140331bdf9230acf8f8 [file] [log] [blame]

Chris Lattner	086c014	2006-02-03 06:21:43 +0000	[diff] [blame]	1	Target Independent Opportunities:
				2
				3	===-------------------------------------------------------------------------===
				4
				5	FreeBench/mason contains code like this:
				6
				7	static p_type m0u(p_type p) {
				8	int m[]={0, 8, 1, 2, 16, 5, 13, 7, 14, 9, 3, 4, 11, 12, 15, 10, 17, 6};
				9	p_type pu;
				10	pu.a = m[p.a];
				11	pu.b = m[p.b];
				12	pu.c = m[p.c];
				13	return pu;
				14	}
				15
				16	We currently compile this into a memcpy from a static array into 'm', then
				17	a bunch of loads from m. It would be better to avoid the memcpy and just do
				18	loads from the static array.
				19
Nate Begeman	81e8097	2006-03-17 01:40:33 +0000	[diff] [blame]	20	//===---------------------------------------------------------------------===//
				21
				22	Make the PPC branch selector target independant
				23
				24	//===---------------------------------------------------------------------===//
Chris Lattner	086c014	2006-02-03 06:21:43 +0000	[diff] [blame]	25
				26	Get the C front-end to expand hypot(x,y) -> llvm.sqrt(xx+yy) when errno and
				27	precision don't matter (ffastmath). Misc/mandel will like this. :)
				28
Chris Lattner	086c014	2006-02-03 06:21:43 +0000	[diff] [blame]	29	//===---------------------------------------------------------------------===//
				30
				31	Solve this DAG isel folding deficiency:
				32
				33	int X, Y;
				34
				35	void fn1(void)
				36	{
				37	X = X \| (Y << 3);
				38	}
				39
				40	compiles to
				41
				42	fn1:
				43	movl Y, %eax
				44	shll $3, %eax
				45	orl X, %eax
				46	movl %eax, X
				47	ret
				48
				49	The problem is the store's chain operand is not the load X but rather
				50	a TokenFactor of the load X and load Y, which prevents the folding.
				51
				52	There are two ways to fix this:
				53
				54	1. The dag combiner can start using alias analysis to realize that y/x
				55	don't alias, making the store to X not dependent on the load from Y.
				56	2. The generated isel could be made smarter in the case it can't
				57	disambiguate the pointers.
				58
				59	Number 1 is the preferred solution.
				60
Evan Cheng	e617b08	2006-03-13 23:19:10 +0000	[diff] [blame]	61	This has been "fixed" by a TableGen hack. But that is a short term workaround
				62	which will be removed once the proper fix is made.
				63
Chris Lattner	086c014	2006-02-03 06:21:43 +0000	[diff] [blame]	64	//===---------------------------------------------------------------------===//
				65
Chris Lattner	a1532bc	2006-02-21 18:29:44 +0000	[diff] [blame]	66	Turn this into a signed shift right in instcombine:
				67
				68	int f(unsigned x) {
				69	return x >> 31 ? -1 : 0;
				70	}
				71
				72	http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25600
				73	http://gcc.gnu.org/ml/gcc-patches/2006-02/msg01492.html
				74
Chris Lattner	89188a1	2006-03-02 22:34:38 +0000	[diff] [blame]	75	//===---------------------------------------------------------------------===//
				76
Chris Lattner	b27b69f	2006-03-04 01:19:34 +0000	[diff] [blame]	77	On targets with expensive 64-bit multiply, we could LSR this:
				78
				79	for (i = ...; ++i) {
				80	x = 1ULL << i;
				81
				82	into:
				83	long long tmp = 1;
				84	for (i = ...; ++i, tmp+=tmp)
				85	x = tmp;
				86
				87	This would be a win on ppc32, but not x86 or ppc64.
				88
Chris Lattner	ad01993	2006-03-04 08:44:51 +0000	[diff] [blame]	89	//===---------------------------------------------------------------------===//
Chris Lattner	5b0fe7d	2006-03-05 20:00:08 +0000	[diff] [blame]	90
				91	Shrink: (setlt (loadi32 P), 0) -> (setlt (loadi8 Phi), 0)
				92
				93	//===---------------------------------------------------------------------===//
Chris Lattner	549f27d2	2006-03-07 02:46:26 +0000	[diff] [blame]	94
Chris Lattner	c20995e	2006-03-11 20:17:08 +0000	[diff] [blame]	95	Reassociate should turn: XXXX -> t=(XX) (t*t) to eliminate a multiply.
				96
				97	//===---------------------------------------------------------------------===//
				98
Chris Lattner	74cfb7d	2006-03-11 20:20:40 +0000	[diff] [blame]	99	Interesting? testcase for add/shift/mul reassoc:
				100
				101	int bar(int x, int y) {
				102	return xxx+y+xxxxxyyyy;
				103	}
				104	int foo(int z, int n) {
				105	return bar(z, n) + bar(2z, 2n);
				106	}
				107
				108	//===---------------------------------------------------------------------===//
				109
Chris Lattner	82c78b2	2006-03-09 20:13:21 +0000	[diff] [blame]	110	These two functions should generate the same code on big-endian systems:
				111
				112	int g(int j,int l) { return memcmp(j,l,4); }
				113	int h(int j, int l) { return j - l; }
				114
				115	this could be done in SelectionDAGISel.cpp, along with other special cases,
				116	for 1,2,4,8 bytes.
				117
				118	//===---------------------------------------------------------------------===//
				119
Chris Lattner	cbd3cdd	2006-03-14 19:31:24 +0000	[diff] [blame]	120	This code:
				121	int rot(unsigned char b) { int a = ((b>>1) ^ (b<<7)) & 0xff; return a; }
				122
				123	Can be improved in two ways:
				124
				125	1. The instcombiner should eliminate the type conversions.
				126	2. The X86 backend should turn this into a rotate by one bit.
				127
Evan Cheng	d3864b5	2006-03-19 06:09:23 +0000	[diff] [blame]	128	//===---------------------------------------------------------------------===//
				129
				130	Add LSR exit value substitution. It'll probably be a win for Ackermann, etc.
Chris Lattner	c04b423	2006-03-22 07:33:46 +0000	[diff] [blame]	131
				132	//===---------------------------------------------------------------------===//
				133
				134	It would be nice to revert this patch:
				135	http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20060213/031986.html
				136
				137	And teach the dag combiner enough to simplify the code expanded before
				138	legalize. It seems plausible that this knowledge would let it simplify other
				139	stuff too.
				140
Chris Lattner	e6cd96d	2006-03-24 19:59:17 +0000	[diff] [blame]	141	//===---------------------------------------------------------------------===//
				142
Evan Cheng	67d3d4c	2006-03-31 22:35:14 +0000	[diff] [blame]	143	For packed types, TargetData.cpp::getTypeInfo() returns alignment that is equal
				144	to the type size. It works but can be overly conservative as the alignment of
				145	specific packed types are target dependent.
Chris Lattner	eaa7c06	2006-04-01 04:08:29 +0000	[diff] [blame]	146
				147	//===---------------------------------------------------------------------===//
				148
				149	We should add 'unaligned load/store' nodes, and produce them from code like
				150	this:
				151
				152	v4sf example(float *P) {
				153	return (v4sf){P[0], P[1], P[2], P[3] };
				154	}
				155
				156	//===---------------------------------------------------------------------===//
				157
Chris Lattner	5295122	2006-04-02 01:47:20 +0000	[diff] [blame]	158	We should constant fold packed type casts at the LLVM level, regardless of the
				159	cast. Currently we cannot fold some casts because we don't have TargetData
				160	information in the constant folder, so we don't know the endianness of the
				161	target!
				162
				163	//===---------------------------------------------------------------------===//
Chris Lattner	879acef	2006-04-20 18:49:28 +0000	[diff] [blame]	164
Chris Lattner	16abfdf	2006-05-18 18:26:13 +0000	[diff] [blame]	165	Add support for conditional increments, and other related patterns. Instead
				166	of:
				167
				168	movl 136(%esp), %eax
				169	cmpl $0, %eax
				170	je LBB16_2 #cond_next
				171	LBB16_1: #cond_true
				172	incl _foo
				173	LBB16_2: #cond_next
				174
				175	emit:
				176	movl _foo, %eax
				177	cmpl $1, %edi
				178	sbbl $-1, %eax
				179	movl %eax, _foo
				180
				181	//===---------------------------------------------------------------------===//
Chris Lattner	870cf1b	2006-05-19 20:45:08 +0000	[diff] [blame]	182
				183	Combine: a = sin(x), b = cos(x) into a,b = sincos(x).
				184
				185	Expand these to calls of sin/cos and stores:
				186	double sincos(double x, double sin, double cos);
				187	float sincosf(float x, float sin, float cos);
				188	long double sincosl(long double x, long double sin, long double cos);
				189
				190	Doing so could allow SROA of the destination pointers. See also:
				191	http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17687
				192
				193	//===---------------------------------------------------------------------===//
Chris Lattner	f00f68a	2006-05-19 21:01:38 +0000	[diff] [blame]	194
				195	Scalar Repl cannot currently promote this testcase to 'ret long cst':
				196
				197	%struct.X = type { int, int }
				198	%struct.Y = type { %struct.X }
				199	ulong %bar() {
				200	%retval = alloca %struct.Y, align 8 ; <%struct.Y*> [#uses=3]
				201	%tmp12 = getelementptr %struct.Y* %retval, int 0, uint 0, uint 0 ; <int*> [#uses=1]
				202	store int 0, int* %tmp12
				203	%tmp15 = getelementptr %struct.Y* %retval, int 0, uint 0, uint 1 ; <int*> [#uses=1]
				204	store int 1, int* %tmp15
				205	%retval = cast %struct.Y* %retval to ulong* ; <ulong*> [#uses=1]
				206	%retval = load ulong* %retval ; <ulong> [#uses=1]
				207	ret ulong %retval
				208	}
				209
				210	it should be extended to do so.
				211
				212	//===---------------------------------------------------------------------===//
Chris Lattner	e8263e6	2006-05-21 03:57:07 +0000	[diff] [blame]	213
				214	Turn this into a single byte store with no load (the other 3 bytes are
				215	unmodified):
				216
				217	void %test(uint* %P) {
				218	%tmp = load uint* %P
				219	%tmp14 = or uint %tmp, 3305111552
				220	%tmp15 = and uint %tmp14, 3321888767
				221	store uint %tmp15, uint* %P
				222	ret void
				223	}
				224
Chris Lattner	9e18ef5	2006-05-30 21:29:15 +0000	[diff] [blame]	225	//===---------------------------------------------------------------------===//
				226
				227	dag/inst combine "clz(x)>>5 -> x==0" for 32-bit x.
				228
				229	Compile:
				230
				231	int bar(int x)
				232	{
				233	int t = __builtin_clz(x);
				234	return -(t>>5);
				235	}
				236
				237	to:
				238
				239	_bar: addic r3,r3,-1
				240	subfe r3,r3,r3
				241	blr
				242
Chris Lattner	cbce2f6	2006-09-15 20:31:36 +0000	[diff] [blame^]	243	//===---------------------------------------------------------------------===//
				244
				245	Legalize should lower ctlz like this:
				246	ctlz(x) = popcnt((x-1) & ~x)
				247
				248	on targets that have popcnt but not ctlz. itanium, what else?
Chris Lattner	9e18ef5	2006-05-30 21:29:15 +0000	[diff] [blame]	249