Blame - third_party/subzero/src/README.SIMD.rst - platform/external/swiftshader

blob: f8cf08f3c3a107c10b348e14f86dd8bfe4e316ae [file] [log] [blame]

Matt Wala	9dbe38e	2014-08-15 15:02:13 -0700	[diff] [blame]	1	Missing support
				2	===============
				3
Andrew Scull	57e1268	2015-09-16 11:30:19 -0700	[diff] [blame]	4	* The PNaCl LLVM backend expands shufflevector operations into sequences of
				5	insertelement and extractelement operations. For instance:
Matt Wala	9dbe38e	2014-08-15 15:02:13 -0700	[diff] [blame]	6
				7	define <4 x i32> @shuffle(<4 x i32> %arg1, <4 x i32> %arg2) {
				8	entry:
Andrew Scull	57e1268	2015-09-16 11:30:19 -0700	[diff] [blame]	9	%res = shufflevector <4 x i32> %arg1,
				10	<4 x i32> %arg2,
				11	<4 x i32> <i32 4, i32 5, i32 0, i32 1>
Matt Wala	9dbe38e	2014-08-15 15:02:13 -0700	[diff] [blame]	12	ret <4 x i32> %res
				13	}
				14
				15	gets expanded into:
				16
				17	define <4 x i32> @shuffle(<4 x i32> %arg1, <4 x i32> %arg2) {
				18	entry:
				19	%0 = extractelement <4 x i32> %arg2, i32 0
				20	%1 = insertelement <4 x i32> undef, i32 %0, i32 0
				21	%2 = extractelement <4 x i32> %arg2, i32 1
				22	%3 = insertelement <4 x i32> %1, i32 %2, i32 1
				23	%4 = extractelement <4 x i32> %arg1, i32 0
				24	%5 = insertelement <4 x i32> %3, i32 %4, i32 2
				25	%6 = extractelement <4 x i32> %arg1, i32 1
				26	%7 = insertelement <4 x i32> %5, i32 %6, i32 3
				27	ret <4 x i32> %7
				28	}
				29
				30	Subzero should recognize these sequences and recombine them into
				31	shuffle operations where appropriate.
				32
				33	* Add support for vector constants in the backend. The current code
Andrew Scull	57e1268	2015-09-16 11:30:19 -0700	[diff] [blame]	34	materializes the vector constants it needs (eg. for performing icmp on
				35	unsigned operands) using register operations, but this should be changed to
				36	loading them from a constant pool if the register initialization is too
				37	complicated (such as in TargetX8632::makeVectorOfHighOrderBits()).
Matt Wala	9dbe38e	2014-08-15 15:02:13 -0700	[diff] [blame]	38
Andrew Scull	57e1268	2015-09-16 11:30:19 -0700	[diff] [blame]	39	* [x86 specific] llvm-mc does not allow lea to take a mem128 memory operand
				40	when assembling x86-32 code. The current InstX8632Lea::emit() code uses
				41	Variable::asType() to convert any mem128 Variables into a compatible memory
				42	operand type. However, the emit code does not do any conversions of
				43	OperandX8632Mem, so if an OperandX8632Mem is passed to lea as mem128 the
				44	resulting code will not assemble. One way to fix this is by implementing
Matt Wala	9dbe38e	2014-08-15 15:02:13 -0700	[diff] [blame]	45	OperandX8632Mem::asType().
				46
Andrew Scull	57e1268	2015-09-16 11:30:19 -0700	[diff] [blame]	47	* [x86 specific] Lower shl with <4 x i32> using some clever float conversion:
Matt Wala	9dbe38e	2014-08-15 15:02:13 -0700	[diff] [blame]	48	http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20100726/105087.html
				49
Andrew Scull	57e1268	2015-09-16 11:30:19 -0700	[diff] [blame]	50	* [x86 specific] Add support for using aligned mov operations (movaps). This
				51	will require passing alignment information to loads and stores.
Matt Wala	9dbe38e	2014-08-15 15:02:13 -0700	[diff] [blame]	52
				53	x86 SIMD Diversification
				54	========================
				55
Andrew Scull	57e1268	2015-09-16 11:30:19 -0700	[diff] [blame]	56	* Vector "bitwise" operations have several variant instructions: the AND
				57	operation can be implemented with pand, andpd, or andps. This pattern also
				58	holds for ANDN, OR, and XOR.
Matt Wala	9dbe38e	2014-08-15 15:02:13 -0700	[diff] [blame]	59
Andrew Scull	57e1268	2015-09-16 11:30:19 -0700	[diff] [blame]	60	* Vector "mov" instructions can be diversified (eg. movdqu instead of movups)
				61	at the cost of a possible performance penalty.
Matt Wala	9dbe38e	2014-08-15 15:02:13 -0700	[diff] [blame]	62
Andrew Scull	57e1268	2015-09-16 11:30:19 -0700	[diff] [blame]	63	* Scalar FP arithmetic can be diversified by performing the operations with the
				64	vector version of the instructions.