Blame - src/gallium/docs/source/tgsi.rst - platform/external/mesa3d

blob: 4d26c465579e24238dfb20b8428a1bb79727b0ba [file] [log] [blame]

Corbin Simpson	c686e17	2009-12-20 15:00:40 -0800	[diff] [blame]	1	TGSI
				2	====
				3
Michal Krol	b665968	2010-01-04 12:52:43 +0100	[diff] [blame]	4	TGSI, Tungsten Graphics Shader Infrastructure, is an intermediate language
Corbin Simpson	c686e17	2009-12-20 15:00:40 -0800	[diff] [blame]	5	for describing shaders. Since Gallium is inherently shaderful, shaders are
				6	an important part of the API. TGSI is the only intermediate representation
				7	used by all drivers.
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	8
Corbin Simpson	62ca7b8	2010-02-02 16:36:34 -0800	[diff] [blame]	9	Basics
				10	------
				11
				12	All TGSI instructions, known as opcodes, operate on arbitrary-precision
				13	floating-point four-component vectors. An opcode may have up to one
				14	destination register, known as dst, and between zero and three source
				15	registers, called src0 through src2, or simply src if there is only
				16	one.
				17
				18	Some instructions, like :opcode:`I2F`, permit re-interpretation of vector
				19	components as integers. Other instructions permit using registers as
				20	two-component vectors with double precision; see :ref:`Double Opcodes`.
				21
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	22	When an instruction has a scalar result, the result is usually copied into
				23	each of the components of dst. When this happens, the result is said to be
				24	replicated to dst. :opcode:`RCP` is one such instruction.
				25
Roland Scheidegger	cb2e678	2013-02-16 02:26:14 +0100	[diff] [blame]	26	Modifiers
				27	^^^^^^^^^^^^^^^
				28
				29	TGSI supports modifiers on inputs (as well as saturate modifier on instructions).
				30
				31	For inputs which have a floating point type, both absolute value and negation
				32	modifiers are supported (with absolute value being applied first).
				33	TGSI_OPCODE_MOV is considered to have float input type for applying modifiers.
				34
Zack Rusin	999cd79	2013-04-28 10:50:55 -0400	[diff] [blame]	35	For inputs which have signed or unsigned type only the negate modifier is
				36	supported.
Roland Scheidegger	cb2e678	2013-02-16 02:26:14 +0100	[diff] [blame]	37
Corbin Simpson	5bcd26c	2009-12-21 21:04:10 -0800	[diff] [blame]	38	Instruction Set
				39	---------------
				40
Corbin Simpson	9d4cb6e	2010-06-16 18:34:32 -0700	[diff] [blame]	41	Core ISA
Corbin Simpson	5bcd26c	2009-12-21 21:04:10 -0800	[diff] [blame]	42	^^^^^^^^^^^^^^^^^^^^^^^^^
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	43
Corbin Simpson	9d4cb6e	2010-06-16 18:34:32 -0700	[diff] [blame]	44	These opcodes are guaranteed to be available regardless of the driver being
				45	used.
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	46
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	47	.. opcode:: ARL - Address Register Load
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	48
				49	.. math::
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	50
Corbin Simpson	d92a685	2009-12-21 19:30:29 -0800	[diff] [blame]	51	dst.x = \lfloor src.x\rfloor
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	52
Corbin Simpson	d92a685	2009-12-21 19:30:29 -0800	[diff] [blame]	53	dst.y = \lfloor src.y\rfloor
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	54
Corbin Simpson	d92a685	2009-12-21 19:30:29 -0800	[diff] [blame]	55	dst.z = \lfloor src.z\rfloor
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	56
Corbin Simpson	d92a685	2009-12-21 19:30:29 -0800	[diff] [blame]	57	dst.w = \lfloor src.w\rfloor
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	58
				59
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	60	.. opcode:: MOV - Move
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	61
				62	.. math::
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	63
				64	dst.x = src.x
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	65
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	66	dst.y = src.y
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	67
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	68	dst.z = src.z
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	69
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	70	dst.w = src.w
				71
				72
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	73	.. opcode:: LIT - Light Coefficients
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	74
				75	.. math::
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	76
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	77	dst.x = 1
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	78
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	79	dst.y = max(src.x, 0)
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	80
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	81	dst.z = (src.x > 0) ? max(src.y, 0)^{clamp(src.w, -128, 128))} : 0
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	82
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	83	dst.w = 1
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	84
				85
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	86	.. opcode:: RCP - Reciprocal
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	87
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	88	This instruction replicates its result.
				89
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	90	.. math::
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	91
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	92	dst = \frac{1}{src.x}
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	93
				94
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	95	.. opcode:: RSQ - Reciprocal Square Root
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	96
Zack Rusin	00cd455	2013-07-11 12:16:06 -0400	[diff] [blame^]	97	This instruction replicates its result. The results are undefined for src <= 0.
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	98
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	99	.. math::
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	100
Zack Rusin	00cd455	2013-07-11 12:16:06 -0400	[diff] [blame^]	101	dst = \frac{1}{\sqrt{src.x}}
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	102
				103
Brian Paul	d276a40	2013-02-01 10:59:43 -0700	[diff] [blame]	104	.. opcode:: SQRT - Square Root
				105
Zack Rusin	00cd455	2013-07-11 12:16:06 -0400	[diff] [blame^]	106	This instruction replicates its result. The results are undefined for src < 0.
Brian Paul	d276a40	2013-02-01 10:59:43 -0700	[diff] [blame]	107
				108	.. math::
				109
				110	dst = {\sqrt{src.x}}
				111
				112
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	113	.. opcode:: EXP - Approximate Exponential Base 2
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	114
				115	.. math::
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	116
Corbin Simpson	dd801e5	2009-12-21 19:41:09 -0800	[diff] [blame]	117	dst.x = 2^{\lfloor src.x\rfloor}
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	118
Corbin Simpson	d92a685	2009-12-21 19:30:29 -0800	[diff] [blame]	119	dst.y = src.x - \lfloor src.x\rfloor
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	120
Corbin Simpson	dd801e5	2009-12-21 19:41:09 -0800	[diff] [blame]	121	dst.z = 2^{src.x}
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	122
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	123	dst.w = 1
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	124
				125
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	126	.. opcode:: LOG - Approximate Logarithm Base 2
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	127
				128	.. math::
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	129
Corbin Simpson	14743ac	2009-12-21 19:57:56 -0800	[diff] [blame]	130	dst.x = \lfloor\log_2{\|src.x\|}\rfloor
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	131
Corbin Simpson	14743ac	2009-12-21 19:57:56 -0800	[diff] [blame]	132	dst.y = \frac{\|src.x\|}{2^{\lfloor\log_2{\|src.x\|}\rfloor}}
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	133
Corbin Simpson	14743ac	2009-12-21 19:57:56 -0800	[diff] [blame]	134	dst.z = \log_2{\|src.x\|}
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	135
Corbin Simpson	14743ac	2009-12-21 19:57:56 -0800	[diff] [blame]	136	dst.w = 1
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	137
				138
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	139	.. opcode:: MUL - Multiply
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	140
				141	.. math::
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	142
Corbin Simpson	ecb2f2a	2009-12-21 20:07:10 -0800	[diff] [blame]	143	dst.x = src0.x \times src1.x
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	144
Corbin Simpson	ecb2f2a	2009-12-21 20:07:10 -0800	[diff] [blame]	145	dst.y = src0.y \times src1.y
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	146
Corbin Simpson	ecb2f2a	2009-12-21 20:07:10 -0800	[diff] [blame]	147	dst.z = src0.z \times src1.z
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	148
Corbin Simpson	ecb2f2a	2009-12-21 20:07:10 -0800	[diff] [blame]	149	dst.w = src0.w \times src1.w
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	150
				151
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	152	.. opcode:: ADD - Add
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	153
				154	.. math::
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	155
				156	dst.x = src0.x + src1.x
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	157
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	158	dst.y = src0.y + src1.y
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	159
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	160	dst.z = src0.z + src1.z
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	161
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	162	dst.w = src0.w + src1.w
				163
				164
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	165	.. opcode:: DP3 - 3-component Dot Product
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	166
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	167	This instruction replicates its result.
				168
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	169	.. math::
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	170
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	171	dst = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	172
				173
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	174	.. opcode:: DP4 - 4-component Dot Product
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	175
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	176	This instruction replicates its result.
				177
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	178	.. math::
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	179
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	180	dst = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src0.w \times src1.w
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	181
				182
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	183	.. opcode:: DST - Distance Vector
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	184
				185	.. math::
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	186
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	187	dst.x = 1
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	188
Corbin Simpson	ecb2f2a	2009-12-21 20:07:10 -0800	[diff] [blame]	189	dst.y = src0.y \times src1.y
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	190
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	191	dst.z = src0.z
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	192
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	193	dst.w = src1.w
				194
				195
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	196	.. opcode:: MIN - Minimum
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	197
				198	.. math::
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	199
				200	dst.x = min(src0.x, src1.x)
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	201
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	202	dst.y = min(src0.y, src1.y)
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	203
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	204	dst.z = min(src0.z, src1.z)
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	205
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	206	dst.w = min(src0.w, src1.w)
				207
				208
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	209	.. opcode:: MAX - Maximum
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	210
				211	.. math::
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	212
				213	dst.x = max(src0.x, src1.x)
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	214
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	215	dst.y = max(src0.y, src1.y)
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	216
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	217	dst.z = max(src0.z, src1.z)
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	218
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	219	dst.w = max(src0.w, src1.w)
				220
				221
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	222	.. opcode:: SLT - Set On Less Than
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	223
				224	.. math::
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	225
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	226	dst.x = (src0.x < src1.x) ? 1 : 0
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	227
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	228	dst.y = (src0.y < src1.y) ? 1 : 0
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	229
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	230	dst.z = (src0.z < src1.z) ? 1 : 0
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	231
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	232	dst.w = (src0.w < src1.w) ? 1 : 0
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	233
				234
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	235	.. opcode:: SGE - Set On Greater Equal Than
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	236
				237	.. math::
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	238
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	239	dst.x = (src0.x >= src1.x) ? 1 : 0
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	240
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	241	dst.y = (src0.y >= src1.y) ? 1 : 0
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	242
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	243	dst.z = (src0.z >= src1.z) ? 1 : 0
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	244
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	245	dst.w = (src0.w >= src1.w) ? 1 : 0
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	246
				247
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	248	.. opcode:: MAD - Multiply And Add
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	249
				250	.. math::
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	251
Corbin Simpson	ecb2f2a	2009-12-21 20:07:10 -0800	[diff] [blame]	252	dst.x = src0.x \times src1.x + src2.x
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	253
Corbin Simpson	ecb2f2a	2009-12-21 20:07:10 -0800	[diff] [blame]	254	dst.y = src0.y \times src1.y + src2.y
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	255
Corbin Simpson	ecb2f2a	2009-12-21 20:07:10 -0800	[diff] [blame]	256	dst.z = src0.z \times src1.z + src2.z
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	257
Corbin Simpson	ecb2f2a	2009-12-21 20:07:10 -0800	[diff] [blame]	258	dst.w = src0.w \times src1.w + src2.w
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	259
				260
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	261	.. opcode:: SUB - Subtract
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	262
				263	.. math::
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	264
				265	dst.x = src0.x - src1.x
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	266
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	267	dst.y = src0.y - src1.y
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	268
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	269	dst.z = src0.z - src1.z
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	270
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	271	dst.w = src0.w - src1.w
				272
				273
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	274	.. opcode:: LRP - Linear Interpolate
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	275
				276	.. math::
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	277
Michal Krol	b3567fc	2010-01-04 12:59:17 +0100	[diff] [blame]	278	dst.x = src0.x \times src1.x + (1 - src0.x) \times src2.x
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	279
Michal Krol	b3567fc	2010-01-04 12:59:17 +0100	[diff] [blame]	280	dst.y = src0.y \times src1.y + (1 - src0.y) \times src2.y
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	281
Michal Krol	b3567fc	2010-01-04 12:59:17 +0100	[diff] [blame]	282	dst.z = src0.z \times src1.z + (1 - src0.z) \times src2.z
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	283
Michal Krol	b3567fc	2010-01-04 12:59:17 +0100	[diff] [blame]	284	dst.w = src0.w \times src1.w + (1 - src0.w) \times src2.w
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	285
				286
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	287	.. opcode:: CND - Condition
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	288
				289	.. math::
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	290
				291	dst.x = (src2.x > 0.5) ? src0.x : src1.x
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	292
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	293	dst.y = (src2.y > 0.5) ? src0.y : src1.y
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	294
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	295	dst.z = (src2.z > 0.5) ? src0.z : src1.z
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	296
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	297	dst.w = (src2.w > 0.5) ? src0.w : src1.w
				298
				299
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	300	.. opcode:: DP2A - 2-component Dot Product And Add
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	301
				302	.. math::
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	303
Corbin Simpson	ecb2f2a	2009-12-21 20:07:10 -0800	[diff] [blame]	304	dst.x = src0.x \times src1.x + src0.y \times src1.y + src2.x
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	305
Corbin Simpson	ecb2f2a	2009-12-21 20:07:10 -0800	[diff] [blame]	306	dst.y = src0.x \times src1.x + src0.y \times src1.y + src2.x
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	307
Corbin Simpson	ecb2f2a	2009-12-21 20:07:10 -0800	[diff] [blame]	308	dst.z = src0.x \times src1.x + src0.y \times src1.y + src2.x
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	309
Corbin Simpson	ecb2f2a	2009-12-21 20:07:10 -0800	[diff] [blame]	310	dst.w = src0.x \times src1.x + src0.y \times src1.y + src2.x
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	311
				312
José Fonseca	d9c6ebb	2010-06-01 16:25:05 +0100	[diff] [blame]	313	.. opcode:: FRC - Fraction
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	314
				315	.. math::
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	316
Corbin Simpson	d92a685	2009-12-21 19:30:29 -0800	[diff] [blame]	317	dst.x = src.x - \lfloor src.x\rfloor
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	318
Corbin Simpson	d92a685	2009-12-21 19:30:29 -0800	[diff] [blame]	319	dst.y = src.y - \lfloor src.y\rfloor
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	320
Corbin Simpson	d92a685	2009-12-21 19:30:29 -0800	[diff] [blame]	321	dst.z = src.z - \lfloor src.z\rfloor
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	322
Corbin Simpson	d92a685	2009-12-21 19:30:29 -0800	[diff] [blame]	323	dst.w = src.w - \lfloor src.w\rfloor
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	324
				325
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	326	.. opcode:: CLAMP - Clamp
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	327
				328	.. math::
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	329
				330	dst.x = clamp(src0.x, src1.x, src2.x)
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	331
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	332	dst.y = clamp(src0.y, src1.y, src2.y)
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	333
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	334	dst.z = clamp(src0.z, src1.z, src2.z)
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	335
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	336	dst.w = clamp(src0.w, src1.w, src2.w)
				337
				338
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	339	.. opcode:: FLR - Floor
Corbin Simpson	d92a685	2009-12-21 19:30:29 -0800	[diff] [blame]	340
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	341	This is identical to :opcode:`ARL`.
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	342
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	343	.. math::
				344
Corbin Simpson	d92a685	2009-12-21 19:30:29 -0800	[diff] [blame]	345	dst.x = \lfloor src.x\rfloor
				346
				347	dst.y = \lfloor src.y\rfloor
				348
				349	dst.z = \lfloor src.z\rfloor
				350
				351	dst.w = \lfloor src.w\rfloor
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	352
				353
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	354	.. opcode:: ROUND - Round
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	355
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	356	.. math::
				357
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	358	dst.x = round(src.x)
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	359
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	360	dst.y = round(src.y)
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	361
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	362	dst.z = round(src.z)
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	363
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	364	dst.w = round(src.w)
				365
				366
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	367	.. opcode:: EX2 - Exponential Base 2
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	368
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	369	This instruction replicates its result.
				370
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	371	.. math::
				372
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	373	dst = 2^{src.x}
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	374
				375
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	376	.. opcode:: LG2 - Logarithm Base 2
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	377
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	378	This instruction replicates its result.
				379
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	380	.. math::
				381
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	382	dst = \log_2{src.x}
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	383
				384
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	385	.. opcode:: POW - Power
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	386
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	387	This instruction replicates its result.
				388
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	389	.. math::
				390
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	391	dst = src0.x^{src1.x}
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	392
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	393	.. opcode:: XPD - Cross Product
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	394
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	395	.. math::
				396
Corbin Simpson	ecb2f2a	2009-12-21 20:07:10 -0800	[diff] [blame]	397	dst.x = src0.y \times src1.z - src1.y \times src0.z
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	398
Corbin Simpson	ecb2f2a	2009-12-21 20:07:10 -0800	[diff] [blame]	399	dst.y = src0.z \times src1.x - src1.z \times src0.x
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	400
Corbin Simpson	ecb2f2a	2009-12-21 20:07:10 -0800	[diff] [blame]	401	dst.z = src0.x \times src1.y - src1.x \times src0.y
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	402
				403	dst.w = 1
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	404
				405
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	406	.. opcode:: ABS - Absolute
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	407
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	408	.. math::
				409
Corbin Simpson	14743ac	2009-12-21 19:57:56 -0800	[diff] [blame]	410	dst.x = \|src.x\|
				411
				412	dst.y = \|src.y\|
				413
				414	dst.z = \|src.z\|
				415
				416	dst.w = \|src.w\|
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	417
				418
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	419	.. opcode:: RCC - Reciprocal Clamped
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	420
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	421	This instruction replicates its result.
				422
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	423	XXX cleanup on aisle three
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	424
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	425	.. math::
				426
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	427	dst = (1 / src.x) > 0 ? clamp(1 / src.x, 5.42101e-020, 1.884467e+019) : clamp(1 / src.x, -1.884467e+019, -5.42101e-020)
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	428
				429
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	430	.. opcode:: DPH - Homogeneous Dot Product
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	431
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	432	This instruction replicates its result.
				433
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	434	.. math::
				435
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	436	dst = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src1.w
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	437
				438
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	439	.. opcode:: COS - Cosine
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	440
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	441	This instruction replicates its result.
				442
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	443	.. math::
				444
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	445	dst = \cos{src.x}
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	446
				447
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	448	.. opcode:: DDX - Derivative Relative To X
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	449
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	450	.. math::
				451
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	452	dst.x = partialx(src.x)
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	453
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	454	dst.y = partialx(src.y)
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	455
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	456	dst.z = partialx(src.z)
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	457
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	458	dst.w = partialx(src.w)
				459
				460
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	461	.. opcode:: DDY - Derivative Relative To Y
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	462
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	463	.. math::
				464
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	465	dst.x = partialy(src.x)
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	466
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	467	dst.y = partialy(src.y)
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	468
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	469	dst.z = partialy(src.z)
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	470
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	471	dst.w = partialy(src.w)
				472
				473
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	474	.. opcode:: PK2H - Pack Two 16-bit Floats
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	475
				476	TBD
				477
				478
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	479	.. opcode:: PK2US - Pack Two Unsigned 16-bit Scalars
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	480
				481	TBD
				482
				483
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	484	.. opcode:: PK4B - Pack Four Signed 8-bit Scalars
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	485
				486	TBD
				487
				488
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	489	.. opcode:: PK4UB - Pack Four Unsigned 8-bit Scalars
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	490
				491	TBD
				492
				493
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	494	.. opcode:: RFL - Reflection Vector
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	495
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	496	.. math::
				497
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	498	dst.x = 2 \times (src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z) / (src0.x \times src0.x + src0.y \times src0.y + src0.z \times src0.z) \times src0.x - src1.x
				499
				500	dst.y = 2 \times (src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z) / (src0.x \times src0.x + src0.y \times src0.y + src0.z \times src0.z) \times src0.y - src1.y
				501
				502	dst.z = 2 \times (src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z) / (src0.x \times src0.x + src0.y \times src0.y + src0.z \times src0.z) \times src0.z - src1.z
				503
				504	dst.w = 1
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	505
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	506	.. note::
				507
				508	Considered for removal.
Keith Whitwell	14eacb0	2009-12-21 23:38:29 +0000	[diff] [blame]	509
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	510
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	511	.. opcode:: SEQ - Set On Equal
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	512
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	513	.. math::
				514
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	515	dst.x = (src0.x == src1.x) ? 1 : 0
Corbin Simpson	0477191	2010-01-18 17:31:56 -0800	[diff] [blame]	516
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	517	dst.y = (src0.y == src1.y) ? 1 : 0
Corbin Simpson	0477191	2010-01-18 17:31:56 -0800	[diff] [blame]	518
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	519	dst.z = (src0.z == src1.z) ? 1 : 0
Corbin Simpson	0477191	2010-01-18 17:31:56 -0800	[diff] [blame]	520
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	521	dst.w = (src0.w == src1.w) ? 1 : 0
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	522
				523
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	524	.. opcode:: SFL - Set On False
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	525
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	526	This instruction replicates its result.
				527
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	528	.. math::
				529
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	530	dst = 0
Corbin Simpson	0477191	2010-01-18 17:31:56 -0800	[diff] [blame]	531
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	532	.. note::
Corbin Simpson	0477191	2010-01-18 17:31:56 -0800	[diff] [blame]	533
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	534	Considered for removal.
Corbin Simpson	0477191	2010-01-18 17:31:56 -0800	[diff] [blame]	535
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	536
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	537	.. opcode:: SGT - Set On Greater Than
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	538
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	539	.. math::
				540
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	541	dst.x = (src0.x > src1.x) ? 1 : 0
Corbin Simpson	0477191	2010-01-18 17:31:56 -0800	[diff] [blame]	542
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	543	dst.y = (src0.y > src1.y) ? 1 : 0
Corbin Simpson	0477191	2010-01-18 17:31:56 -0800	[diff] [blame]	544
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	545	dst.z = (src0.z > src1.z) ? 1 : 0
Corbin Simpson	0477191	2010-01-18 17:31:56 -0800	[diff] [blame]	546
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	547	dst.w = (src0.w > src1.w) ? 1 : 0
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	548
				549
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	550	.. opcode:: SIN - Sine
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	551
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	552	This instruction replicates its result.
				553
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	554	.. math::
				555
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	556	dst = \sin{src.x}
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	557
				558
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	559	.. opcode:: SLE - Set On Less Equal Than
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	560
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	561	.. math::
				562
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	563	dst.x = (src0.x <= src1.x) ? 1 : 0
Corbin Simpson	0477191	2010-01-18 17:31:56 -0800	[diff] [blame]	564
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	565	dst.y = (src0.y <= src1.y) ? 1 : 0
Corbin Simpson	0477191	2010-01-18 17:31:56 -0800	[diff] [blame]	566
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	567	dst.z = (src0.z <= src1.z) ? 1 : 0
Corbin Simpson	0477191	2010-01-18 17:31:56 -0800	[diff] [blame]	568
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	569	dst.w = (src0.w <= src1.w) ? 1 : 0
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	570
				571
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	572	.. opcode:: SNE - Set On Not Equal
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	573
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	574	.. math::
				575
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	576	dst.x = (src0.x != src1.x) ? 1 : 0
Corbin Simpson	0477191	2010-01-18 17:31:56 -0800	[diff] [blame]	577
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	578	dst.y = (src0.y != src1.y) ? 1 : 0
Corbin Simpson	0477191	2010-01-18 17:31:56 -0800	[diff] [blame]	579
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	580	dst.z = (src0.z != src1.z) ? 1 : 0
Corbin Simpson	0477191	2010-01-18 17:31:56 -0800	[diff] [blame]	581
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	582	dst.w = (src0.w != src1.w) ? 1 : 0
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	583
				584
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	585	.. opcode:: STR - Set On True
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	586
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	587	This instruction replicates its result.
				588
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	589	.. math::
				590
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	591	dst = 1
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	592
				593
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	594	.. opcode:: TEX - Texture Lookup
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	595
Brian Paul	2a77c3c	2010-12-14 12:45:36 -0700	[diff] [blame]	596	.. math::
				597
				598	coord = src0
				599
				600	bias = 0.0
				601
				602	dst = texture_sample(unit, coord, bias)
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	603
Dave Airlie	35db326	2011-12-19 16:40:05 +0000	[diff] [blame]	604	for array textures src0.y contains the slice for 1D,
				605	and src0.z contain the slice for 2D.
				606	for shadow textures with no arrays, src0.z contains
				607	the reference value.
				608	for shadow textures with arrays, src0.z contains
				609	the reference value for 1D arrays, and src0.w contains
				610	the reference value for 2D arrays.
				611	There is no way to pass a bias in the .w value for
				612	shadow arrays, and GLSL doesn't allow this.
				613	GLSL does allow cube shadows maps to take a bias value,
				614	and we have to determine how this will look in TGSI.
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	615
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	616	.. opcode:: TXD - Texture Lookup with Derivatives
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	617
Brian Paul	2a77c3c	2010-12-14 12:45:36 -0700	[diff] [blame]	618	.. math::
				619
				620	coord = src0
				621
				622	ddx = src1
				623
				624	ddy = src2
				625
				626	bias = 0.0
				627
				628	dst = texture_sample_deriv(unit, coord, bias, ddx, ddy)
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	629
				630
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	631	.. opcode:: TXP - Projective Texture Lookup
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	632
Brian Paul	2a77c3c	2010-12-14 12:45:36 -0700	[diff] [blame]	633	.. math::
				634
				635	coord.x = src0.x / src.w
				636
				637	coord.y = src0.y / src.w
				638
				639	coord.z = src0.z / src.w
				640
				641	coord.w = src0.w
				642
				643	bias = 0.0
				644
				645	dst = texture_sample(unit, coord, bias)
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	646
				647
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	648	.. opcode:: UP2H - Unpack Two 16-Bit Floats
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	649
				650	TBD
				651
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	652	.. note::
				653
				654	Considered for removal.
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	655
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	656	.. opcode:: UP2US - Unpack Two Unsigned 16-Bit Scalars
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	657
				658	TBD
				659
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	660	.. note::
				661
				662	Considered for removal.
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	663
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	664	.. opcode:: UP4B - Unpack Four Signed 8-Bit Values
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	665
				666	TBD
				667
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	668	.. note::
				669
				670	Considered for removal.
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	671
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	672	.. opcode:: UP4UB - Unpack Four Unsigned 8-Bit Scalars
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	673
				674	TBD
				675
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	676	.. note::
				677
				678	Considered for removal.
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	679
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	680	.. opcode:: X2D - 2D Coordinate Transformation
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	681
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	682	.. math::
				683
Corbin Simpson	ecb2f2a	2009-12-21 20:07:10 -0800	[diff] [blame]	684	dst.x = src0.x + src1.x \times src2.x + src1.y \times src2.y
Corbin Simpson	0477191	2010-01-18 17:31:56 -0800	[diff] [blame]	685
Corbin Simpson	ecb2f2a	2009-12-21 20:07:10 -0800	[diff] [blame]	686	dst.y = src0.y + src1.x \times src2.z + src1.y \times src2.w
Corbin Simpson	0477191	2010-01-18 17:31:56 -0800	[diff] [blame]	687
Corbin Simpson	ecb2f2a	2009-12-21 20:07:10 -0800	[diff] [blame]	688	dst.z = src0.x + src1.x \times src2.x + src1.y \times src2.y
Corbin Simpson	0477191	2010-01-18 17:31:56 -0800	[diff] [blame]	689
Corbin Simpson	ecb2f2a	2009-12-21 20:07:10 -0800	[diff] [blame]	690	dst.w = src0.y + src1.x \times src2.z + src1.y \times src2.w
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	691
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	692	.. note::
				693
				694	Considered for removal.
Keith Whitwell	14eacb0	2009-12-21 23:38:29 +0000	[diff] [blame]	695
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	696
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	697	.. opcode:: ARA - Address Register Add
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	698
				699	TBD
				700
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	701	.. note::
				702
				703	Considered for removal.
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	704
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	705	.. opcode:: ARR - Address Register Load With Round
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	706
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	707	.. math::
				708
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	709	dst.x = round(src.x)
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	710
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	711	dst.y = round(src.y)
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	712
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	713	dst.z = round(src.z)
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	714
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	715	dst.w = round(src.w)
				716
				717
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	718	.. opcode:: SSG - Set Sign
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	719
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	720	.. math::
				721
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	722	dst.x = (src.x > 0) ? 1 : (src.x < 0) ? -1 : 0
				723
				724	dst.y = (src.y > 0) ? 1 : (src.y < 0) ? -1 : 0
				725
				726	dst.z = (src.z > 0) ? 1 : (src.z < 0) ? -1 : 0
				727
				728	dst.w = (src.w > 0) ? 1 : (src.w < 0) ? -1 : 0
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	729
				730
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	731	.. opcode:: CMP - Compare
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	732
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	733	.. math::
				734
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	735	dst.x = (src0.x < 0) ? src1.x : src2.x
				736
				737	dst.y = (src0.y < 0) ? src1.y : src2.y
				738
				739	dst.z = (src0.z < 0) ? src1.z : src2.z
				740
				741	dst.w = (src0.w < 0) ? src1.w : src2.w
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	742
				743
Brian Paul	46205ab	2013-07-11 17:02:37 -0600	[diff] [blame]	744	.. opcode:: KILL_IF - Conditional Discard
				745
				746	Conditional discard. Allowed in fragment shaders only.
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	747
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	748	.. math::
				749
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	750	if (src.x < 0 \|\| src.y < 0 \|\| src.z < 0 \|\| src.w < 0)
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	751	discard
				752	endif
				753
				754
Brian Paul	46205ab	2013-07-11 17:02:37 -0600	[diff] [blame]	755	.. opcode:: KILL - Discard
Brian Paul	f501baa	2013-07-11 16:00:45 -0600	[diff] [blame]	756
				757	Unconditional discard. Allowed in fragment shaders only.
				758
				759
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	760	.. opcode:: SCS - Sine Cosine
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	761
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	762	.. math::
				763
Corbin Simpson	d92a685	2009-12-21 19:30:29 -0800	[diff] [blame]	764	dst.x = \cos{src.x}
				765
				766	dst.y = \sin{src.x}
				767
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	768	dst.z = 0
Corbin Simpson	d92a685	2009-12-21 19:30:29 -0800	[diff] [blame]	769
Tilman Sauerbeck	d323118	2010-09-19 09:03:11 +0200	[diff] [blame]	770	dst.w = 1
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	771
				772
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	773	.. opcode:: TXB - Texture Lookup With Bias
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	774
Brian Paul	2a77c3c	2010-12-14 12:45:36 -0700	[diff] [blame]	775	.. math::
				776
				777	coord.x = src.x
				778
				779	coord.y = src.y
				780
				781	coord.z = src.z
				782
				783	coord.w = 1.0
				784
				785	bias = src.z
				786
				787	dst = texture_sample(unit, coord, bias)
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	788
				789
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	790	.. opcode:: NRM - 3-component Vector Normalise
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	791
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	792	.. math::
				793
Corbin Simpson	ecb2f2a	2009-12-21 20:07:10 -0800	[diff] [blame]	794	dst.x = src.x / (src.x \times src.x + src.y \times src.y + src.z \times src.z)
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	795
Corbin Simpson	ecb2f2a	2009-12-21 20:07:10 -0800	[diff] [blame]	796	dst.y = src.y / (src.x \times src.x + src.y \times src.y + src.z \times src.z)
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	797
Corbin Simpson	ecb2f2a	2009-12-21 20:07:10 -0800	[diff] [blame]	798	dst.z = src.z / (src.x \times src.x + src.y \times src.y + src.z \times src.z)
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	799
				800	dst.w = 1
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	801
				802
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	803	.. opcode:: DIV - Divide
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	804
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	805	.. math::
				806
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	807	dst.x = \frac{src0.x}{src1.x}
				808
				809	dst.y = \frac{src0.y}{src1.y}
				810
				811	dst.z = \frac{src0.z}{src1.z}
				812
				813	dst.w = \frac{src0.w}{src1.w}
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	814
				815
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	816	.. opcode:: DP2 - 2-component Dot Product
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	817
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	818	This instruction replicates its result.
				819
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	820	.. math::
				821
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	822	dst = src0.x \times src1.x + src0.y \times src1.y
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	823
				824
Brian Paul	2a77c3c	2010-12-14 12:45:36 -0700	[diff] [blame]	825	.. opcode:: TXL - Texture Lookup With explicit LOD
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	826
Brian Paul	2a77c3c	2010-12-14 12:45:36 -0700	[diff] [blame]	827	.. math::
				828
				829	coord.x = src0.x
				830
				831	coord.y = src0.y
				832
				833	coord.z = src0.z
				834
				835	coord.w = 1.0
				836
				837	lod = src0.w
				838
				839	dst = texture_sample(unit, coord, lod)
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	840
				841
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	842	.. opcode:: PUSHA - Push Address Register On Stack
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	843
				844	push(src.x)
				845	push(src.y)
				846	push(src.z)
				847	push(src.w)
				848
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	849	.. note::
				850
				851	Considered for cleanup.
				852
				853	.. note::
				854
				855	Considered for removal.
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	856
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	857	.. opcode:: POPA - Pop Address Register From Stack
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	858
				859	dst.w = pop()
				860	dst.z = pop()
				861	dst.y = pop()
				862	dst.x = pop()
				863
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	864	.. note::
				865
				866	Considered for cleanup.
				867
				868	.. note::
				869
				870	Considered for removal.
Keith Whitwell	14eacb0	2009-12-21 23:38:29 +0000	[diff] [blame]	871
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	872
Roland Scheidegger	65102b7	2013-05-03 23:32:23 +0200	[diff] [blame]	873	.. opcode:: BRA - Branch
				874
				875	pc = target
				876
				877	.. note::
				878
				879	Considered for removal.
				880
				881
				882	.. opcode:: CALLNZ - Subroutine Call If Not Zero
				883
				884	TBD
				885
				886	.. note::
				887
				888	Considered for cleanup.
				889
				890	.. note::
				891
				892	Considered for removal.
				893
				894
Corbin Simpson	9d4cb6e	2010-06-16 18:34:32 -0700	[diff] [blame]	895	Compute ISA
Corbin Simpson	5bcd26c	2009-12-21 21:04:10 -0800	[diff] [blame]	896	^^^^^^^^^^^^^^^^^^^^^^^^
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	897
Corbin Simpson	9d4cb6e	2010-06-16 18:34:32 -0700	[diff] [blame]	898	These opcodes are primarily provided for special-use computational shaders.
Keith Whitwell	14eacb0	2009-12-21 23:38:29 +0000	[diff] [blame]	899	Support for these opcodes indicated by a special pipe capability bit (TBD).
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	900
Roland Scheidegger	23025ed	2013-05-03 21:34:12 +0200	[diff] [blame]	901	XXX doesn't look like most of the opcodes really belong here.
Corbin Simpson	9d4cb6e	2010-06-16 18:34:32 -0700	[diff] [blame]	902
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	903	.. opcode:: CEIL - Ceiling
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	904
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	905	.. math::
				906
Corbin Simpson	14743ac	2009-12-21 19:57:56 -0800	[diff] [blame]	907	dst.x = \lceil src.x\rceil
				908
				909	dst.y = \lceil src.y\rceil
				910
				911	dst.z = \lceil src.z\rceil
				912
				913	dst.w = \lceil src.w\rceil
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	914
				915
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	916	.. opcode:: TRUNC - Truncate
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	917
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	918	.. math::
				919
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	920	dst.x = trunc(src.x)
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	921
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	922	dst.y = trunc(src.y)
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	923
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	924	dst.z = trunc(src.z)
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	925
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	926	dst.w = trunc(src.w)
				927
				928
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	929	.. opcode:: MOD - Modulus
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	930
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	931	.. math::
				932
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	933	dst.x = src0.x \bmod src1.x
				934
				935	dst.y = src0.y \bmod src1.y
				936
				937	dst.z = src0.z \bmod src1.z
				938
				939	dst.w = src0.w \bmod src1.w
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	940
				941
Bryan Cain	324ac98	2011-09-10 12:31:54 -0500	[diff] [blame]	942	.. opcode:: UARL - Integer Address Register Load
				943
				944	Moves the contents of the source register, assumed to be an integer, into the
				945	destination register, which is assumed to be an address (ADDR) register.
				946
				947
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	948	.. opcode:: SAD - Sum Of Absolute Differences
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	949
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	950	.. math::
				951
Corbin Simpson	14743ac	2009-12-21 19:57:56 -0800	[diff] [blame]	952	dst.x = \|src0.x - src1.x\| + src2.x
				953
				954	dst.y = \|src0.y - src1.y\| + src2.y
				955
				956	dst.z = \|src0.z - src1.z\| + src2.z
				957
				958	dst.w = \|src0.w - src1.w\| + src2.w
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	959
				960
Dave Airlie	2083a27	2011-08-26 10:59:18 +0100	[diff] [blame]	961	.. opcode:: TXF - Texel Fetch (as per NV_gpu_shader4), extract a single texel
				962	from a specified texture image. The source sampler may
				963	not be a CUBE or SHADOW.
				964	src 0 is a four-component signed integer vector used to
				965	identify the single texel accessed. 3 components + level.
				966	src 1 is a 3 component constant signed integer vector,
				967	with each component only have a range of
				968	-8..+8 (hw only seems to deal with this range, interface
				969	allows for up to unsigned int).
				970	TXF(uint_vec coord, int_vec offset).
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	971
				972
Dave Airlie	6fb12bf	2011-08-25 13:03:19 +0100	[diff] [blame]	973	.. opcode:: TXQ - Texture Size Query (as per NV_gpu_program4)
				974	retrieve the dimensions of the texture
				975	depending on the target. For 1D (width), 2D/RECT/CUBE
				976	(width, height), 3D (width, height, depth),
				977	1D array (width, layers), 2D array (width, height, layers)
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	978
Dave Airlie	6fb12bf	2011-08-25 13:03:19 +0100	[diff] [blame]	979	.. math::
				980
Roland Scheidegger	23025ed	2013-05-03 21:34:12 +0200	[diff] [blame]	981	lod = src0.x
Dave Airlie	6fb12bf	2011-08-25 13:03:19 +0100	[diff] [blame]	982
				983	dst.x = texture_width(unit, lod)
				984
				985	dst.y = texture_height(unit, lod)
				986
				987	dst.z = texture_depth(unit, lod)
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	988
				989
Roland Scheidegger	23025ed	2013-05-03 21:34:12 +0200	[diff] [blame]	990	Integer ISA
				991	^^^^^^^^^^^^^^^^^^^^^^^^
				992	These opcodes are used for integer operations.
				993	Support for these opcodes indicated by PIPE_SHADER_CAP_INTEGERS (all of them?)
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	994
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	995
Roland Scheidegger	23025ed	2013-05-03 21:34:12 +0200	[diff] [blame]	996	.. opcode:: I2F - Signed Integer To Float
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	997
Roland Scheidegger	23025ed	2013-05-03 21:34:12 +0200	[diff] [blame]	998	Rounding is unspecified (round to nearest even suggested).
				999
				1000	.. math::
				1001
				1002	dst.x = (float) src.x
				1003
				1004	dst.y = (float) src.y
				1005
				1006	dst.z = (float) src.z
				1007
				1008	dst.w = (float) src.w
				1009
				1010
				1011	.. opcode:: U2F - Unsigned Integer To Float
				1012
				1013	Rounding is unspecified (round to nearest even suggested).
				1014
				1015	.. math::
				1016
				1017	dst.x = (float) src.x
				1018
				1019	dst.y = (float) src.y
				1020
				1021	dst.z = (float) src.z
				1022
				1023	dst.w = (float) src.w
				1024
				1025
				1026	.. opcode:: F2I - Float to Signed Integer
				1027
				1028	Rounding is towards zero (truncate).
				1029	Values outside signed range (including NaNs) produce undefined results.
				1030
				1031	.. math::
				1032
				1033	dst.x = (int) src.x
				1034
				1035	dst.y = (int) src.y
				1036
				1037	dst.z = (int) src.z
				1038
				1039	dst.w = (int) src.w
				1040
				1041
				1042	.. opcode:: F2U - Float to Unsigned Integer
				1043
				1044	Rounding is towards zero (truncate).
				1045	Values outside unsigned range (including NaNs) produce undefined results.
				1046
				1047	.. math::
				1048
				1049	dst.x = (unsigned) src.x
				1050
				1051	dst.y = (unsigned) src.y
				1052
				1053	dst.z = (unsigned) src.z
				1054
				1055	dst.w = (unsigned) src.w
				1056
				1057
				1058	.. opcode:: UADD - Integer Add
				1059
				1060	This instruction works the same for signed and unsigned integers.
				1061	The low 32bit of the result is returned.
				1062
				1063	.. math::
				1064
				1065	dst.x = src0.x + src1.x
				1066
				1067	dst.y = src0.y + src1.y
				1068
				1069	dst.z = src0.z + src1.z
				1070
				1071	dst.w = src0.w + src1.w
				1072
				1073
				1074	.. opcode:: UMAD - Integer Multiply And Add
				1075
				1076	This instruction works the same for signed and unsigned integers.
				1077	The multiplication returns the low 32bit (as does the result itself).
				1078
				1079	.. math::
				1080
				1081	dst.x = src0.x \times src1.x + src2.x
				1082
				1083	dst.y = src0.y \times src1.y + src2.y
				1084
				1085	dst.z = src0.z \times src1.z + src2.z
				1086
				1087	dst.w = src0.w \times src1.w + src2.w
				1088
				1089
				1090	.. opcode:: UMUL - Integer Multiply
				1091
				1092	This instruction works the same for signed and unsigned integers.
				1093	The low 32bit of the result is returned.
				1094
				1095	.. math::
				1096
				1097	dst.x = src0.x \times src1.x
				1098
				1099	dst.y = src0.y \times src1.y
				1100
				1101	dst.z = src0.z \times src1.z
				1102
				1103	dst.w = src0.w \times src1.w
				1104
				1105
				1106	.. opcode:: IDIV - Signed Integer Division
				1107
				1108	TBD: behavior for division by zero.
				1109
				1110	.. math::
				1111
				1112	dst.x = src0.x \ src1.x
				1113
				1114	dst.y = src0.y \ src1.y
				1115
				1116	dst.z = src0.z \ src1.z
				1117
				1118	dst.w = src0.w \ src1.w
				1119
				1120
				1121	.. opcode:: UDIV - Unsigned Integer Division
				1122
				1123	For division by zero, 0xffffffff is returned.
				1124
				1125	.. math::
				1126
				1127	dst.x = src0.x \ src1.x
				1128
				1129	dst.y = src0.y \ src1.y
				1130
				1131	dst.z = src0.z \ src1.z
				1132
				1133	dst.w = src0.w \ src1.w
				1134
				1135
				1136	.. opcode:: UMOD - Unsigned Integer Remainder
				1137
				1138	If second arg is zero, 0xffffffff is returned.
				1139
				1140	.. math::
				1141
				1142	dst.x = src0.x \ src1.x
				1143
				1144	dst.y = src0.y \ src1.y
				1145
				1146	dst.z = src0.z \ src1.z
				1147
				1148	dst.w = src0.w \ src1.w
				1149
				1150
				1151	.. opcode:: NOT - Bitwise Not
				1152
				1153	.. math::
				1154
				1155	dst.x = ~src.x
				1156
				1157	dst.y = ~src.y
				1158
				1159	dst.z = ~src.z
				1160
				1161	dst.w = ~src.w
				1162
				1163
				1164	.. opcode:: AND - Bitwise And
				1165
				1166	.. math::
				1167
				1168	dst.x = src0.x & src1.x
				1169
				1170	dst.y = src0.y & src1.y
				1171
				1172	dst.z = src0.z & src1.z
				1173
				1174	dst.w = src0.w & src1.w
				1175
				1176
				1177	.. opcode:: OR - Bitwise Or
				1178
				1179	.. math::
				1180
				1181	dst.x = src0.x \| src1.x
				1182
				1183	dst.y = src0.y \| src1.y
				1184
				1185	dst.z = src0.z \| src1.z
				1186
				1187	dst.w = src0.w \| src1.w
				1188
				1189
				1190	.. opcode:: XOR - Bitwise Xor
				1191
				1192	.. math::
				1193
				1194	dst.x = src0.x \oplus src1.x
				1195
				1196	dst.y = src0.y \oplus src1.y
				1197
				1198	dst.z = src0.z \oplus src1.z
				1199
				1200	dst.w = src0.w \oplus src1.w
				1201
				1202
				1203	.. opcode:: IMAX - Maximum of Signed Integers
				1204
				1205	.. math::
				1206
				1207	dst.x = max(src0.x, src1.x)
				1208
				1209	dst.y = max(src0.y, src1.y)
				1210
				1211	dst.z = max(src0.z, src1.z)
				1212
				1213	dst.w = max(src0.w, src1.w)
				1214
				1215
				1216	.. opcode:: UMAX - Maximum of Unsigned Integers
				1217
				1218	.. math::
				1219
				1220	dst.x = max(src0.x, src1.x)
				1221
				1222	dst.y = max(src0.y, src1.y)
				1223
				1224	dst.z = max(src0.z, src1.z)
				1225
				1226	dst.w = max(src0.w, src1.w)
				1227
				1228
				1229	.. opcode:: IMIN - Minimum of Signed Integers
				1230
				1231	.. math::
				1232
				1233	dst.x = min(src0.x, src1.x)
				1234
				1235	dst.y = min(src0.y, src1.y)
				1236
				1237	dst.z = min(src0.z, src1.z)
				1238
				1239	dst.w = min(src0.w, src1.w)
				1240
				1241
				1242	.. opcode:: UMIN - Minimum of Unsigned Integers
				1243
				1244	.. math::
				1245
				1246	dst.x = min(src0.x, src1.x)
				1247
				1248	dst.y = min(src0.y, src1.y)
				1249
				1250	dst.z = min(src0.z, src1.z)
				1251
				1252	dst.w = min(src0.w, src1.w)
				1253
				1254
				1255	.. opcode:: SHL - Shift Left
				1256
				1257	.. math::
				1258
				1259	dst.x = src0.x << src1.x
				1260
				1261	dst.y = src0.y << src1.x
				1262
				1263	dst.z = src0.z << src1.x
				1264
				1265	dst.w = src0.w << src1.x
				1266
				1267
				1268	.. opcode:: ISHR - Arithmetic Shift Right (of Signed Integer)
				1269
				1270	.. math::
				1271
				1272	dst.x = src0.x >> src1.x
				1273
				1274	dst.y = src0.y >> src1.x
				1275
				1276	dst.z = src0.z >> src1.x
				1277
				1278	dst.w = src0.w >> src1.x
				1279
				1280
				1281	.. opcode:: USHR - Logical Shift Right
				1282
				1283	.. math::
				1284
				1285	dst.x = src0.x >> (unsigned) src1.x
				1286
				1287	dst.y = src0.y >> (unsigned) src1.x
				1288
				1289	dst.z = src0.z >> (unsigned) src1.x
				1290
				1291	dst.w = src0.w >> (unsigned) src1.x
				1292
				1293
Roland Scheidegger	23025ed	2013-05-03 21:34:12 +0200	[diff] [blame]	1294	.. opcode:: UCMP - Integer Conditional Move
				1295
				1296	.. math::
				1297
				1298	dst.x = src0.x ? src1.x : src2.x
				1299
				1300	dst.y = src0.y ? src1.y : src2.y
				1301
				1302	dst.z = src0.z ? src1.z : src2.z
				1303
				1304	dst.w = src0.w ? src1.w : src2.w
				1305
				1306
Roland Scheidegger	65102b7	2013-05-03 23:32:23 +0200	[diff] [blame]	1307
				1308	.. opcode:: ISSG - Integer Set Sign
				1309
				1310	.. math::
				1311
				1312	dst.x = (src0.x < 0) ? -1 : (src0.x > 0) ? 1 : 0
				1313
				1314	dst.y = (src0.y < 0) ? -1 : (src0.y > 0) ? 1 : 0
				1315
				1316	dst.z = (src0.z < 0) ? -1 : (src0.z > 0) ? 1 : 0
				1317
				1318	dst.w = (src0.w < 0) ? -1 : (src0.w > 0) ? 1 : 0
				1319
				1320
				1321
				1322	.. opcode:: ISLT - Signed Integer Set On Less Than
				1323
				1324	.. math::
				1325
				1326	dst.x = (src0.x < src1.x) ? ~0 : 0
				1327
				1328	dst.y = (src0.y < src1.y) ? ~0 : 0
				1329
				1330	dst.z = (src0.z < src1.z) ? ~0 : 0
				1331
				1332	dst.w = (src0.w < src1.w) ? ~0 : 0
				1333
				1334
				1335	.. opcode:: USLT - Unsigned Integer Set On Less Than
				1336
				1337	.. math::
				1338
				1339	dst.x = (src0.x < src1.x) ? ~0 : 0
				1340
				1341	dst.y = (src0.y < src1.y) ? ~0 : 0
				1342
				1343	dst.z = (src0.z < src1.z) ? ~0 : 0
				1344
				1345	dst.w = (src0.w < src1.w) ? ~0 : 0
				1346
				1347
				1348	.. opcode:: ISGE - Signed Integer Set On Greater Equal Than
				1349
				1350	.. math::
				1351
				1352	dst.x = (src0.x >= src1.x) ? ~0 : 0
				1353
				1354	dst.y = (src0.y >= src1.y) ? ~0 : 0
				1355
				1356	dst.z = (src0.z >= src1.z) ? ~0 : 0
				1357
				1358	dst.w = (src0.w >= src1.w) ? ~0 : 0
				1359
				1360
				1361	.. opcode:: USGE - Unsigned Integer Set On Greater Equal Than
				1362
				1363	.. math::
				1364
				1365	dst.x = (src0.x >= src1.x) ? ~0 : 0
				1366
				1367	dst.y = (src0.y >= src1.y) ? ~0 : 0
				1368
				1369	dst.z = (src0.z >= src1.z) ? ~0 : 0
				1370
				1371	dst.w = (src0.w >= src1.w) ? ~0 : 0
				1372
				1373
				1374	.. opcode:: USEQ - Integer Set On Equal
				1375
				1376	.. math::
				1377
				1378	dst.x = (src0.x == src1.x) ? ~0 : 0
				1379
				1380	dst.y = (src0.y == src1.y) ? ~0 : 0
				1381
				1382	dst.z = (src0.z == src1.z) ? ~0 : 0
				1383
				1384	dst.w = (src0.w == src1.w) ? ~0 : 0
				1385
				1386
				1387	.. opcode:: USNE - Integer Set On Not Equal
				1388
				1389	.. math::
				1390
				1391	dst.x = (src0.x != src1.x) ? ~0 : 0
				1392
				1393	dst.y = (src0.y != src1.y) ? ~0 : 0
				1394
				1395	dst.z = (src0.z != src1.z) ? ~0 : 0
				1396
				1397	dst.w = (src0.w != src1.w) ? ~0 : 0
				1398
				1399
				1400	.. opcode:: INEG - Integer Negate
				1401
				1402	Two's complement.
				1403
				1404	.. math::
				1405
				1406	dst.x = -src.x
				1407
				1408	dst.y = -src.y
				1409
				1410	dst.z = -src.z
				1411
				1412	dst.w = -src.w
				1413
				1414
Roland Scheidegger	23025ed	2013-05-03 21:34:12 +0200	[diff] [blame]	1415	.. opcode:: IABS - Integer Absolute Value
				1416
				1417	.. math::
				1418
				1419	dst.x = \|src.x\|
				1420
				1421	dst.y = \|src.y\|
				1422
				1423	dst.z = \|src.z\|
				1424
				1425	dst.w = \|src.w\|
Corbin Simpson	9d4cb6e	2010-06-16 18:34:32 -0700	[diff] [blame]	1426
				1427
				1428	Geometry ISA
Corbin Simpson	5bcd26c	2009-12-21 21:04:10 -0800	[diff] [blame]	1429	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	1430
Corbin Simpson	9d4cb6e	2010-06-16 18:34:32 -0700	[diff] [blame]	1431	These opcodes are only supported in geometry shaders; they have no meaning
				1432	in any other type of shader.
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	1433
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	1434	.. opcode:: EMIT - Emit
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	1435
Roland Scheidegger	65102b7	2013-05-03 23:32:23 +0200	[diff] [blame]	1436	Generate a new vertex for the current primitive using the values in the
				1437	output registers.
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	1438
				1439
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	1440	.. opcode:: ENDPRIM - End Primitive
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	1441
Roland Scheidegger	65102b7	2013-05-03 23:32:23 +0200	[diff] [blame]	1442	Complete the current primitive (consisting of the emitted vertices),
				1443	and start a new one.
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	1444
				1445
Corbin Simpson	9d4cb6e	2010-06-16 18:34:32 -0700	[diff] [blame]	1446	GLSL ISA
Corbin Simpson	5bcd26c	2009-12-21 21:04:10 -0800	[diff] [blame]	1447	^^^^^^^^^^
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	1448
Corbin Simpson	9d4cb6e	2010-06-16 18:34:32 -0700	[diff] [blame]	1449	These opcodes are part of :term:`GLSL`'s opcode set. Support for these
				1450	opcodes is determined by a special capability bit, ``GLSL``.
Roland Scheidegger	65102b7	2013-05-03 23:32:23 +0200	[diff] [blame]	1451	Some require glsl version 1.30 (UIF/BREAKC/SWITCH/CASE/DEFAULT/ENDSWITCH).
				1452
				1453	.. opcode:: CAL - Subroutine Call
				1454
				1455	push(pc)
				1456	pc = target
				1457
				1458
				1459	.. opcode:: RET - Subroutine Call Return
				1460
				1461	pc = pop()
				1462
				1463
				1464	.. opcode:: CONT - Continue
				1465
				1466	Unconditionally moves the point of execution to the instruction after the
				1467	last bgnloop. The instruction must appear within a bgnloop/endloop.
				1468
				1469	.. note::
				1470
				1471	Support for CONT is determined by a special capability bit,
				1472	``TGSI_CONT_SUPPORTED``. See :ref:`Screen` for more information.
				1473
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	1474
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	1475	.. opcode:: BGNLOOP - Begin a Loop
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	1476
Roland Scheidegger	65102b7	2013-05-03 23:32:23 +0200	[diff] [blame]	1477	Start a loop. Must have a matching endloop.
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	1478
				1479
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	1480	.. opcode:: BGNSUB - Begin Subroutine
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	1481
Roland Scheidegger	65102b7	2013-05-03 23:32:23 +0200	[diff] [blame]	1482	Starts definition of a subroutine. Must have a matching endsub.
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	1483
				1484
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	1485	.. opcode:: ENDLOOP - End a Loop
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	1486
Roland Scheidegger	65102b7	2013-05-03 23:32:23 +0200	[diff] [blame]	1487	End a loop started with bgnloop.
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	1488
				1489
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	1490	.. opcode:: ENDSUB - End Subroutine
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	1491
Roland Scheidegger	65102b7	2013-05-03 23:32:23 +0200	[diff] [blame]	1492	Ends definition of a subroutine.
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	1493
				1494
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	1495	.. opcode:: NOP - No Operation
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	1496
Michal Krol	8ab89d7	2010-01-04 13:23:41 +0100	[diff] [blame]	1497	Do nothing.
				1498
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	1499
Roland Scheidegger	65102b7	2013-05-03 23:32:23 +0200	[diff] [blame]	1500	.. opcode:: BRK - Break
				1501
				1502	Unconditionally moves the point of execution to the instruction after the
				1503	next endloop or endswitch. The instruction must appear within a loop/endloop
				1504	or switch/endswitch.
				1505
				1506
				1507	.. opcode:: BREAKC - Break Conditional
				1508
				1509	Conditionally moves the point of execution to the instruction after the
				1510	next endloop or endswitch. The instruction must appear within a loop/endloop
				1511	or switch/endswitch.
				1512	Condition evaluates to true if src0.x != 0 where src0.x is interpreted
				1513	as an integer register.
				1514
				1515	.. note::
				1516
				1517	Considered for removal as it's quite inconsistent wrt other opcodes
				1518	(could emulate with UIF/BRK/ENDIF).
				1519
				1520
				1521	.. opcode:: IF - Float If
				1522
				1523	Start an IF ... ELSE .. ENDIF block. Condition evaluates to true if
				1524
				1525	src0.x != 0.0
				1526
				1527	where src0.x is interpreted as a floating point register.
				1528
				1529
				1530	.. opcode:: UIF - Bitwise If
				1531
				1532	Start an UIF ... ELSE .. ENDIF block. Condition evaluates to true if
				1533
				1534	src0.x != 0
				1535
				1536	where src0.x is interpreted as an integer register.
				1537
				1538
				1539	.. opcode:: ELSE - Else
				1540
				1541	Starts an else block, after an IF or UIF statement.
				1542
				1543
				1544	.. opcode:: ENDIF - End If
				1545
				1546	Ends an IF or UIF block.
				1547
				1548
				1549	.. opcode:: SWITCH - Switch
				1550
				1551	Starts a C-style switch expression. The switch consists of one or multiple
				1552	CASE statements, and at most one DEFAULT statement. Execution of a statement
				1553	ends when a BRK is hit, but just like in C falling through to other cases
				1554	without a break is allowed. Similarly, DEFAULT label is allowed anywhere not
				1555	just as last statement, and fallthrough is allowed into/from it.
				1556	CASE src arguments are evaluated at bit level against the SWITCH src argument.
				1557
				1558	Example:
				1559	SWITCH src[0].x
				1560	CASE src[0].x
				1561	(some instructions here)
				1562	(optional BRK here)
				1563	DEFAULT
				1564	(some instructions here)
				1565	(optional BRK here)
				1566	CASE src[0].x
				1567	(some instructions here)
				1568	(optional BRK here)
				1569	ENDSWITCH
				1570
				1571
				1572	.. opcode:: CASE - Switch case
				1573
				1574	This represents a switch case label. The src arg must be an integer immediate.
				1575
				1576
				1577	.. opcode:: DEFAULT - Switch default
				1578
				1579	This represents the default case in the switch, which is taken if no other
				1580	case matches.
				1581
				1582
				1583	.. opcode:: ENDSWITCH - End of switch
				1584
				1585	Ends a switch expression.
				1586
				1587
Corbin Simpson	8580522	2010-02-02 16:20:12 -0800	[diff] [blame]	1588	.. opcode:: NRM4 - 4-component Vector Normalise
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	1589
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	1590	This instruction replicates its result.
				1591
Corbin Simpson	e8ed3b9	2009-12-21 19:12:55 -0800	[diff] [blame]	1592	.. math::
				1593
Corbin Simpson	17c2a44	2010-02-02 17:02:28 -0800	[diff] [blame]	1594	dst = \frac{src.x}{src.x \times src.x + src.y \times src.y + src.z \times src.z + src.w \times src.w}
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	1595
				1596
Corbin Simpson	62ca7b8	2010-02-02 16:36:34 -0800	[diff] [blame]	1597	.. _doubleopcodes:
				1598
Corbin Simpson	9d4cb6e	2010-06-16 18:34:32 -0700	[diff] [blame]	1599	Double ISA
Igor Oliveira	db89bf4	2010-01-25 19:23:04 -0400	[diff] [blame]	1600	^^^^^^^^^^^^^^^
				1601
Corbin Simpson	dbc95e8	2010-06-16 18:34:51 -0700	[diff] [blame]	1602	The double-precision opcodes reinterpret four-component vectors into
				1603	two-component vectors with doubled precision in each component.
				1604
				1605	Support for these opcodes is XXX undecided. :T
				1606
				1607	.. opcode:: DADD - Add
Igor Oliveira	db89bf4	2010-01-25 19:23:04 -0400	[diff] [blame]	1608
				1609	.. math::
				1610
				1611	dst.xy = src0.xy + src1.xy
				1612
				1613	dst.zw = src0.zw + src1.zw
				1614
				1615
Corbin Simpson	dbc95e8	2010-06-16 18:34:51 -0700	[diff] [blame]	1616	.. opcode:: DDIV - Divide
Igor Oliveira	db89bf4	2010-01-25 19:23:04 -0400	[diff] [blame]	1617
				1618	.. math::
				1619
				1620	dst.xy = src0.xy / src1.xy
				1621
				1622	dst.zw = src0.zw / src1.zw
				1623
Corbin Simpson	dbc95e8	2010-06-16 18:34:51 -0700	[diff] [blame]	1624	.. opcode:: DSEQ - Set on Equal
Igor Oliveira	db89bf4	2010-01-25 19:23:04 -0400	[diff] [blame]	1625
				1626	.. math::
				1627
				1628	dst.xy = src0.xy == src1.xy ? 1.0F : 0.0F
				1629
				1630	dst.zw = src0.zw == src1.zw ? 1.0F : 0.0F
				1631
Corbin Simpson	dbc95e8	2010-06-16 18:34:51 -0700	[diff] [blame]	1632	.. opcode:: DSLT - Set on Less than
Igor Oliveira	db89bf4	2010-01-25 19:23:04 -0400	[diff] [blame]	1633
				1634	.. math::
				1635
				1636	dst.xy = src0.xy < src1.xy ? 1.0F : 0.0F
				1637
				1638	dst.zw = src0.zw < src1.zw ? 1.0F : 0.0F
				1639
Corbin Simpson	dbc95e8	2010-06-16 18:34:51 -0700	[diff] [blame]	1640	.. opcode:: DFRAC - Fraction
Igor Oliveira	db89bf4	2010-01-25 19:23:04 -0400	[diff] [blame]	1641
				1642	.. math::
				1643
				1644	dst.xy = src.xy - \lfloor src.xy\rfloor
				1645
				1646	dst.zw = src.zw - \lfloor src.zw\rfloor
				1647
				1648
Corbin Simpson	dbc95e8	2010-06-16 18:34:51 -0700	[diff] [blame]	1649	.. opcode:: DFRACEXP - Convert Number to Fractional and Integral Components
Igor Oliveira	db89bf4	2010-01-25 19:23:04 -0400	[diff] [blame]	1650
Corbin Simpson	f98c462	2010-06-16 18:45:50 -0700	[diff] [blame]	1651	Like the ``frexp()`` routine in many math libraries, this opcode stores the
				1652	exponent of its source to ``dst0``, and the significand to ``dst1``, such that
				1653	:math:`dst1 \times 2^{dst0} = src` .
Igor Oliveira	db89bf4	2010-01-25 19:23:04 -0400	[diff] [blame]	1654
				1655	.. math::
				1656
Corbin Simpson	f98c462	2010-06-16 18:45:50 -0700	[diff] [blame]	1657	dst0.xy = exp(src.xy)
Igor Oliveira	db89bf4	2010-01-25 19:23:04 -0400	[diff] [blame]	1658
Corbin Simpson	f98c462	2010-06-16 18:45:50 -0700	[diff] [blame]	1659	dst1.xy = frac(src.xy)
				1660
				1661	dst0.zw = exp(src.zw)
				1662
				1663	dst1.zw = frac(src.zw)
				1664
				1665	.. opcode:: DLDEXP - Multiply Number by Integral Power of 2
				1666
				1667	This opcode is the inverse of :opcode:`DFRACEXP`.
				1668
				1669	.. math::
				1670
				1671	dst.xy = src0.xy \times 2^{src1.xy}
				1672
				1673	dst.zw = src0.zw \times 2^{src1.zw}
Igor Oliveira	db89bf4	2010-01-25 19:23:04 -0400	[diff] [blame]	1674
Corbin Simpson	dbc95e8	2010-06-16 18:34:51 -0700	[diff] [blame]	1675	.. opcode:: DMIN - Minimum
Igor Oliveira	db89bf4	2010-01-25 19:23:04 -0400	[diff] [blame]	1676
				1677	.. math::
				1678
				1679	dst.xy = min(src0.xy, src1.xy)
				1680
				1681	dst.zw = min(src0.zw, src1.zw)
				1682
Corbin Simpson	dbc95e8	2010-06-16 18:34:51 -0700	[diff] [blame]	1683	.. opcode:: DMAX - Maximum
Igor Oliveira	db89bf4	2010-01-25 19:23:04 -0400	[diff] [blame]	1684
				1685	.. math::
				1686
				1687	dst.xy = max(src0.xy, src1.xy)
				1688
				1689	dst.zw = max(src0.zw, src1.zw)
				1690
Corbin Simpson	dbc95e8	2010-06-16 18:34:51 -0700	[diff] [blame]	1691	.. opcode:: DMUL - Multiply
Igor Oliveira	db89bf4	2010-01-25 19:23:04 -0400	[diff] [blame]	1692
				1693	.. math::
				1694
				1695	dst.xy = src0.xy \times src1.xy
				1696
				1697	dst.zw = src0.zw \times src1.zw
				1698
				1699
Corbin Simpson	dbc95e8	2010-06-16 18:34:51 -0700	[diff] [blame]	1700	.. opcode:: DMAD - Multiply And Add
Igor Oliveira	db89bf4	2010-01-25 19:23:04 -0400	[diff] [blame]	1701
				1702	.. math::
				1703
				1704	dst.xy = src0.xy \times src1.xy + src2.xy
				1705
				1706	dst.zw = src0.zw \times src1.zw + src2.zw
				1707
				1708
Corbin Simpson	dbc95e8	2010-06-16 18:34:51 -0700	[diff] [blame]	1709	.. opcode:: DRCP - Reciprocal
Igor Oliveira	db89bf4	2010-01-25 19:23:04 -0400	[diff] [blame]	1710
				1711	.. math::
				1712
				1713	dst.xy = \frac{1}{src.xy}
				1714
				1715	dst.zw = \frac{1}{src.zw}
				1716
Corbin Simpson	dbc95e8	2010-06-16 18:34:51 -0700	[diff] [blame]	1717	.. opcode:: DSQRT - Square Root
Igor Oliveira	db89bf4	2010-01-25 19:23:04 -0400	[diff] [blame]	1718
				1719	.. math::
				1720
				1721	dst.xy = \sqrt{src.xy}
				1722
				1723	dst.zw = \sqrt{src.zw}
				1724
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	1725
Francisco Jerez	a5f44cc	2012-05-01 02:38:51 +0200	[diff] [blame]	1726	.. _samplingopcodes:
Zack Rusin	bdbe77f	2011-01-24 17:47:10 -0500	[diff] [blame]	1727
Francisco Jerez	a5f44cc	2012-05-01 02:38:51 +0200	[diff] [blame]	1728	Resource Sampling Opcodes
				1729	^^^^^^^^^^^^^^^^^^^^^^^^^
Zack Rusin	bdbe77f	2011-01-24 17:47:10 -0500	[diff] [blame]	1730
				1731	Those opcodes follow very closely semantics of the respective Direct3D
				1732	instructions. If in doubt double check Direct3D documentation.
				1733
Francisco Jerez	a5f44cc	2012-05-01 02:38:51 +0200	[diff] [blame]	1734	.. opcode:: SAMPLE - Using provided address, sample data from the
				1735	specified texture using the filtering mode identified
				1736	by the gven sampler. The source data may come from
				1737	any resource type other than buffers.
				1738	SAMPLE dst, address, sampler_view, sampler
				1739	e.g.
				1740	SAMPLE TEMP[0], TEMP[1], SVIEW[0], SAMP[0]
				1741
				1742	.. opcode:: SAMPLE_I - Simplified alternative to the SAMPLE instruction.
				1743	Using the provided integer address, SAMPLE_I fetches data
				1744	from the specified sampler view without any filtering.
Zack Rusin	bdbe77f	2011-01-24 17:47:10 -0500	[diff] [blame]	1745	The source data may come from any resource type other
				1746	than CUBE.
Francisco Jerez	a5f44cc	2012-05-01 02:38:51 +0200	[diff] [blame]	1747	SAMPLE_I dst, address, sampler_view
Zack Rusin	bdbe77f	2011-01-24 17:47:10 -0500	[diff] [blame]	1748	e.g.
Francisco Jerez	a5f44cc	2012-05-01 02:38:51 +0200	[diff] [blame]	1749	SAMPLE_I TEMP[0], TEMP[1], SVIEW[0]
Zack Rusin	3fa814d	2011-01-24 21:45:37 -0500	[diff] [blame]	1750	The 'address' is specified as unsigned integers. If the
				1751	'address' is out of range [0...(# texels - 1)] the
				1752	result of the fetch is always 0 in all components.
				1753	As such the instruction doesn't honor address wrap
				1754	modes, in cases where that behavior is desirable
Francisco Jerez	a5f44cc	2012-05-01 02:38:51 +0200	[diff] [blame]	1755	'SAMPLE' instruction should be used.
Zack Rusin	3fa814d	2011-01-24 21:45:37 -0500	[diff] [blame]	1756	address.w always provides an unsigned integer mipmap
				1757	level. If the value is out of the range then the
				1758	instruction always returns 0 in all components.
				1759	address.yz are ignored for buffers and 1d textures.
				1760	address.z is ignored for 1d texture arrays and 2d
				1761	textures.
				1762	For 1D texture arrays address.y provides the array
				1763	index (also as unsigned integer). If the value is
				1764	out of the range of available array indices
				1765	[0... (array size - 1)] then the opcode always returns
				1766	0 in all components.
				1767	For 2D texture arrays address.z provides the array
				1768	index, otherwise it exhibits the same behavior as in
				1769	the case for 1D texture arrays.
Francisco Jerez	a5f44cc	2012-05-01 02:38:51 +0200	[diff] [blame]	1770	The exact semantics of the source address are presented
Zack Rusin	3fa814d	2011-01-24 21:45:37 -0500	[diff] [blame]	1771	in the table below:
				1772	resource type X Y Z W
				1773	------------- ------------------------
				1774	PIPE_BUFFER x ignored
				1775	PIPE_TEXTURE_1D x mpl
				1776	PIPE_TEXTURE_2D x y mpl
				1777	PIPE_TEXTURE_3D x y z mpl
				1778	PIPE_TEXTURE_RECT x y mpl
				1779	PIPE_TEXTURE_CUBE not allowed as source
				1780	PIPE_TEXTURE_1D_ARRAY x idx mpl
				1781	PIPE_TEXTURE_2D_ARRAY x y idx mpl
				1782
				1783	Where 'mpl' is a mipmap level and 'idx' is the
				1784	array index.
				1785
Francisco Jerez	a5f44cc	2012-05-01 02:38:51 +0200	[diff] [blame]	1786	.. opcode:: SAMPLE_I_MS - Just like SAMPLE_I but allows fetch data from
Zack Rusin	bdbe77f	2011-01-24 17:47:10 -0500	[diff] [blame]	1787	multi-sampled surfaces.
Roland Scheidegger	9870459	2013-02-12 16:48:52 +0100	[diff] [blame]	1788	SAMPLE_I_MS dst, address, sampler_view, sample
Zack Rusin	bdbe77f	2011-01-24 17:47:10 -0500	[diff] [blame]	1789
Zack Rusin	bdbe77f	2011-01-24 17:47:10 -0500	[diff] [blame]	1790	.. opcode:: SAMPLE_B - Just like the SAMPLE instruction with the
Roland Scheidegger	9870459	2013-02-12 16:48:52 +0100	[diff] [blame]	1791	exception that an additional bias is applied to the
Zack Rusin	bdbe77f	2011-01-24 17:47:10 -0500	[diff] [blame]	1792	level of detail computed as part of the instruction
				1793	execution.
Francisco Jerez	a5f44cc	2012-05-01 02:38:51 +0200	[diff] [blame]	1794	SAMPLE_B dst, address, sampler_view, sampler, lod_bias
Zack Rusin	bdbe77f	2011-01-24 17:47:10 -0500	[diff] [blame]	1795	e.g.
Francisco Jerez	a5f44cc	2012-05-01 02:38:51 +0200	[diff] [blame]	1796	SAMPLE_B TEMP[0], TEMP[1], SVIEW[0], SAMP[0], TEMP[2].x
Zack Rusin	bdbe77f	2011-01-24 17:47:10 -0500	[diff] [blame]	1797
Zack Rusin	3fa814d	2011-01-24 21:45:37 -0500	[diff] [blame]	1798	.. opcode:: SAMPLE_C - Similar to the SAMPLE instruction but it
Zack Rusin	bdbe77f	2011-01-24 17:47:10 -0500	[diff] [blame]	1799	performs a comparison filter. The operands to SAMPLE_C
Roland Scheidegger	9870459	2013-02-12 16:48:52 +0100	[diff] [blame]	1800	are identical to SAMPLE, except that there is an additional
Zack Rusin	bdbe77f	2011-01-24 17:47:10 -0500	[diff] [blame]	1801	float32 operand, reference value, which must be a register
				1802	with single-component, or a scalar literal.
				1803	SAMPLE_C makes the hardware use the current samplers
				1804	compare_func (in pipe_sampler_state) to compare
				1805	reference value against the red component value for the
				1806	surce resource at each texel that the currently configured
				1807	texture filter covers based on the provided coordinates.
Francisco Jerez	a5f44cc	2012-05-01 02:38:51 +0200	[diff] [blame]	1808	SAMPLE_C dst, address, sampler_view.r, sampler, ref_value
Zack Rusin	bdbe77f	2011-01-24 17:47:10 -0500	[diff] [blame]	1809	e.g.
Francisco Jerez	a5f44cc	2012-05-01 02:38:51 +0200	[diff] [blame]	1810	SAMPLE_C TEMP[0], TEMP[1], SVIEW[0].r, SAMP[0], TEMP[2].x
Zack Rusin	bdbe77f	2011-01-24 17:47:10 -0500	[diff] [blame]	1811
				1812	.. opcode:: SAMPLE_C_LZ - Same as SAMPLE_C, but LOD is 0 and derivatives
				1813	are ignored. The LZ stands for level-zero.
Francisco Jerez	a5f44cc	2012-05-01 02:38:51 +0200	[diff] [blame]	1814	SAMPLE_C_LZ dst, address, sampler_view.r, sampler, ref_value
Zack Rusin	bdbe77f	2011-01-24 17:47:10 -0500	[diff] [blame]	1815	e.g.
Francisco Jerez	a5f44cc	2012-05-01 02:38:51 +0200	[diff] [blame]	1816	SAMPLE_C_LZ TEMP[0], TEMP[1], SVIEW[0].r, SAMP[0], TEMP[2].x
Zack Rusin	bdbe77f	2011-01-24 17:47:10 -0500	[diff] [blame]	1817
				1818
				1819	.. opcode:: SAMPLE_D - SAMPLE_D is identical to the SAMPLE opcode except
				1820	that the derivatives for the source address in the x
				1821	direction and the y direction are provided by extra
				1822	parameters.
Francisco Jerez	a5f44cc	2012-05-01 02:38:51 +0200	[diff] [blame]	1823	SAMPLE_D dst, address, sampler_view, sampler, der_x, der_y
Zack Rusin	bdbe77f	2011-01-24 17:47:10 -0500	[diff] [blame]	1824	e.g.
Francisco Jerez	a5f44cc	2012-05-01 02:38:51 +0200	[diff] [blame]	1825	SAMPLE_D TEMP[0], TEMP[1], SVIEW[0], SAMP[0], TEMP[2], TEMP[3]
Zack Rusin	bdbe77f	2011-01-24 17:47:10 -0500	[diff] [blame]	1826
				1827	.. opcode:: SAMPLE_L - SAMPLE_L is identical to the SAMPLE opcode except
				1828	that the LOD is provided directly as a scalar value,
Roland Scheidegger	427d36a	2013-02-12 16:41:56 +0100	[diff] [blame]	1829	representing no anisotropy.
				1830	SAMPLE_L dst, address, sampler_view, sampler, explicit_lod
Zack Rusin	bdbe77f	2011-01-24 17:47:10 -0500	[diff] [blame]	1831	e.g.
Roland Scheidegger	427d36a	2013-02-12 16:41:56 +0100	[diff] [blame]	1832	SAMPLE_L TEMP[0], TEMP[1], SVIEW[0], SAMP[0], TEMP[2].x
Zack Rusin	bdbe77f	2011-01-24 17:47:10 -0500	[diff] [blame]	1833
				1834	.. opcode:: GATHER4 - Gathers the four texels to be used in a bi-linear
				1835	filtering operation and packs them into a single register.
Brian Paul	0cd6800	2012-03-30 09:41:42 -0600	[diff] [blame]	1836	Only works with 2D, 2D array, cubemaps, and cubemaps arrays.
Zack Rusin	bdbe77f	2011-01-24 17:47:10 -0500	[diff] [blame]	1837	For 2D textures, only the addressing modes of the sampler and
				1838	the top level of any mip pyramid are used. Set W to zero.
				1839	It behaves like the SAMPLE instruction, but a filtered
				1840	sample is not generated. The four samples that contribute
Brian Paul	0cd6800	2012-03-30 09:41:42 -0600	[diff] [blame]	1841	to filtering are placed into xyzw in counter-clockwise order,
Zack Rusin	bdbe77f	2011-01-24 17:47:10 -0500	[diff] [blame]	1842	starting with the (u,v) texture coordinate delta at the
				1843	following locations (-, +), (+, +), (+, -), (-, -), where
				1844	the magnitude of the deltas are half a texel.
				1845
				1846
Francisco Jerez	a5f44cc	2012-05-01 02:38:51 +0200	[diff] [blame]	1847	.. opcode:: SVIEWINFO - query the dimensions of a given sampler view.
Zack Rusin	bdbe77f	2011-01-24 17:47:10 -0500	[diff] [blame]	1848	dst receives width, height, depth or array size and
Roland Scheidegger	614982d3	2013-02-05 13:37:57 -0800	[diff] [blame]	1849	number of mipmap levels as int4. The dst can have a writemask
Zack Rusin	3fa814d	2011-01-24 21:45:37 -0500	[diff] [blame]	1850	which will specify what info is the caller interested
				1851	in.
Francisco Jerez	a5f44cc	2012-05-01 02:38:51 +0200	[diff] [blame]	1852	SVIEWINFO dst, src_mip_level, sampler_view
Zack Rusin	bdbe77f	2011-01-24 17:47:10 -0500	[diff] [blame]	1853	e.g.
Francisco Jerez	a5f44cc	2012-05-01 02:38:51 +0200	[diff] [blame]	1854	SVIEWINFO TEMP[0], TEMP[1].x, SVIEW[0]
Zack Rusin	3fa814d	2011-01-24 21:45:37 -0500	[diff] [blame]	1855	src_mip_level is an unsigned integer scalar. If it's
				1856	out of range then returns 0 for width, height and
				1857	depth/array size but the total number of mipmap is
Francisco Jerez	a5f44cc	2012-05-01 02:38:51 +0200	[diff] [blame]	1858	still returned correctly for the given sampler view.
Zack Rusin	3fa814d	2011-01-24 21:45:37 -0500	[diff] [blame]	1859	The returned width, height and depth values are for
				1860	the mipmap level selected by the src_mip_level and
				1861	are in the number of texels.
				1862	For 1d texture array width is in dst.x, array size
				1863	is in dst.y and dst.zw are always 0.
Zack Rusin	bdbe77f	2011-01-24 17:47:10 -0500	[diff] [blame]	1864
				1865	.. opcode:: SAMPLE_POS - query the position of a given sample.
				1866	dst receives float4 (x, y, 0, 0) indicated where the
				1867	sample is located. If the resource is not a multi-sample
				1868	resource and not a render target, the result is 0.
				1869
Zack Rusin	3fa814d	2011-01-24 21:45:37 -0500	[diff] [blame]	1870	.. opcode:: SAMPLE_INFO - dst receives number of samples in x.
				1871	If the resource is not a multi-sample resource and
Zack Rusin	bdbe77f	2011-01-24 17:47:10 -0500	[diff] [blame]	1872	not a render target, the result is 0.
				1873
				1874
Francisco Jerez	a5f44cc	2012-05-01 02:38:51 +0200	[diff] [blame]	1875	.. _resourceopcodes:
				1876
				1877	Resource Access Opcodes
				1878	^^^^^^^^^^^^^^^^^^^^^^^
				1879
				1880	.. opcode:: LOAD - Fetch data from a shader resource
				1881
				1882	Syntax: ``LOAD dst, resource, address``
				1883
				1884	Example: ``LOAD TEMP[0], RES[0], TEMP[1]``
				1885
				1886	Using the provided integer address, LOAD fetches data
				1887	from the specified buffer or texture without any
				1888	filtering.
				1889
				1890	The 'address' is specified as a vector of unsigned
				1891	integers. If the 'address' is out of range the result
				1892	is unspecified.
				1893
				1894	Only the first mipmap level of a resource can be read
				1895	from using this instruction.
				1896
				1897	For 1D or 2D texture arrays, the array index is
				1898	provided as an unsigned integer in address.y or
				1899	address.z, respectively. address.yz are ignored for
				1900	buffers and 1D textures. address.z is ignored for 1D
				1901	texture arrays and 2D textures. address.w is always
				1902	ignored.
				1903
Francisco Jerez	b8e808f	2012-04-30 20:20:29 +0200	[diff] [blame]	1904	.. opcode:: STORE - Write data to a shader resource
				1905
				1906	Syntax: ``STORE resource, address, src``
				1907
				1908	Example: ``STORE RES[0], TEMP[0], TEMP[1]``
				1909
				1910	Using the provided integer address, STORE writes data
				1911	to the specified buffer or texture.
				1912
				1913	The 'address' is specified as a vector of unsigned
				1914	integers. If the 'address' is out of range the result
				1915	is unspecified.
				1916
				1917	Only the first mipmap level of a resource can be
				1918	written to using this instruction.
				1919
				1920	For 1D or 2D texture arrays, the array index is
				1921	provided as an unsigned integer in address.y or
				1922	address.z, respectively. address.yz are ignored for
				1923	buffers and 1D textures. address.z is ignored for 1D
				1924	texture arrays and 2D textures. address.w is always
				1925	ignored.
				1926
Francisco Jerez	a5f44cc	2012-05-01 02:38:51 +0200	[diff] [blame]	1927
Francisco Jerez	9e550c3	2012-04-30 20:21:38 +0200	[diff] [blame]	1928	.. _threadsyncopcodes:
				1929
				1930	Inter-thread synchronization opcodes
				1931	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				1932
				1933	These opcodes are intended for communication between threads running
				1934	within the same compute grid. For now they're only valid in compute
				1935	programs.
				1936
				1937	.. opcode:: MFENCE - Memory fence
				1938
				1939	Syntax: ``MFENCE resource``
				1940
				1941	Example: ``MFENCE RES[0]``
				1942
				1943	This opcode forces strong ordering between any memory access
				1944	operations that affect the specified resource. This means that
				1945	previous loads and stores (and only those) will be performed and
				1946	visible to other threads before the program execution continues.
				1947
				1948
				1949	.. opcode:: LFENCE - Load memory fence
				1950
				1951	Syntax: ``LFENCE resource``
				1952
				1953	Example: ``LFENCE RES[0]``
				1954
				1955	Similar to MFENCE, but it only affects the ordering of memory loads.
				1956
				1957
				1958	.. opcode:: SFENCE - Store memory fence
				1959
				1960	Syntax: ``SFENCE resource``
				1961
				1962	Example: ``SFENCE RES[0]``
				1963
				1964	Similar to MFENCE, but it only affects the ordering of memory stores.
				1965
				1966
				1967	.. opcode:: BARRIER - Thread group barrier
				1968
				1969	``BARRIER``
				1970
				1971	This opcode suspends the execution of the current thread until all
				1972	the remaining threads in the working group reach the same point of
				1973	the program. Results are unspecified if any of the remaining
				1974	threads terminates or never reaches an executed BARRIER instruction.
				1975
				1976
Francisco Jerez	c2d31a8	2012-04-30 20:22:23 +0200	[diff] [blame]	1977	.. _atomopcodes:
				1978
				1979	Atomic opcodes
				1980	^^^^^^^^^^^^^^
				1981
				1982	These opcodes provide atomic variants of some common arithmetic and
				1983	logical operations. In this context atomicity means that another
				1984	concurrent memory access operation that affects the same memory
				1985	location is guaranteed to be performed strictly before or after the
				1986	entire execution of the atomic operation.
				1987
				1988	For the moment they're only valid in compute programs.
				1989
				1990	.. opcode:: ATOMUADD - Atomic integer addition
				1991
				1992	Syntax: ``ATOMUADD dst, resource, offset, src``
				1993
				1994	Example: ``ATOMUADD TEMP[0], RES[0], TEMP[1], TEMP[2]``
				1995
				1996	The following operation is performed atomically on each component:
				1997
				1998	.. math::
				1999
				2000	dst_i = resource[offset]_i
				2001
				2002	resource[offset]_i = dst_i + src_i
				2003
				2004
				2005	.. opcode:: ATOMXCHG - Atomic exchange
				2006
				2007	Syntax: ``ATOMXCHG dst, resource, offset, src``
				2008
				2009	Example: ``ATOMXCHG TEMP[0], RES[0], TEMP[1], TEMP[2]``
				2010
				2011	The following operation is performed atomically on each component:
				2012
				2013	.. math::
				2014
				2015	dst_i = resource[offset]_i
				2016
				2017	resource[offset]_i = src_i
				2018
				2019
				2020	.. opcode:: ATOMCAS - Atomic compare-and-exchange
				2021
				2022	Syntax: ``ATOMCAS dst, resource, offset, cmp, src``
				2023
				2024	Example: ``ATOMCAS TEMP[0], RES[0], TEMP[1], TEMP[2], TEMP[3]``
				2025
				2026	The following operation is performed atomically on each component:
				2027
				2028	.. math::
				2029
				2030	dst_i = resource[offset]_i
				2031
				2032	resource[offset]_i = (dst_i == cmp_i ? src_i : dst_i)
				2033
				2034
				2035	.. opcode:: ATOMAND - Atomic bitwise And
				2036
				2037	Syntax: ``ATOMAND dst, resource, offset, src``
				2038
				2039	Example: ``ATOMAND TEMP[0], RES[0], TEMP[1], TEMP[2]``
				2040
				2041	The following operation is performed atomically on each component:
				2042
				2043	.. math::
				2044
				2045	dst_i = resource[offset]_i
				2046
				2047	resource[offset]_i = dst_i \& src_i
				2048
				2049
				2050	.. opcode:: ATOMOR - Atomic bitwise Or
				2051
				2052	Syntax: ``ATOMOR dst, resource, offset, src``
				2053
				2054	Example: ``ATOMOR TEMP[0], RES[0], TEMP[1], TEMP[2]``
				2055
				2056	The following operation is performed atomically on each component:
				2057
				2058	.. math::
				2059
				2060	dst_i = resource[offset]_i
				2061
				2062	resource[offset]_i = dst_i \| src_i
				2063
				2064
				2065	.. opcode:: ATOMXOR - Atomic bitwise Xor
				2066
				2067	Syntax: ``ATOMXOR dst, resource, offset, src``
				2068
				2069	Example: ``ATOMXOR TEMP[0], RES[0], TEMP[1], TEMP[2]``
				2070
				2071	The following operation is performed atomically on each component:
				2072
				2073	.. math::
				2074
				2075	dst_i = resource[offset]_i
				2076
				2077	resource[offset]_i = dst_i \oplus src_i
				2078
				2079
				2080	.. opcode:: ATOMUMIN - Atomic unsigned minimum
				2081
				2082	Syntax: ``ATOMUMIN dst, resource, offset, src``
				2083
				2084	Example: ``ATOMUMIN TEMP[0], RES[0], TEMP[1], TEMP[2]``
				2085
				2086	The following operation is performed atomically on each component:
				2087
				2088	.. math::
				2089
				2090	dst_i = resource[offset]_i
				2091
				2092	resource[offset]_i = (dst_i < src_i ? dst_i : src_i)
				2093
				2094
				2095	.. opcode:: ATOMUMAX - Atomic unsigned maximum
				2096
				2097	Syntax: ``ATOMUMAX dst, resource, offset, src``
				2098
				2099	Example: ``ATOMUMAX TEMP[0], RES[0], TEMP[1], TEMP[2]``
				2100
				2101	The following operation is performed atomically on each component:
				2102
				2103	.. math::
				2104
				2105	dst_i = resource[offset]_i
				2106
				2107	resource[offset]_i = (dst_i > src_i ? dst_i : src_i)
				2108
				2109
				2110	.. opcode:: ATOMIMIN - Atomic signed minimum
				2111
				2112	Syntax: ``ATOMIMIN dst, resource, offset, src``
				2113
				2114	Example: ``ATOMIMIN TEMP[0], RES[0], TEMP[1], TEMP[2]``
				2115
				2116	The following operation is performed atomically on each component:
				2117
				2118	.. math::
				2119
				2120	dst_i = resource[offset]_i
				2121
				2122	resource[offset]_i = (dst_i < src_i ? dst_i : src_i)
				2123
				2124
				2125	.. opcode:: ATOMIMAX - Atomic signed maximum
				2126
				2127	Syntax: ``ATOMIMAX dst, resource, offset, src``
				2128
				2129	Example: ``ATOMIMAX TEMP[0], RES[0], TEMP[1], TEMP[2]``
				2130
				2131	The following operation is performed atomically on each component:
				2132
				2133	.. math::
				2134
				2135	dst_i = resource[offset]_i
				2136
				2137	resource[offset]_i = (dst_i > src_i ? dst_i : src_i)
				2138
				2139
				2140
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	2141	Explanation of symbols used
Corbin Simpson	5bcd26c	2009-12-21 21:04:10 -0800	[diff] [blame]	2142	------------------------------
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	2143
				2144
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	2145	Functions
Corbin Simpson	5bcd26c	2009-12-21 21:04:10 -0800	[diff] [blame]	2146	^^^^^^^^^^^^^^
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	2147
				2148
Corbin Simpson	14743ac	2009-12-21 19:57:56 -0800	[diff] [blame]	2149	:math:`\|x\|` Absolute value of `x`.
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	2150
Corbin Simpson	14743ac	2009-12-21 19:57:56 -0800	[diff] [blame]	2151	:math:`\lceil x \rceil` Ceiling of `x`.
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	2152
				2153	clamp(x,y,z) Clamp x between y and z.
				2154	(x < y) ? y : (x > z) ? z : x
				2155
Corbin Simpson	dd801e5	2009-12-21 19:41:09 -0800	[diff] [blame]	2156	:math:`\lfloor x\rfloor` Floor of `x`.
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	2157
Corbin Simpson	14743ac	2009-12-21 19:57:56 -0800	[diff] [blame]	2158	:math:`\log_2{x}` Logarithm of `x`, base 2.
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	2159
				2160	max(x,y) Maximum of x and y.
				2161	(x > y) ? x : y
				2162
				2163	min(x,y) Minimum of x and y.
				2164	(x < y) ? x : y
				2165
				2166	partialx(x) Derivative of x relative to fragment's X.
				2167
				2168	partialy(x) Derivative of x relative to fragment's Y.
				2169
				2170	pop() Pop from stack.
				2171
Corbin Simpson	dd801e5	2009-12-21 19:41:09 -0800	[diff] [blame]	2172	:math:`x^y` `x` to the power `y`.
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	2173
				2174	push(x) Push x on stack.
				2175
				2176	round(x) Round x.
				2177
Michal Krol	07f416c	2010-01-04 13:21:32 +0100	[diff] [blame]	2178	trunc(x) Truncate x, i.e. drop the fraction bits.
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	2179
				2180
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	2181	Keywords
Corbin Simpson	5bcd26c	2009-12-21 21:04:10 -0800	[diff] [blame]	2182	^^^^^^^^^^^^^
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	2183
				2184
				2185	discard Discard fragment.
				2186
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	2187	pc Program counter.
				2188
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	2189	target Label of target instruction.
				2190
				2191
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	2192	Other tokens
Corbin Simpson	5bcd26c	2009-12-21 21:04:10 -0800	[diff] [blame]	2193	---------------
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	2194
				2195
Michal Krol	63d6097	2010-02-03 15:45:32 +0100	[diff] [blame]	2196	Declaration
				2197	^^^^^^^^^^^
				2198
				2199
				2200	Declares a register that is will be referenced as an operand in Instruction
				2201	tokens.
				2202
				2203	File field contains register file that is being declared and is one
				2204	of TGSI_FILE.
				2205
				2206	UsageMask field specifies which of the register components can be accessed
				2207	and is one of TGSI_WRITEMASK.
				2208
Francisco Jerez	2644952	2012-03-18 19:21:36 +0100	[diff] [blame]	2209	The Local flag specifies that a given value isn't intended for
				2210	subroutine parameter passing and, as a result, the implementation
				2211	isn't required to give any guarantees of it being preserved across
				2212	subroutine boundaries. As it's merely a compiler hint, the
				2213	implementation is free to ignore it.
				2214
Michal Krol	63d6097	2010-02-03 15:45:32 +0100	[diff] [blame]	2215	If Dimension flag is set to 1, a Declaration Dimension token follows.
				2216
				2217	If Semantic flag is set to 1, a Declaration Semantic token follows.
				2218
Francisco Jerez	1279923	2012-04-30 18:27:52 +0200	[diff] [blame]	2219	If Interpolate flag is set to 1, a Declaration Interpolate token follows.
Michal Krol	63d6097	2010-02-03 15:45:32 +0100	[diff] [blame]	2220
Zack Rusin	bdbe77f	2011-01-24 17:47:10 -0500	[diff] [blame]	2221	If file is TGSI_FILE_RESOURCE, a Declaration Resource token follows.
				2222
Christian König	897303f	2013-03-14 11:10:16 +0100	[diff] [blame]	2223	If Array flag is set to 1, a Declaration Array token follows.
				2224
				2225	Array Declaration
				2226	^^^^^^^^^^^^^^^^^^^^^^^^
				2227
				2228	Declarations can optional have an ArrayID attribute which can be referred by
				2229	indirect addressing operands. An ArrayID of zero is reserved and treaded as
				2230	if no ArrayID is specified.
				2231
				2232	If an indirect addressing operand refers to a specific declaration by using
				2233	an ArrayID only the registers in this declaration are guaranteed to be
				2234	accessed, accessing any register outside this declaration results in undefined
				2235	behavior. Note that for compatibility the effective index is zero-based and
				2236	not relative to the specified declaration
				2237
				2238	If no ArrayID is specified with an indirect addressing operand the whole
				2239	register file might be accessed by this operand. This is strongly discouraged
				2240	and will prevent packing of scalar/vec2 arrays and effective alias analysis.
Michal Krol	63d6097	2010-02-03 15:45:32 +0100	[diff] [blame]	2241
Corbin Simpson	da65ac6	2009-12-21 20:32:46 -0800	[diff] [blame]	2242	Declaration Semantic
Corbin Simpson	5bcd26c	2009-12-21 21:04:10 -0800	[diff] [blame]	2243	^^^^^^^^^^^^^^^^^^^^^^^^
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	2244
Brian Paul	05a18f4	2010-06-24 07:21:15 -0600	[diff] [blame]	2245	Vertex and fragment shader input and output registers may be labeled
				2246	with semantic information consisting of a name and index.
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	2247
				2248	Follows Declaration token if Semantic bit is set.
				2249
				2250	Since its purpose is to link a shader with other stages of the pipeline,
				2251	it is valid to follow only those Declaration tokens that declare a register
				2252	either in INPUT or OUTPUT file.
				2253
				2254	SemanticName field contains the semantic name of the register being declared.
				2255	There is no default value.
				2256
				2257	SemanticIndex is an optional subscript that can be used to distinguish
				2258	different register declarations with the same semantic name. The default value
				2259	is 0.
				2260
				2261	The meanings of the individual semantic names are explained in the following
				2262	sections.
				2263
Corbin Simpson	54ddf64	2009-12-23 23:36:06 -0800	[diff] [blame]	2264	TGSI_SEMANTIC_POSITION
				2265	""""""""""""""""""""""
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	2266
Brian Paul	50b3f2e	2010-06-23 17:00:10 -0600	[diff] [blame]	2267	For vertex shaders, TGSI_SEMANTIC_POSITION indicates the vertex shader
				2268	output register which contains the homogeneous vertex position in the clip
				2269	space coordinate system. After clipping, the X, Y and Z components of the
				2270	vertex will be divided by the W value to get normalized device coordinates.
Keith Whitwell	a62aaa7	2009-12-21 23:25:15 +0000	[diff] [blame]	2271
Brian Paul	50b3f2e	2010-06-23 17:00:10 -0600	[diff] [blame]	2272	For fragment shaders, TGSI_SEMANTIC_POSITION is used to indicate that
				2273	fragment shader input contains the fragment's window position. The X
				2274	component starts at zero and always increases from left to right.
				2275	The Y component starts at zero and always increases but Y=0 may either
				2276	indicate the top of the window or the bottom depending on the fragment
				2277	coordinate origin convention (see TGSI_PROPERTY_FS_COORD_ORIGIN).
				2278	The Z coordinate ranges from 0 to 1 to represent depth from the front
				2279	to the back of the Z buffer. The W component contains the reciprocol
				2280	of the interpolated vertex position W component.
Corbin Simpson	54ddf64	2009-12-23 23:36:06 -0800	[diff] [blame]	2281
Brian Paul	05a18f4	2010-06-24 07:21:15 -0600	[diff] [blame]	2282	Fragment shaders may also declare an output register with
				2283	TGSI_SEMANTIC_POSITION. Only the Z component is writable. This allows
				2284	the fragment shader to change the fragment's Z position.
				2285
Corbin Simpson	54ddf64	2009-12-23 23:36:06 -0800	[diff] [blame]	2286
Corbin Simpson	54ddf64	2009-12-23 23:36:06 -0800	[diff] [blame]	2287
				2288	TGSI_SEMANTIC_COLOR
				2289	"""""""""""""""""""
				2290
Brian Paul	50b3f2e	2010-06-23 17:00:10 -0600	[diff] [blame]	2291	For vertex shader outputs or fragment shader inputs/outputs, this
				2292	label indicates that the resister contains an R,G,B,A color.
Corbin Simpson	54ddf64	2009-12-23 23:36:06 -0800	[diff] [blame]	2293
Brian Paul	50b3f2e	2010-06-23 17:00:10 -0600	[diff] [blame]	2294	Several shader inputs/outputs may contain colors so the semantic index
				2295	is used to distinguish them. For example, color[0] may be the diffuse
				2296	color while color[1] may be the specular color.
				2297
				2298	This label is needed so that the flat/smooth shading can be applied
				2299	to the right interpolants during rasterization.
				2300
				2301
Corbin Simpson	54ddf64	2009-12-23 23:36:06 -0800	[diff] [blame]	2302
				2303	TGSI_SEMANTIC_BCOLOR
				2304	""""""""""""""""""""
				2305
				2306	Back-facing colors are only used for back-facing polygons, and are only valid
				2307	in vertex shader outputs. After rasterization, all polygons are front-facing
Brian Paul	50b3f2e	2010-06-23 17:00:10 -0600	[diff] [blame]	2308	and COLOR and BCOLOR end up occupying the same slots in the fragment shader,
				2309	so all BCOLORs effectively become regular COLORs in the fragment shader.
				2310
Corbin Simpson	54ddf64	2009-12-23 23:36:06 -0800	[diff] [blame]	2311
				2312	TGSI_SEMANTIC_FOG
				2313	"""""""""""""""""
				2314
Brian Paul	05a18f4	2010-06-24 07:21:15 -0600	[diff] [blame]	2315	Vertex shader inputs and outputs and fragment shader inputs may be
				2316	labeled with TGSI_SEMANTIC_FOG to indicate that the register contains
				2317	a fog coordinate in the form (F, 0, 0, 1). Typically, the fragment
				2318	shader will use the fog coordinate to compute a fog blend factor which
				2319	is used to blend the normal fragment color with a constant fog color.
Corbin Simpson	54ddf64	2009-12-23 23:36:06 -0800	[diff] [blame]	2320
Brian Paul	05a18f4	2010-06-24 07:21:15 -0600	[diff] [blame]	2321	Only the first component matters when writing from the vertex shader;
				2322	the driver will ensure that the coordinate is in this format when used
				2323	as a fragment shader input.
				2324
Corbin Simpson	54ddf64	2009-12-23 23:36:06 -0800	[diff] [blame]	2325
				2326	TGSI_SEMANTIC_PSIZE
				2327	"""""""""""""""""""
				2328
Brian Paul	05a18f4	2010-06-24 07:21:15 -0600	[diff] [blame]	2329	Vertex shader input and output registers may be labeled with
				2330	TGIS_SEMANTIC_PSIZE to indicate that the register contains a point size
				2331	in the form (S, 0, 0, 1). The point size controls the width or diameter
				2332	of points for rasterization. This label cannot be used in fragment
				2333	shaders.
Corbin Simpson	54ddf64	2009-12-23 23:36:06 -0800	[diff] [blame]	2334
				2335	When using this semantic, be sure to set the appropriate state in the
				2336	:ref:`rasterizer` first.
				2337
Brian Paul	05a18f4	2010-06-24 07:21:15 -0600	[diff] [blame]	2338
Christoph Bumiller	8acaf86	2013-03-15 22:11:31 +0100	[diff] [blame]	2339	TGSI_SEMANTIC_TEXCOORD
				2340	""""""""""""""""""""""
				2341
				2342	Only available if PIPE_CAP_TGSI_TEXCOORD is exposed !
				2343
				2344	Vertex shader outputs and fragment shader inputs may be labeled with
				2345	this semantic to make them replaceable by sprite coordinates via the
				2346	sprite_coord_enable state in the :ref:`rasterizer`.
				2347	The semantic index permitted with this semantic is limited to <= 7.
				2348
				2349	If the driver does not support TEXCOORD, sprite coordinate replacement
				2350	applies to inputs with the GENERIC semantic instead.
				2351
				2352	The intended use case for this semantic is gl_TexCoord.
				2353
				2354
				2355	TGSI_SEMANTIC_PCOORD
				2356	""""""""""""""""""""
				2357
				2358	Only available if PIPE_CAP_TGSI_TEXCOORD is exposed !
				2359
				2360	Fragment shader inputs may be labeled with TGSI_SEMANTIC_PCOORD to indicate
				2361	that the register contains sprite coordinates in the form (x, y, 0, 1), if
				2362	the current primitive is a point and point sprites are enabled. Otherwise,
				2363	the contents of the register are undefined.
				2364
				2365	The intended use case for this semantic is gl_PointCoord.
				2366
				2367
Corbin Simpson	54ddf64	2009-12-23 23:36:06 -0800	[diff] [blame]	2368	TGSI_SEMANTIC_GENERIC
				2369	"""""""""""""""""""""
				2370
Brian Paul	05a18f4	2010-06-24 07:21:15 -0600	[diff] [blame]	2371	All vertex/fragment shader inputs/outputs not labeled with any other
				2372	semantic label can be considered to be generic attributes. Typical
				2373	uses of generic inputs/outputs are texcoords and user-defined values.
Corbin Simpson	54ddf64	2009-12-23 23:36:06 -0800	[diff] [blame]	2374
Corbin Simpson	54ddf64	2009-12-23 23:36:06 -0800	[diff] [blame]	2375
				2376	TGSI_SEMANTIC_NORMAL
				2377	""""""""""""""""""""
				2378
Brian Paul	05a18f4	2010-06-24 07:21:15 -0600	[diff] [blame]	2379	Indicates that a vertex shader input is a normal vector. This is
				2380	typically only used for legacy graphics APIs.
				2381
Corbin Simpson	54ddf64	2009-12-23 23:36:06 -0800	[diff] [blame]	2382
				2383	TGSI_SEMANTIC_FACE
				2384	""""""""""""""""""
				2385
Brian Paul	05a18f4	2010-06-24 07:21:15 -0600	[diff] [blame]	2386	This label applies to fragment shader inputs only and indicates that
				2387	the register contains front/back-face information of the form (F, 0,
				2388	0, 1). The first component will be positive when the fragment belongs
				2389	to a front-facing polygon, and negative when the fragment belongs to a
				2390	back-facing polygon.
				2391
Corbin Simpson	54ddf64	2009-12-23 23:36:06 -0800	[diff] [blame]	2392
				2393	TGSI_SEMANTIC_EDGEFLAG
				2394	""""""""""""""""""""""
				2395
Brian Paul	7315300	2010-06-23 17:38:58 -0600	[diff] [blame]	2396	For vertex shaders, this sematic label indicates that an input or
				2397	output is a boolean edge flag. The register layout is [F, x, x, x]
				2398	where F is 0.0 or 1.0 and x = don't care. Normally, the vertex shader
				2399	simply copies the edge flag input to the edgeflag output.
				2400
				2401	Edge flags are used to control which lines or points are actually
				2402	drawn when the polygon mode converts triangles/quads/polygons into
				2403	points or lines.
				2404
Dave Airlie	4ecb2c1	2010-10-06 09:28:46 +1000	[diff] [blame]	2405
Roland Scheidegger	6b53e2b	2013-06-01 20:02:17 +0200	[diff] [blame]	2406	TGSI_SEMANTIC_STENCIL
				2407	"""""""""""""""""""""
				2408
				2409	For fragment shaders, this semantic label indicates that an output
Dave Airlie	4ecb2c1	2010-10-06 09:28:46 +1000	[diff] [blame]	2410	is a writable stencil reference value. Only the Y component is writable.
				2411	This allows the fragment shader to change the fragments stencilref value.
Luca Barbieri	7331713	2010-01-21 05:36:14 +0100	[diff] [blame]	2412
				2413
Roland Scheidegger	6b53e2b	2013-06-01 20:02:17 +0200	[diff] [blame]	2414	TGSI_SEMANTIC_VIEWPORT_INDEX
				2415	""""""""""""""""""""""""""""
				2416
				2417	For geometry shaders, this semantic label indicates that an output
				2418	contains the index of the viewport (and scissor) to use.
				2419	Only the X value is used.
				2420
				2421
				2422	TGSI_SEMANTIC_LAYER
				2423	"""""""""""""""""""
				2424
				2425	For geometry shaders, this semantic label indicates that an output
				2426	contains the layer value to use for the color and depth/stencil surfaces.
				2427	Only the X value is used. (Also known as rendertarget array index.)
				2428
				2429
Zack Rusin	3d08ead	2013-06-06 09:04:11 -0400	[diff] [blame]	2430	TGSI_SEMANTIC_CULLDIST
				2431	""""""""""""""""""""""
				2432
				2433	Used as distance to plane for performing application-defined culling
				2434	of individual primitives against a plane. When components of vertex
				2435	elements are given this label, these values are assumed to be a
				2436	float32 signed distance to a plane. Primitives will be completely
				2437	discarded if the plane distance for all of the vertices in the
				2438	primitive are < 0. If a vertex has a cull distance of NaN, that
				2439	vertex counts as "out" (as if its < 0);
Zack Rusin	5507c11	2013-06-10 23:36:59 -0400	[diff] [blame]	2440	The limits on both clip and cull distances are bound
				2441	by the PIPE_MAX_CLIP_OR_CULL_DISTANCE_COUNT define which defines
				2442	the maximum number of components that can be used to hold the
				2443	distances and by the PIPE_MAX_CLIP_OR_CULL_DISTANCE_ELEMENT_COUNT
				2444	which specifies the maximum number of registers which can be
				2445	annotated with those semantics.
				2446
				2447
				2448	TGSI_SEMANTIC_CLIPDIST
				2449	""""""""""""""""""""""
				2450
				2451	When components of vertex elements are identified this way, these
				2452	values are each assumed to be a float32 signed distance to a plane.
				2453	Primitive setup only invokes rasterization on pixels for which
				2454	the interpolated plane distances are >= 0. Multiple clip planes
				2455	can be implemented simultaneously, by annotating multiple
				2456	components of one or more vertex elements with the above specified
				2457	semantic. The limits on both clip and cull distances are bound
				2458	by the PIPE_MAX_CLIP_OR_CULL_DISTANCE_COUNT define which defines
				2459	the maximum number of components that can be used to hold the
				2460	distances and by the PIPE_MAX_CLIP_OR_CULL_DISTANCE_ELEMENT_COUNT
				2461	which specifies the maximum number of registers which can be
				2462	annotated with those semantics.
Zack Rusin	3d08ead	2013-06-06 09:04:11 -0400	[diff] [blame]	2463
Roland Scheidegger	6b53e2b	2013-06-01 20:02:17 +0200	[diff] [blame]	2464
Francisco Jerez	1279923	2012-04-30 18:27:52 +0200	[diff] [blame]	2465	Declaration Interpolate
				2466	^^^^^^^^^^^^^^^^^^^^^^^
				2467
				2468	This token is only valid for fragment shader INPUT declarations.
				2469
				2470	The Interpolate field specifes the way input is being interpolated by
				2471	the rasteriser and is one of TGSI_INTERPOLATE_*.
				2472
				2473	The CylindricalWrap bitfield specifies which register components
				2474	should be subject to cylindrical wrapping when interpolating by the
				2475	rasteriser. If TGSI_CYLINDRICAL_WRAP_X is set to 1, the X component
				2476	should be interpolated according to cylindrical wrapping rules.
				2477
				2478
Francisco Jerez	a5f44cc	2012-05-01 02:38:51 +0200	[diff] [blame]	2479	Declaration Sampler View
Zack Rusin	bdbe77f	2011-01-24 17:47:10 -0500	[diff] [blame]	2480	^^^^^^^^^^^^^^^^^^^^^^^^
				2481
Francisco Jerez	a5f44cc	2012-05-01 02:38:51 +0200	[diff] [blame]	2482	Follows Declaration token if file is TGSI_FILE_SAMPLER_VIEW.
				2483
				2484	DCL SVIEW[#], resource, type(s)
				2485
				2486	Declares a shader input sampler view and assigns it to a SVIEW[#]
				2487	register.
				2488
				2489	resource can be one of BUFFER, 1D, 2D, 3D, 1DArray and 2DArray.
				2490
				2491	type must be 1 or 4 entries (if specifying on a per-component
				2492	level) out of UNORM, SNORM, SINT, UINT and FLOAT.
				2493
				2494
				2495	Declaration Resource
				2496	^^^^^^^^^^^^^^^^^^^^
				2497
Zack Rusin	bdbe77f	2011-01-24 17:47:10 -0500	[diff] [blame]	2498	Follows Declaration token if file is TGSI_FILE_RESOURCE.
				2499
Francisco Jerez	b8e808f	2012-04-30 20:20:29 +0200	[diff] [blame]	2500	DCL RES[#], resource [, WR] [, RAW]
Zack Rusin	bdbe77f	2011-01-24 17:47:10 -0500	[diff] [blame]	2501
				2502	Declares a shader input resource and assigns it to a RES[#]
				2503	register.
				2504
				2505	resource can be one of BUFFER, 1D, 2D, 3D, CUBE, 1DArray and
				2506	2DArray.
				2507
Francisco Jerez	82c90b2	2012-04-30 19:08:55 +0200	[diff] [blame]	2508	If the RAW keyword is not specified, the texture data will be
				2509	subject to conversion, swizzling and scaling as required to yield
				2510	the specified data type from the physical data format of the bound
				2511	resource.
				2512
				2513	If the RAW keyword is specified, no channel conversion will be
				2514	performed: the values read for each of the channels (X,Y,Z,W) will
				2515	correspond to consecutive words in the same order and format
				2516	they're found in memory. No element-to-address conversion will be
				2517	performed either: the value of the provided X coordinate will be
				2518	interpreted in byte units instead of texel units. The result of
				2519	accessing a misaligned address is undefined.
				2520
Francisco Jerez	b8e808f	2012-04-30 20:20:29 +0200	[diff] [blame]	2521	Usage of the STORE opcode is only allowed if the WR (writable) flag
				2522	is set.
				2523
Zack Rusin	bdbe77f	2011-01-24 17:47:10 -0500	[diff] [blame]	2524
Luca Barbieri	7331713	2010-01-21 05:36:14 +0100	[diff] [blame]	2525	Properties
				2526	^^^^^^^^^^^^^^^^^^^^^^^^
				2527
				2528
				2529	Properties are general directives that apply to the whole TGSI program.
				2530
				2531	FS_COORD_ORIGIN
				2532	"""""""""""""""
				2533
				2534	Specifies the fragment shader TGSI_SEMANTIC_POSITION coordinate origin.
				2535	The default value is UPPER_LEFT.
				2536
				2537	If UPPER_LEFT, the position will be (0,0) at the upper left corner and
				2538	increase downward and rightward.
				2539	If LOWER_LEFT, the position will be (0,0) at the lower left corner and
				2540	increase upward and rightward.
				2541
				2542	OpenGL defaults to LOWER_LEFT, and is configurable with the
				2543	GL_ARB_fragment_coord_conventions extension.
				2544
				2545	DirectX 9/10 use UPPER_LEFT.
				2546
				2547	FS_COORD_PIXEL_CENTER
				2548	"""""""""""""""""""""
				2549
				2550	Specifies the fragment shader TGSI_SEMANTIC_POSITION pixel center convention.
				2551	The default value is HALF_INTEGER.
				2552
				2553	If HALF_INTEGER, the fractionary part of the position will be 0.5
				2554	If INTEGER, the fractionary part of the position will be 0.0
				2555
				2556	Note that this does not affect the set of fragments generated by
José Fonseca	2737abb	2013-04-23 19:40:05 +0100	[diff] [blame]	2557	rasterization, which is instead controlled by half_pixel_center in the
Luca Barbieri	7331713	2010-01-21 05:36:14 +0100	[diff] [blame]	2558	rasterizer.
				2559
				2560	OpenGL defaults to HALF_INTEGER, and is configurable with the
				2561	GL_ARB_fragment_coord_conventions extension.
				2562
				2563	DirectX 9 uses INTEGER.
				2564	DirectX 10 uses HALF_INTEGER.
Brian Paul	4778f46	2010-02-02 08:14:40 -0700	[diff] [blame]	2565
Dave Airlie	c9c8a5e	2010-12-18 10:34:35 +1000	[diff] [blame]	2566	FS_COLOR0_WRITES_ALL_CBUFS
				2567	""""""""""""""""""""""""""
				2568	Specifies that writes to the fragment shader color 0 are replicated to all
				2569	bound cbufs. This facilitates OpenGL's fragColor output vs fragData[0] where
				2570	fragData is directed to a single color buffer, but fragColor is broadcast.
Brian Paul	4778f46	2010-02-02 08:14:40 -0700	[diff] [blame]	2571
Marek Olšák	dc4c821	2012-01-10 00:19:00 +0100	[diff] [blame]	2572	VS_PROHIBIT_UCPS
				2573	""""""""""""""""""""""""""
				2574	If this property is set on the program bound to the shader stage before the
				2575	fragment shader, user clip planes should have no effect (be disabled) even if
				2576	that shader does not write to any clip distance outputs and the rasterizer's
				2577	clip_plane_enable is non-zero.
				2578	This property is only supported by drivers that also support shader clip
				2579	distance outputs.
				2580	This is useful for APIs that don't have UCPs and where clip distances written
				2581	by a shader cannot be disabled.
				2582
Brian Paul	4778f46	2010-02-02 08:14:40 -0700	[diff] [blame]	2583
				2584	Texture Sampling and Texture Formats
				2585	------------------------------------
				2586
Corbin Simpson	797dcc0	2010-02-02 17:07:26 -0800	[diff] [blame]	2587	This table shows how texture image components are returned as (x,y,z,w) tuples
				2588	by TGSI texture instructions, such as :opcode:`TEX`, :opcode:`TXD`, and
				2589	:opcode:`TXP`. For reference, OpenGL and Direct3D conventions are shown as
				2590	well.
Brian Paul	4778f46	2010-02-02 08:14:40 -0700	[diff] [blame]	2591
Corbin Simpson	516e715	2010-02-02 12:44:22 -0800	[diff] [blame]	2592	+--------------------+--------------+--------------------+--------------+
				2593	\| Texture Components \| Gallium \| OpenGL \| Direct3D 9 \|
				2594	+====================+==============+====================+==============+
Corbin Simpson	92867dc	2010-06-16 16:56:55 -0700	[diff] [blame]	2595	\| R \| (r, 0, 0, 1) \| (r, 0, 0, 1) \| (r, 1, 1, 1) \|
Corbin Simpson	516e715	2010-02-02 12:44:22 -0800	[diff] [blame]	2596	+--------------------+--------------+--------------------+--------------+
Corbin Simpson	92867dc	2010-06-16 16:56:55 -0700	[diff] [blame]	2597	\| RG \| (r, g, 0, 1) \| (r, g, 0, 1) \| (r, g, 1, 1) \|
Corbin Simpson	516e715	2010-02-02 12:44:22 -0800	[diff] [blame]	2598	+--------------------+--------------+--------------------+--------------+
				2599	\| RGB \| (r, g, b, 1) \| (r, g, b, 1) \| (r, g, b, 1) \|
				2600	+--------------------+--------------+--------------------+--------------+
				2601	\| RGBA \| (r, g, b, a) \| (r, g, b, a) \| (r, g, b, a) \|
				2602	+--------------------+--------------+--------------------+--------------+
				2603	\| A \| (0, 0, 0, a) \| (0, 0, 0, a) \| (0, 0, 0, a) \|
				2604	+--------------------+--------------+--------------------+--------------+
				2605	\| L \| (l, l, l, 1) \| (l, l, l, 1) \| (l, l, l, 1) \|
				2606	+--------------------+--------------+--------------------+--------------+
				2607	\| LA \| (l, l, l, a) \| (l, l, l, a) \| (l, l, l, a) \|
				2608	+--------------------+--------------+--------------------+--------------+
				2609	\| I \| (i, i, i, i) \| (i, i, i, i) \| N/A \|
				2610	+--------------------+--------------+--------------------+--------------+
				2611	\| UV \| XXX TBD \| (0, 0, 0, 1) \| (u, v, 1, 1) \|
				2612	\| \| \| [#envmap-bumpmap]_ \| \|
				2613	+--------------------+--------------+--------------------+--------------+
Brian Paul	3e572eb	2010-02-02 16:27:07 -0700	[diff] [blame]	2614	\| Z \| XXX TBD \| (z, z, z, 1) \| (0, z, 0, 1) \|
Corbin Simpson	516e715	2010-02-02 12:44:22 -0800	[diff] [blame]	2615	\| \| \| [#depth-tex-mode]_ \| \|
				2616	+--------------------+--------------+--------------------+--------------+
Dave Airlie	66a0d1e	2010-10-06 09:30:17 +1000	[diff] [blame]	2617	\| S \| (s, s, s, s) \| unknown \| unknown \|
				2618	+--------------------+--------------+--------------------+--------------+
Brian Paul	4778f46	2010-02-02 08:14:40 -0700	[diff] [blame]	2619
Corbin Simpson	516e715	2010-02-02 12:44:22 -0800	[diff] [blame]	2620	.. [#envmap-bumpmap] http://www.opengl.org/registry/specs/ATI/envmap_bumpmap.txt
Brian Paul	3e572eb	2010-02-02 16:27:07 -0700	[diff] [blame]	2621	.. [#depth-tex-mode] the default is (z, z, z, 1) but may also be (0, 0, 0, z)
Corbin Simpson	797dcc0	2010-02-02 17:07:26 -0800	[diff] [blame]	2622	or (z, z, z, z) depending on the value of GL_DEPTH_TEXTURE_MODE.