blob: 548a9a398556d79368c910a67be8f0b90b7736f8 [file] [log] [blame]
Corbin Simpsonc686e172009-12-20 15:00:40 -08001TGSI
2====
3
Michal Krolb6659682010-01-04 12:52:43 +01004TGSI, Tungsten Graphics Shader Infrastructure, is an intermediate language
Corbin Simpsonc686e172009-12-20 15:00:40 -08005for describing shaders. Since Gallium is inherently shaderful, shaders are
6an important part of the API. TGSI is the only intermediate representation
7used by all drivers.
Keith Whitwella62aaa72009-12-21 23:25:15 +00008
Corbin Simpson62ca7b82010-02-02 16:36:34 -08009Basics
10------
11
12All TGSI instructions, known as *opcodes*, operate on arbitrary-precision
13floating-point four-component vectors. An opcode may have up to one
14destination register, known as *dst*, and between zero and three source
15registers, called *src0* through *src2*, or simply *src* if there is only
16one.
17
18Some instructions, like :opcode:`I2F`, permit re-interpretation of vector
19components as integers. Other instructions permit using registers as
20two-component vectors with double precision; see :ref:`Double Opcodes`.
21
Corbin Simpson17c2a442010-02-02 17:02:28 -080022When an instruction has a scalar result, the result is usually copied into
23each of the components of *dst*. When this happens, the result is said to be
24*replicated* to *dst*. :opcode:`RCP` is one such instruction.
25
Corbin Simpson5bcd26c2009-12-21 21:04:10 -080026Instruction Set
27---------------
28
Corbin Simpson9d4cb6e2010-06-16 18:34:32 -070029Core ISA
Corbin Simpson5bcd26c2009-12-21 21:04:10 -080030^^^^^^^^^^^^^^^^^^^^^^^^^
Keith Whitwella62aaa72009-12-21 23:25:15 +000031
Corbin Simpson9d4cb6e2010-06-16 18:34:32 -070032These opcodes are guaranteed to be available regardless of the driver being
33used.
Keith Whitwella62aaa72009-12-21 23:25:15 +000034
Corbin Simpson85805222010-02-02 16:20:12 -080035.. opcode:: ARL - Address Register Load
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080036
37.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +000038
Corbin Simpsond92a6852009-12-21 19:30:29 -080039 dst.x = \lfloor src.x\rfloor
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080040
Corbin Simpsond92a6852009-12-21 19:30:29 -080041 dst.y = \lfloor src.y\rfloor
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080042
Corbin Simpsond92a6852009-12-21 19:30:29 -080043 dst.z = \lfloor src.z\rfloor
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080044
Corbin Simpsond92a6852009-12-21 19:30:29 -080045 dst.w = \lfloor src.w\rfloor
Keith Whitwella62aaa72009-12-21 23:25:15 +000046
47
Corbin Simpson85805222010-02-02 16:20:12 -080048.. opcode:: MOV - Move
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080049
50.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +000051
52 dst.x = src.x
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080053
Keith Whitwella62aaa72009-12-21 23:25:15 +000054 dst.y = src.y
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080055
Keith Whitwella62aaa72009-12-21 23:25:15 +000056 dst.z = src.z
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080057
Keith Whitwella62aaa72009-12-21 23:25:15 +000058 dst.w = src.w
59
60
Corbin Simpson85805222010-02-02 16:20:12 -080061.. opcode:: LIT - Light Coefficients
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080062
63.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +000064
Corbin Simpsonda65ac62009-12-21 20:32:46 -080065 dst.x = 1
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080066
Corbin Simpsonda65ac62009-12-21 20:32:46 -080067 dst.y = max(src.x, 0)
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080068
Corbin Simpsonda65ac62009-12-21 20:32:46 -080069 dst.z = (src.x > 0) ? max(src.y, 0)^{clamp(src.w, -128, 128))} : 0
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080070
Corbin Simpsonda65ac62009-12-21 20:32:46 -080071 dst.w = 1
Keith Whitwella62aaa72009-12-21 23:25:15 +000072
73
Corbin Simpson85805222010-02-02 16:20:12 -080074.. opcode:: RCP - Reciprocal
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080075
Corbin Simpson17c2a442010-02-02 17:02:28 -080076This instruction replicates its result.
77
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080078.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +000079
Corbin Simpson17c2a442010-02-02 17:02:28 -080080 dst = \frac{1}{src.x}
Keith Whitwella62aaa72009-12-21 23:25:15 +000081
82
Corbin Simpson85805222010-02-02 16:20:12 -080083.. opcode:: RSQ - Reciprocal Square Root
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080084
Corbin Simpson17c2a442010-02-02 17:02:28 -080085This instruction replicates its result.
86
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080087.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +000088
Corbin Simpson17c2a442010-02-02 17:02:28 -080089 dst = \frac{1}{\sqrt{|src.x|}}
Keith Whitwella62aaa72009-12-21 23:25:15 +000090
91
Corbin Simpson85805222010-02-02 16:20:12 -080092.. opcode:: EXP - Approximate Exponential Base 2
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080093
94.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +000095
Corbin Simpsondd801e52009-12-21 19:41:09 -080096 dst.x = 2^{\lfloor src.x\rfloor}
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080097
Corbin Simpsond92a6852009-12-21 19:30:29 -080098 dst.y = src.x - \lfloor src.x\rfloor
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080099
Corbin Simpsondd801e52009-12-21 19:41:09 -0800100 dst.z = 2^{src.x}
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800101
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800102 dst.w = 1
Keith Whitwella62aaa72009-12-21 23:25:15 +0000103
104
Corbin Simpson85805222010-02-02 16:20:12 -0800105.. opcode:: LOG - Approximate Logarithm Base 2
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800106
107.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000108
Corbin Simpson14743ac2009-12-21 19:57:56 -0800109 dst.x = \lfloor\log_2{|src.x|}\rfloor
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800110
Corbin Simpson14743ac2009-12-21 19:57:56 -0800111 dst.y = \frac{|src.x|}{2^{\lfloor\log_2{|src.x|}\rfloor}}
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800112
Corbin Simpson14743ac2009-12-21 19:57:56 -0800113 dst.z = \log_2{|src.x|}
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800114
Corbin Simpson14743ac2009-12-21 19:57:56 -0800115 dst.w = 1
Keith Whitwella62aaa72009-12-21 23:25:15 +0000116
117
Corbin Simpson85805222010-02-02 16:20:12 -0800118.. opcode:: MUL - Multiply
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800119
120.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000121
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800122 dst.x = src0.x \times src1.x
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800123
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800124 dst.y = src0.y \times src1.y
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800125
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800126 dst.z = src0.z \times src1.z
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800127
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800128 dst.w = src0.w \times src1.w
Keith Whitwella62aaa72009-12-21 23:25:15 +0000129
130
Corbin Simpson85805222010-02-02 16:20:12 -0800131.. opcode:: ADD - Add
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800132
133.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000134
135 dst.x = src0.x + src1.x
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800136
Keith Whitwella62aaa72009-12-21 23:25:15 +0000137 dst.y = src0.y + src1.y
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800138
Keith Whitwella62aaa72009-12-21 23:25:15 +0000139 dst.z = src0.z + src1.z
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800140
Keith Whitwella62aaa72009-12-21 23:25:15 +0000141 dst.w = src0.w + src1.w
142
143
Corbin Simpson85805222010-02-02 16:20:12 -0800144.. opcode:: DP3 - 3-component Dot Product
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800145
Corbin Simpson17c2a442010-02-02 17:02:28 -0800146This instruction replicates its result.
147
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800148.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000149
Corbin Simpson17c2a442010-02-02 17:02:28 -0800150 dst = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z
Keith Whitwella62aaa72009-12-21 23:25:15 +0000151
152
Corbin Simpson85805222010-02-02 16:20:12 -0800153.. opcode:: DP4 - 4-component Dot Product
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800154
Corbin Simpson17c2a442010-02-02 17:02:28 -0800155This instruction replicates its result.
156
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800157.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000158
Corbin Simpson17c2a442010-02-02 17:02:28 -0800159 dst = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src0.w \times src1.w
Keith Whitwella62aaa72009-12-21 23:25:15 +0000160
161
Corbin Simpson85805222010-02-02 16:20:12 -0800162.. opcode:: DST - Distance Vector
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800163
164.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000165
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800166 dst.x = 1
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800167
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800168 dst.y = src0.y \times src1.y
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800169
Keith Whitwella62aaa72009-12-21 23:25:15 +0000170 dst.z = src0.z
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800171
Keith Whitwella62aaa72009-12-21 23:25:15 +0000172 dst.w = src1.w
173
174
Corbin Simpson85805222010-02-02 16:20:12 -0800175.. opcode:: MIN - Minimum
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800176
177.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000178
179 dst.x = min(src0.x, src1.x)
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800180
Keith Whitwella62aaa72009-12-21 23:25:15 +0000181 dst.y = min(src0.y, src1.y)
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800182
Keith Whitwella62aaa72009-12-21 23:25:15 +0000183 dst.z = min(src0.z, src1.z)
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800184
Keith Whitwella62aaa72009-12-21 23:25:15 +0000185 dst.w = min(src0.w, src1.w)
186
187
Corbin Simpson85805222010-02-02 16:20:12 -0800188.. opcode:: MAX - Maximum
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800189
190.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000191
192 dst.x = max(src0.x, src1.x)
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800193
Keith Whitwella62aaa72009-12-21 23:25:15 +0000194 dst.y = max(src0.y, src1.y)
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800195
Keith Whitwella62aaa72009-12-21 23:25:15 +0000196 dst.z = max(src0.z, src1.z)
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800197
Keith Whitwella62aaa72009-12-21 23:25:15 +0000198 dst.w = max(src0.w, src1.w)
199
200
Corbin Simpson85805222010-02-02 16:20:12 -0800201.. opcode:: SLT - Set On Less Than
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800202
203.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000204
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800205 dst.x = (src0.x < src1.x) ? 1 : 0
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800206
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800207 dst.y = (src0.y < src1.y) ? 1 : 0
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800208
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800209 dst.z = (src0.z < src1.z) ? 1 : 0
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800210
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800211 dst.w = (src0.w < src1.w) ? 1 : 0
Keith Whitwella62aaa72009-12-21 23:25:15 +0000212
213
Corbin Simpson85805222010-02-02 16:20:12 -0800214.. opcode:: SGE - Set On Greater Equal Than
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800215
216.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000217
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800218 dst.x = (src0.x >= src1.x) ? 1 : 0
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800219
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800220 dst.y = (src0.y >= src1.y) ? 1 : 0
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800221
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800222 dst.z = (src0.z >= src1.z) ? 1 : 0
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800223
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800224 dst.w = (src0.w >= src1.w) ? 1 : 0
Keith Whitwella62aaa72009-12-21 23:25:15 +0000225
226
Corbin Simpson85805222010-02-02 16:20:12 -0800227.. opcode:: MAD - Multiply And Add
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800228
229.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000230
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800231 dst.x = src0.x \times src1.x + src2.x
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800232
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800233 dst.y = src0.y \times src1.y + src2.y
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800234
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800235 dst.z = src0.z \times src1.z + src2.z
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800236
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800237 dst.w = src0.w \times src1.w + src2.w
Keith Whitwella62aaa72009-12-21 23:25:15 +0000238
239
Corbin Simpson85805222010-02-02 16:20:12 -0800240.. opcode:: SUB - Subtract
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800241
242.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000243
244 dst.x = src0.x - src1.x
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800245
Keith Whitwella62aaa72009-12-21 23:25:15 +0000246 dst.y = src0.y - src1.y
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800247
Keith Whitwella62aaa72009-12-21 23:25:15 +0000248 dst.z = src0.z - src1.z
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800249
Keith Whitwella62aaa72009-12-21 23:25:15 +0000250 dst.w = src0.w - src1.w
251
252
Corbin Simpson85805222010-02-02 16:20:12 -0800253.. opcode:: LRP - Linear Interpolate
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800254
255.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000256
Michal Krolb3567fc2010-01-04 12:59:17 +0100257 dst.x = src0.x \times src1.x + (1 - src0.x) \times src2.x
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800258
Michal Krolb3567fc2010-01-04 12:59:17 +0100259 dst.y = src0.y \times src1.y + (1 - src0.y) \times src2.y
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800260
Michal Krolb3567fc2010-01-04 12:59:17 +0100261 dst.z = src0.z \times src1.z + (1 - src0.z) \times src2.z
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800262
Michal Krolb3567fc2010-01-04 12:59:17 +0100263 dst.w = src0.w \times src1.w + (1 - src0.w) \times src2.w
Keith Whitwella62aaa72009-12-21 23:25:15 +0000264
265
Corbin Simpson85805222010-02-02 16:20:12 -0800266.. opcode:: CND - Condition
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800267
268.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000269
270 dst.x = (src2.x > 0.5) ? src0.x : src1.x
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800271
Keith Whitwella62aaa72009-12-21 23:25:15 +0000272 dst.y = (src2.y > 0.5) ? src0.y : src1.y
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800273
Keith Whitwella62aaa72009-12-21 23:25:15 +0000274 dst.z = (src2.z > 0.5) ? src0.z : src1.z
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800275
Keith Whitwella62aaa72009-12-21 23:25:15 +0000276 dst.w = (src2.w > 0.5) ? src0.w : src1.w
277
278
Corbin Simpson85805222010-02-02 16:20:12 -0800279.. opcode:: DP2A - 2-component Dot Product And Add
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800280
281.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000282
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800283 dst.x = src0.x \times src1.x + src0.y \times src1.y + src2.x
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800284
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800285 dst.y = src0.x \times src1.x + src0.y \times src1.y + src2.x
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800286
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800287 dst.z = src0.x \times src1.x + src0.y \times src1.y + src2.x
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800288
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800289 dst.w = src0.x \times src1.x + src0.y \times src1.y + src2.x
Keith Whitwella62aaa72009-12-21 23:25:15 +0000290
291
José Fonsecad9c6ebb2010-06-01 16:25:05 +0100292.. opcode:: FRC - Fraction
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800293
294.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000295
Corbin Simpsond92a6852009-12-21 19:30:29 -0800296 dst.x = src.x - \lfloor src.x\rfloor
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800297
Corbin Simpsond92a6852009-12-21 19:30:29 -0800298 dst.y = src.y - \lfloor src.y\rfloor
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800299
Corbin Simpsond92a6852009-12-21 19:30:29 -0800300 dst.z = src.z - \lfloor src.z\rfloor
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800301
Corbin Simpsond92a6852009-12-21 19:30:29 -0800302 dst.w = src.w - \lfloor src.w\rfloor
Keith Whitwella62aaa72009-12-21 23:25:15 +0000303
304
Corbin Simpson85805222010-02-02 16:20:12 -0800305.. opcode:: CLAMP - Clamp
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800306
307.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000308
309 dst.x = clamp(src0.x, src1.x, src2.x)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800310
Keith Whitwella62aaa72009-12-21 23:25:15 +0000311 dst.y = clamp(src0.y, src1.y, src2.y)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800312
Keith Whitwella62aaa72009-12-21 23:25:15 +0000313 dst.z = clamp(src0.z, src1.z, src2.z)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800314
Keith Whitwella62aaa72009-12-21 23:25:15 +0000315 dst.w = clamp(src0.w, src1.w, src2.w)
316
317
Corbin Simpson85805222010-02-02 16:20:12 -0800318.. opcode:: FLR - Floor
Corbin Simpsond92a6852009-12-21 19:30:29 -0800319
Corbin Simpson17c2a442010-02-02 17:02:28 -0800320This is identical to :opcode:`ARL`.
Keith Whitwella62aaa72009-12-21 23:25:15 +0000321
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800322.. math::
323
Corbin Simpsond92a6852009-12-21 19:30:29 -0800324 dst.x = \lfloor src.x\rfloor
325
326 dst.y = \lfloor src.y\rfloor
327
328 dst.z = \lfloor src.z\rfloor
329
330 dst.w = \lfloor src.w\rfloor
Keith Whitwella62aaa72009-12-21 23:25:15 +0000331
332
Corbin Simpson85805222010-02-02 16:20:12 -0800333.. opcode:: ROUND - Round
Keith Whitwella62aaa72009-12-21 23:25:15 +0000334
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800335.. math::
336
Keith Whitwella62aaa72009-12-21 23:25:15 +0000337 dst.x = round(src.x)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800338
Keith Whitwella62aaa72009-12-21 23:25:15 +0000339 dst.y = round(src.y)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800340
Keith Whitwella62aaa72009-12-21 23:25:15 +0000341 dst.z = round(src.z)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800342
Keith Whitwella62aaa72009-12-21 23:25:15 +0000343 dst.w = round(src.w)
344
345
Corbin Simpson85805222010-02-02 16:20:12 -0800346.. opcode:: EX2 - Exponential Base 2
Keith Whitwella62aaa72009-12-21 23:25:15 +0000347
Corbin Simpson17c2a442010-02-02 17:02:28 -0800348This instruction replicates its result.
349
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800350.. math::
351
Corbin Simpson17c2a442010-02-02 17:02:28 -0800352 dst = 2^{src.x}
Keith Whitwella62aaa72009-12-21 23:25:15 +0000353
354
Corbin Simpson85805222010-02-02 16:20:12 -0800355.. opcode:: LG2 - Logarithm Base 2
Keith Whitwella62aaa72009-12-21 23:25:15 +0000356
Corbin Simpson17c2a442010-02-02 17:02:28 -0800357This instruction replicates its result.
358
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800359.. math::
360
Corbin Simpson17c2a442010-02-02 17:02:28 -0800361 dst = \log_2{src.x}
Keith Whitwella62aaa72009-12-21 23:25:15 +0000362
363
Corbin Simpson85805222010-02-02 16:20:12 -0800364.. opcode:: POW - Power
Keith Whitwella62aaa72009-12-21 23:25:15 +0000365
Corbin Simpson17c2a442010-02-02 17:02:28 -0800366This instruction replicates its result.
367
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800368.. math::
369
Corbin Simpson17c2a442010-02-02 17:02:28 -0800370 dst = src0.x^{src1.x}
Keith Whitwella62aaa72009-12-21 23:25:15 +0000371
Corbin Simpson85805222010-02-02 16:20:12 -0800372.. opcode:: XPD - Cross Product
Keith Whitwella62aaa72009-12-21 23:25:15 +0000373
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800374.. math::
375
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800376 dst.x = src0.y \times src1.z - src1.y \times src0.z
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800377
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800378 dst.y = src0.z \times src1.x - src1.z \times src0.x
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800379
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800380 dst.z = src0.x \times src1.y - src1.x \times src0.y
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800381
382 dst.w = 1
Keith Whitwella62aaa72009-12-21 23:25:15 +0000383
384
Corbin Simpson85805222010-02-02 16:20:12 -0800385.. opcode:: ABS - Absolute
Keith Whitwella62aaa72009-12-21 23:25:15 +0000386
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800387.. math::
388
Corbin Simpson14743ac2009-12-21 19:57:56 -0800389 dst.x = |src.x|
390
391 dst.y = |src.y|
392
393 dst.z = |src.z|
394
395 dst.w = |src.w|
Keith Whitwella62aaa72009-12-21 23:25:15 +0000396
397
Corbin Simpson85805222010-02-02 16:20:12 -0800398.. opcode:: RCC - Reciprocal Clamped
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800399
Corbin Simpson17c2a442010-02-02 17:02:28 -0800400This instruction replicates its result.
401
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800402XXX cleanup on aisle three
Keith Whitwella62aaa72009-12-21 23:25:15 +0000403
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800404.. math::
405
Corbin Simpson17c2a442010-02-02 17:02:28 -0800406 dst = (1 / src.x) > 0 ? clamp(1 / src.x, 5.42101e-020, 1.884467e+019) : clamp(1 / src.x, -1.884467e+019, -5.42101e-020)
Keith Whitwella62aaa72009-12-21 23:25:15 +0000407
408
Corbin Simpson85805222010-02-02 16:20:12 -0800409.. opcode:: DPH - Homogeneous Dot Product
Keith Whitwella62aaa72009-12-21 23:25:15 +0000410
Corbin Simpson17c2a442010-02-02 17:02:28 -0800411This instruction replicates its result.
412
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800413.. math::
414
Corbin Simpson17c2a442010-02-02 17:02:28 -0800415 dst = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src1.w
Keith Whitwella62aaa72009-12-21 23:25:15 +0000416
417
Corbin Simpson85805222010-02-02 16:20:12 -0800418.. opcode:: COS - Cosine
Keith Whitwella62aaa72009-12-21 23:25:15 +0000419
Corbin Simpson17c2a442010-02-02 17:02:28 -0800420This instruction replicates its result.
421
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800422.. math::
423
Corbin Simpson17c2a442010-02-02 17:02:28 -0800424 dst = \cos{src.x}
Keith Whitwella62aaa72009-12-21 23:25:15 +0000425
426
Corbin Simpson85805222010-02-02 16:20:12 -0800427.. opcode:: DDX - Derivative Relative To X
Keith Whitwella62aaa72009-12-21 23:25:15 +0000428
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800429.. math::
430
Keith Whitwella62aaa72009-12-21 23:25:15 +0000431 dst.x = partialx(src.x)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800432
Keith Whitwella62aaa72009-12-21 23:25:15 +0000433 dst.y = partialx(src.y)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800434
Keith Whitwella62aaa72009-12-21 23:25:15 +0000435 dst.z = partialx(src.z)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800436
Keith Whitwella62aaa72009-12-21 23:25:15 +0000437 dst.w = partialx(src.w)
438
439
Corbin Simpson85805222010-02-02 16:20:12 -0800440.. opcode:: DDY - Derivative Relative To Y
Keith Whitwella62aaa72009-12-21 23:25:15 +0000441
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800442.. math::
443
Keith Whitwella62aaa72009-12-21 23:25:15 +0000444 dst.x = partialy(src.x)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800445
Keith Whitwella62aaa72009-12-21 23:25:15 +0000446 dst.y = partialy(src.y)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800447
Keith Whitwella62aaa72009-12-21 23:25:15 +0000448 dst.z = partialy(src.z)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800449
Keith Whitwella62aaa72009-12-21 23:25:15 +0000450 dst.w = partialy(src.w)
451
452
Corbin Simpson85805222010-02-02 16:20:12 -0800453.. opcode:: KILP - Predicated Discard
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800454
Keith Whitwella62aaa72009-12-21 23:25:15 +0000455 discard
456
457
Corbin Simpson85805222010-02-02 16:20:12 -0800458.. opcode:: PK2H - Pack Two 16-bit Floats
Keith Whitwella62aaa72009-12-21 23:25:15 +0000459
460 TBD
461
462
Corbin Simpson85805222010-02-02 16:20:12 -0800463.. opcode:: PK2US - Pack Two Unsigned 16-bit Scalars
Keith Whitwella62aaa72009-12-21 23:25:15 +0000464
465 TBD
466
467
Corbin Simpson85805222010-02-02 16:20:12 -0800468.. opcode:: PK4B - Pack Four Signed 8-bit Scalars
Keith Whitwella62aaa72009-12-21 23:25:15 +0000469
470 TBD
471
472
Corbin Simpson85805222010-02-02 16:20:12 -0800473.. opcode:: PK4UB - Pack Four Unsigned 8-bit Scalars
Keith Whitwella62aaa72009-12-21 23:25:15 +0000474
475 TBD
476
477
Corbin Simpson85805222010-02-02 16:20:12 -0800478.. opcode:: RFL - Reflection Vector
Keith Whitwella62aaa72009-12-21 23:25:15 +0000479
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800480.. math::
481
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800482 dst.x = 2 \times (src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z) / (src0.x \times src0.x + src0.y \times src0.y + src0.z \times src0.z) \times src0.x - src1.x
483
484 dst.y = 2 \times (src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z) / (src0.x \times src0.x + src0.y \times src0.y + src0.z \times src0.z) \times src0.y - src1.y
485
486 dst.z = 2 \times (src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z) / (src0.x \times src0.x + src0.y \times src0.y + src0.z \times src0.z) \times src0.z - src1.z
487
488 dst.w = 1
Keith Whitwella62aaa72009-12-21 23:25:15 +0000489
Corbin Simpson17c2a442010-02-02 17:02:28 -0800490.. note::
491
492 Considered for removal.
Keith Whitwell14eacb02009-12-21 23:38:29 +0000493
Keith Whitwella62aaa72009-12-21 23:25:15 +0000494
Corbin Simpson85805222010-02-02 16:20:12 -0800495.. opcode:: SEQ - Set On Equal
Keith Whitwella62aaa72009-12-21 23:25:15 +0000496
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800497.. math::
498
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800499 dst.x = (src0.x == src1.x) ? 1 : 0
Corbin Simpson04771912010-01-18 17:31:56 -0800500
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800501 dst.y = (src0.y == src1.y) ? 1 : 0
Corbin Simpson04771912010-01-18 17:31:56 -0800502
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800503 dst.z = (src0.z == src1.z) ? 1 : 0
Corbin Simpson04771912010-01-18 17:31:56 -0800504
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800505 dst.w = (src0.w == src1.w) ? 1 : 0
Keith Whitwella62aaa72009-12-21 23:25:15 +0000506
507
Corbin Simpson85805222010-02-02 16:20:12 -0800508.. opcode:: SFL - Set On False
Keith Whitwella62aaa72009-12-21 23:25:15 +0000509
Corbin Simpson17c2a442010-02-02 17:02:28 -0800510This instruction replicates its result.
511
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800512.. math::
513
Corbin Simpson17c2a442010-02-02 17:02:28 -0800514 dst = 0
Corbin Simpson04771912010-01-18 17:31:56 -0800515
Corbin Simpson17c2a442010-02-02 17:02:28 -0800516.. note::
Corbin Simpson04771912010-01-18 17:31:56 -0800517
Corbin Simpson17c2a442010-02-02 17:02:28 -0800518 Considered for removal.
Corbin Simpson04771912010-01-18 17:31:56 -0800519
Keith Whitwella62aaa72009-12-21 23:25:15 +0000520
Corbin Simpson85805222010-02-02 16:20:12 -0800521.. opcode:: SGT - Set On Greater Than
Keith Whitwella62aaa72009-12-21 23:25:15 +0000522
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800523.. math::
524
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800525 dst.x = (src0.x > src1.x) ? 1 : 0
Corbin Simpson04771912010-01-18 17:31:56 -0800526
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800527 dst.y = (src0.y > src1.y) ? 1 : 0
Corbin Simpson04771912010-01-18 17:31:56 -0800528
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800529 dst.z = (src0.z > src1.z) ? 1 : 0
Corbin Simpson04771912010-01-18 17:31:56 -0800530
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800531 dst.w = (src0.w > src1.w) ? 1 : 0
Keith Whitwella62aaa72009-12-21 23:25:15 +0000532
533
Corbin Simpson85805222010-02-02 16:20:12 -0800534.. opcode:: SIN - Sine
Keith Whitwella62aaa72009-12-21 23:25:15 +0000535
Corbin Simpson17c2a442010-02-02 17:02:28 -0800536This instruction replicates its result.
537
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800538.. math::
539
Corbin Simpson17c2a442010-02-02 17:02:28 -0800540 dst = \sin{src.x}
Keith Whitwella62aaa72009-12-21 23:25:15 +0000541
542
Corbin Simpson85805222010-02-02 16:20:12 -0800543.. opcode:: SLE - Set On Less Equal Than
Keith Whitwella62aaa72009-12-21 23:25:15 +0000544
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800545.. math::
546
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800547 dst.x = (src0.x <= src1.x) ? 1 : 0
Corbin Simpson04771912010-01-18 17:31:56 -0800548
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800549 dst.y = (src0.y <= src1.y) ? 1 : 0
Corbin Simpson04771912010-01-18 17:31:56 -0800550
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800551 dst.z = (src0.z <= src1.z) ? 1 : 0
Corbin Simpson04771912010-01-18 17:31:56 -0800552
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800553 dst.w = (src0.w <= src1.w) ? 1 : 0
Keith Whitwella62aaa72009-12-21 23:25:15 +0000554
555
Corbin Simpson85805222010-02-02 16:20:12 -0800556.. opcode:: SNE - Set On Not Equal
Keith Whitwella62aaa72009-12-21 23:25:15 +0000557
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800558.. math::
559
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800560 dst.x = (src0.x != src1.x) ? 1 : 0
Corbin Simpson04771912010-01-18 17:31:56 -0800561
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800562 dst.y = (src0.y != src1.y) ? 1 : 0
Corbin Simpson04771912010-01-18 17:31:56 -0800563
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800564 dst.z = (src0.z != src1.z) ? 1 : 0
Corbin Simpson04771912010-01-18 17:31:56 -0800565
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800566 dst.w = (src0.w != src1.w) ? 1 : 0
Keith Whitwella62aaa72009-12-21 23:25:15 +0000567
568
Corbin Simpson85805222010-02-02 16:20:12 -0800569.. opcode:: STR - Set On True
Keith Whitwella62aaa72009-12-21 23:25:15 +0000570
Corbin Simpson17c2a442010-02-02 17:02:28 -0800571This instruction replicates its result.
572
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800573.. math::
574
Corbin Simpson17c2a442010-02-02 17:02:28 -0800575 dst = 1
Keith Whitwella62aaa72009-12-21 23:25:15 +0000576
577
Corbin Simpson85805222010-02-02 16:20:12 -0800578.. opcode:: TEX - Texture Lookup
Keith Whitwella62aaa72009-12-21 23:25:15 +0000579
Brian Paul2a77c3c2010-12-14 12:45:36 -0700580.. math::
581
582 coord = src0
583
584 bias = 0.0
585
586 dst = texture_sample(unit, coord, bias)
Keith Whitwella62aaa72009-12-21 23:25:15 +0000587
Dave Airlie35db3262011-12-19 16:40:05 +0000588 for array textures src0.y contains the slice for 1D,
589 and src0.z contain the slice for 2D.
590 for shadow textures with no arrays, src0.z contains
591 the reference value.
592 for shadow textures with arrays, src0.z contains
593 the reference value for 1D arrays, and src0.w contains
594 the reference value for 2D arrays.
595 There is no way to pass a bias in the .w value for
596 shadow arrays, and GLSL doesn't allow this.
597 GLSL does allow cube shadows maps to take a bias value,
598 and we have to determine how this will look in TGSI.
Keith Whitwella62aaa72009-12-21 23:25:15 +0000599
Corbin Simpson85805222010-02-02 16:20:12 -0800600.. opcode:: TXD - Texture Lookup with Derivatives
Keith Whitwella62aaa72009-12-21 23:25:15 +0000601
Brian Paul2a77c3c2010-12-14 12:45:36 -0700602.. math::
603
604 coord = src0
605
606 ddx = src1
607
608 ddy = src2
609
610 bias = 0.0
611
612 dst = texture_sample_deriv(unit, coord, bias, ddx, ddy)
Keith Whitwella62aaa72009-12-21 23:25:15 +0000613
614
Corbin Simpson85805222010-02-02 16:20:12 -0800615.. opcode:: TXP - Projective Texture Lookup
Keith Whitwella62aaa72009-12-21 23:25:15 +0000616
Brian Paul2a77c3c2010-12-14 12:45:36 -0700617.. math::
618
619 coord.x = src0.x / src.w
620
621 coord.y = src0.y / src.w
622
623 coord.z = src0.z / src.w
624
625 coord.w = src0.w
626
627 bias = 0.0
628
629 dst = texture_sample(unit, coord, bias)
Keith Whitwella62aaa72009-12-21 23:25:15 +0000630
631
Corbin Simpson85805222010-02-02 16:20:12 -0800632.. opcode:: UP2H - Unpack Two 16-Bit Floats
Keith Whitwella62aaa72009-12-21 23:25:15 +0000633
634 TBD
635
Corbin Simpson17c2a442010-02-02 17:02:28 -0800636.. note::
637
638 Considered for removal.
Keith Whitwella62aaa72009-12-21 23:25:15 +0000639
Corbin Simpson85805222010-02-02 16:20:12 -0800640.. opcode:: UP2US - Unpack Two Unsigned 16-Bit Scalars
Keith Whitwella62aaa72009-12-21 23:25:15 +0000641
642 TBD
643
Corbin Simpson17c2a442010-02-02 17:02:28 -0800644.. note::
645
646 Considered for removal.
Keith Whitwella62aaa72009-12-21 23:25:15 +0000647
Corbin Simpson85805222010-02-02 16:20:12 -0800648.. opcode:: UP4B - Unpack Four Signed 8-Bit Values
Keith Whitwella62aaa72009-12-21 23:25:15 +0000649
650 TBD
651
Corbin Simpson17c2a442010-02-02 17:02:28 -0800652.. note::
653
654 Considered for removal.
Keith Whitwella62aaa72009-12-21 23:25:15 +0000655
Corbin Simpson85805222010-02-02 16:20:12 -0800656.. opcode:: UP4UB - Unpack Four Unsigned 8-Bit Scalars
Keith Whitwella62aaa72009-12-21 23:25:15 +0000657
658 TBD
659
Corbin Simpson17c2a442010-02-02 17:02:28 -0800660.. note::
661
662 Considered for removal.
Keith Whitwella62aaa72009-12-21 23:25:15 +0000663
Corbin Simpson85805222010-02-02 16:20:12 -0800664.. opcode:: X2D - 2D Coordinate Transformation
Keith Whitwella62aaa72009-12-21 23:25:15 +0000665
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800666.. math::
667
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800668 dst.x = src0.x + src1.x \times src2.x + src1.y \times src2.y
Corbin Simpson04771912010-01-18 17:31:56 -0800669
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800670 dst.y = src0.y + src1.x \times src2.z + src1.y \times src2.w
Corbin Simpson04771912010-01-18 17:31:56 -0800671
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800672 dst.z = src0.x + src1.x \times src2.x + src1.y \times src2.y
Corbin Simpson04771912010-01-18 17:31:56 -0800673
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800674 dst.w = src0.y + src1.x \times src2.z + src1.y \times src2.w
Keith Whitwella62aaa72009-12-21 23:25:15 +0000675
Corbin Simpson17c2a442010-02-02 17:02:28 -0800676.. note::
677
678 Considered for removal.
Keith Whitwell14eacb02009-12-21 23:38:29 +0000679
Keith Whitwella62aaa72009-12-21 23:25:15 +0000680
Corbin Simpson85805222010-02-02 16:20:12 -0800681.. opcode:: ARA - Address Register Add
Keith Whitwella62aaa72009-12-21 23:25:15 +0000682
683 TBD
684
Corbin Simpson17c2a442010-02-02 17:02:28 -0800685.. note::
686
687 Considered for removal.
Keith Whitwella62aaa72009-12-21 23:25:15 +0000688
Corbin Simpson85805222010-02-02 16:20:12 -0800689.. opcode:: ARR - Address Register Load With Round
Keith Whitwella62aaa72009-12-21 23:25:15 +0000690
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800691.. math::
692
Keith Whitwella62aaa72009-12-21 23:25:15 +0000693 dst.x = round(src.x)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800694
Keith Whitwella62aaa72009-12-21 23:25:15 +0000695 dst.y = round(src.y)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800696
Keith Whitwella62aaa72009-12-21 23:25:15 +0000697 dst.z = round(src.z)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800698
Keith Whitwella62aaa72009-12-21 23:25:15 +0000699 dst.w = round(src.w)
700
701
Corbin Simpson85805222010-02-02 16:20:12 -0800702.. opcode:: BRA - Branch
Keith Whitwella62aaa72009-12-21 23:25:15 +0000703
704 pc = target
705
Corbin Simpson17c2a442010-02-02 17:02:28 -0800706.. note::
707
708 Considered for removal.
Keith Whitwella62aaa72009-12-21 23:25:15 +0000709
Corbin Simpson85805222010-02-02 16:20:12 -0800710.. opcode:: CAL - Subroutine Call
Keith Whitwella62aaa72009-12-21 23:25:15 +0000711
712 push(pc)
713 pc = target
714
715
Corbin Simpson85805222010-02-02 16:20:12 -0800716.. opcode:: RET - Subroutine Call Return
Keith Whitwella62aaa72009-12-21 23:25:15 +0000717
718 pc = pop()
719
720
Corbin Simpson85805222010-02-02 16:20:12 -0800721.. opcode:: SSG - Set Sign
Keith Whitwella62aaa72009-12-21 23:25:15 +0000722
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800723.. math::
724
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800725 dst.x = (src.x > 0) ? 1 : (src.x < 0) ? -1 : 0
726
727 dst.y = (src.y > 0) ? 1 : (src.y < 0) ? -1 : 0
728
729 dst.z = (src.z > 0) ? 1 : (src.z < 0) ? -1 : 0
730
731 dst.w = (src.w > 0) ? 1 : (src.w < 0) ? -1 : 0
Keith Whitwella62aaa72009-12-21 23:25:15 +0000732
733
Corbin Simpson85805222010-02-02 16:20:12 -0800734.. opcode:: CMP - Compare
Keith Whitwella62aaa72009-12-21 23:25:15 +0000735
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800736.. math::
737
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800738 dst.x = (src0.x < 0) ? src1.x : src2.x
739
740 dst.y = (src0.y < 0) ? src1.y : src2.y
741
742 dst.z = (src0.z < 0) ? src1.z : src2.z
743
744 dst.w = (src0.w < 0) ? src1.w : src2.w
Keith Whitwella62aaa72009-12-21 23:25:15 +0000745
746
Corbin Simpson85805222010-02-02 16:20:12 -0800747.. opcode:: KIL - Conditional Discard
Keith Whitwella62aaa72009-12-21 23:25:15 +0000748
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800749.. math::
750
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800751 if (src.x < 0 || src.y < 0 || src.z < 0 || src.w < 0)
Keith Whitwella62aaa72009-12-21 23:25:15 +0000752 discard
753 endif
754
755
Corbin Simpson85805222010-02-02 16:20:12 -0800756.. opcode:: SCS - Sine Cosine
Keith Whitwella62aaa72009-12-21 23:25:15 +0000757
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800758.. math::
759
Corbin Simpsond92a6852009-12-21 19:30:29 -0800760 dst.x = \cos{src.x}
761
762 dst.y = \sin{src.x}
763
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800764 dst.z = 0
Corbin Simpsond92a6852009-12-21 19:30:29 -0800765
Tilman Sauerbeckd3231182010-09-19 09:03:11 +0200766 dst.w = 1
Keith Whitwella62aaa72009-12-21 23:25:15 +0000767
768
Corbin Simpson85805222010-02-02 16:20:12 -0800769.. opcode:: TXB - Texture Lookup With Bias
Keith Whitwella62aaa72009-12-21 23:25:15 +0000770
Brian Paul2a77c3c2010-12-14 12:45:36 -0700771.. math::
772
773 coord.x = src.x
774
775 coord.y = src.y
776
777 coord.z = src.z
778
779 coord.w = 1.0
780
781 bias = src.z
782
783 dst = texture_sample(unit, coord, bias)
Keith Whitwella62aaa72009-12-21 23:25:15 +0000784
785
Corbin Simpson85805222010-02-02 16:20:12 -0800786.. opcode:: NRM - 3-component Vector Normalise
Keith Whitwella62aaa72009-12-21 23:25:15 +0000787
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800788.. math::
789
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800790 dst.x = src.x / (src.x \times src.x + src.y \times src.y + src.z \times src.z)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800791
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800792 dst.y = src.y / (src.x \times src.x + src.y \times src.y + src.z \times src.z)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800793
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800794 dst.z = src.z / (src.x \times src.x + src.y \times src.y + src.z \times src.z)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800795
796 dst.w = 1
Keith Whitwella62aaa72009-12-21 23:25:15 +0000797
798
Corbin Simpson85805222010-02-02 16:20:12 -0800799.. opcode:: DIV - Divide
Keith Whitwella62aaa72009-12-21 23:25:15 +0000800
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800801.. math::
802
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800803 dst.x = \frac{src0.x}{src1.x}
804
805 dst.y = \frac{src0.y}{src1.y}
806
807 dst.z = \frac{src0.z}{src1.z}
808
809 dst.w = \frac{src0.w}{src1.w}
Keith Whitwella62aaa72009-12-21 23:25:15 +0000810
811
Corbin Simpson85805222010-02-02 16:20:12 -0800812.. opcode:: DP2 - 2-component Dot Product
Keith Whitwella62aaa72009-12-21 23:25:15 +0000813
Corbin Simpson17c2a442010-02-02 17:02:28 -0800814This instruction replicates its result.
815
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800816.. math::
817
Corbin Simpson17c2a442010-02-02 17:02:28 -0800818 dst = src0.x \times src1.x + src0.y \times src1.y
Keith Whitwella62aaa72009-12-21 23:25:15 +0000819
820
Brian Paul2a77c3c2010-12-14 12:45:36 -0700821.. opcode:: TXL - Texture Lookup With explicit LOD
Keith Whitwella62aaa72009-12-21 23:25:15 +0000822
Brian Paul2a77c3c2010-12-14 12:45:36 -0700823.. math::
824
825 coord.x = src0.x
826
827 coord.y = src0.y
828
829 coord.z = src0.z
830
831 coord.w = 1.0
832
833 lod = src0.w
834
835 dst = texture_sample(unit, coord, lod)
Keith Whitwella62aaa72009-12-21 23:25:15 +0000836
837
Corbin Simpson85805222010-02-02 16:20:12 -0800838.. opcode:: BRK - Break
Keith Whitwella62aaa72009-12-21 23:25:15 +0000839
840 TBD
841
842
Corbin Simpson85805222010-02-02 16:20:12 -0800843.. opcode:: IF - If
Keith Whitwella62aaa72009-12-21 23:25:15 +0000844
845 TBD
846
847
Corbin Simpson85805222010-02-02 16:20:12 -0800848.. opcode:: ELSE - Else
Keith Whitwella62aaa72009-12-21 23:25:15 +0000849
850 TBD
851
852
Corbin Simpson85805222010-02-02 16:20:12 -0800853.. opcode:: ENDIF - End If
Keith Whitwella62aaa72009-12-21 23:25:15 +0000854
855 TBD
856
857
Corbin Simpson85805222010-02-02 16:20:12 -0800858.. opcode:: PUSHA - Push Address Register On Stack
Keith Whitwella62aaa72009-12-21 23:25:15 +0000859
860 push(src.x)
861 push(src.y)
862 push(src.z)
863 push(src.w)
864
Corbin Simpson17c2a442010-02-02 17:02:28 -0800865.. note::
866
867 Considered for cleanup.
868
869.. note::
870
871 Considered for removal.
Keith Whitwella62aaa72009-12-21 23:25:15 +0000872
Corbin Simpson85805222010-02-02 16:20:12 -0800873.. opcode:: POPA - Pop Address Register From Stack
Keith Whitwella62aaa72009-12-21 23:25:15 +0000874
875 dst.w = pop()
876 dst.z = pop()
877 dst.y = pop()
878 dst.x = pop()
879
Corbin Simpson17c2a442010-02-02 17:02:28 -0800880.. note::
881
882 Considered for cleanup.
883
884.. note::
885
886 Considered for removal.
Keith Whitwell14eacb02009-12-21 23:38:29 +0000887
Keith Whitwella62aaa72009-12-21 23:25:15 +0000888
Corbin Simpson9d4cb6e2010-06-16 18:34:32 -0700889Compute ISA
Corbin Simpson5bcd26c2009-12-21 21:04:10 -0800890^^^^^^^^^^^^^^^^^^^^^^^^
Keith Whitwella62aaa72009-12-21 23:25:15 +0000891
Corbin Simpson9d4cb6e2010-06-16 18:34:32 -0700892These opcodes are primarily provided for special-use computational shaders.
Keith Whitwell14eacb02009-12-21 23:38:29 +0000893Support for these opcodes indicated by a special pipe capability bit (TBD).
Keith Whitwella62aaa72009-12-21 23:25:15 +0000894
Corbin Simpson9d4cb6e2010-06-16 18:34:32 -0700895XXX so let's discuss it, yeah?
896
Corbin Simpson85805222010-02-02 16:20:12 -0800897.. opcode:: CEIL - Ceiling
Keith Whitwella62aaa72009-12-21 23:25:15 +0000898
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800899.. math::
900
Corbin Simpson14743ac2009-12-21 19:57:56 -0800901 dst.x = \lceil src.x\rceil
902
903 dst.y = \lceil src.y\rceil
904
905 dst.z = \lceil src.z\rceil
906
907 dst.w = \lceil src.w\rceil
Keith Whitwella62aaa72009-12-21 23:25:15 +0000908
909
Corbin Simpson85805222010-02-02 16:20:12 -0800910.. opcode:: I2F - Integer To Float
Keith Whitwella62aaa72009-12-21 23:25:15 +0000911
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800912.. math::
913
Keith Whitwella62aaa72009-12-21 23:25:15 +0000914 dst.x = (float) src.x
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800915
Keith Whitwella62aaa72009-12-21 23:25:15 +0000916 dst.y = (float) src.y
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800917
Keith Whitwella62aaa72009-12-21 23:25:15 +0000918 dst.z = (float) src.z
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800919
Keith Whitwella62aaa72009-12-21 23:25:15 +0000920 dst.w = (float) src.w
921
922
Corbin Simpson85805222010-02-02 16:20:12 -0800923.. opcode:: NOT - Bitwise Not
Keith Whitwella62aaa72009-12-21 23:25:15 +0000924
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800925.. math::
926
Keith Whitwella62aaa72009-12-21 23:25:15 +0000927 dst.x = ~src.x
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800928
Keith Whitwella62aaa72009-12-21 23:25:15 +0000929 dst.y = ~src.y
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800930
Keith Whitwella62aaa72009-12-21 23:25:15 +0000931 dst.z = ~src.z
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800932
Keith Whitwella62aaa72009-12-21 23:25:15 +0000933 dst.w = ~src.w
934
935
Corbin Simpson85805222010-02-02 16:20:12 -0800936.. opcode:: TRUNC - Truncate
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800937
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800938.. math::
939
Keith Whitwella62aaa72009-12-21 23:25:15 +0000940 dst.x = trunc(src.x)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800941
Keith Whitwella62aaa72009-12-21 23:25:15 +0000942 dst.y = trunc(src.y)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800943
Keith Whitwella62aaa72009-12-21 23:25:15 +0000944 dst.z = trunc(src.z)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800945
Keith Whitwella62aaa72009-12-21 23:25:15 +0000946 dst.w = trunc(src.w)
947
948
Corbin Simpson85805222010-02-02 16:20:12 -0800949.. opcode:: SHL - Shift Left
Keith Whitwella62aaa72009-12-21 23:25:15 +0000950
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800951.. math::
952
Keith Whitwella62aaa72009-12-21 23:25:15 +0000953 dst.x = src0.x << src1.x
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800954
Keith Whitwella62aaa72009-12-21 23:25:15 +0000955 dst.y = src0.y << src1.x
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800956
Keith Whitwella62aaa72009-12-21 23:25:15 +0000957 dst.z = src0.z << src1.x
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800958
Keith Whitwella62aaa72009-12-21 23:25:15 +0000959 dst.w = src0.w << src1.x
960
961
Corbin Simpson85805222010-02-02 16:20:12 -0800962.. opcode:: SHR - Shift Right
Keith Whitwella62aaa72009-12-21 23:25:15 +0000963
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800964.. math::
965
Keith Whitwella62aaa72009-12-21 23:25:15 +0000966 dst.x = src0.x >> src1.x
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800967
Keith Whitwella62aaa72009-12-21 23:25:15 +0000968 dst.y = src0.y >> src1.x
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800969
Keith Whitwella62aaa72009-12-21 23:25:15 +0000970 dst.z = src0.z >> src1.x
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800971
Keith Whitwella62aaa72009-12-21 23:25:15 +0000972 dst.w = src0.w >> src1.x
973
974
Corbin Simpson85805222010-02-02 16:20:12 -0800975.. opcode:: AND - Bitwise And
Keith Whitwella62aaa72009-12-21 23:25:15 +0000976
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800977.. math::
978
Keith Whitwella62aaa72009-12-21 23:25:15 +0000979 dst.x = src0.x & src1.x
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800980
Keith Whitwella62aaa72009-12-21 23:25:15 +0000981 dst.y = src0.y & src1.y
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800982
Keith Whitwella62aaa72009-12-21 23:25:15 +0000983 dst.z = src0.z & src1.z
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800984
Keith Whitwella62aaa72009-12-21 23:25:15 +0000985 dst.w = src0.w & src1.w
986
987
Corbin Simpson85805222010-02-02 16:20:12 -0800988.. opcode:: OR - Bitwise Or
Keith Whitwella62aaa72009-12-21 23:25:15 +0000989
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800990.. math::
991
Keith Whitwella62aaa72009-12-21 23:25:15 +0000992 dst.x = src0.x | src1.x
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800993
Keith Whitwella62aaa72009-12-21 23:25:15 +0000994 dst.y = src0.y | src1.y
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800995
Keith Whitwella62aaa72009-12-21 23:25:15 +0000996 dst.z = src0.z | src1.z
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800997
Keith Whitwella62aaa72009-12-21 23:25:15 +0000998 dst.w = src0.w | src1.w
999
1000
Corbin Simpson85805222010-02-02 16:20:12 -08001001.. opcode:: MOD - Modulus
Keith Whitwella62aaa72009-12-21 23:25:15 +00001002
Corbin Simpsone8ed3b92009-12-21 19:12:55 -08001003.. math::
1004
Corbin Simpsonda65ac62009-12-21 20:32:46 -08001005 dst.x = src0.x \bmod src1.x
1006
1007 dst.y = src0.y \bmod src1.y
1008
1009 dst.z = src0.z \bmod src1.z
1010
1011 dst.w = src0.w \bmod src1.w
Keith Whitwella62aaa72009-12-21 23:25:15 +00001012
1013
Corbin Simpson85805222010-02-02 16:20:12 -08001014.. opcode:: XOR - Bitwise Xor
Keith Whitwella62aaa72009-12-21 23:25:15 +00001015
Corbin Simpsone8ed3b92009-12-21 19:12:55 -08001016.. math::
1017
Corbin Simpsonf90733c2010-01-18 17:37:25 -08001018 dst.x = src0.x \oplus src1.x
Corbin Simpsonda65ac62009-12-21 20:32:46 -08001019
Corbin Simpsonf90733c2010-01-18 17:37:25 -08001020 dst.y = src0.y \oplus src1.y
Corbin Simpsonda65ac62009-12-21 20:32:46 -08001021
Corbin Simpsonf90733c2010-01-18 17:37:25 -08001022 dst.z = src0.z \oplus src1.z
Corbin Simpsonda65ac62009-12-21 20:32:46 -08001023
Corbin Simpsonf90733c2010-01-18 17:37:25 -08001024 dst.w = src0.w \oplus src1.w
Keith Whitwella62aaa72009-12-21 23:25:15 +00001025
1026
Bryan Cain324ac982011-09-10 12:31:54 -05001027.. opcode:: UCMP - Integer Conditional Move
1028
1029.. math::
1030
1031 dst.x = src0.x ? src1.x : src2.x
1032
1033 dst.y = src0.y ? src1.y : src2.y
1034
1035 dst.z = src0.z ? src1.z : src2.z
1036
1037 dst.w = src0.w ? src1.w : src2.w
1038
1039
1040.. opcode:: UARL - Integer Address Register Load
1041
1042 Moves the contents of the source register, assumed to be an integer, into the
1043 destination register, which is assumed to be an address (ADDR) register.
1044
1045
Bryan Cain4c0f1fb2012-01-07 10:43:04 -06001046.. opcode:: IABS - Integer Absolute Value
1047
1048.. math::
1049
1050 dst.x = |src.x|
1051
1052 dst.y = |src.y|
1053
1054 dst.z = |src.z|
1055
1056 dst.w = |src.w|
1057
1058
Corbin Simpson85805222010-02-02 16:20:12 -08001059.. opcode:: SAD - Sum Of Absolute Differences
Keith Whitwella62aaa72009-12-21 23:25:15 +00001060
Corbin Simpsone8ed3b92009-12-21 19:12:55 -08001061.. math::
1062
Corbin Simpson14743ac2009-12-21 19:57:56 -08001063 dst.x = |src0.x - src1.x| + src2.x
1064
1065 dst.y = |src0.y - src1.y| + src2.y
1066
1067 dst.z = |src0.z - src1.z| + src2.z
1068
1069 dst.w = |src0.w - src1.w| + src2.w
Keith Whitwella62aaa72009-12-21 23:25:15 +00001070
1071
Dave Airlie2083a272011-08-26 10:59:18 +01001072.. opcode:: TXF - Texel Fetch (as per NV_gpu_shader4), extract a single texel
1073 from a specified texture image. The source sampler may
1074 not be a CUBE or SHADOW.
1075 src 0 is a four-component signed integer vector used to
1076 identify the single texel accessed. 3 components + level.
1077 src 1 is a 3 component constant signed integer vector,
1078 with each component only have a range of
1079 -8..+8 (hw only seems to deal with this range, interface
1080 allows for up to unsigned int).
1081 TXF(uint_vec coord, int_vec offset).
Keith Whitwella62aaa72009-12-21 23:25:15 +00001082
1083
Dave Airlie6fb12bf2011-08-25 13:03:19 +01001084.. opcode:: TXQ - Texture Size Query (as per NV_gpu_program4)
1085 retrieve the dimensions of the texture
1086 depending on the target. For 1D (width), 2D/RECT/CUBE
1087 (width, height), 3D (width, height, depth),
1088 1D array (width, layers), 2D array (width, height, layers)
Keith Whitwella62aaa72009-12-21 23:25:15 +00001089
Dave Airlie6fb12bf2011-08-25 13:03:19 +01001090.. math::
1091
1092 lod = src0
1093
1094 dst.x = texture_width(unit, lod)
1095
1096 dst.y = texture_height(unit, lod)
1097
1098 dst.z = texture_depth(unit, lod)
Keith Whitwella62aaa72009-12-21 23:25:15 +00001099
1100
Corbin Simpson85805222010-02-02 16:20:12 -08001101.. opcode:: CONT - Continue
Keith Whitwella62aaa72009-12-21 23:25:15 +00001102
1103 TBD
1104
Corbin Simpson9d4cb6e2010-06-16 18:34:32 -07001105.. note::
Keith Whitwella62aaa72009-12-21 23:25:15 +00001106
Corbin Simpson9d4cb6e2010-06-16 18:34:32 -07001107 Support for CONT is determined by a special capability bit,
1108 ``TGSI_CONT_SUPPORTED``. See :ref:`Screen` for more information.
1109
1110
1111Geometry ISA
Corbin Simpson5bcd26c2009-12-21 21:04:10 -08001112^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Keith Whitwella62aaa72009-12-21 23:25:15 +00001113
Corbin Simpson9d4cb6e2010-06-16 18:34:32 -07001114These opcodes are only supported in geometry shaders; they have no meaning
1115in any other type of shader.
Keith Whitwella62aaa72009-12-21 23:25:15 +00001116
Corbin Simpson85805222010-02-02 16:20:12 -08001117.. opcode:: EMIT - Emit
Keith Whitwella62aaa72009-12-21 23:25:15 +00001118
1119 TBD
1120
1121
Corbin Simpson85805222010-02-02 16:20:12 -08001122.. opcode:: ENDPRIM - End Primitive
Keith Whitwella62aaa72009-12-21 23:25:15 +00001123
1124 TBD
1125
1126
Corbin Simpson9d4cb6e2010-06-16 18:34:32 -07001127GLSL ISA
Corbin Simpson5bcd26c2009-12-21 21:04:10 -08001128^^^^^^^^^^
Keith Whitwella62aaa72009-12-21 23:25:15 +00001129
Corbin Simpson9d4cb6e2010-06-16 18:34:32 -07001130These opcodes are part of :term:`GLSL`'s opcode set. Support for these
1131opcodes is determined by a special capability bit, ``GLSL``.
Keith Whitwella62aaa72009-12-21 23:25:15 +00001132
Corbin Simpson85805222010-02-02 16:20:12 -08001133.. opcode:: BGNLOOP - Begin a Loop
Keith Whitwella62aaa72009-12-21 23:25:15 +00001134
1135 TBD
1136
1137
Corbin Simpson85805222010-02-02 16:20:12 -08001138.. opcode:: BGNSUB - Begin Subroutine
Keith Whitwella62aaa72009-12-21 23:25:15 +00001139
1140 TBD
1141
1142
Corbin Simpson85805222010-02-02 16:20:12 -08001143.. opcode:: ENDLOOP - End a Loop
Keith Whitwella62aaa72009-12-21 23:25:15 +00001144
1145 TBD
1146
1147
Corbin Simpson85805222010-02-02 16:20:12 -08001148.. opcode:: ENDSUB - End Subroutine
Keith Whitwella62aaa72009-12-21 23:25:15 +00001149
1150 TBD
1151
1152
Corbin Simpson85805222010-02-02 16:20:12 -08001153.. opcode:: NOP - No Operation
Keith Whitwella62aaa72009-12-21 23:25:15 +00001154
Michal Krol8ab89d72010-01-04 13:23:41 +01001155 Do nothing.
1156
Keith Whitwella62aaa72009-12-21 23:25:15 +00001157
Corbin Simpson85805222010-02-02 16:20:12 -08001158.. opcode:: NRM4 - 4-component Vector Normalise
Keith Whitwella62aaa72009-12-21 23:25:15 +00001159
Corbin Simpson17c2a442010-02-02 17:02:28 -08001160This instruction replicates its result.
1161
Corbin Simpsone8ed3b92009-12-21 19:12:55 -08001162.. math::
1163
Corbin Simpson17c2a442010-02-02 17:02:28 -08001164 dst = \frac{src.x}{src.x \times src.x + src.y \times src.y + src.z \times src.z + src.w \times src.w}
Keith Whitwella62aaa72009-12-21 23:25:15 +00001165
1166
Corbin Simpsonda65ac62009-12-21 20:32:46 -08001167ps_2_x
Corbin Simpson5bcd26c2009-12-21 21:04:10 -08001168^^^^^^^^^^^^
Keith Whitwella62aaa72009-12-21 23:25:15 +00001169
Corbin Simpson9d4cb6e2010-06-16 18:34:32 -07001170XXX wait what
Keith Whitwella62aaa72009-12-21 23:25:15 +00001171
Corbin Simpson85805222010-02-02 16:20:12 -08001172.. opcode:: CALLNZ - Subroutine Call If Not Zero
Keith Whitwella62aaa72009-12-21 23:25:15 +00001173
1174 TBD
1175
1176
Corbin Simpson85805222010-02-02 16:20:12 -08001177.. opcode:: IFC - If
Keith Whitwella62aaa72009-12-21 23:25:15 +00001178
1179 TBD
1180
1181
Corbin Simpson85805222010-02-02 16:20:12 -08001182.. opcode:: BREAKC - Break Conditional
Keith Whitwella62aaa72009-12-21 23:25:15 +00001183
1184 TBD
1185
Corbin Simpson62ca7b82010-02-02 16:36:34 -08001186.. _doubleopcodes:
1187
Corbin Simpson9d4cb6e2010-06-16 18:34:32 -07001188Double ISA
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001189^^^^^^^^^^^^^^^
1190
Corbin Simpsondbc95e82010-06-16 18:34:51 -07001191The double-precision opcodes reinterpret four-component vectors into
1192two-component vectors with doubled precision in each component.
1193
1194Support for these opcodes is XXX undecided. :T
1195
1196.. opcode:: DADD - Add
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001197
1198.. math::
1199
1200 dst.xy = src0.xy + src1.xy
1201
1202 dst.zw = src0.zw + src1.zw
1203
1204
Corbin Simpsondbc95e82010-06-16 18:34:51 -07001205.. opcode:: DDIV - Divide
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001206
1207.. math::
1208
1209 dst.xy = src0.xy / src1.xy
1210
1211 dst.zw = src0.zw / src1.zw
1212
Corbin Simpsondbc95e82010-06-16 18:34:51 -07001213.. opcode:: DSEQ - Set on Equal
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001214
1215.. math::
1216
1217 dst.xy = src0.xy == src1.xy ? 1.0F : 0.0F
1218
1219 dst.zw = src0.zw == src1.zw ? 1.0F : 0.0F
1220
Corbin Simpsondbc95e82010-06-16 18:34:51 -07001221.. opcode:: DSLT - Set on Less than
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001222
1223.. math::
1224
1225 dst.xy = src0.xy < src1.xy ? 1.0F : 0.0F
1226
1227 dst.zw = src0.zw < src1.zw ? 1.0F : 0.0F
1228
Corbin Simpsondbc95e82010-06-16 18:34:51 -07001229.. opcode:: DFRAC - Fraction
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001230
1231.. math::
1232
1233 dst.xy = src.xy - \lfloor src.xy\rfloor
1234
1235 dst.zw = src.zw - \lfloor src.zw\rfloor
1236
1237
Corbin Simpsondbc95e82010-06-16 18:34:51 -07001238.. opcode:: DFRACEXP - Convert Number to Fractional and Integral Components
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001239
Corbin Simpsonf98c4622010-06-16 18:45:50 -07001240Like the ``frexp()`` routine in many math libraries, this opcode stores the
1241exponent of its source to ``dst0``, and the significand to ``dst1``, such that
1242:math:`dst1 \times 2^{dst0} = src` .
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001243
1244.. math::
1245
Corbin Simpsonf98c4622010-06-16 18:45:50 -07001246 dst0.xy = exp(src.xy)
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001247
Corbin Simpsonf98c4622010-06-16 18:45:50 -07001248 dst1.xy = frac(src.xy)
1249
1250 dst0.zw = exp(src.zw)
1251
1252 dst1.zw = frac(src.zw)
1253
1254.. opcode:: DLDEXP - Multiply Number by Integral Power of 2
1255
1256This opcode is the inverse of :opcode:`DFRACEXP`.
1257
1258.. math::
1259
1260 dst.xy = src0.xy \times 2^{src1.xy}
1261
1262 dst.zw = src0.zw \times 2^{src1.zw}
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001263
Corbin Simpsondbc95e82010-06-16 18:34:51 -07001264.. opcode:: DMIN - Minimum
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001265
1266.. math::
1267
1268 dst.xy = min(src0.xy, src1.xy)
1269
1270 dst.zw = min(src0.zw, src1.zw)
1271
Corbin Simpsondbc95e82010-06-16 18:34:51 -07001272.. opcode:: DMAX - Maximum
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001273
1274.. math::
1275
1276 dst.xy = max(src0.xy, src1.xy)
1277
1278 dst.zw = max(src0.zw, src1.zw)
1279
Corbin Simpsondbc95e82010-06-16 18:34:51 -07001280.. opcode:: DMUL - Multiply
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001281
1282.. math::
1283
1284 dst.xy = src0.xy \times src1.xy
1285
1286 dst.zw = src0.zw \times src1.zw
1287
1288
Corbin Simpsondbc95e82010-06-16 18:34:51 -07001289.. opcode:: DMAD - Multiply And Add
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001290
1291.. math::
1292
1293 dst.xy = src0.xy \times src1.xy + src2.xy
1294
1295 dst.zw = src0.zw \times src1.zw + src2.zw
1296
1297
Corbin Simpsondbc95e82010-06-16 18:34:51 -07001298.. opcode:: DRCP - Reciprocal
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001299
1300.. math::
1301
1302 dst.xy = \frac{1}{src.xy}
1303
1304 dst.zw = \frac{1}{src.zw}
1305
Corbin Simpsondbc95e82010-06-16 18:34:51 -07001306.. opcode:: DSQRT - Square Root
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001307
1308.. math::
1309
1310 dst.xy = \sqrt{src.xy}
1311
1312 dst.zw = \sqrt{src.zw}
1313
Keith Whitwella62aaa72009-12-21 23:25:15 +00001314
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001315.. _samplingopcodes:
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001316
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001317Resource Sampling Opcodes
1318^^^^^^^^^^^^^^^^^^^^^^^^^
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001319
1320Those opcodes follow very closely semantics of the respective Direct3D
1321instructions. If in doubt double check Direct3D documentation.
1322
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001323.. opcode:: SAMPLE - Using provided address, sample data from the
1324 specified texture using the filtering mode identified
1325 by the gven sampler. The source data may come from
1326 any resource type other than buffers.
1327 SAMPLE dst, address, sampler_view, sampler
1328 e.g.
1329 SAMPLE TEMP[0], TEMP[1], SVIEW[0], SAMP[0]
1330
1331.. opcode:: SAMPLE_I - Simplified alternative to the SAMPLE instruction.
1332 Using the provided integer address, SAMPLE_I fetches data
1333 from the specified sampler view without any filtering.
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001334 The source data may come from any resource type other
1335 than CUBE.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001336 SAMPLE_I dst, address, sampler_view
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001337 e.g.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001338 SAMPLE_I TEMP[0], TEMP[1], SVIEW[0]
Zack Rusin3fa814d2011-01-24 21:45:37 -05001339 The 'address' is specified as unsigned integers. If the
1340 'address' is out of range [0...(# texels - 1)] the
1341 result of the fetch is always 0 in all components.
1342 As such the instruction doesn't honor address wrap
1343 modes, in cases where that behavior is desirable
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001344 'SAMPLE' instruction should be used.
Zack Rusin3fa814d2011-01-24 21:45:37 -05001345 address.w always provides an unsigned integer mipmap
1346 level. If the value is out of the range then the
1347 instruction always returns 0 in all components.
1348 address.yz are ignored for buffers and 1d textures.
1349 address.z is ignored for 1d texture arrays and 2d
1350 textures.
1351 For 1D texture arrays address.y provides the array
1352 index (also as unsigned integer). If the value is
1353 out of the range of available array indices
1354 [0... (array size - 1)] then the opcode always returns
1355 0 in all components.
1356 For 2D texture arrays address.z provides the array
1357 index, otherwise it exhibits the same behavior as in
1358 the case for 1D texture arrays.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001359 The exact semantics of the source address are presented
Zack Rusin3fa814d2011-01-24 21:45:37 -05001360 in the table below:
1361 resource type X Y Z W
1362 ------------- ------------------------
1363 PIPE_BUFFER x ignored
1364 PIPE_TEXTURE_1D x mpl
1365 PIPE_TEXTURE_2D x y mpl
1366 PIPE_TEXTURE_3D x y z mpl
1367 PIPE_TEXTURE_RECT x y mpl
1368 PIPE_TEXTURE_CUBE not allowed as source
1369 PIPE_TEXTURE_1D_ARRAY x idx mpl
1370 PIPE_TEXTURE_2D_ARRAY x y idx mpl
1371
1372 Where 'mpl' is a mipmap level and 'idx' is the
1373 array index.
1374
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001375.. opcode:: SAMPLE_I_MS - Just like SAMPLE_I but allows fetch data from
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001376 multi-sampled surfaces.
1377
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001378.. opcode:: SAMPLE_B - Just like the SAMPLE instruction with the
1379 exception that an additiona bias is applied to the
1380 level of detail computed as part of the instruction
1381 execution.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001382 SAMPLE_B dst, address, sampler_view, sampler, lod_bias
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001383 e.g.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001384 SAMPLE_B TEMP[0], TEMP[1], SVIEW[0], SAMP[0], TEMP[2].x
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001385
Zack Rusin3fa814d2011-01-24 21:45:37 -05001386.. opcode:: SAMPLE_C - Similar to the SAMPLE instruction but it
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001387 performs a comparison filter. The operands to SAMPLE_C
1388 are identical to SAMPLE, except that tere is an additional
1389 float32 operand, reference value, which must be a register
1390 with single-component, or a scalar literal.
1391 SAMPLE_C makes the hardware use the current samplers
1392 compare_func (in pipe_sampler_state) to compare
1393 reference value against the red component value for the
1394 surce resource at each texel that the currently configured
1395 texture filter covers based on the provided coordinates.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001396 SAMPLE_C dst, address, sampler_view.r, sampler, ref_value
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001397 e.g.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001398 SAMPLE_C TEMP[0], TEMP[1], SVIEW[0].r, SAMP[0], TEMP[2].x
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001399
1400.. opcode:: SAMPLE_C_LZ - Same as SAMPLE_C, but LOD is 0 and derivatives
1401 are ignored. The LZ stands for level-zero.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001402 SAMPLE_C_LZ dst, address, sampler_view.r, sampler, ref_value
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001403 e.g.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001404 SAMPLE_C_LZ TEMP[0], TEMP[1], SVIEW[0].r, SAMP[0], TEMP[2].x
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001405
1406
1407.. opcode:: SAMPLE_D - SAMPLE_D is identical to the SAMPLE opcode except
1408 that the derivatives for the source address in the x
1409 direction and the y direction are provided by extra
1410 parameters.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001411 SAMPLE_D dst, address, sampler_view, sampler, der_x, der_y
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001412 e.g.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001413 SAMPLE_D TEMP[0], TEMP[1], SVIEW[0], SAMP[0], TEMP[2], TEMP[3]
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001414
1415.. opcode:: SAMPLE_L - SAMPLE_L is identical to the SAMPLE opcode except
1416 that the LOD is provided directly as a scalar value,
1417 representing no anisotropy. Source addresses A channel
1418 is used as the LOD.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001419 SAMPLE_L dst, address, sampler_view, sampler
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001420 e.g.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001421 SAMPLE_L TEMP[0], TEMP[1], SVIEW[0], SAMP[0]
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001422
1423.. opcode:: GATHER4 - Gathers the four texels to be used in a bi-linear
1424 filtering operation and packs them into a single register.
Brian Paul0cd68002012-03-30 09:41:42 -06001425 Only works with 2D, 2D array, cubemaps, and cubemaps arrays.
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001426 For 2D textures, only the addressing modes of the sampler and
1427 the top level of any mip pyramid are used. Set W to zero.
1428 It behaves like the SAMPLE instruction, but a filtered
1429 sample is not generated. The four samples that contribute
Brian Paul0cd68002012-03-30 09:41:42 -06001430 to filtering are placed into xyzw in counter-clockwise order,
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001431 starting with the (u,v) texture coordinate delta at the
1432 following locations (-, +), (+, +), (+, -), (-, -), where
1433 the magnitude of the deltas are half a texel.
1434
1435
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001436.. opcode:: SVIEWINFO - query the dimensions of a given sampler view.
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001437 dst receives width, height, depth or array size and
Zack Rusin3fa814d2011-01-24 21:45:37 -05001438 number of mipmap levels. The dst can have a writemask
1439 which will specify what info is the caller interested
1440 in.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001441 SVIEWINFO dst, src_mip_level, sampler_view
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001442 e.g.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001443 SVIEWINFO TEMP[0], TEMP[1].x, SVIEW[0]
Zack Rusin3fa814d2011-01-24 21:45:37 -05001444 src_mip_level is an unsigned integer scalar. If it's
1445 out of range then returns 0 for width, height and
1446 depth/array size but the total number of mipmap is
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001447 still returned correctly for the given sampler view.
Zack Rusin3fa814d2011-01-24 21:45:37 -05001448 The returned width, height and depth values are for
1449 the mipmap level selected by the src_mip_level and
1450 are in the number of texels.
1451 For 1d texture array width is in dst.x, array size
1452 is in dst.y and dst.zw are always 0.
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001453
1454.. opcode:: SAMPLE_POS - query the position of a given sample.
1455 dst receives float4 (x, y, 0, 0) indicated where the
1456 sample is located. If the resource is not a multi-sample
1457 resource and not a render target, the result is 0.
1458
Zack Rusin3fa814d2011-01-24 21:45:37 -05001459.. opcode:: SAMPLE_INFO - dst receives number of samples in x.
1460 If the resource is not a multi-sample resource and
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001461 not a render target, the result is 0.
1462
1463
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001464.. _resourceopcodes:
1465
1466Resource Access Opcodes
1467^^^^^^^^^^^^^^^^^^^^^^^
1468
1469.. opcode:: LOAD - Fetch data from a shader resource
1470
1471 Syntax: ``LOAD dst, resource, address``
1472
1473 Example: ``LOAD TEMP[0], RES[0], TEMP[1]``
1474
1475 Using the provided integer address, LOAD fetches data
1476 from the specified buffer or texture without any
1477 filtering.
1478
1479 The 'address' is specified as a vector of unsigned
1480 integers. If the 'address' is out of range the result
1481 is unspecified.
1482
1483 Only the first mipmap level of a resource can be read
1484 from using this instruction.
1485
1486 For 1D or 2D texture arrays, the array index is
1487 provided as an unsigned integer in address.y or
1488 address.z, respectively. address.yz are ignored for
1489 buffers and 1D textures. address.z is ignored for 1D
1490 texture arrays and 2D textures. address.w is always
1491 ignored.
1492
Francisco Jerezb8e808f2012-04-30 20:20:29 +02001493.. opcode:: STORE - Write data to a shader resource
1494
1495 Syntax: ``STORE resource, address, src``
1496
1497 Example: ``STORE RES[0], TEMP[0], TEMP[1]``
1498
1499 Using the provided integer address, STORE writes data
1500 to the specified buffer or texture.
1501
1502 The 'address' is specified as a vector of unsigned
1503 integers. If the 'address' is out of range the result
1504 is unspecified.
1505
1506 Only the first mipmap level of a resource can be
1507 written to using this instruction.
1508
1509 For 1D or 2D texture arrays, the array index is
1510 provided as an unsigned integer in address.y or
1511 address.z, respectively. address.yz are ignored for
1512 buffers and 1D textures. address.z is ignored for 1D
1513 texture arrays and 2D textures. address.w is always
1514 ignored.
1515
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001516
Francisco Jerez9e550c32012-04-30 20:21:38 +02001517.. _threadsyncopcodes:
1518
1519Inter-thread synchronization opcodes
1520^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1521
1522These opcodes are intended for communication between threads running
1523within the same compute grid. For now they're only valid in compute
1524programs.
1525
1526.. opcode:: MFENCE - Memory fence
1527
1528 Syntax: ``MFENCE resource``
1529
1530 Example: ``MFENCE RES[0]``
1531
1532 This opcode forces strong ordering between any memory access
1533 operations that affect the specified resource. This means that
1534 previous loads and stores (and only those) will be performed and
1535 visible to other threads before the program execution continues.
1536
1537
1538.. opcode:: LFENCE - Load memory fence
1539
1540 Syntax: ``LFENCE resource``
1541
1542 Example: ``LFENCE RES[0]``
1543
1544 Similar to MFENCE, but it only affects the ordering of memory loads.
1545
1546
1547.. opcode:: SFENCE - Store memory fence
1548
1549 Syntax: ``SFENCE resource``
1550
1551 Example: ``SFENCE RES[0]``
1552
1553 Similar to MFENCE, but it only affects the ordering of memory stores.
1554
1555
1556.. opcode:: BARRIER - Thread group barrier
1557
1558 ``BARRIER``
1559
1560 This opcode suspends the execution of the current thread until all
1561 the remaining threads in the working group reach the same point of
1562 the program. Results are unspecified if any of the remaining
1563 threads terminates or never reaches an executed BARRIER instruction.
1564
1565
Francisco Jerezc2d31a82012-04-30 20:22:23 +02001566.. _atomopcodes:
1567
1568Atomic opcodes
1569^^^^^^^^^^^^^^
1570
1571These opcodes provide atomic variants of some common arithmetic and
1572logical operations. In this context atomicity means that another
1573concurrent memory access operation that affects the same memory
1574location is guaranteed to be performed strictly before or after the
1575entire execution of the atomic operation.
1576
1577For the moment they're only valid in compute programs.
1578
1579.. opcode:: ATOMUADD - Atomic integer addition
1580
1581 Syntax: ``ATOMUADD dst, resource, offset, src``
1582
1583 Example: ``ATOMUADD TEMP[0], RES[0], TEMP[1], TEMP[2]``
1584
1585 The following operation is performed atomically on each component:
1586
1587.. math::
1588
1589 dst_i = resource[offset]_i
1590
1591 resource[offset]_i = dst_i + src_i
1592
1593
1594.. opcode:: ATOMXCHG - Atomic exchange
1595
1596 Syntax: ``ATOMXCHG dst, resource, offset, src``
1597
1598 Example: ``ATOMXCHG TEMP[0], RES[0], TEMP[1], TEMP[2]``
1599
1600 The following operation is performed atomically on each component:
1601
1602.. math::
1603
1604 dst_i = resource[offset]_i
1605
1606 resource[offset]_i = src_i
1607
1608
1609.. opcode:: ATOMCAS - Atomic compare-and-exchange
1610
1611 Syntax: ``ATOMCAS dst, resource, offset, cmp, src``
1612
1613 Example: ``ATOMCAS TEMP[0], RES[0], TEMP[1], TEMP[2], TEMP[3]``
1614
1615 The following operation is performed atomically on each component:
1616
1617.. math::
1618
1619 dst_i = resource[offset]_i
1620
1621 resource[offset]_i = (dst_i == cmp_i ? src_i : dst_i)
1622
1623
1624.. opcode:: ATOMAND - Atomic bitwise And
1625
1626 Syntax: ``ATOMAND dst, resource, offset, src``
1627
1628 Example: ``ATOMAND TEMP[0], RES[0], TEMP[1], TEMP[2]``
1629
1630 The following operation is performed atomically on each component:
1631
1632.. math::
1633
1634 dst_i = resource[offset]_i
1635
1636 resource[offset]_i = dst_i \& src_i
1637
1638
1639.. opcode:: ATOMOR - Atomic bitwise Or
1640
1641 Syntax: ``ATOMOR dst, resource, offset, src``
1642
1643 Example: ``ATOMOR TEMP[0], RES[0], TEMP[1], TEMP[2]``
1644
1645 The following operation is performed atomically on each component:
1646
1647.. math::
1648
1649 dst_i = resource[offset]_i
1650
1651 resource[offset]_i = dst_i | src_i
1652
1653
1654.. opcode:: ATOMXOR - Atomic bitwise Xor
1655
1656 Syntax: ``ATOMXOR dst, resource, offset, src``
1657
1658 Example: ``ATOMXOR TEMP[0], RES[0], TEMP[1], TEMP[2]``
1659
1660 The following operation is performed atomically on each component:
1661
1662.. math::
1663
1664 dst_i = resource[offset]_i
1665
1666 resource[offset]_i = dst_i \oplus src_i
1667
1668
1669.. opcode:: ATOMUMIN - Atomic unsigned minimum
1670
1671 Syntax: ``ATOMUMIN dst, resource, offset, src``
1672
1673 Example: ``ATOMUMIN TEMP[0], RES[0], TEMP[1], TEMP[2]``
1674
1675 The following operation is performed atomically on each component:
1676
1677.. math::
1678
1679 dst_i = resource[offset]_i
1680
1681 resource[offset]_i = (dst_i < src_i ? dst_i : src_i)
1682
1683
1684.. opcode:: ATOMUMAX - Atomic unsigned maximum
1685
1686 Syntax: ``ATOMUMAX dst, resource, offset, src``
1687
1688 Example: ``ATOMUMAX TEMP[0], RES[0], TEMP[1], TEMP[2]``
1689
1690 The following operation is performed atomically on each component:
1691
1692.. math::
1693
1694 dst_i = resource[offset]_i
1695
1696 resource[offset]_i = (dst_i > src_i ? dst_i : src_i)
1697
1698
1699.. opcode:: ATOMIMIN - Atomic signed minimum
1700
1701 Syntax: ``ATOMIMIN dst, resource, offset, src``
1702
1703 Example: ``ATOMIMIN TEMP[0], RES[0], TEMP[1], TEMP[2]``
1704
1705 The following operation is performed atomically on each component:
1706
1707.. math::
1708
1709 dst_i = resource[offset]_i
1710
1711 resource[offset]_i = (dst_i < src_i ? dst_i : src_i)
1712
1713
1714.. opcode:: ATOMIMAX - Atomic signed maximum
1715
1716 Syntax: ``ATOMIMAX dst, resource, offset, src``
1717
1718 Example: ``ATOMIMAX TEMP[0], RES[0], TEMP[1], TEMP[2]``
1719
1720 The following operation is performed atomically on each component:
1721
1722.. math::
1723
1724 dst_i = resource[offset]_i
1725
1726 resource[offset]_i = (dst_i > src_i ? dst_i : src_i)
1727
1728
1729
Corbin Simpsonda65ac62009-12-21 20:32:46 -08001730Explanation of symbols used
Corbin Simpson5bcd26c2009-12-21 21:04:10 -08001731------------------------------
Keith Whitwella62aaa72009-12-21 23:25:15 +00001732
1733
Corbin Simpsonda65ac62009-12-21 20:32:46 -08001734Functions
Corbin Simpson5bcd26c2009-12-21 21:04:10 -08001735^^^^^^^^^^^^^^
Keith Whitwella62aaa72009-12-21 23:25:15 +00001736
1737
Corbin Simpson14743ac2009-12-21 19:57:56 -08001738 :math:`|x|` Absolute value of `x`.
Keith Whitwella62aaa72009-12-21 23:25:15 +00001739
Corbin Simpson14743ac2009-12-21 19:57:56 -08001740 :math:`\lceil x \rceil` Ceiling of `x`.
Keith Whitwella62aaa72009-12-21 23:25:15 +00001741
1742 clamp(x,y,z) Clamp x between y and z.
1743 (x < y) ? y : (x > z) ? z : x
1744
Corbin Simpsondd801e52009-12-21 19:41:09 -08001745 :math:`\lfloor x\rfloor` Floor of `x`.
Keith Whitwella62aaa72009-12-21 23:25:15 +00001746
Corbin Simpson14743ac2009-12-21 19:57:56 -08001747 :math:`\log_2{x}` Logarithm of `x`, base 2.
Keith Whitwella62aaa72009-12-21 23:25:15 +00001748
1749 max(x,y) Maximum of x and y.
1750 (x > y) ? x : y
1751
1752 min(x,y) Minimum of x and y.
1753 (x < y) ? x : y
1754
1755 partialx(x) Derivative of x relative to fragment's X.
1756
1757 partialy(x) Derivative of x relative to fragment's Y.
1758
1759 pop() Pop from stack.
1760
Corbin Simpsondd801e52009-12-21 19:41:09 -08001761 :math:`x^y` `x` to the power `y`.
Keith Whitwella62aaa72009-12-21 23:25:15 +00001762
1763 push(x) Push x on stack.
1764
1765 round(x) Round x.
1766
Michal Krol07f416c2010-01-04 13:21:32 +01001767 trunc(x) Truncate x, i.e. drop the fraction bits.
Keith Whitwella62aaa72009-12-21 23:25:15 +00001768
1769
Corbin Simpsonda65ac62009-12-21 20:32:46 -08001770Keywords
Corbin Simpson5bcd26c2009-12-21 21:04:10 -08001771^^^^^^^^^^^^^
Keith Whitwella62aaa72009-12-21 23:25:15 +00001772
1773
1774 discard Discard fragment.
1775
Keith Whitwella62aaa72009-12-21 23:25:15 +00001776 pc Program counter.
1777
Keith Whitwella62aaa72009-12-21 23:25:15 +00001778 target Label of target instruction.
1779
1780
Corbin Simpsonda65ac62009-12-21 20:32:46 -08001781Other tokens
Corbin Simpson5bcd26c2009-12-21 21:04:10 -08001782---------------
Keith Whitwella62aaa72009-12-21 23:25:15 +00001783
1784
Michal Krol63d60972010-02-03 15:45:32 +01001785Declaration
1786^^^^^^^^^^^
1787
1788
1789Declares a register that is will be referenced as an operand in Instruction
1790tokens.
1791
1792File field contains register file that is being declared and is one
1793of TGSI_FILE.
1794
1795UsageMask field specifies which of the register components can be accessed
1796and is one of TGSI_WRITEMASK.
1797
Francisco Jerez26449522012-03-18 19:21:36 +01001798The Local flag specifies that a given value isn't intended for
1799subroutine parameter passing and, as a result, the implementation
1800isn't required to give any guarantees of it being preserved across
1801subroutine boundaries. As it's merely a compiler hint, the
1802implementation is free to ignore it.
1803
Michal Krol63d60972010-02-03 15:45:32 +01001804If Dimension flag is set to 1, a Declaration Dimension token follows.
1805
1806If Semantic flag is set to 1, a Declaration Semantic token follows.
1807
Francisco Jerez12799232012-04-30 18:27:52 +02001808If Interpolate flag is set to 1, a Declaration Interpolate token follows.
Michal Krol63d60972010-02-03 15:45:32 +01001809
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001810If file is TGSI_FILE_RESOURCE, a Declaration Resource token follows.
1811
Michal Krol63d60972010-02-03 15:45:32 +01001812
Corbin Simpsonda65ac62009-12-21 20:32:46 -08001813Declaration Semantic
Corbin Simpson5bcd26c2009-12-21 21:04:10 -08001814^^^^^^^^^^^^^^^^^^^^^^^^
Keith Whitwella62aaa72009-12-21 23:25:15 +00001815
Brian Paul05a18f42010-06-24 07:21:15 -06001816 Vertex and fragment shader input and output registers may be labeled
1817 with semantic information consisting of a name and index.
Keith Whitwella62aaa72009-12-21 23:25:15 +00001818
1819 Follows Declaration token if Semantic bit is set.
1820
1821 Since its purpose is to link a shader with other stages of the pipeline,
1822 it is valid to follow only those Declaration tokens that declare a register
1823 either in INPUT or OUTPUT file.
1824
1825 SemanticName field contains the semantic name of the register being declared.
1826 There is no default value.
1827
1828 SemanticIndex is an optional subscript that can be used to distinguish
1829 different register declarations with the same semantic name. The default value
1830 is 0.
1831
1832 The meanings of the individual semantic names are explained in the following
1833 sections.
1834
Corbin Simpson54ddf642009-12-23 23:36:06 -08001835TGSI_SEMANTIC_POSITION
1836""""""""""""""""""""""
Keith Whitwella62aaa72009-12-21 23:25:15 +00001837
Brian Paul50b3f2e2010-06-23 17:00:10 -06001838For vertex shaders, TGSI_SEMANTIC_POSITION indicates the vertex shader
1839output register which contains the homogeneous vertex position in the clip
1840space coordinate system. After clipping, the X, Y and Z components of the
1841vertex will be divided by the W value to get normalized device coordinates.
Keith Whitwella62aaa72009-12-21 23:25:15 +00001842
Brian Paul50b3f2e2010-06-23 17:00:10 -06001843For fragment shaders, TGSI_SEMANTIC_POSITION is used to indicate that
1844fragment shader input contains the fragment's window position. The X
1845component starts at zero and always increases from left to right.
1846The Y component starts at zero and always increases but Y=0 may either
1847indicate the top of the window or the bottom depending on the fragment
1848coordinate origin convention (see TGSI_PROPERTY_FS_COORD_ORIGIN).
1849The Z coordinate ranges from 0 to 1 to represent depth from the front
1850to the back of the Z buffer. The W component contains the reciprocol
1851of the interpolated vertex position W component.
Corbin Simpson54ddf642009-12-23 23:36:06 -08001852
Brian Paul05a18f42010-06-24 07:21:15 -06001853Fragment shaders may also declare an output register with
1854TGSI_SEMANTIC_POSITION. Only the Z component is writable. This allows
1855the fragment shader to change the fragment's Z position.
1856
Corbin Simpson54ddf642009-12-23 23:36:06 -08001857
Corbin Simpson54ddf642009-12-23 23:36:06 -08001858
1859TGSI_SEMANTIC_COLOR
1860"""""""""""""""""""
1861
Brian Paul50b3f2e2010-06-23 17:00:10 -06001862For vertex shader outputs or fragment shader inputs/outputs, this
1863label indicates that the resister contains an R,G,B,A color.
Corbin Simpson54ddf642009-12-23 23:36:06 -08001864
Brian Paul50b3f2e2010-06-23 17:00:10 -06001865Several shader inputs/outputs may contain colors so the semantic index
1866is used to distinguish them. For example, color[0] may be the diffuse
1867color while color[1] may be the specular color.
1868
1869This label is needed so that the flat/smooth shading can be applied
1870to the right interpolants during rasterization.
1871
1872
Corbin Simpson54ddf642009-12-23 23:36:06 -08001873
1874TGSI_SEMANTIC_BCOLOR
1875""""""""""""""""""""
1876
1877Back-facing colors are only used for back-facing polygons, and are only valid
1878in vertex shader outputs. After rasterization, all polygons are front-facing
Brian Paul50b3f2e2010-06-23 17:00:10 -06001879and COLOR and BCOLOR end up occupying the same slots in the fragment shader,
1880so all BCOLORs effectively become regular COLORs in the fragment shader.
1881
Corbin Simpson54ddf642009-12-23 23:36:06 -08001882
1883TGSI_SEMANTIC_FOG
1884"""""""""""""""""
1885
Brian Paul05a18f42010-06-24 07:21:15 -06001886Vertex shader inputs and outputs and fragment shader inputs may be
1887labeled with TGSI_SEMANTIC_FOG to indicate that the register contains
1888a fog coordinate in the form (F, 0, 0, 1). Typically, the fragment
1889shader will use the fog coordinate to compute a fog blend factor which
1890is used to blend the normal fragment color with a constant fog color.
Corbin Simpson54ddf642009-12-23 23:36:06 -08001891
Brian Paul05a18f42010-06-24 07:21:15 -06001892Only the first component matters when writing from the vertex shader;
1893the driver will ensure that the coordinate is in this format when used
1894as a fragment shader input.
1895
Corbin Simpson54ddf642009-12-23 23:36:06 -08001896
1897TGSI_SEMANTIC_PSIZE
1898"""""""""""""""""""
1899
Brian Paul05a18f42010-06-24 07:21:15 -06001900Vertex shader input and output registers may be labeled with
1901TGIS_SEMANTIC_PSIZE to indicate that the register contains a point size
1902in the form (S, 0, 0, 1). The point size controls the width or diameter
1903of points for rasterization. This label cannot be used in fragment
1904shaders.
Corbin Simpson54ddf642009-12-23 23:36:06 -08001905
1906When using this semantic, be sure to set the appropriate state in the
1907:ref:`rasterizer` first.
1908
Brian Paul05a18f42010-06-24 07:21:15 -06001909
Corbin Simpson54ddf642009-12-23 23:36:06 -08001910TGSI_SEMANTIC_GENERIC
1911"""""""""""""""""""""
1912
Brian Paul05a18f42010-06-24 07:21:15 -06001913All vertex/fragment shader inputs/outputs not labeled with any other
1914semantic label can be considered to be generic attributes. Typical
1915uses of generic inputs/outputs are texcoords and user-defined values.
Corbin Simpson54ddf642009-12-23 23:36:06 -08001916
Corbin Simpson54ddf642009-12-23 23:36:06 -08001917
1918TGSI_SEMANTIC_NORMAL
1919""""""""""""""""""""
1920
Brian Paul05a18f42010-06-24 07:21:15 -06001921Indicates that a vertex shader input is a normal vector. This is
1922typically only used for legacy graphics APIs.
1923
Corbin Simpson54ddf642009-12-23 23:36:06 -08001924
1925TGSI_SEMANTIC_FACE
1926""""""""""""""""""
1927
Brian Paul05a18f42010-06-24 07:21:15 -06001928This label applies to fragment shader inputs only and indicates that
1929the register contains front/back-face information of the form (F, 0,
19300, 1). The first component will be positive when the fragment belongs
1931to a front-facing polygon, and negative when the fragment belongs to a
1932back-facing polygon.
1933
Corbin Simpson54ddf642009-12-23 23:36:06 -08001934
1935TGSI_SEMANTIC_EDGEFLAG
1936""""""""""""""""""""""
1937
Brian Paul73153002010-06-23 17:38:58 -06001938For vertex shaders, this sematic label indicates that an input or
1939output is a boolean edge flag. The register layout is [F, x, x, x]
1940where F is 0.0 or 1.0 and x = don't care. Normally, the vertex shader
1941simply copies the edge flag input to the edgeflag output.
1942
1943Edge flags are used to control which lines or points are actually
1944drawn when the polygon mode converts triangles/quads/polygons into
1945points or lines.
1946
Dave Airlie4ecb2c12010-10-06 09:28:46 +10001947TGSI_SEMANTIC_STENCIL
1948""""""""""""""""""""""
1949
1950For fragment shaders, this semantic label indicates than an output
1951is a writable stencil reference value. Only the Y component is writable.
1952This allows the fragment shader to change the fragments stencilref value.
Luca Barbieri73317132010-01-21 05:36:14 +01001953
1954
Francisco Jerez12799232012-04-30 18:27:52 +02001955Declaration Interpolate
1956^^^^^^^^^^^^^^^^^^^^^^^
1957
1958This token is only valid for fragment shader INPUT declarations.
1959
1960The Interpolate field specifes the way input is being interpolated by
1961the rasteriser and is one of TGSI_INTERPOLATE_*.
1962
1963The CylindricalWrap bitfield specifies which register components
1964should be subject to cylindrical wrapping when interpolating by the
1965rasteriser. If TGSI_CYLINDRICAL_WRAP_X is set to 1, the X component
1966should be interpolated according to cylindrical wrapping rules.
1967
1968
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001969Declaration Sampler View
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001970^^^^^^^^^^^^^^^^^^^^^^^^
1971
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001972 Follows Declaration token if file is TGSI_FILE_SAMPLER_VIEW.
1973
1974 DCL SVIEW[#], resource, type(s)
1975
1976 Declares a shader input sampler view and assigns it to a SVIEW[#]
1977 register.
1978
1979 resource can be one of BUFFER, 1D, 2D, 3D, 1DArray and 2DArray.
1980
1981 type must be 1 or 4 entries (if specifying on a per-component
1982 level) out of UNORM, SNORM, SINT, UINT and FLOAT.
1983
1984
1985Declaration Resource
1986^^^^^^^^^^^^^^^^^^^^
1987
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001988 Follows Declaration token if file is TGSI_FILE_RESOURCE.
1989
Francisco Jerezb8e808f2012-04-30 20:20:29 +02001990 DCL RES[#], resource [, WR] [, RAW]
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001991
1992 Declares a shader input resource and assigns it to a RES[#]
1993 register.
1994
1995 resource can be one of BUFFER, 1D, 2D, 3D, CUBE, 1DArray and
1996 2DArray.
1997
Francisco Jerez82c90b22012-04-30 19:08:55 +02001998 If the RAW keyword is not specified, the texture data will be
1999 subject to conversion, swizzling and scaling as required to yield
2000 the specified data type from the physical data format of the bound
2001 resource.
2002
2003 If the RAW keyword is specified, no channel conversion will be
2004 performed: the values read for each of the channels (X,Y,Z,W) will
2005 correspond to consecutive words in the same order and format
2006 they're found in memory. No element-to-address conversion will be
2007 performed either: the value of the provided X coordinate will be
2008 interpreted in byte units instead of texel units. The result of
2009 accessing a misaligned address is undefined.
2010
Francisco Jerezb8e808f2012-04-30 20:20:29 +02002011 Usage of the STORE opcode is only allowed if the WR (writable) flag
2012 is set.
2013
Zack Rusinbdbe77f2011-01-24 17:47:10 -05002014
Luca Barbieri73317132010-01-21 05:36:14 +01002015Properties
2016^^^^^^^^^^^^^^^^^^^^^^^^
2017
2018
2019 Properties are general directives that apply to the whole TGSI program.
2020
2021FS_COORD_ORIGIN
2022"""""""""""""""
2023
2024Specifies the fragment shader TGSI_SEMANTIC_POSITION coordinate origin.
2025The default value is UPPER_LEFT.
2026
2027If UPPER_LEFT, the position will be (0,0) at the upper left corner and
2028increase downward and rightward.
2029If LOWER_LEFT, the position will be (0,0) at the lower left corner and
2030increase upward and rightward.
2031
2032OpenGL defaults to LOWER_LEFT, and is configurable with the
2033GL_ARB_fragment_coord_conventions extension.
2034
2035DirectX 9/10 use UPPER_LEFT.
2036
2037FS_COORD_PIXEL_CENTER
2038"""""""""""""""""""""
2039
2040Specifies the fragment shader TGSI_SEMANTIC_POSITION pixel center convention.
2041The default value is HALF_INTEGER.
2042
2043If HALF_INTEGER, the fractionary part of the position will be 0.5
2044If INTEGER, the fractionary part of the position will be 0.0
2045
2046Note that this does not affect the set of fragments generated by
2047rasterization, which is instead controlled by gl_rasterization_rules in the
2048rasterizer.
2049
2050OpenGL defaults to HALF_INTEGER, and is configurable with the
2051GL_ARB_fragment_coord_conventions extension.
2052
2053DirectX 9 uses INTEGER.
2054DirectX 10 uses HALF_INTEGER.
Brian Paul4778f462010-02-02 08:14:40 -07002055
Dave Airliec9c8a5e2010-12-18 10:34:35 +10002056FS_COLOR0_WRITES_ALL_CBUFS
2057""""""""""""""""""""""""""
2058Specifies that writes to the fragment shader color 0 are replicated to all
2059bound cbufs. This facilitates OpenGL's fragColor output vs fragData[0] where
2060fragData is directed to a single color buffer, but fragColor is broadcast.
Brian Paul4778f462010-02-02 08:14:40 -07002061
Marek Olšákdc4c8212012-01-10 00:19:00 +01002062VS_PROHIBIT_UCPS
2063""""""""""""""""""""""""""
2064If this property is set on the program bound to the shader stage before the
2065fragment shader, user clip planes should have no effect (be disabled) even if
2066that shader does not write to any clip distance outputs and the rasterizer's
2067clip_plane_enable is non-zero.
2068This property is only supported by drivers that also support shader clip
2069distance outputs.
2070This is useful for APIs that don't have UCPs and where clip distances written
2071by a shader cannot be disabled.
2072
Brian Paul4778f462010-02-02 08:14:40 -07002073
2074Texture Sampling and Texture Formats
2075------------------------------------
2076
Corbin Simpson797dcc02010-02-02 17:07:26 -08002077This table shows how texture image components are returned as (x,y,z,w) tuples
2078by TGSI texture instructions, such as :opcode:`TEX`, :opcode:`TXD`, and
2079:opcode:`TXP`. For reference, OpenGL and Direct3D conventions are shown as
2080well.
Brian Paul4778f462010-02-02 08:14:40 -07002081
Corbin Simpson516e7152010-02-02 12:44:22 -08002082+--------------------+--------------+--------------------+--------------+
2083| Texture Components | Gallium | OpenGL | Direct3D 9 |
2084+====================+==============+====================+==============+
Corbin Simpson92867dc2010-06-16 16:56:55 -07002085| R | (r, 0, 0, 1) | (r, 0, 0, 1) | (r, 1, 1, 1) |
Corbin Simpson516e7152010-02-02 12:44:22 -08002086+--------------------+--------------+--------------------+--------------+
Corbin Simpson92867dc2010-06-16 16:56:55 -07002087| RG | (r, g, 0, 1) | (r, g, 0, 1) | (r, g, 1, 1) |
Corbin Simpson516e7152010-02-02 12:44:22 -08002088+--------------------+--------------+--------------------+--------------+
2089| RGB | (r, g, b, 1) | (r, g, b, 1) | (r, g, b, 1) |
2090+--------------------+--------------+--------------------+--------------+
2091| RGBA | (r, g, b, a) | (r, g, b, a) | (r, g, b, a) |
2092+--------------------+--------------+--------------------+--------------+
2093| A | (0, 0, 0, a) | (0, 0, 0, a) | (0, 0, 0, a) |
2094+--------------------+--------------+--------------------+--------------+
2095| L | (l, l, l, 1) | (l, l, l, 1) | (l, l, l, 1) |
2096+--------------------+--------------+--------------------+--------------+
2097| LA | (l, l, l, a) | (l, l, l, a) | (l, l, l, a) |
2098+--------------------+--------------+--------------------+--------------+
2099| I | (i, i, i, i) | (i, i, i, i) | N/A |
2100+--------------------+--------------+--------------------+--------------+
2101| UV | XXX TBD | (0, 0, 0, 1) | (u, v, 1, 1) |
2102| | | [#envmap-bumpmap]_ | |
2103+--------------------+--------------+--------------------+--------------+
Brian Paul3e572eb2010-02-02 16:27:07 -07002104| Z | XXX TBD | (z, z, z, 1) | (0, z, 0, 1) |
Corbin Simpson516e7152010-02-02 12:44:22 -08002105| | | [#depth-tex-mode]_ | |
2106+--------------------+--------------+--------------------+--------------+
Dave Airlie66a0d1e2010-10-06 09:30:17 +10002107| S | (s, s, s, s) | unknown | unknown |
2108+--------------------+--------------+--------------------+--------------+
Brian Paul4778f462010-02-02 08:14:40 -07002109
Corbin Simpson516e7152010-02-02 12:44:22 -08002110.. [#envmap-bumpmap] http://www.opengl.org/registry/specs/ATI/envmap_bumpmap.txt
Brian Paul3e572eb2010-02-02 16:27:07 -07002111.. [#depth-tex-mode] the default is (z, z, z, 1) but may also be (0, 0, 0, z)
Corbin Simpson797dcc02010-02-02 17:07:26 -08002112 or (z, z, z, z) depending on the value of GL_DEPTH_TEXTURE_MODE.