blob: f13dd8fdffa659b0642801ca06e59687288f3289 [file] [log] [blame]
Corbin Simpsonc686e172009-12-20 15:00:40 -08001TGSI
2====
3
Michal Krolb6659682010-01-04 12:52:43 +01004TGSI, Tungsten Graphics Shader Infrastructure, is an intermediate language
Corbin Simpsonc686e172009-12-20 15:00:40 -08005for describing shaders. Since Gallium is inherently shaderful, shaders are
6an important part of the API. TGSI is the only intermediate representation
7used by all drivers.
Keith Whitwella62aaa72009-12-21 23:25:15 +00008
Corbin Simpson62ca7b82010-02-02 16:36:34 -08009Basics
10------
11
12All TGSI instructions, known as *opcodes*, operate on arbitrary-precision
13floating-point four-component vectors. An opcode may have up to one
14destination register, known as *dst*, and between zero and three source
15registers, called *src0* through *src2*, or simply *src* if there is only
16one.
17
18Some instructions, like :opcode:`I2F`, permit re-interpretation of vector
19components as integers. Other instructions permit using registers as
20two-component vectors with double precision; see :ref:`Double Opcodes`.
21
Corbin Simpson17c2a442010-02-02 17:02:28 -080022When an instruction has a scalar result, the result is usually copied into
23each of the components of *dst*. When this happens, the result is said to be
24*replicated* to *dst*. :opcode:`RCP` is one such instruction.
25
Corbin Simpson5bcd26c2009-12-21 21:04:10 -080026Instruction Set
27---------------
28
Corbin Simpson9d4cb6e2010-06-16 18:34:32 -070029Core ISA
Corbin Simpson5bcd26c2009-12-21 21:04:10 -080030^^^^^^^^^^^^^^^^^^^^^^^^^
Keith Whitwella62aaa72009-12-21 23:25:15 +000031
Corbin Simpson9d4cb6e2010-06-16 18:34:32 -070032These opcodes are guaranteed to be available regardless of the driver being
33used.
Keith Whitwella62aaa72009-12-21 23:25:15 +000034
Corbin Simpson85805222010-02-02 16:20:12 -080035.. opcode:: ARL - Address Register Load
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080036
37.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +000038
Corbin Simpsond92a6852009-12-21 19:30:29 -080039 dst.x = \lfloor src.x\rfloor
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080040
Corbin Simpsond92a6852009-12-21 19:30:29 -080041 dst.y = \lfloor src.y\rfloor
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080042
Corbin Simpsond92a6852009-12-21 19:30:29 -080043 dst.z = \lfloor src.z\rfloor
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080044
Corbin Simpsond92a6852009-12-21 19:30:29 -080045 dst.w = \lfloor src.w\rfloor
Keith Whitwella62aaa72009-12-21 23:25:15 +000046
47
Corbin Simpson85805222010-02-02 16:20:12 -080048.. opcode:: MOV - Move
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080049
50.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +000051
52 dst.x = src.x
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080053
Keith Whitwella62aaa72009-12-21 23:25:15 +000054 dst.y = src.y
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080055
Keith Whitwella62aaa72009-12-21 23:25:15 +000056 dst.z = src.z
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080057
Keith Whitwella62aaa72009-12-21 23:25:15 +000058 dst.w = src.w
59
60
Corbin Simpson85805222010-02-02 16:20:12 -080061.. opcode:: LIT - Light Coefficients
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080062
63.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +000064
Corbin Simpsonda65ac62009-12-21 20:32:46 -080065 dst.x = 1
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080066
Corbin Simpsonda65ac62009-12-21 20:32:46 -080067 dst.y = max(src.x, 0)
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080068
Corbin Simpsonda65ac62009-12-21 20:32:46 -080069 dst.z = (src.x > 0) ? max(src.y, 0)^{clamp(src.w, -128, 128))} : 0
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080070
Corbin Simpsonda65ac62009-12-21 20:32:46 -080071 dst.w = 1
Keith Whitwella62aaa72009-12-21 23:25:15 +000072
73
Corbin Simpson85805222010-02-02 16:20:12 -080074.. opcode:: RCP - Reciprocal
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080075
Corbin Simpson17c2a442010-02-02 17:02:28 -080076This instruction replicates its result.
77
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080078.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +000079
Corbin Simpson17c2a442010-02-02 17:02:28 -080080 dst = \frac{1}{src.x}
Keith Whitwella62aaa72009-12-21 23:25:15 +000081
82
Corbin Simpson85805222010-02-02 16:20:12 -080083.. opcode:: RSQ - Reciprocal Square Root
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080084
Corbin Simpson17c2a442010-02-02 17:02:28 -080085This instruction replicates its result.
86
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080087.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +000088
Corbin Simpson17c2a442010-02-02 17:02:28 -080089 dst = \frac{1}{\sqrt{|src.x|}}
Keith Whitwella62aaa72009-12-21 23:25:15 +000090
91
Brian Pauld276a402013-02-01 10:59:43 -070092.. opcode:: SQRT - Square Root
93
94This instruction replicates its result.
95
96.. math::
97
98 dst = {\sqrt{src.x}}
99
100
Corbin Simpson85805222010-02-02 16:20:12 -0800101.. opcode:: EXP - Approximate Exponential Base 2
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800102
103.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000104
Corbin Simpsondd801e52009-12-21 19:41:09 -0800105 dst.x = 2^{\lfloor src.x\rfloor}
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800106
Corbin Simpsond92a6852009-12-21 19:30:29 -0800107 dst.y = src.x - \lfloor src.x\rfloor
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800108
Corbin Simpsondd801e52009-12-21 19:41:09 -0800109 dst.z = 2^{src.x}
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800110
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800111 dst.w = 1
Keith Whitwella62aaa72009-12-21 23:25:15 +0000112
113
Corbin Simpson85805222010-02-02 16:20:12 -0800114.. opcode:: LOG - Approximate Logarithm Base 2
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800115
116.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000117
Corbin Simpson14743ac2009-12-21 19:57:56 -0800118 dst.x = \lfloor\log_2{|src.x|}\rfloor
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800119
Corbin Simpson14743ac2009-12-21 19:57:56 -0800120 dst.y = \frac{|src.x|}{2^{\lfloor\log_2{|src.x|}\rfloor}}
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800121
Corbin Simpson14743ac2009-12-21 19:57:56 -0800122 dst.z = \log_2{|src.x|}
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800123
Corbin Simpson14743ac2009-12-21 19:57:56 -0800124 dst.w = 1
Keith Whitwella62aaa72009-12-21 23:25:15 +0000125
126
Corbin Simpson85805222010-02-02 16:20:12 -0800127.. opcode:: MUL - Multiply
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800128
129.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000130
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800131 dst.x = src0.x \times src1.x
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800132
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800133 dst.y = src0.y \times src1.y
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800134
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800135 dst.z = src0.z \times src1.z
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800136
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800137 dst.w = src0.w \times src1.w
Keith Whitwella62aaa72009-12-21 23:25:15 +0000138
139
Corbin Simpson85805222010-02-02 16:20:12 -0800140.. opcode:: ADD - Add
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800141
142.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000143
144 dst.x = src0.x + src1.x
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800145
Keith Whitwella62aaa72009-12-21 23:25:15 +0000146 dst.y = src0.y + src1.y
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800147
Keith Whitwella62aaa72009-12-21 23:25:15 +0000148 dst.z = src0.z + src1.z
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800149
Keith Whitwella62aaa72009-12-21 23:25:15 +0000150 dst.w = src0.w + src1.w
151
152
Corbin Simpson85805222010-02-02 16:20:12 -0800153.. opcode:: DP3 - 3-component Dot Product
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800154
Corbin Simpson17c2a442010-02-02 17:02:28 -0800155This instruction replicates its result.
156
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800157.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000158
Corbin Simpson17c2a442010-02-02 17:02:28 -0800159 dst = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z
Keith Whitwella62aaa72009-12-21 23:25:15 +0000160
161
Corbin Simpson85805222010-02-02 16:20:12 -0800162.. opcode:: DP4 - 4-component Dot Product
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800163
Corbin Simpson17c2a442010-02-02 17:02:28 -0800164This instruction replicates its result.
165
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800166.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000167
Corbin Simpson17c2a442010-02-02 17:02:28 -0800168 dst = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src0.w \times src1.w
Keith Whitwella62aaa72009-12-21 23:25:15 +0000169
170
Corbin Simpson85805222010-02-02 16:20:12 -0800171.. opcode:: DST - Distance Vector
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800172
173.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000174
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800175 dst.x = 1
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800176
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800177 dst.y = src0.y \times src1.y
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800178
Keith Whitwella62aaa72009-12-21 23:25:15 +0000179 dst.z = src0.z
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800180
Keith Whitwella62aaa72009-12-21 23:25:15 +0000181 dst.w = src1.w
182
183
Corbin Simpson85805222010-02-02 16:20:12 -0800184.. opcode:: MIN - Minimum
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800185
186.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000187
188 dst.x = min(src0.x, src1.x)
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800189
Keith Whitwella62aaa72009-12-21 23:25:15 +0000190 dst.y = min(src0.y, src1.y)
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800191
Keith Whitwella62aaa72009-12-21 23:25:15 +0000192 dst.z = min(src0.z, src1.z)
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800193
Keith Whitwella62aaa72009-12-21 23:25:15 +0000194 dst.w = min(src0.w, src1.w)
195
196
Corbin Simpson85805222010-02-02 16:20:12 -0800197.. opcode:: MAX - Maximum
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800198
199.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000200
201 dst.x = max(src0.x, src1.x)
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800202
Keith Whitwella62aaa72009-12-21 23:25:15 +0000203 dst.y = max(src0.y, src1.y)
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800204
Keith Whitwella62aaa72009-12-21 23:25:15 +0000205 dst.z = max(src0.z, src1.z)
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800206
Keith Whitwella62aaa72009-12-21 23:25:15 +0000207 dst.w = max(src0.w, src1.w)
208
209
Corbin Simpson85805222010-02-02 16:20:12 -0800210.. opcode:: SLT - Set On Less Than
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800211
212.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000213
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800214 dst.x = (src0.x < src1.x) ? 1 : 0
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800215
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800216 dst.y = (src0.y < src1.y) ? 1 : 0
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800217
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800218 dst.z = (src0.z < src1.z) ? 1 : 0
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800219
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800220 dst.w = (src0.w < src1.w) ? 1 : 0
Keith Whitwella62aaa72009-12-21 23:25:15 +0000221
222
Corbin Simpson85805222010-02-02 16:20:12 -0800223.. opcode:: SGE - Set On Greater Equal Than
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800224
225.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000226
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800227 dst.x = (src0.x >= src1.x) ? 1 : 0
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800228
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800229 dst.y = (src0.y >= src1.y) ? 1 : 0
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800230
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800231 dst.z = (src0.z >= src1.z) ? 1 : 0
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800232
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800233 dst.w = (src0.w >= src1.w) ? 1 : 0
Keith Whitwella62aaa72009-12-21 23:25:15 +0000234
235
Corbin Simpson85805222010-02-02 16:20:12 -0800236.. opcode:: MAD - Multiply And Add
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800237
238.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000239
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800240 dst.x = src0.x \times src1.x + src2.x
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800241
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800242 dst.y = src0.y \times src1.y + src2.y
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800243
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800244 dst.z = src0.z \times src1.z + src2.z
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800245
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800246 dst.w = src0.w \times src1.w + src2.w
Keith Whitwella62aaa72009-12-21 23:25:15 +0000247
248
Corbin Simpson85805222010-02-02 16:20:12 -0800249.. opcode:: SUB - Subtract
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800250
251.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000252
253 dst.x = src0.x - src1.x
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800254
Keith Whitwella62aaa72009-12-21 23:25:15 +0000255 dst.y = src0.y - src1.y
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800256
Keith Whitwella62aaa72009-12-21 23:25:15 +0000257 dst.z = src0.z - src1.z
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800258
Keith Whitwella62aaa72009-12-21 23:25:15 +0000259 dst.w = src0.w - src1.w
260
261
Corbin Simpson85805222010-02-02 16:20:12 -0800262.. opcode:: LRP - Linear Interpolate
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800263
264.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000265
Michal Krolb3567fc2010-01-04 12:59:17 +0100266 dst.x = src0.x \times src1.x + (1 - src0.x) \times src2.x
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800267
Michal Krolb3567fc2010-01-04 12:59:17 +0100268 dst.y = src0.y \times src1.y + (1 - src0.y) \times src2.y
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800269
Michal Krolb3567fc2010-01-04 12:59:17 +0100270 dst.z = src0.z \times src1.z + (1 - src0.z) \times src2.z
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800271
Michal Krolb3567fc2010-01-04 12:59:17 +0100272 dst.w = src0.w \times src1.w + (1 - src0.w) \times src2.w
Keith Whitwella62aaa72009-12-21 23:25:15 +0000273
274
Corbin Simpson85805222010-02-02 16:20:12 -0800275.. opcode:: CND - Condition
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800276
277.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000278
279 dst.x = (src2.x > 0.5) ? src0.x : src1.x
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800280
Keith Whitwella62aaa72009-12-21 23:25:15 +0000281 dst.y = (src2.y > 0.5) ? src0.y : src1.y
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800282
Keith Whitwella62aaa72009-12-21 23:25:15 +0000283 dst.z = (src2.z > 0.5) ? src0.z : src1.z
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800284
Keith Whitwella62aaa72009-12-21 23:25:15 +0000285 dst.w = (src2.w > 0.5) ? src0.w : src1.w
286
287
Corbin Simpson85805222010-02-02 16:20:12 -0800288.. opcode:: DP2A - 2-component Dot Product And Add
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800289
290.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000291
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800292 dst.x = src0.x \times src1.x + src0.y \times src1.y + src2.x
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800293
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800294 dst.y = src0.x \times src1.x + src0.y \times src1.y + src2.x
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800295
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800296 dst.z = src0.x \times src1.x + src0.y \times src1.y + src2.x
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800297
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800298 dst.w = src0.x \times src1.x + src0.y \times src1.y + src2.x
Keith Whitwella62aaa72009-12-21 23:25:15 +0000299
300
José Fonsecad9c6ebb2010-06-01 16:25:05 +0100301.. opcode:: FRC - Fraction
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800302
303.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000304
Corbin Simpsond92a6852009-12-21 19:30:29 -0800305 dst.x = src.x - \lfloor src.x\rfloor
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800306
Corbin Simpsond92a6852009-12-21 19:30:29 -0800307 dst.y = src.y - \lfloor src.y\rfloor
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800308
Corbin Simpsond92a6852009-12-21 19:30:29 -0800309 dst.z = src.z - \lfloor src.z\rfloor
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800310
Corbin Simpsond92a6852009-12-21 19:30:29 -0800311 dst.w = src.w - \lfloor src.w\rfloor
Keith Whitwella62aaa72009-12-21 23:25:15 +0000312
313
Corbin Simpson85805222010-02-02 16:20:12 -0800314.. opcode:: CLAMP - Clamp
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800315
316.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000317
318 dst.x = clamp(src0.x, src1.x, src2.x)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800319
Keith Whitwella62aaa72009-12-21 23:25:15 +0000320 dst.y = clamp(src0.y, src1.y, src2.y)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800321
Keith Whitwella62aaa72009-12-21 23:25:15 +0000322 dst.z = clamp(src0.z, src1.z, src2.z)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800323
Keith Whitwella62aaa72009-12-21 23:25:15 +0000324 dst.w = clamp(src0.w, src1.w, src2.w)
325
326
Corbin Simpson85805222010-02-02 16:20:12 -0800327.. opcode:: FLR - Floor
Corbin Simpsond92a6852009-12-21 19:30:29 -0800328
Corbin Simpson17c2a442010-02-02 17:02:28 -0800329This is identical to :opcode:`ARL`.
Keith Whitwella62aaa72009-12-21 23:25:15 +0000330
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800331.. math::
332
Corbin Simpsond92a6852009-12-21 19:30:29 -0800333 dst.x = \lfloor src.x\rfloor
334
335 dst.y = \lfloor src.y\rfloor
336
337 dst.z = \lfloor src.z\rfloor
338
339 dst.w = \lfloor src.w\rfloor
Keith Whitwella62aaa72009-12-21 23:25:15 +0000340
341
Corbin Simpson85805222010-02-02 16:20:12 -0800342.. opcode:: ROUND - Round
Keith Whitwella62aaa72009-12-21 23:25:15 +0000343
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800344.. math::
345
Keith Whitwella62aaa72009-12-21 23:25:15 +0000346 dst.x = round(src.x)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800347
Keith Whitwella62aaa72009-12-21 23:25:15 +0000348 dst.y = round(src.y)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800349
Keith Whitwella62aaa72009-12-21 23:25:15 +0000350 dst.z = round(src.z)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800351
Keith Whitwella62aaa72009-12-21 23:25:15 +0000352 dst.w = round(src.w)
353
354
Corbin Simpson85805222010-02-02 16:20:12 -0800355.. opcode:: EX2 - Exponential Base 2
Keith Whitwella62aaa72009-12-21 23:25:15 +0000356
Corbin Simpson17c2a442010-02-02 17:02:28 -0800357This instruction replicates its result.
358
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800359.. math::
360
Corbin Simpson17c2a442010-02-02 17:02:28 -0800361 dst = 2^{src.x}
Keith Whitwella62aaa72009-12-21 23:25:15 +0000362
363
Corbin Simpson85805222010-02-02 16:20:12 -0800364.. opcode:: LG2 - Logarithm Base 2
Keith Whitwella62aaa72009-12-21 23:25:15 +0000365
Corbin Simpson17c2a442010-02-02 17:02:28 -0800366This instruction replicates its result.
367
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800368.. math::
369
Corbin Simpson17c2a442010-02-02 17:02:28 -0800370 dst = \log_2{src.x}
Keith Whitwella62aaa72009-12-21 23:25:15 +0000371
372
Corbin Simpson85805222010-02-02 16:20:12 -0800373.. opcode:: POW - Power
Keith Whitwella62aaa72009-12-21 23:25:15 +0000374
Corbin Simpson17c2a442010-02-02 17:02:28 -0800375This instruction replicates its result.
376
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800377.. math::
378
Corbin Simpson17c2a442010-02-02 17:02:28 -0800379 dst = src0.x^{src1.x}
Keith Whitwella62aaa72009-12-21 23:25:15 +0000380
Corbin Simpson85805222010-02-02 16:20:12 -0800381.. opcode:: XPD - Cross Product
Keith Whitwella62aaa72009-12-21 23:25:15 +0000382
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800383.. math::
384
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800385 dst.x = src0.y \times src1.z - src1.y \times src0.z
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800386
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800387 dst.y = src0.z \times src1.x - src1.z \times src0.x
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800388
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800389 dst.z = src0.x \times src1.y - src1.x \times src0.y
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800390
391 dst.w = 1
Keith Whitwella62aaa72009-12-21 23:25:15 +0000392
393
Corbin Simpson85805222010-02-02 16:20:12 -0800394.. opcode:: ABS - Absolute
Keith Whitwella62aaa72009-12-21 23:25:15 +0000395
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800396.. math::
397
Corbin Simpson14743ac2009-12-21 19:57:56 -0800398 dst.x = |src.x|
399
400 dst.y = |src.y|
401
402 dst.z = |src.z|
403
404 dst.w = |src.w|
Keith Whitwella62aaa72009-12-21 23:25:15 +0000405
406
Corbin Simpson85805222010-02-02 16:20:12 -0800407.. opcode:: RCC - Reciprocal Clamped
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800408
Corbin Simpson17c2a442010-02-02 17:02:28 -0800409This instruction replicates its result.
410
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800411XXX cleanup on aisle three
Keith Whitwella62aaa72009-12-21 23:25:15 +0000412
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800413.. math::
414
Corbin Simpson17c2a442010-02-02 17:02:28 -0800415 dst = (1 / src.x) > 0 ? clamp(1 / src.x, 5.42101e-020, 1.884467e+019) : clamp(1 / src.x, -1.884467e+019, -5.42101e-020)
Keith Whitwella62aaa72009-12-21 23:25:15 +0000416
417
Corbin Simpson85805222010-02-02 16:20:12 -0800418.. opcode:: DPH - Homogeneous Dot Product
Keith Whitwella62aaa72009-12-21 23:25:15 +0000419
Corbin Simpson17c2a442010-02-02 17:02:28 -0800420This instruction replicates its result.
421
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800422.. math::
423
Corbin Simpson17c2a442010-02-02 17:02:28 -0800424 dst = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src1.w
Keith Whitwella62aaa72009-12-21 23:25:15 +0000425
426
Corbin Simpson85805222010-02-02 16:20:12 -0800427.. opcode:: COS - Cosine
Keith Whitwella62aaa72009-12-21 23:25:15 +0000428
Corbin Simpson17c2a442010-02-02 17:02:28 -0800429This instruction replicates its result.
430
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800431.. math::
432
Corbin Simpson17c2a442010-02-02 17:02:28 -0800433 dst = \cos{src.x}
Keith Whitwella62aaa72009-12-21 23:25:15 +0000434
435
Corbin Simpson85805222010-02-02 16:20:12 -0800436.. opcode:: DDX - Derivative Relative To X
Keith Whitwella62aaa72009-12-21 23:25:15 +0000437
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800438.. math::
439
Keith Whitwella62aaa72009-12-21 23:25:15 +0000440 dst.x = partialx(src.x)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800441
Keith Whitwella62aaa72009-12-21 23:25:15 +0000442 dst.y = partialx(src.y)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800443
Keith Whitwella62aaa72009-12-21 23:25:15 +0000444 dst.z = partialx(src.z)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800445
Keith Whitwella62aaa72009-12-21 23:25:15 +0000446 dst.w = partialx(src.w)
447
448
Corbin Simpson85805222010-02-02 16:20:12 -0800449.. opcode:: DDY - Derivative Relative To Y
Keith Whitwella62aaa72009-12-21 23:25:15 +0000450
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800451.. math::
452
Keith Whitwella62aaa72009-12-21 23:25:15 +0000453 dst.x = partialy(src.x)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800454
Keith Whitwella62aaa72009-12-21 23:25:15 +0000455 dst.y = partialy(src.y)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800456
Keith Whitwella62aaa72009-12-21 23:25:15 +0000457 dst.z = partialy(src.z)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800458
Keith Whitwella62aaa72009-12-21 23:25:15 +0000459 dst.w = partialy(src.w)
460
461
Corbin Simpson85805222010-02-02 16:20:12 -0800462.. opcode:: KILP - Predicated Discard
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800463
Keith Whitwella62aaa72009-12-21 23:25:15 +0000464 discard
465
466
Corbin Simpson85805222010-02-02 16:20:12 -0800467.. opcode:: PK2H - Pack Two 16-bit Floats
Keith Whitwella62aaa72009-12-21 23:25:15 +0000468
469 TBD
470
471
Corbin Simpson85805222010-02-02 16:20:12 -0800472.. opcode:: PK2US - Pack Two Unsigned 16-bit Scalars
Keith Whitwella62aaa72009-12-21 23:25:15 +0000473
474 TBD
475
476
Corbin Simpson85805222010-02-02 16:20:12 -0800477.. opcode:: PK4B - Pack Four Signed 8-bit Scalars
Keith Whitwella62aaa72009-12-21 23:25:15 +0000478
479 TBD
480
481
Corbin Simpson85805222010-02-02 16:20:12 -0800482.. opcode:: PK4UB - Pack Four Unsigned 8-bit Scalars
Keith Whitwella62aaa72009-12-21 23:25:15 +0000483
484 TBD
485
486
Corbin Simpson85805222010-02-02 16:20:12 -0800487.. opcode:: RFL - Reflection Vector
Keith Whitwella62aaa72009-12-21 23:25:15 +0000488
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800489.. math::
490
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800491 dst.x = 2 \times (src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z) / (src0.x \times src0.x + src0.y \times src0.y + src0.z \times src0.z) \times src0.x - src1.x
492
493 dst.y = 2 \times (src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z) / (src0.x \times src0.x + src0.y \times src0.y + src0.z \times src0.z) \times src0.y - src1.y
494
495 dst.z = 2 \times (src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z) / (src0.x \times src0.x + src0.y \times src0.y + src0.z \times src0.z) \times src0.z - src1.z
496
497 dst.w = 1
Keith Whitwella62aaa72009-12-21 23:25:15 +0000498
Corbin Simpson17c2a442010-02-02 17:02:28 -0800499.. note::
500
501 Considered for removal.
Keith Whitwell14eacb02009-12-21 23:38:29 +0000502
Keith Whitwella62aaa72009-12-21 23:25:15 +0000503
Corbin Simpson85805222010-02-02 16:20:12 -0800504.. opcode:: SEQ - Set On Equal
Keith Whitwella62aaa72009-12-21 23:25:15 +0000505
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800506.. math::
507
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800508 dst.x = (src0.x == src1.x) ? 1 : 0
Corbin Simpson04771912010-01-18 17:31:56 -0800509
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800510 dst.y = (src0.y == src1.y) ? 1 : 0
Corbin Simpson04771912010-01-18 17:31:56 -0800511
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800512 dst.z = (src0.z == src1.z) ? 1 : 0
Corbin Simpson04771912010-01-18 17:31:56 -0800513
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800514 dst.w = (src0.w == src1.w) ? 1 : 0
Keith Whitwella62aaa72009-12-21 23:25:15 +0000515
516
Corbin Simpson85805222010-02-02 16:20:12 -0800517.. opcode:: SFL - Set On False
Keith Whitwella62aaa72009-12-21 23:25:15 +0000518
Corbin Simpson17c2a442010-02-02 17:02:28 -0800519This instruction replicates its result.
520
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800521.. math::
522
Corbin Simpson17c2a442010-02-02 17:02:28 -0800523 dst = 0
Corbin Simpson04771912010-01-18 17:31:56 -0800524
Corbin Simpson17c2a442010-02-02 17:02:28 -0800525.. note::
Corbin Simpson04771912010-01-18 17:31:56 -0800526
Corbin Simpson17c2a442010-02-02 17:02:28 -0800527 Considered for removal.
Corbin Simpson04771912010-01-18 17:31:56 -0800528
Keith Whitwella62aaa72009-12-21 23:25:15 +0000529
Corbin Simpson85805222010-02-02 16:20:12 -0800530.. opcode:: SGT - Set On Greater Than
Keith Whitwella62aaa72009-12-21 23:25:15 +0000531
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800532.. math::
533
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800534 dst.x = (src0.x > src1.x) ? 1 : 0
Corbin Simpson04771912010-01-18 17:31:56 -0800535
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800536 dst.y = (src0.y > src1.y) ? 1 : 0
Corbin Simpson04771912010-01-18 17:31:56 -0800537
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800538 dst.z = (src0.z > src1.z) ? 1 : 0
Corbin Simpson04771912010-01-18 17:31:56 -0800539
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800540 dst.w = (src0.w > src1.w) ? 1 : 0
Keith Whitwella62aaa72009-12-21 23:25:15 +0000541
542
Corbin Simpson85805222010-02-02 16:20:12 -0800543.. opcode:: SIN - Sine
Keith Whitwella62aaa72009-12-21 23:25:15 +0000544
Corbin Simpson17c2a442010-02-02 17:02:28 -0800545This instruction replicates its result.
546
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800547.. math::
548
Corbin Simpson17c2a442010-02-02 17:02:28 -0800549 dst = \sin{src.x}
Keith Whitwella62aaa72009-12-21 23:25:15 +0000550
551
Corbin Simpson85805222010-02-02 16:20:12 -0800552.. opcode:: SLE - Set On Less Equal Than
Keith Whitwella62aaa72009-12-21 23:25:15 +0000553
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800554.. math::
555
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800556 dst.x = (src0.x <= src1.x) ? 1 : 0
Corbin Simpson04771912010-01-18 17:31:56 -0800557
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800558 dst.y = (src0.y <= src1.y) ? 1 : 0
Corbin Simpson04771912010-01-18 17:31:56 -0800559
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800560 dst.z = (src0.z <= src1.z) ? 1 : 0
Corbin Simpson04771912010-01-18 17:31:56 -0800561
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800562 dst.w = (src0.w <= src1.w) ? 1 : 0
Keith Whitwella62aaa72009-12-21 23:25:15 +0000563
564
Corbin Simpson85805222010-02-02 16:20:12 -0800565.. opcode:: SNE - Set On Not Equal
Keith Whitwella62aaa72009-12-21 23:25:15 +0000566
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800567.. math::
568
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800569 dst.x = (src0.x != src1.x) ? 1 : 0
Corbin Simpson04771912010-01-18 17:31:56 -0800570
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800571 dst.y = (src0.y != src1.y) ? 1 : 0
Corbin Simpson04771912010-01-18 17:31:56 -0800572
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800573 dst.z = (src0.z != src1.z) ? 1 : 0
Corbin Simpson04771912010-01-18 17:31:56 -0800574
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800575 dst.w = (src0.w != src1.w) ? 1 : 0
Keith Whitwella62aaa72009-12-21 23:25:15 +0000576
577
Corbin Simpson85805222010-02-02 16:20:12 -0800578.. opcode:: STR - Set On True
Keith Whitwella62aaa72009-12-21 23:25:15 +0000579
Corbin Simpson17c2a442010-02-02 17:02:28 -0800580This instruction replicates its result.
581
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800582.. math::
583
Corbin Simpson17c2a442010-02-02 17:02:28 -0800584 dst = 1
Keith Whitwella62aaa72009-12-21 23:25:15 +0000585
586
Corbin Simpson85805222010-02-02 16:20:12 -0800587.. opcode:: TEX - Texture Lookup
Keith Whitwella62aaa72009-12-21 23:25:15 +0000588
Brian Paul2a77c3c2010-12-14 12:45:36 -0700589.. math::
590
591 coord = src0
592
593 bias = 0.0
594
595 dst = texture_sample(unit, coord, bias)
Keith Whitwella62aaa72009-12-21 23:25:15 +0000596
Dave Airlie35db3262011-12-19 16:40:05 +0000597 for array textures src0.y contains the slice for 1D,
598 and src0.z contain the slice for 2D.
599 for shadow textures with no arrays, src0.z contains
600 the reference value.
601 for shadow textures with arrays, src0.z contains
602 the reference value for 1D arrays, and src0.w contains
603 the reference value for 2D arrays.
604 There is no way to pass a bias in the .w value for
605 shadow arrays, and GLSL doesn't allow this.
606 GLSL does allow cube shadows maps to take a bias value,
607 and we have to determine how this will look in TGSI.
Keith Whitwella62aaa72009-12-21 23:25:15 +0000608
Corbin Simpson85805222010-02-02 16:20:12 -0800609.. opcode:: TXD - Texture Lookup with Derivatives
Keith Whitwella62aaa72009-12-21 23:25:15 +0000610
Brian Paul2a77c3c2010-12-14 12:45:36 -0700611.. math::
612
613 coord = src0
614
615 ddx = src1
616
617 ddy = src2
618
619 bias = 0.0
620
621 dst = texture_sample_deriv(unit, coord, bias, ddx, ddy)
Keith Whitwella62aaa72009-12-21 23:25:15 +0000622
623
Corbin Simpson85805222010-02-02 16:20:12 -0800624.. opcode:: TXP - Projective Texture Lookup
Keith Whitwella62aaa72009-12-21 23:25:15 +0000625
Brian Paul2a77c3c2010-12-14 12:45:36 -0700626.. math::
627
628 coord.x = src0.x / src.w
629
630 coord.y = src0.y / src.w
631
632 coord.z = src0.z / src.w
633
634 coord.w = src0.w
635
636 bias = 0.0
637
638 dst = texture_sample(unit, coord, bias)
Keith Whitwella62aaa72009-12-21 23:25:15 +0000639
640
Corbin Simpson85805222010-02-02 16:20:12 -0800641.. opcode:: UP2H - Unpack Two 16-Bit Floats
Keith Whitwella62aaa72009-12-21 23:25:15 +0000642
643 TBD
644
Corbin Simpson17c2a442010-02-02 17:02:28 -0800645.. note::
646
647 Considered for removal.
Keith Whitwella62aaa72009-12-21 23:25:15 +0000648
Corbin Simpson85805222010-02-02 16:20:12 -0800649.. opcode:: UP2US - Unpack Two Unsigned 16-Bit Scalars
Keith Whitwella62aaa72009-12-21 23:25:15 +0000650
651 TBD
652
Corbin Simpson17c2a442010-02-02 17:02:28 -0800653.. note::
654
655 Considered for removal.
Keith Whitwella62aaa72009-12-21 23:25:15 +0000656
Corbin Simpson85805222010-02-02 16:20:12 -0800657.. opcode:: UP4B - Unpack Four Signed 8-Bit Values
Keith Whitwella62aaa72009-12-21 23:25:15 +0000658
659 TBD
660
Corbin Simpson17c2a442010-02-02 17:02:28 -0800661.. note::
662
663 Considered for removal.
Keith Whitwella62aaa72009-12-21 23:25:15 +0000664
Corbin Simpson85805222010-02-02 16:20:12 -0800665.. opcode:: UP4UB - Unpack Four Unsigned 8-Bit Scalars
Keith Whitwella62aaa72009-12-21 23:25:15 +0000666
667 TBD
668
Corbin Simpson17c2a442010-02-02 17:02:28 -0800669.. note::
670
671 Considered for removal.
Keith Whitwella62aaa72009-12-21 23:25:15 +0000672
Corbin Simpson85805222010-02-02 16:20:12 -0800673.. opcode:: X2D - 2D Coordinate Transformation
Keith Whitwella62aaa72009-12-21 23:25:15 +0000674
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800675.. math::
676
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800677 dst.x = src0.x + src1.x \times src2.x + src1.y \times src2.y
Corbin Simpson04771912010-01-18 17:31:56 -0800678
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800679 dst.y = src0.y + src1.x \times src2.z + src1.y \times src2.w
Corbin Simpson04771912010-01-18 17:31:56 -0800680
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800681 dst.z = src0.x + src1.x \times src2.x + src1.y \times src2.y
Corbin Simpson04771912010-01-18 17:31:56 -0800682
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800683 dst.w = src0.y + src1.x \times src2.z + src1.y \times src2.w
Keith Whitwella62aaa72009-12-21 23:25:15 +0000684
Corbin Simpson17c2a442010-02-02 17:02:28 -0800685.. note::
686
687 Considered for removal.
Keith Whitwell14eacb02009-12-21 23:38:29 +0000688
Keith Whitwella62aaa72009-12-21 23:25:15 +0000689
Corbin Simpson85805222010-02-02 16:20:12 -0800690.. opcode:: ARA - Address Register Add
Keith Whitwella62aaa72009-12-21 23:25:15 +0000691
692 TBD
693
Corbin Simpson17c2a442010-02-02 17:02:28 -0800694.. note::
695
696 Considered for removal.
Keith Whitwella62aaa72009-12-21 23:25:15 +0000697
Corbin Simpson85805222010-02-02 16:20:12 -0800698.. opcode:: ARR - Address Register Load With Round
Keith Whitwella62aaa72009-12-21 23:25:15 +0000699
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800700.. math::
701
Keith Whitwella62aaa72009-12-21 23:25:15 +0000702 dst.x = round(src.x)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800703
Keith Whitwella62aaa72009-12-21 23:25:15 +0000704 dst.y = round(src.y)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800705
Keith Whitwella62aaa72009-12-21 23:25:15 +0000706 dst.z = round(src.z)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800707
Keith Whitwella62aaa72009-12-21 23:25:15 +0000708 dst.w = round(src.w)
709
710
Corbin Simpson85805222010-02-02 16:20:12 -0800711.. opcode:: BRA - Branch
Keith Whitwella62aaa72009-12-21 23:25:15 +0000712
713 pc = target
714
Corbin Simpson17c2a442010-02-02 17:02:28 -0800715.. note::
716
717 Considered for removal.
Keith Whitwella62aaa72009-12-21 23:25:15 +0000718
Corbin Simpson85805222010-02-02 16:20:12 -0800719.. opcode:: CAL - Subroutine Call
Keith Whitwella62aaa72009-12-21 23:25:15 +0000720
721 push(pc)
722 pc = target
723
724
Corbin Simpson85805222010-02-02 16:20:12 -0800725.. opcode:: RET - Subroutine Call Return
Keith Whitwella62aaa72009-12-21 23:25:15 +0000726
727 pc = pop()
728
729
Corbin Simpson85805222010-02-02 16:20:12 -0800730.. opcode:: SSG - Set Sign
Keith Whitwella62aaa72009-12-21 23:25:15 +0000731
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800732.. math::
733
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800734 dst.x = (src.x > 0) ? 1 : (src.x < 0) ? -1 : 0
735
736 dst.y = (src.y > 0) ? 1 : (src.y < 0) ? -1 : 0
737
738 dst.z = (src.z > 0) ? 1 : (src.z < 0) ? -1 : 0
739
740 dst.w = (src.w > 0) ? 1 : (src.w < 0) ? -1 : 0
Keith Whitwella62aaa72009-12-21 23:25:15 +0000741
742
Corbin Simpson85805222010-02-02 16:20:12 -0800743.. opcode:: CMP - Compare
Keith Whitwella62aaa72009-12-21 23:25:15 +0000744
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800745.. math::
746
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800747 dst.x = (src0.x < 0) ? src1.x : src2.x
748
749 dst.y = (src0.y < 0) ? src1.y : src2.y
750
751 dst.z = (src0.z < 0) ? src1.z : src2.z
752
753 dst.w = (src0.w < 0) ? src1.w : src2.w
Keith Whitwella62aaa72009-12-21 23:25:15 +0000754
755
Corbin Simpson85805222010-02-02 16:20:12 -0800756.. opcode:: KIL - Conditional Discard
Keith Whitwella62aaa72009-12-21 23:25:15 +0000757
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800758.. math::
759
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800760 if (src.x < 0 || src.y < 0 || src.z < 0 || src.w < 0)
Keith Whitwella62aaa72009-12-21 23:25:15 +0000761 discard
762 endif
763
764
Corbin Simpson85805222010-02-02 16:20:12 -0800765.. opcode:: SCS - Sine Cosine
Keith Whitwella62aaa72009-12-21 23:25:15 +0000766
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800767.. math::
768
Corbin Simpsond92a6852009-12-21 19:30:29 -0800769 dst.x = \cos{src.x}
770
771 dst.y = \sin{src.x}
772
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800773 dst.z = 0
Corbin Simpsond92a6852009-12-21 19:30:29 -0800774
Tilman Sauerbeckd3231182010-09-19 09:03:11 +0200775 dst.w = 1
Keith Whitwella62aaa72009-12-21 23:25:15 +0000776
777
Corbin Simpson85805222010-02-02 16:20:12 -0800778.. opcode:: TXB - Texture Lookup With Bias
Keith Whitwella62aaa72009-12-21 23:25:15 +0000779
Brian Paul2a77c3c2010-12-14 12:45:36 -0700780.. math::
781
782 coord.x = src.x
783
784 coord.y = src.y
785
786 coord.z = src.z
787
788 coord.w = 1.0
789
790 bias = src.z
791
792 dst = texture_sample(unit, coord, bias)
Keith Whitwella62aaa72009-12-21 23:25:15 +0000793
794
Corbin Simpson85805222010-02-02 16:20:12 -0800795.. opcode:: NRM - 3-component Vector Normalise
Keith Whitwella62aaa72009-12-21 23:25:15 +0000796
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800797.. math::
798
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800799 dst.x = src.x / (src.x \times src.x + src.y \times src.y + src.z \times src.z)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800800
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800801 dst.y = src.y / (src.x \times src.x + src.y \times src.y + src.z \times src.z)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800802
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800803 dst.z = src.z / (src.x \times src.x + src.y \times src.y + src.z \times src.z)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800804
805 dst.w = 1
Keith Whitwella62aaa72009-12-21 23:25:15 +0000806
807
Corbin Simpson85805222010-02-02 16:20:12 -0800808.. opcode:: DIV - Divide
Keith Whitwella62aaa72009-12-21 23:25:15 +0000809
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800810.. math::
811
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800812 dst.x = \frac{src0.x}{src1.x}
813
814 dst.y = \frac{src0.y}{src1.y}
815
816 dst.z = \frac{src0.z}{src1.z}
817
818 dst.w = \frac{src0.w}{src1.w}
Keith Whitwella62aaa72009-12-21 23:25:15 +0000819
820
Corbin Simpson85805222010-02-02 16:20:12 -0800821.. opcode:: DP2 - 2-component Dot Product
Keith Whitwella62aaa72009-12-21 23:25:15 +0000822
Corbin Simpson17c2a442010-02-02 17:02:28 -0800823This instruction replicates its result.
824
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800825.. math::
826
Corbin Simpson17c2a442010-02-02 17:02:28 -0800827 dst = src0.x \times src1.x + src0.y \times src1.y
Keith Whitwella62aaa72009-12-21 23:25:15 +0000828
829
Brian Paul2a77c3c2010-12-14 12:45:36 -0700830.. opcode:: TXL - Texture Lookup With explicit LOD
Keith Whitwella62aaa72009-12-21 23:25:15 +0000831
Brian Paul2a77c3c2010-12-14 12:45:36 -0700832.. math::
833
834 coord.x = src0.x
835
836 coord.y = src0.y
837
838 coord.z = src0.z
839
840 coord.w = 1.0
841
842 lod = src0.w
843
844 dst = texture_sample(unit, coord, lod)
Keith Whitwella62aaa72009-12-21 23:25:15 +0000845
846
Corbin Simpson85805222010-02-02 16:20:12 -0800847.. opcode:: BRK - Break
Keith Whitwella62aaa72009-12-21 23:25:15 +0000848
849 TBD
850
851
Corbin Simpson85805222010-02-02 16:20:12 -0800852.. opcode:: IF - If
Keith Whitwella62aaa72009-12-21 23:25:15 +0000853
854 TBD
855
856
Corbin Simpson85805222010-02-02 16:20:12 -0800857.. opcode:: ELSE - Else
Keith Whitwella62aaa72009-12-21 23:25:15 +0000858
859 TBD
860
861
Corbin Simpson85805222010-02-02 16:20:12 -0800862.. opcode:: ENDIF - End If
Keith Whitwella62aaa72009-12-21 23:25:15 +0000863
864 TBD
865
866
Corbin Simpson85805222010-02-02 16:20:12 -0800867.. opcode:: PUSHA - Push Address Register On Stack
Keith Whitwella62aaa72009-12-21 23:25:15 +0000868
869 push(src.x)
870 push(src.y)
871 push(src.z)
872 push(src.w)
873
Corbin Simpson17c2a442010-02-02 17:02:28 -0800874.. note::
875
876 Considered for cleanup.
877
878.. note::
879
880 Considered for removal.
Keith Whitwella62aaa72009-12-21 23:25:15 +0000881
Corbin Simpson85805222010-02-02 16:20:12 -0800882.. opcode:: POPA - Pop Address Register From Stack
Keith Whitwella62aaa72009-12-21 23:25:15 +0000883
884 dst.w = pop()
885 dst.z = pop()
886 dst.y = pop()
887 dst.x = pop()
888
Corbin Simpson17c2a442010-02-02 17:02:28 -0800889.. note::
890
891 Considered for cleanup.
892
893.. note::
894
895 Considered for removal.
Keith Whitwell14eacb02009-12-21 23:38:29 +0000896
Keith Whitwella62aaa72009-12-21 23:25:15 +0000897
Corbin Simpson9d4cb6e2010-06-16 18:34:32 -0700898Compute ISA
Corbin Simpson5bcd26c2009-12-21 21:04:10 -0800899^^^^^^^^^^^^^^^^^^^^^^^^
Keith Whitwella62aaa72009-12-21 23:25:15 +0000900
Corbin Simpson9d4cb6e2010-06-16 18:34:32 -0700901These opcodes are primarily provided for special-use computational shaders.
Keith Whitwell14eacb02009-12-21 23:38:29 +0000902Support for these opcodes indicated by a special pipe capability bit (TBD).
Keith Whitwella62aaa72009-12-21 23:25:15 +0000903
Corbin Simpson9d4cb6e2010-06-16 18:34:32 -0700904XXX so let's discuss it, yeah?
905
Corbin Simpson85805222010-02-02 16:20:12 -0800906.. opcode:: CEIL - Ceiling
Keith Whitwella62aaa72009-12-21 23:25:15 +0000907
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800908.. math::
909
Corbin Simpson14743ac2009-12-21 19:57:56 -0800910 dst.x = \lceil src.x\rceil
911
912 dst.y = \lceil src.y\rceil
913
914 dst.z = \lceil src.z\rceil
915
916 dst.w = \lceil src.w\rceil
Keith Whitwella62aaa72009-12-21 23:25:15 +0000917
918
Corbin Simpson85805222010-02-02 16:20:12 -0800919.. opcode:: I2F - Integer To Float
Keith Whitwella62aaa72009-12-21 23:25:15 +0000920
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800921.. math::
922
Keith Whitwella62aaa72009-12-21 23:25:15 +0000923 dst.x = (float) src.x
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800924
Keith Whitwella62aaa72009-12-21 23:25:15 +0000925 dst.y = (float) src.y
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800926
Keith Whitwella62aaa72009-12-21 23:25:15 +0000927 dst.z = (float) src.z
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800928
Keith Whitwella62aaa72009-12-21 23:25:15 +0000929 dst.w = (float) src.w
930
931
Corbin Simpson85805222010-02-02 16:20:12 -0800932.. opcode:: NOT - Bitwise Not
Keith Whitwella62aaa72009-12-21 23:25:15 +0000933
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800934.. math::
935
Keith Whitwella62aaa72009-12-21 23:25:15 +0000936 dst.x = ~src.x
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800937
Keith Whitwella62aaa72009-12-21 23:25:15 +0000938 dst.y = ~src.y
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800939
Keith Whitwella62aaa72009-12-21 23:25:15 +0000940 dst.z = ~src.z
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800941
Keith Whitwella62aaa72009-12-21 23:25:15 +0000942 dst.w = ~src.w
943
944
Corbin Simpson85805222010-02-02 16:20:12 -0800945.. opcode:: TRUNC - Truncate
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800946
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800947.. math::
948
Keith Whitwella62aaa72009-12-21 23:25:15 +0000949 dst.x = trunc(src.x)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800950
Keith Whitwella62aaa72009-12-21 23:25:15 +0000951 dst.y = trunc(src.y)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800952
Keith Whitwella62aaa72009-12-21 23:25:15 +0000953 dst.z = trunc(src.z)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800954
Keith Whitwella62aaa72009-12-21 23:25:15 +0000955 dst.w = trunc(src.w)
956
957
Corbin Simpson85805222010-02-02 16:20:12 -0800958.. opcode:: SHL - Shift Left
Keith Whitwella62aaa72009-12-21 23:25:15 +0000959
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800960.. math::
961
Keith Whitwella62aaa72009-12-21 23:25:15 +0000962 dst.x = src0.x << src1.x
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800963
Keith Whitwella62aaa72009-12-21 23:25:15 +0000964 dst.y = src0.y << src1.x
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800965
Keith Whitwella62aaa72009-12-21 23:25:15 +0000966 dst.z = src0.z << src1.x
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800967
Keith Whitwella62aaa72009-12-21 23:25:15 +0000968 dst.w = src0.w << src1.x
969
970
Corbin Simpson85805222010-02-02 16:20:12 -0800971.. opcode:: SHR - Shift Right
Keith Whitwella62aaa72009-12-21 23:25:15 +0000972
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800973.. math::
974
Keith Whitwella62aaa72009-12-21 23:25:15 +0000975 dst.x = src0.x >> src1.x
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800976
Keith Whitwella62aaa72009-12-21 23:25:15 +0000977 dst.y = src0.y >> src1.x
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800978
Keith Whitwella62aaa72009-12-21 23:25:15 +0000979 dst.z = src0.z >> src1.x
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800980
Keith Whitwella62aaa72009-12-21 23:25:15 +0000981 dst.w = src0.w >> src1.x
982
983
Corbin Simpson85805222010-02-02 16:20:12 -0800984.. opcode:: AND - Bitwise And
Keith Whitwella62aaa72009-12-21 23:25:15 +0000985
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800986.. math::
987
Keith Whitwella62aaa72009-12-21 23:25:15 +0000988 dst.x = src0.x & src1.x
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800989
Keith Whitwella62aaa72009-12-21 23:25:15 +0000990 dst.y = src0.y & src1.y
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800991
Keith Whitwella62aaa72009-12-21 23:25:15 +0000992 dst.z = src0.z & src1.z
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800993
Keith Whitwella62aaa72009-12-21 23:25:15 +0000994 dst.w = src0.w & src1.w
995
996
Corbin Simpson85805222010-02-02 16:20:12 -0800997.. opcode:: OR - Bitwise Or
Keith Whitwella62aaa72009-12-21 23:25:15 +0000998
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800999.. math::
1000
Keith Whitwella62aaa72009-12-21 23:25:15 +00001001 dst.x = src0.x | src1.x
Corbin Simpsonda65ac62009-12-21 20:32:46 -08001002
Keith Whitwella62aaa72009-12-21 23:25:15 +00001003 dst.y = src0.y | src1.y
Corbin Simpsonda65ac62009-12-21 20:32:46 -08001004
Keith Whitwella62aaa72009-12-21 23:25:15 +00001005 dst.z = src0.z | src1.z
Corbin Simpsonda65ac62009-12-21 20:32:46 -08001006
Keith Whitwella62aaa72009-12-21 23:25:15 +00001007 dst.w = src0.w | src1.w
1008
1009
Corbin Simpson85805222010-02-02 16:20:12 -08001010.. opcode:: MOD - Modulus
Keith Whitwella62aaa72009-12-21 23:25:15 +00001011
Corbin Simpsone8ed3b92009-12-21 19:12:55 -08001012.. math::
1013
Corbin Simpsonda65ac62009-12-21 20:32:46 -08001014 dst.x = src0.x \bmod src1.x
1015
1016 dst.y = src0.y \bmod src1.y
1017
1018 dst.z = src0.z \bmod src1.z
1019
1020 dst.w = src0.w \bmod src1.w
Keith Whitwella62aaa72009-12-21 23:25:15 +00001021
1022
Corbin Simpson85805222010-02-02 16:20:12 -08001023.. opcode:: XOR - Bitwise Xor
Keith Whitwella62aaa72009-12-21 23:25:15 +00001024
Corbin Simpsone8ed3b92009-12-21 19:12:55 -08001025.. math::
1026
Corbin Simpsonf90733c2010-01-18 17:37:25 -08001027 dst.x = src0.x \oplus src1.x
Corbin Simpsonda65ac62009-12-21 20:32:46 -08001028
Corbin Simpsonf90733c2010-01-18 17:37:25 -08001029 dst.y = src0.y \oplus src1.y
Corbin Simpsonda65ac62009-12-21 20:32:46 -08001030
Corbin Simpsonf90733c2010-01-18 17:37:25 -08001031 dst.z = src0.z \oplus src1.z
Corbin Simpsonda65ac62009-12-21 20:32:46 -08001032
Corbin Simpsonf90733c2010-01-18 17:37:25 -08001033 dst.w = src0.w \oplus src1.w
Keith Whitwella62aaa72009-12-21 23:25:15 +00001034
1035
Bryan Cain324ac982011-09-10 12:31:54 -05001036.. opcode:: UCMP - Integer Conditional Move
1037
1038.. math::
1039
1040 dst.x = src0.x ? src1.x : src2.x
1041
1042 dst.y = src0.y ? src1.y : src2.y
1043
1044 dst.z = src0.z ? src1.z : src2.z
1045
1046 dst.w = src0.w ? src1.w : src2.w
1047
1048
1049.. opcode:: UARL - Integer Address Register Load
1050
1051 Moves the contents of the source register, assumed to be an integer, into the
1052 destination register, which is assumed to be an address (ADDR) register.
1053
1054
Bryan Cain4c0f1fb2012-01-07 10:43:04 -06001055.. opcode:: IABS - Integer Absolute Value
1056
1057.. math::
1058
1059 dst.x = |src.x|
1060
1061 dst.y = |src.y|
1062
1063 dst.z = |src.z|
1064
1065 dst.w = |src.w|
1066
1067
Corbin Simpson85805222010-02-02 16:20:12 -08001068.. opcode:: SAD - Sum Of Absolute Differences
Keith Whitwella62aaa72009-12-21 23:25:15 +00001069
Corbin Simpsone8ed3b92009-12-21 19:12:55 -08001070.. math::
1071
Corbin Simpson14743ac2009-12-21 19:57:56 -08001072 dst.x = |src0.x - src1.x| + src2.x
1073
1074 dst.y = |src0.y - src1.y| + src2.y
1075
1076 dst.z = |src0.z - src1.z| + src2.z
1077
1078 dst.w = |src0.w - src1.w| + src2.w
Keith Whitwella62aaa72009-12-21 23:25:15 +00001079
1080
Dave Airlie2083a272011-08-26 10:59:18 +01001081.. opcode:: TXF - Texel Fetch (as per NV_gpu_shader4), extract a single texel
1082 from a specified texture image. The source sampler may
1083 not be a CUBE or SHADOW.
1084 src 0 is a four-component signed integer vector used to
1085 identify the single texel accessed. 3 components + level.
1086 src 1 is a 3 component constant signed integer vector,
1087 with each component only have a range of
1088 -8..+8 (hw only seems to deal with this range, interface
1089 allows for up to unsigned int).
1090 TXF(uint_vec coord, int_vec offset).
Keith Whitwella62aaa72009-12-21 23:25:15 +00001091
1092
Dave Airlie6fb12bf2011-08-25 13:03:19 +01001093.. opcode:: TXQ - Texture Size Query (as per NV_gpu_program4)
1094 retrieve the dimensions of the texture
1095 depending on the target. For 1D (width), 2D/RECT/CUBE
1096 (width, height), 3D (width, height, depth),
1097 1D array (width, layers), 2D array (width, height, layers)
Keith Whitwella62aaa72009-12-21 23:25:15 +00001098
Dave Airlie6fb12bf2011-08-25 13:03:19 +01001099.. math::
1100
1101 lod = src0
1102
1103 dst.x = texture_width(unit, lod)
1104
1105 dst.y = texture_height(unit, lod)
1106
1107 dst.z = texture_depth(unit, lod)
Keith Whitwella62aaa72009-12-21 23:25:15 +00001108
1109
Corbin Simpson85805222010-02-02 16:20:12 -08001110.. opcode:: CONT - Continue
Keith Whitwella62aaa72009-12-21 23:25:15 +00001111
1112 TBD
1113
Corbin Simpson9d4cb6e2010-06-16 18:34:32 -07001114.. note::
Keith Whitwella62aaa72009-12-21 23:25:15 +00001115
Corbin Simpson9d4cb6e2010-06-16 18:34:32 -07001116 Support for CONT is determined by a special capability bit,
1117 ``TGSI_CONT_SUPPORTED``. See :ref:`Screen` for more information.
1118
1119
1120Geometry ISA
Corbin Simpson5bcd26c2009-12-21 21:04:10 -08001121^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Keith Whitwella62aaa72009-12-21 23:25:15 +00001122
Corbin Simpson9d4cb6e2010-06-16 18:34:32 -07001123These opcodes are only supported in geometry shaders; they have no meaning
1124in any other type of shader.
Keith Whitwella62aaa72009-12-21 23:25:15 +00001125
Corbin Simpson85805222010-02-02 16:20:12 -08001126.. opcode:: EMIT - Emit
Keith Whitwella62aaa72009-12-21 23:25:15 +00001127
1128 TBD
1129
1130
Corbin Simpson85805222010-02-02 16:20:12 -08001131.. opcode:: ENDPRIM - End Primitive
Keith Whitwella62aaa72009-12-21 23:25:15 +00001132
1133 TBD
1134
1135
Corbin Simpson9d4cb6e2010-06-16 18:34:32 -07001136GLSL ISA
Corbin Simpson5bcd26c2009-12-21 21:04:10 -08001137^^^^^^^^^^
Keith Whitwella62aaa72009-12-21 23:25:15 +00001138
Corbin Simpson9d4cb6e2010-06-16 18:34:32 -07001139These opcodes are part of :term:`GLSL`'s opcode set. Support for these
1140opcodes is determined by a special capability bit, ``GLSL``.
Keith Whitwella62aaa72009-12-21 23:25:15 +00001141
Corbin Simpson85805222010-02-02 16:20:12 -08001142.. opcode:: BGNLOOP - Begin a Loop
Keith Whitwella62aaa72009-12-21 23:25:15 +00001143
1144 TBD
1145
1146
Corbin Simpson85805222010-02-02 16:20:12 -08001147.. opcode:: BGNSUB - Begin Subroutine
Keith Whitwella62aaa72009-12-21 23:25:15 +00001148
1149 TBD
1150
1151
Corbin Simpson85805222010-02-02 16:20:12 -08001152.. opcode:: ENDLOOP - End a Loop
Keith Whitwella62aaa72009-12-21 23:25:15 +00001153
1154 TBD
1155
1156
Corbin Simpson85805222010-02-02 16:20:12 -08001157.. opcode:: ENDSUB - End Subroutine
Keith Whitwella62aaa72009-12-21 23:25:15 +00001158
1159 TBD
1160
1161
Corbin Simpson85805222010-02-02 16:20:12 -08001162.. opcode:: NOP - No Operation
Keith Whitwella62aaa72009-12-21 23:25:15 +00001163
Michal Krol8ab89d72010-01-04 13:23:41 +01001164 Do nothing.
1165
Keith Whitwella62aaa72009-12-21 23:25:15 +00001166
Corbin Simpson85805222010-02-02 16:20:12 -08001167.. opcode:: NRM4 - 4-component Vector Normalise
Keith Whitwella62aaa72009-12-21 23:25:15 +00001168
Corbin Simpson17c2a442010-02-02 17:02:28 -08001169This instruction replicates its result.
1170
Corbin Simpsone8ed3b92009-12-21 19:12:55 -08001171.. math::
1172
Corbin Simpson17c2a442010-02-02 17:02:28 -08001173 dst = \frac{src.x}{src.x \times src.x + src.y \times src.y + src.z \times src.z + src.w \times src.w}
Keith Whitwella62aaa72009-12-21 23:25:15 +00001174
1175
Corbin Simpsonda65ac62009-12-21 20:32:46 -08001176ps_2_x
Corbin Simpson5bcd26c2009-12-21 21:04:10 -08001177^^^^^^^^^^^^
Keith Whitwella62aaa72009-12-21 23:25:15 +00001178
Corbin Simpson9d4cb6e2010-06-16 18:34:32 -07001179XXX wait what
Keith Whitwella62aaa72009-12-21 23:25:15 +00001180
Corbin Simpson85805222010-02-02 16:20:12 -08001181.. opcode:: CALLNZ - Subroutine Call If Not Zero
Keith Whitwella62aaa72009-12-21 23:25:15 +00001182
1183 TBD
1184
1185
Corbin Simpson85805222010-02-02 16:20:12 -08001186.. opcode:: IFC - If
Keith Whitwella62aaa72009-12-21 23:25:15 +00001187
1188 TBD
1189
1190
Corbin Simpson85805222010-02-02 16:20:12 -08001191.. opcode:: BREAKC - Break Conditional
Keith Whitwella62aaa72009-12-21 23:25:15 +00001192
1193 TBD
1194
Corbin Simpson62ca7b82010-02-02 16:36:34 -08001195.. _doubleopcodes:
1196
Corbin Simpson9d4cb6e2010-06-16 18:34:32 -07001197Double ISA
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001198^^^^^^^^^^^^^^^
1199
Corbin Simpsondbc95e82010-06-16 18:34:51 -07001200The double-precision opcodes reinterpret four-component vectors into
1201two-component vectors with doubled precision in each component.
1202
1203Support for these opcodes is XXX undecided. :T
1204
1205.. opcode:: DADD - Add
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001206
1207.. math::
1208
1209 dst.xy = src0.xy + src1.xy
1210
1211 dst.zw = src0.zw + src1.zw
1212
1213
Corbin Simpsondbc95e82010-06-16 18:34:51 -07001214.. opcode:: DDIV - Divide
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001215
1216.. math::
1217
1218 dst.xy = src0.xy / src1.xy
1219
1220 dst.zw = src0.zw / src1.zw
1221
Corbin Simpsondbc95e82010-06-16 18:34:51 -07001222.. opcode:: DSEQ - Set on Equal
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001223
1224.. math::
1225
1226 dst.xy = src0.xy == src1.xy ? 1.0F : 0.0F
1227
1228 dst.zw = src0.zw == src1.zw ? 1.0F : 0.0F
1229
Corbin Simpsondbc95e82010-06-16 18:34:51 -07001230.. opcode:: DSLT - Set on Less than
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001231
1232.. math::
1233
1234 dst.xy = src0.xy < src1.xy ? 1.0F : 0.0F
1235
1236 dst.zw = src0.zw < src1.zw ? 1.0F : 0.0F
1237
Corbin Simpsondbc95e82010-06-16 18:34:51 -07001238.. opcode:: DFRAC - Fraction
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001239
1240.. math::
1241
1242 dst.xy = src.xy - \lfloor src.xy\rfloor
1243
1244 dst.zw = src.zw - \lfloor src.zw\rfloor
1245
1246
Corbin Simpsondbc95e82010-06-16 18:34:51 -07001247.. opcode:: DFRACEXP - Convert Number to Fractional and Integral Components
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001248
Corbin Simpsonf98c4622010-06-16 18:45:50 -07001249Like the ``frexp()`` routine in many math libraries, this opcode stores the
1250exponent of its source to ``dst0``, and the significand to ``dst1``, such that
1251:math:`dst1 \times 2^{dst0} = src` .
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001252
1253.. math::
1254
Corbin Simpsonf98c4622010-06-16 18:45:50 -07001255 dst0.xy = exp(src.xy)
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001256
Corbin Simpsonf98c4622010-06-16 18:45:50 -07001257 dst1.xy = frac(src.xy)
1258
1259 dst0.zw = exp(src.zw)
1260
1261 dst1.zw = frac(src.zw)
1262
1263.. opcode:: DLDEXP - Multiply Number by Integral Power of 2
1264
1265This opcode is the inverse of :opcode:`DFRACEXP`.
1266
1267.. math::
1268
1269 dst.xy = src0.xy \times 2^{src1.xy}
1270
1271 dst.zw = src0.zw \times 2^{src1.zw}
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001272
Corbin Simpsondbc95e82010-06-16 18:34:51 -07001273.. opcode:: DMIN - Minimum
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001274
1275.. math::
1276
1277 dst.xy = min(src0.xy, src1.xy)
1278
1279 dst.zw = min(src0.zw, src1.zw)
1280
Corbin Simpsondbc95e82010-06-16 18:34:51 -07001281.. opcode:: DMAX - Maximum
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001282
1283.. math::
1284
1285 dst.xy = max(src0.xy, src1.xy)
1286
1287 dst.zw = max(src0.zw, src1.zw)
1288
Corbin Simpsondbc95e82010-06-16 18:34:51 -07001289.. opcode:: DMUL - Multiply
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001290
1291.. math::
1292
1293 dst.xy = src0.xy \times src1.xy
1294
1295 dst.zw = src0.zw \times src1.zw
1296
1297
Corbin Simpsondbc95e82010-06-16 18:34:51 -07001298.. opcode:: DMAD - Multiply And Add
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001299
1300.. math::
1301
1302 dst.xy = src0.xy \times src1.xy + src2.xy
1303
1304 dst.zw = src0.zw \times src1.zw + src2.zw
1305
1306
Corbin Simpsondbc95e82010-06-16 18:34:51 -07001307.. opcode:: DRCP - Reciprocal
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001308
1309.. math::
1310
1311 dst.xy = \frac{1}{src.xy}
1312
1313 dst.zw = \frac{1}{src.zw}
1314
Corbin Simpsondbc95e82010-06-16 18:34:51 -07001315.. opcode:: DSQRT - Square Root
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001316
1317.. math::
1318
1319 dst.xy = \sqrt{src.xy}
1320
1321 dst.zw = \sqrt{src.zw}
1322
Keith Whitwella62aaa72009-12-21 23:25:15 +00001323
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001324.. _samplingopcodes:
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001325
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001326Resource Sampling Opcodes
1327^^^^^^^^^^^^^^^^^^^^^^^^^
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001328
1329Those opcodes follow very closely semantics of the respective Direct3D
1330instructions. If in doubt double check Direct3D documentation.
1331
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001332.. opcode:: SAMPLE - Using provided address, sample data from the
1333 specified texture using the filtering mode identified
1334 by the gven sampler. The source data may come from
1335 any resource type other than buffers.
1336 SAMPLE dst, address, sampler_view, sampler
1337 e.g.
1338 SAMPLE TEMP[0], TEMP[1], SVIEW[0], SAMP[0]
1339
1340.. opcode:: SAMPLE_I - Simplified alternative to the SAMPLE instruction.
1341 Using the provided integer address, SAMPLE_I fetches data
1342 from the specified sampler view without any filtering.
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001343 The source data may come from any resource type other
1344 than CUBE.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001345 SAMPLE_I dst, address, sampler_view
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001346 e.g.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001347 SAMPLE_I TEMP[0], TEMP[1], SVIEW[0]
Zack Rusin3fa814d2011-01-24 21:45:37 -05001348 The 'address' is specified as unsigned integers. If the
1349 'address' is out of range [0...(# texels - 1)] the
1350 result of the fetch is always 0 in all components.
1351 As such the instruction doesn't honor address wrap
1352 modes, in cases where that behavior is desirable
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001353 'SAMPLE' instruction should be used.
Zack Rusin3fa814d2011-01-24 21:45:37 -05001354 address.w always provides an unsigned integer mipmap
1355 level. If the value is out of the range then the
1356 instruction always returns 0 in all components.
1357 address.yz are ignored for buffers and 1d textures.
1358 address.z is ignored for 1d texture arrays and 2d
1359 textures.
1360 For 1D texture arrays address.y provides the array
1361 index (also as unsigned integer). If the value is
1362 out of the range of available array indices
1363 [0... (array size - 1)] then the opcode always returns
1364 0 in all components.
1365 For 2D texture arrays address.z provides the array
1366 index, otherwise it exhibits the same behavior as in
1367 the case for 1D texture arrays.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001368 The exact semantics of the source address are presented
Zack Rusin3fa814d2011-01-24 21:45:37 -05001369 in the table below:
1370 resource type X Y Z W
1371 ------------- ------------------------
1372 PIPE_BUFFER x ignored
1373 PIPE_TEXTURE_1D x mpl
1374 PIPE_TEXTURE_2D x y mpl
1375 PIPE_TEXTURE_3D x y z mpl
1376 PIPE_TEXTURE_RECT x y mpl
1377 PIPE_TEXTURE_CUBE not allowed as source
1378 PIPE_TEXTURE_1D_ARRAY x idx mpl
1379 PIPE_TEXTURE_2D_ARRAY x y idx mpl
1380
1381 Where 'mpl' is a mipmap level and 'idx' is the
1382 array index.
1383
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001384.. opcode:: SAMPLE_I_MS - Just like SAMPLE_I but allows fetch data from
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001385 multi-sampled surfaces.
1386
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001387.. opcode:: SAMPLE_B - Just like the SAMPLE instruction with the
1388 exception that an additiona bias is applied to the
1389 level of detail computed as part of the instruction
1390 execution.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001391 SAMPLE_B dst, address, sampler_view, sampler, lod_bias
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001392 e.g.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001393 SAMPLE_B TEMP[0], TEMP[1], SVIEW[0], SAMP[0], TEMP[2].x
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001394
Zack Rusin3fa814d2011-01-24 21:45:37 -05001395.. opcode:: SAMPLE_C - Similar to the SAMPLE instruction but it
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001396 performs a comparison filter. The operands to SAMPLE_C
1397 are identical to SAMPLE, except that tere is an additional
1398 float32 operand, reference value, which must be a register
1399 with single-component, or a scalar literal.
1400 SAMPLE_C makes the hardware use the current samplers
1401 compare_func (in pipe_sampler_state) to compare
1402 reference value against the red component value for the
1403 surce resource at each texel that the currently configured
1404 texture filter covers based on the provided coordinates.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001405 SAMPLE_C dst, address, sampler_view.r, sampler, ref_value
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001406 e.g.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001407 SAMPLE_C TEMP[0], TEMP[1], SVIEW[0].r, SAMP[0], TEMP[2].x
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001408
1409.. opcode:: SAMPLE_C_LZ - Same as SAMPLE_C, but LOD is 0 and derivatives
1410 are ignored. The LZ stands for level-zero.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001411 SAMPLE_C_LZ dst, address, sampler_view.r, sampler, ref_value
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001412 e.g.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001413 SAMPLE_C_LZ TEMP[0], TEMP[1], SVIEW[0].r, SAMP[0], TEMP[2].x
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001414
1415
1416.. opcode:: SAMPLE_D - SAMPLE_D is identical to the SAMPLE opcode except
1417 that the derivatives for the source address in the x
1418 direction and the y direction are provided by extra
1419 parameters.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001420 SAMPLE_D dst, address, sampler_view, sampler, der_x, der_y
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001421 e.g.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001422 SAMPLE_D TEMP[0], TEMP[1], SVIEW[0], SAMP[0], TEMP[2], TEMP[3]
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001423
1424.. opcode:: SAMPLE_L - SAMPLE_L is identical to the SAMPLE opcode except
1425 that the LOD is provided directly as a scalar value,
Roland Scheidegger427d36a2013-02-12 16:41:56 +01001426 representing no anisotropy.
1427 SAMPLE_L dst, address, sampler_view, sampler, explicit_lod
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001428 e.g.
Roland Scheidegger427d36a2013-02-12 16:41:56 +01001429 SAMPLE_L TEMP[0], TEMP[1], SVIEW[0], SAMP[0], TEMP[2].x
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001430
1431.. opcode:: GATHER4 - Gathers the four texels to be used in a bi-linear
1432 filtering operation and packs them into a single register.
Brian Paul0cd68002012-03-30 09:41:42 -06001433 Only works with 2D, 2D array, cubemaps, and cubemaps arrays.
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001434 For 2D textures, only the addressing modes of the sampler and
1435 the top level of any mip pyramid are used. Set W to zero.
1436 It behaves like the SAMPLE instruction, but a filtered
1437 sample is not generated. The four samples that contribute
Brian Paul0cd68002012-03-30 09:41:42 -06001438 to filtering are placed into xyzw in counter-clockwise order,
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001439 starting with the (u,v) texture coordinate delta at the
1440 following locations (-, +), (+, +), (+, -), (-, -), where
1441 the magnitude of the deltas are half a texel.
1442
1443
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001444.. opcode:: SVIEWINFO - query the dimensions of a given sampler view.
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001445 dst receives width, height, depth or array size and
Roland Scheidegger614982d32013-02-05 13:37:57 -08001446 number of mipmap levels as int4. The dst can have a writemask
Zack Rusin3fa814d2011-01-24 21:45:37 -05001447 which will specify what info is the caller interested
1448 in.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001449 SVIEWINFO dst, src_mip_level, sampler_view
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001450 e.g.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001451 SVIEWINFO TEMP[0], TEMP[1].x, SVIEW[0]
Zack Rusin3fa814d2011-01-24 21:45:37 -05001452 src_mip_level is an unsigned integer scalar. If it's
1453 out of range then returns 0 for width, height and
1454 depth/array size but the total number of mipmap is
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001455 still returned correctly for the given sampler view.
Zack Rusin3fa814d2011-01-24 21:45:37 -05001456 The returned width, height and depth values are for
1457 the mipmap level selected by the src_mip_level and
1458 are in the number of texels.
1459 For 1d texture array width is in dst.x, array size
1460 is in dst.y and dst.zw are always 0.
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001461
1462.. opcode:: SAMPLE_POS - query the position of a given sample.
1463 dst receives float4 (x, y, 0, 0) indicated where the
1464 sample is located. If the resource is not a multi-sample
1465 resource and not a render target, the result is 0.
1466
Zack Rusin3fa814d2011-01-24 21:45:37 -05001467.. opcode:: SAMPLE_INFO - dst receives number of samples in x.
1468 If the resource is not a multi-sample resource and
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001469 not a render target, the result is 0.
1470
1471
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001472.. _resourceopcodes:
1473
1474Resource Access Opcodes
1475^^^^^^^^^^^^^^^^^^^^^^^
1476
1477.. opcode:: LOAD - Fetch data from a shader resource
1478
1479 Syntax: ``LOAD dst, resource, address``
1480
1481 Example: ``LOAD TEMP[0], RES[0], TEMP[1]``
1482
1483 Using the provided integer address, LOAD fetches data
1484 from the specified buffer or texture without any
1485 filtering.
1486
1487 The 'address' is specified as a vector of unsigned
1488 integers. If the 'address' is out of range the result
1489 is unspecified.
1490
1491 Only the first mipmap level of a resource can be read
1492 from using this instruction.
1493
1494 For 1D or 2D texture arrays, the array index is
1495 provided as an unsigned integer in address.y or
1496 address.z, respectively. address.yz are ignored for
1497 buffers and 1D textures. address.z is ignored for 1D
1498 texture arrays and 2D textures. address.w is always
1499 ignored.
1500
Francisco Jerezb8e808f2012-04-30 20:20:29 +02001501.. opcode:: STORE - Write data to a shader resource
1502
1503 Syntax: ``STORE resource, address, src``
1504
1505 Example: ``STORE RES[0], TEMP[0], TEMP[1]``
1506
1507 Using the provided integer address, STORE writes data
1508 to the specified buffer or texture.
1509
1510 The 'address' is specified as a vector of unsigned
1511 integers. If the 'address' is out of range the result
1512 is unspecified.
1513
1514 Only the first mipmap level of a resource can be
1515 written to using this instruction.
1516
1517 For 1D or 2D texture arrays, the array index is
1518 provided as an unsigned integer in address.y or
1519 address.z, respectively. address.yz are ignored for
1520 buffers and 1D textures. address.z is ignored for 1D
1521 texture arrays and 2D textures. address.w is always
1522 ignored.
1523
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001524
Francisco Jerez9e550c32012-04-30 20:21:38 +02001525.. _threadsyncopcodes:
1526
1527Inter-thread synchronization opcodes
1528^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1529
1530These opcodes are intended for communication between threads running
1531within the same compute grid. For now they're only valid in compute
1532programs.
1533
1534.. opcode:: MFENCE - Memory fence
1535
1536 Syntax: ``MFENCE resource``
1537
1538 Example: ``MFENCE RES[0]``
1539
1540 This opcode forces strong ordering between any memory access
1541 operations that affect the specified resource. This means that
1542 previous loads and stores (and only those) will be performed and
1543 visible to other threads before the program execution continues.
1544
1545
1546.. opcode:: LFENCE - Load memory fence
1547
1548 Syntax: ``LFENCE resource``
1549
1550 Example: ``LFENCE RES[0]``
1551
1552 Similar to MFENCE, but it only affects the ordering of memory loads.
1553
1554
1555.. opcode:: SFENCE - Store memory fence
1556
1557 Syntax: ``SFENCE resource``
1558
1559 Example: ``SFENCE RES[0]``
1560
1561 Similar to MFENCE, but it only affects the ordering of memory stores.
1562
1563
1564.. opcode:: BARRIER - Thread group barrier
1565
1566 ``BARRIER``
1567
1568 This opcode suspends the execution of the current thread until all
1569 the remaining threads in the working group reach the same point of
1570 the program. Results are unspecified if any of the remaining
1571 threads terminates or never reaches an executed BARRIER instruction.
1572
1573
Francisco Jerezc2d31a82012-04-30 20:22:23 +02001574.. _atomopcodes:
1575
1576Atomic opcodes
1577^^^^^^^^^^^^^^
1578
1579These opcodes provide atomic variants of some common arithmetic and
1580logical operations. In this context atomicity means that another
1581concurrent memory access operation that affects the same memory
1582location is guaranteed to be performed strictly before or after the
1583entire execution of the atomic operation.
1584
1585For the moment they're only valid in compute programs.
1586
1587.. opcode:: ATOMUADD - Atomic integer addition
1588
1589 Syntax: ``ATOMUADD dst, resource, offset, src``
1590
1591 Example: ``ATOMUADD TEMP[0], RES[0], TEMP[1], TEMP[2]``
1592
1593 The following operation is performed atomically on each component:
1594
1595.. math::
1596
1597 dst_i = resource[offset]_i
1598
1599 resource[offset]_i = dst_i + src_i
1600
1601
1602.. opcode:: ATOMXCHG - Atomic exchange
1603
1604 Syntax: ``ATOMXCHG dst, resource, offset, src``
1605
1606 Example: ``ATOMXCHG TEMP[0], RES[0], TEMP[1], TEMP[2]``
1607
1608 The following operation is performed atomically on each component:
1609
1610.. math::
1611
1612 dst_i = resource[offset]_i
1613
1614 resource[offset]_i = src_i
1615
1616
1617.. opcode:: ATOMCAS - Atomic compare-and-exchange
1618
1619 Syntax: ``ATOMCAS dst, resource, offset, cmp, src``
1620
1621 Example: ``ATOMCAS TEMP[0], RES[0], TEMP[1], TEMP[2], TEMP[3]``
1622
1623 The following operation is performed atomically on each component:
1624
1625.. math::
1626
1627 dst_i = resource[offset]_i
1628
1629 resource[offset]_i = (dst_i == cmp_i ? src_i : dst_i)
1630
1631
1632.. opcode:: ATOMAND - Atomic bitwise And
1633
1634 Syntax: ``ATOMAND dst, resource, offset, src``
1635
1636 Example: ``ATOMAND TEMP[0], RES[0], TEMP[1], TEMP[2]``
1637
1638 The following operation is performed atomically on each component:
1639
1640.. math::
1641
1642 dst_i = resource[offset]_i
1643
1644 resource[offset]_i = dst_i \& src_i
1645
1646
1647.. opcode:: ATOMOR - Atomic bitwise Or
1648
1649 Syntax: ``ATOMOR dst, resource, offset, src``
1650
1651 Example: ``ATOMOR TEMP[0], RES[0], TEMP[1], TEMP[2]``
1652
1653 The following operation is performed atomically on each component:
1654
1655.. math::
1656
1657 dst_i = resource[offset]_i
1658
1659 resource[offset]_i = dst_i | src_i
1660
1661
1662.. opcode:: ATOMXOR - Atomic bitwise Xor
1663
1664 Syntax: ``ATOMXOR dst, resource, offset, src``
1665
1666 Example: ``ATOMXOR TEMP[0], RES[0], TEMP[1], TEMP[2]``
1667
1668 The following operation is performed atomically on each component:
1669
1670.. math::
1671
1672 dst_i = resource[offset]_i
1673
1674 resource[offset]_i = dst_i \oplus src_i
1675
1676
1677.. opcode:: ATOMUMIN - Atomic unsigned minimum
1678
1679 Syntax: ``ATOMUMIN dst, resource, offset, src``
1680
1681 Example: ``ATOMUMIN TEMP[0], RES[0], TEMP[1], TEMP[2]``
1682
1683 The following operation is performed atomically on each component:
1684
1685.. math::
1686
1687 dst_i = resource[offset]_i
1688
1689 resource[offset]_i = (dst_i < src_i ? dst_i : src_i)
1690
1691
1692.. opcode:: ATOMUMAX - Atomic unsigned maximum
1693
1694 Syntax: ``ATOMUMAX dst, resource, offset, src``
1695
1696 Example: ``ATOMUMAX TEMP[0], RES[0], TEMP[1], TEMP[2]``
1697
1698 The following operation is performed atomically on each component:
1699
1700.. math::
1701
1702 dst_i = resource[offset]_i
1703
1704 resource[offset]_i = (dst_i > src_i ? dst_i : src_i)
1705
1706
1707.. opcode:: ATOMIMIN - Atomic signed minimum
1708
1709 Syntax: ``ATOMIMIN dst, resource, offset, src``
1710
1711 Example: ``ATOMIMIN TEMP[0], RES[0], TEMP[1], TEMP[2]``
1712
1713 The following operation is performed atomically on each component:
1714
1715.. math::
1716
1717 dst_i = resource[offset]_i
1718
1719 resource[offset]_i = (dst_i < src_i ? dst_i : src_i)
1720
1721
1722.. opcode:: ATOMIMAX - Atomic signed maximum
1723
1724 Syntax: ``ATOMIMAX dst, resource, offset, src``
1725
1726 Example: ``ATOMIMAX TEMP[0], RES[0], TEMP[1], TEMP[2]``
1727
1728 The following operation is performed atomically on each component:
1729
1730.. math::
1731
1732 dst_i = resource[offset]_i
1733
1734 resource[offset]_i = (dst_i > src_i ? dst_i : src_i)
1735
1736
1737
Corbin Simpsonda65ac62009-12-21 20:32:46 -08001738Explanation of symbols used
Corbin Simpson5bcd26c2009-12-21 21:04:10 -08001739------------------------------
Keith Whitwella62aaa72009-12-21 23:25:15 +00001740
1741
Corbin Simpsonda65ac62009-12-21 20:32:46 -08001742Functions
Corbin Simpson5bcd26c2009-12-21 21:04:10 -08001743^^^^^^^^^^^^^^
Keith Whitwella62aaa72009-12-21 23:25:15 +00001744
1745
Corbin Simpson14743ac2009-12-21 19:57:56 -08001746 :math:`|x|` Absolute value of `x`.
Keith Whitwella62aaa72009-12-21 23:25:15 +00001747
Corbin Simpson14743ac2009-12-21 19:57:56 -08001748 :math:`\lceil x \rceil` Ceiling of `x`.
Keith Whitwella62aaa72009-12-21 23:25:15 +00001749
1750 clamp(x,y,z) Clamp x between y and z.
1751 (x < y) ? y : (x > z) ? z : x
1752
Corbin Simpsondd801e52009-12-21 19:41:09 -08001753 :math:`\lfloor x\rfloor` Floor of `x`.
Keith Whitwella62aaa72009-12-21 23:25:15 +00001754
Corbin Simpson14743ac2009-12-21 19:57:56 -08001755 :math:`\log_2{x}` Logarithm of `x`, base 2.
Keith Whitwella62aaa72009-12-21 23:25:15 +00001756
1757 max(x,y) Maximum of x and y.
1758 (x > y) ? x : y
1759
1760 min(x,y) Minimum of x and y.
1761 (x < y) ? x : y
1762
1763 partialx(x) Derivative of x relative to fragment's X.
1764
1765 partialy(x) Derivative of x relative to fragment's Y.
1766
1767 pop() Pop from stack.
1768
Corbin Simpsondd801e52009-12-21 19:41:09 -08001769 :math:`x^y` `x` to the power `y`.
Keith Whitwella62aaa72009-12-21 23:25:15 +00001770
1771 push(x) Push x on stack.
1772
1773 round(x) Round x.
1774
Michal Krol07f416c2010-01-04 13:21:32 +01001775 trunc(x) Truncate x, i.e. drop the fraction bits.
Keith Whitwella62aaa72009-12-21 23:25:15 +00001776
1777
Corbin Simpsonda65ac62009-12-21 20:32:46 -08001778Keywords
Corbin Simpson5bcd26c2009-12-21 21:04:10 -08001779^^^^^^^^^^^^^
Keith Whitwella62aaa72009-12-21 23:25:15 +00001780
1781
1782 discard Discard fragment.
1783
Keith Whitwella62aaa72009-12-21 23:25:15 +00001784 pc Program counter.
1785
Keith Whitwella62aaa72009-12-21 23:25:15 +00001786 target Label of target instruction.
1787
1788
Corbin Simpsonda65ac62009-12-21 20:32:46 -08001789Other tokens
Corbin Simpson5bcd26c2009-12-21 21:04:10 -08001790---------------
Keith Whitwella62aaa72009-12-21 23:25:15 +00001791
1792
Michal Krol63d60972010-02-03 15:45:32 +01001793Declaration
1794^^^^^^^^^^^
1795
1796
1797Declares a register that is will be referenced as an operand in Instruction
1798tokens.
1799
1800File field contains register file that is being declared and is one
1801of TGSI_FILE.
1802
1803UsageMask field specifies which of the register components can be accessed
1804and is one of TGSI_WRITEMASK.
1805
Francisco Jerez26449522012-03-18 19:21:36 +01001806The Local flag specifies that a given value isn't intended for
1807subroutine parameter passing and, as a result, the implementation
1808isn't required to give any guarantees of it being preserved across
1809subroutine boundaries. As it's merely a compiler hint, the
1810implementation is free to ignore it.
1811
Michal Krol63d60972010-02-03 15:45:32 +01001812If Dimension flag is set to 1, a Declaration Dimension token follows.
1813
1814If Semantic flag is set to 1, a Declaration Semantic token follows.
1815
Francisco Jerez12799232012-04-30 18:27:52 +02001816If Interpolate flag is set to 1, a Declaration Interpolate token follows.
Michal Krol63d60972010-02-03 15:45:32 +01001817
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001818If file is TGSI_FILE_RESOURCE, a Declaration Resource token follows.
1819
Michal Krol63d60972010-02-03 15:45:32 +01001820
Corbin Simpsonda65ac62009-12-21 20:32:46 -08001821Declaration Semantic
Corbin Simpson5bcd26c2009-12-21 21:04:10 -08001822^^^^^^^^^^^^^^^^^^^^^^^^
Keith Whitwella62aaa72009-12-21 23:25:15 +00001823
Brian Paul05a18f42010-06-24 07:21:15 -06001824 Vertex and fragment shader input and output registers may be labeled
1825 with semantic information consisting of a name and index.
Keith Whitwella62aaa72009-12-21 23:25:15 +00001826
1827 Follows Declaration token if Semantic bit is set.
1828
1829 Since its purpose is to link a shader with other stages of the pipeline,
1830 it is valid to follow only those Declaration tokens that declare a register
1831 either in INPUT or OUTPUT file.
1832
1833 SemanticName field contains the semantic name of the register being declared.
1834 There is no default value.
1835
1836 SemanticIndex is an optional subscript that can be used to distinguish
1837 different register declarations with the same semantic name. The default value
1838 is 0.
1839
1840 The meanings of the individual semantic names are explained in the following
1841 sections.
1842
Corbin Simpson54ddf642009-12-23 23:36:06 -08001843TGSI_SEMANTIC_POSITION
1844""""""""""""""""""""""
Keith Whitwella62aaa72009-12-21 23:25:15 +00001845
Brian Paul50b3f2e2010-06-23 17:00:10 -06001846For vertex shaders, TGSI_SEMANTIC_POSITION indicates the vertex shader
1847output register which contains the homogeneous vertex position in the clip
1848space coordinate system. After clipping, the X, Y and Z components of the
1849vertex will be divided by the W value to get normalized device coordinates.
Keith Whitwella62aaa72009-12-21 23:25:15 +00001850
Brian Paul50b3f2e2010-06-23 17:00:10 -06001851For fragment shaders, TGSI_SEMANTIC_POSITION is used to indicate that
1852fragment shader input contains the fragment's window position. The X
1853component starts at zero and always increases from left to right.
1854The Y component starts at zero and always increases but Y=0 may either
1855indicate the top of the window or the bottom depending on the fragment
1856coordinate origin convention (see TGSI_PROPERTY_FS_COORD_ORIGIN).
1857The Z coordinate ranges from 0 to 1 to represent depth from the front
1858to the back of the Z buffer. The W component contains the reciprocol
1859of the interpolated vertex position W component.
Corbin Simpson54ddf642009-12-23 23:36:06 -08001860
Brian Paul05a18f42010-06-24 07:21:15 -06001861Fragment shaders may also declare an output register with
1862TGSI_SEMANTIC_POSITION. Only the Z component is writable. This allows
1863the fragment shader to change the fragment's Z position.
1864
Corbin Simpson54ddf642009-12-23 23:36:06 -08001865
Corbin Simpson54ddf642009-12-23 23:36:06 -08001866
1867TGSI_SEMANTIC_COLOR
1868"""""""""""""""""""
1869
Brian Paul50b3f2e2010-06-23 17:00:10 -06001870For vertex shader outputs or fragment shader inputs/outputs, this
1871label indicates that the resister contains an R,G,B,A color.
Corbin Simpson54ddf642009-12-23 23:36:06 -08001872
Brian Paul50b3f2e2010-06-23 17:00:10 -06001873Several shader inputs/outputs may contain colors so the semantic index
1874is used to distinguish them. For example, color[0] may be the diffuse
1875color while color[1] may be the specular color.
1876
1877This label is needed so that the flat/smooth shading can be applied
1878to the right interpolants during rasterization.
1879
1880
Corbin Simpson54ddf642009-12-23 23:36:06 -08001881
1882TGSI_SEMANTIC_BCOLOR
1883""""""""""""""""""""
1884
1885Back-facing colors are only used for back-facing polygons, and are only valid
1886in vertex shader outputs. After rasterization, all polygons are front-facing
Brian Paul50b3f2e2010-06-23 17:00:10 -06001887and COLOR and BCOLOR end up occupying the same slots in the fragment shader,
1888so all BCOLORs effectively become regular COLORs in the fragment shader.
1889
Corbin Simpson54ddf642009-12-23 23:36:06 -08001890
1891TGSI_SEMANTIC_FOG
1892"""""""""""""""""
1893
Brian Paul05a18f42010-06-24 07:21:15 -06001894Vertex shader inputs and outputs and fragment shader inputs may be
1895labeled with TGSI_SEMANTIC_FOG to indicate that the register contains
1896a fog coordinate in the form (F, 0, 0, 1). Typically, the fragment
1897shader will use the fog coordinate to compute a fog blend factor which
1898is used to blend the normal fragment color with a constant fog color.
Corbin Simpson54ddf642009-12-23 23:36:06 -08001899
Brian Paul05a18f42010-06-24 07:21:15 -06001900Only the first component matters when writing from the vertex shader;
1901the driver will ensure that the coordinate is in this format when used
1902as a fragment shader input.
1903
Corbin Simpson54ddf642009-12-23 23:36:06 -08001904
1905TGSI_SEMANTIC_PSIZE
1906"""""""""""""""""""
1907
Brian Paul05a18f42010-06-24 07:21:15 -06001908Vertex shader input and output registers may be labeled with
1909TGIS_SEMANTIC_PSIZE to indicate that the register contains a point size
1910in the form (S, 0, 0, 1). The point size controls the width or diameter
1911of points for rasterization. This label cannot be used in fragment
1912shaders.
Corbin Simpson54ddf642009-12-23 23:36:06 -08001913
1914When using this semantic, be sure to set the appropriate state in the
1915:ref:`rasterizer` first.
1916
Brian Paul05a18f42010-06-24 07:21:15 -06001917
Corbin Simpson54ddf642009-12-23 23:36:06 -08001918TGSI_SEMANTIC_GENERIC
1919"""""""""""""""""""""
1920
Brian Paul05a18f42010-06-24 07:21:15 -06001921All vertex/fragment shader inputs/outputs not labeled with any other
1922semantic label can be considered to be generic attributes. Typical
1923uses of generic inputs/outputs are texcoords and user-defined values.
Corbin Simpson54ddf642009-12-23 23:36:06 -08001924
Corbin Simpson54ddf642009-12-23 23:36:06 -08001925
1926TGSI_SEMANTIC_NORMAL
1927""""""""""""""""""""
1928
Brian Paul05a18f42010-06-24 07:21:15 -06001929Indicates that a vertex shader input is a normal vector. This is
1930typically only used for legacy graphics APIs.
1931
Corbin Simpson54ddf642009-12-23 23:36:06 -08001932
1933TGSI_SEMANTIC_FACE
1934""""""""""""""""""
1935
Brian Paul05a18f42010-06-24 07:21:15 -06001936This label applies to fragment shader inputs only and indicates that
1937the register contains front/back-face information of the form (F, 0,
19380, 1). The first component will be positive when the fragment belongs
1939to a front-facing polygon, and negative when the fragment belongs to a
1940back-facing polygon.
1941
Corbin Simpson54ddf642009-12-23 23:36:06 -08001942
1943TGSI_SEMANTIC_EDGEFLAG
1944""""""""""""""""""""""
1945
Brian Paul73153002010-06-23 17:38:58 -06001946For vertex shaders, this sematic label indicates that an input or
1947output is a boolean edge flag. The register layout is [F, x, x, x]
1948where F is 0.0 or 1.0 and x = don't care. Normally, the vertex shader
1949simply copies the edge flag input to the edgeflag output.
1950
1951Edge flags are used to control which lines or points are actually
1952drawn when the polygon mode converts triangles/quads/polygons into
1953points or lines.
1954
Dave Airlie4ecb2c12010-10-06 09:28:46 +10001955TGSI_SEMANTIC_STENCIL
1956""""""""""""""""""""""
1957
1958For fragment shaders, this semantic label indicates than an output
1959is a writable stencil reference value. Only the Y component is writable.
1960This allows the fragment shader to change the fragments stencilref value.
Luca Barbieri73317132010-01-21 05:36:14 +01001961
1962
Francisco Jerez12799232012-04-30 18:27:52 +02001963Declaration Interpolate
1964^^^^^^^^^^^^^^^^^^^^^^^
1965
1966This token is only valid for fragment shader INPUT declarations.
1967
1968The Interpolate field specifes the way input is being interpolated by
1969the rasteriser and is one of TGSI_INTERPOLATE_*.
1970
1971The CylindricalWrap bitfield specifies which register components
1972should be subject to cylindrical wrapping when interpolating by the
1973rasteriser. If TGSI_CYLINDRICAL_WRAP_X is set to 1, the X component
1974should be interpolated according to cylindrical wrapping rules.
1975
1976
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001977Declaration Sampler View
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001978^^^^^^^^^^^^^^^^^^^^^^^^
1979
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001980 Follows Declaration token if file is TGSI_FILE_SAMPLER_VIEW.
1981
1982 DCL SVIEW[#], resource, type(s)
1983
1984 Declares a shader input sampler view and assigns it to a SVIEW[#]
1985 register.
1986
1987 resource can be one of BUFFER, 1D, 2D, 3D, 1DArray and 2DArray.
1988
1989 type must be 1 or 4 entries (if specifying on a per-component
1990 level) out of UNORM, SNORM, SINT, UINT and FLOAT.
1991
1992
1993Declaration Resource
1994^^^^^^^^^^^^^^^^^^^^
1995
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001996 Follows Declaration token if file is TGSI_FILE_RESOURCE.
1997
Francisco Jerezb8e808f2012-04-30 20:20:29 +02001998 DCL RES[#], resource [, WR] [, RAW]
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001999
2000 Declares a shader input resource and assigns it to a RES[#]
2001 register.
2002
2003 resource can be one of BUFFER, 1D, 2D, 3D, CUBE, 1DArray and
2004 2DArray.
2005
Francisco Jerez82c90b22012-04-30 19:08:55 +02002006 If the RAW keyword is not specified, the texture data will be
2007 subject to conversion, swizzling and scaling as required to yield
2008 the specified data type from the physical data format of the bound
2009 resource.
2010
2011 If the RAW keyword is specified, no channel conversion will be
2012 performed: the values read for each of the channels (X,Y,Z,W) will
2013 correspond to consecutive words in the same order and format
2014 they're found in memory. No element-to-address conversion will be
2015 performed either: the value of the provided X coordinate will be
2016 interpreted in byte units instead of texel units. The result of
2017 accessing a misaligned address is undefined.
2018
Francisco Jerezb8e808f2012-04-30 20:20:29 +02002019 Usage of the STORE opcode is only allowed if the WR (writable) flag
2020 is set.
2021
Zack Rusinbdbe77f2011-01-24 17:47:10 -05002022
Luca Barbieri73317132010-01-21 05:36:14 +01002023Properties
2024^^^^^^^^^^^^^^^^^^^^^^^^
2025
2026
2027 Properties are general directives that apply to the whole TGSI program.
2028
2029FS_COORD_ORIGIN
2030"""""""""""""""
2031
2032Specifies the fragment shader TGSI_SEMANTIC_POSITION coordinate origin.
2033The default value is UPPER_LEFT.
2034
2035If UPPER_LEFT, the position will be (0,0) at the upper left corner and
2036increase downward and rightward.
2037If LOWER_LEFT, the position will be (0,0) at the lower left corner and
2038increase upward and rightward.
2039
2040OpenGL defaults to LOWER_LEFT, and is configurable with the
2041GL_ARB_fragment_coord_conventions extension.
2042
2043DirectX 9/10 use UPPER_LEFT.
2044
2045FS_COORD_PIXEL_CENTER
2046"""""""""""""""""""""
2047
2048Specifies the fragment shader TGSI_SEMANTIC_POSITION pixel center convention.
2049The default value is HALF_INTEGER.
2050
2051If HALF_INTEGER, the fractionary part of the position will be 0.5
2052If INTEGER, the fractionary part of the position will be 0.0
2053
2054Note that this does not affect the set of fragments generated by
2055rasterization, which is instead controlled by gl_rasterization_rules in the
2056rasterizer.
2057
2058OpenGL defaults to HALF_INTEGER, and is configurable with the
2059GL_ARB_fragment_coord_conventions extension.
2060
2061DirectX 9 uses INTEGER.
2062DirectX 10 uses HALF_INTEGER.
Brian Paul4778f462010-02-02 08:14:40 -07002063
Dave Airliec9c8a5e2010-12-18 10:34:35 +10002064FS_COLOR0_WRITES_ALL_CBUFS
2065""""""""""""""""""""""""""
2066Specifies that writes to the fragment shader color 0 are replicated to all
2067bound cbufs. This facilitates OpenGL's fragColor output vs fragData[0] where
2068fragData is directed to a single color buffer, but fragColor is broadcast.
Brian Paul4778f462010-02-02 08:14:40 -07002069
Marek Olšákdc4c8212012-01-10 00:19:00 +01002070VS_PROHIBIT_UCPS
2071""""""""""""""""""""""""""
2072If this property is set on the program bound to the shader stage before the
2073fragment shader, user clip planes should have no effect (be disabled) even if
2074that shader does not write to any clip distance outputs and the rasterizer's
2075clip_plane_enable is non-zero.
2076This property is only supported by drivers that also support shader clip
2077distance outputs.
2078This is useful for APIs that don't have UCPs and where clip distances written
2079by a shader cannot be disabled.
2080
Brian Paul4778f462010-02-02 08:14:40 -07002081
2082Texture Sampling and Texture Formats
2083------------------------------------
2084
Corbin Simpson797dcc02010-02-02 17:07:26 -08002085This table shows how texture image components are returned as (x,y,z,w) tuples
2086by TGSI texture instructions, such as :opcode:`TEX`, :opcode:`TXD`, and
2087:opcode:`TXP`. For reference, OpenGL and Direct3D conventions are shown as
2088well.
Brian Paul4778f462010-02-02 08:14:40 -07002089
Corbin Simpson516e7152010-02-02 12:44:22 -08002090+--------------------+--------------+--------------------+--------------+
2091| Texture Components | Gallium | OpenGL | Direct3D 9 |
2092+====================+==============+====================+==============+
Corbin Simpson92867dc2010-06-16 16:56:55 -07002093| R | (r, 0, 0, 1) | (r, 0, 0, 1) | (r, 1, 1, 1) |
Corbin Simpson516e7152010-02-02 12:44:22 -08002094+--------------------+--------------+--------------------+--------------+
Corbin Simpson92867dc2010-06-16 16:56:55 -07002095| RG | (r, g, 0, 1) | (r, g, 0, 1) | (r, g, 1, 1) |
Corbin Simpson516e7152010-02-02 12:44:22 -08002096+--------------------+--------------+--------------------+--------------+
2097| RGB | (r, g, b, 1) | (r, g, b, 1) | (r, g, b, 1) |
2098+--------------------+--------------+--------------------+--------------+
2099| RGBA | (r, g, b, a) | (r, g, b, a) | (r, g, b, a) |
2100+--------------------+--------------+--------------------+--------------+
2101| A | (0, 0, 0, a) | (0, 0, 0, a) | (0, 0, 0, a) |
2102+--------------------+--------------+--------------------+--------------+
2103| L | (l, l, l, 1) | (l, l, l, 1) | (l, l, l, 1) |
2104+--------------------+--------------+--------------------+--------------+
2105| LA | (l, l, l, a) | (l, l, l, a) | (l, l, l, a) |
2106+--------------------+--------------+--------------------+--------------+
2107| I | (i, i, i, i) | (i, i, i, i) | N/A |
2108+--------------------+--------------+--------------------+--------------+
2109| UV | XXX TBD | (0, 0, 0, 1) | (u, v, 1, 1) |
2110| | | [#envmap-bumpmap]_ | |
2111+--------------------+--------------+--------------------+--------------+
Brian Paul3e572eb2010-02-02 16:27:07 -07002112| Z | XXX TBD | (z, z, z, 1) | (0, z, 0, 1) |
Corbin Simpson516e7152010-02-02 12:44:22 -08002113| | | [#depth-tex-mode]_ | |
2114+--------------------+--------------+--------------------+--------------+
Dave Airlie66a0d1e2010-10-06 09:30:17 +10002115| S | (s, s, s, s) | unknown | unknown |
2116+--------------------+--------------+--------------------+--------------+
Brian Paul4778f462010-02-02 08:14:40 -07002117
Corbin Simpson516e7152010-02-02 12:44:22 -08002118.. [#envmap-bumpmap] http://www.opengl.org/registry/specs/ATI/envmap_bumpmap.txt
Brian Paul3e572eb2010-02-02 16:27:07 -07002119.. [#depth-tex-mode] the default is (z, z, z, 1) but may also be (0, 0, 0, z)
Corbin Simpson797dcc02010-02-02 17:07:26 -08002120 or (z, z, z, z) depending on the value of GL_DEPTH_TEXTURE_MODE.