blob: 5f03f3245353743a005bebefab962ca892380be1 [file] [log] [blame]
Corbin Simpsonc686e172009-12-20 15:00:40 -08001TGSI
2====
3
Michal Krolb6659682010-01-04 12:52:43 +01004TGSI, Tungsten Graphics Shader Infrastructure, is an intermediate language
Corbin Simpsonc686e172009-12-20 15:00:40 -08005for describing shaders. Since Gallium is inherently shaderful, shaders are
6an important part of the API. TGSI is the only intermediate representation
7used by all drivers.
Keith Whitwella62aaa72009-12-21 23:25:15 +00008
Corbin Simpson62ca7b82010-02-02 16:36:34 -08009Basics
10------
11
12All TGSI instructions, known as *opcodes*, operate on arbitrary-precision
13floating-point four-component vectors. An opcode may have up to one
14destination register, known as *dst*, and between zero and three source
15registers, called *src0* through *src2*, or simply *src* if there is only
16one.
17
18Some instructions, like :opcode:`I2F`, permit re-interpretation of vector
19components as integers. Other instructions permit using registers as
20two-component vectors with double precision; see :ref:`Double Opcodes`.
21
Corbin Simpson17c2a442010-02-02 17:02:28 -080022When an instruction has a scalar result, the result is usually copied into
23each of the components of *dst*. When this happens, the result is said to be
24*replicated* to *dst*. :opcode:`RCP` is one such instruction.
25
Corbin Simpson5bcd26c2009-12-21 21:04:10 -080026Instruction Set
27---------------
28
Corbin Simpson9d4cb6e2010-06-16 18:34:32 -070029Core ISA
Corbin Simpson5bcd26c2009-12-21 21:04:10 -080030^^^^^^^^^^^^^^^^^^^^^^^^^
Keith Whitwella62aaa72009-12-21 23:25:15 +000031
Corbin Simpson9d4cb6e2010-06-16 18:34:32 -070032These opcodes are guaranteed to be available regardless of the driver being
33used.
Keith Whitwella62aaa72009-12-21 23:25:15 +000034
Corbin Simpson85805222010-02-02 16:20:12 -080035.. opcode:: ARL - Address Register Load
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080036
37.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +000038
Corbin Simpsond92a6852009-12-21 19:30:29 -080039 dst.x = \lfloor src.x\rfloor
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080040
Corbin Simpsond92a6852009-12-21 19:30:29 -080041 dst.y = \lfloor src.y\rfloor
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080042
Corbin Simpsond92a6852009-12-21 19:30:29 -080043 dst.z = \lfloor src.z\rfloor
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080044
Corbin Simpsond92a6852009-12-21 19:30:29 -080045 dst.w = \lfloor src.w\rfloor
Keith Whitwella62aaa72009-12-21 23:25:15 +000046
47
Corbin Simpson85805222010-02-02 16:20:12 -080048.. opcode:: MOV - Move
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080049
50.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +000051
52 dst.x = src.x
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080053
Keith Whitwella62aaa72009-12-21 23:25:15 +000054 dst.y = src.y
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080055
Keith Whitwella62aaa72009-12-21 23:25:15 +000056 dst.z = src.z
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080057
Keith Whitwella62aaa72009-12-21 23:25:15 +000058 dst.w = src.w
59
60
Corbin Simpson85805222010-02-02 16:20:12 -080061.. opcode:: LIT - Light Coefficients
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080062
63.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +000064
Corbin Simpsonda65ac62009-12-21 20:32:46 -080065 dst.x = 1
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080066
Corbin Simpsonda65ac62009-12-21 20:32:46 -080067 dst.y = max(src.x, 0)
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080068
Corbin Simpsonda65ac62009-12-21 20:32:46 -080069 dst.z = (src.x > 0) ? max(src.y, 0)^{clamp(src.w, -128, 128))} : 0
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080070
Corbin Simpsonda65ac62009-12-21 20:32:46 -080071 dst.w = 1
Keith Whitwella62aaa72009-12-21 23:25:15 +000072
73
Corbin Simpson85805222010-02-02 16:20:12 -080074.. opcode:: RCP - Reciprocal
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080075
Corbin Simpson17c2a442010-02-02 17:02:28 -080076This instruction replicates its result.
77
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080078.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +000079
Corbin Simpson17c2a442010-02-02 17:02:28 -080080 dst = \frac{1}{src.x}
Keith Whitwella62aaa72009-12-21 23:25:15 +000081
82
Corbin Simpson85805222010-02-02 16:20:12 -080083.. opcode:: RSQ - Reciprocal Square Root
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080084
Corbin Simpson17c2a442010-02-02 17:02:28 -080085This instruction replicates its result.
86
Corbin Simpsone8ed3b92009-12-21 19:12:55 -080087.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +000088
Corbin Simpson17c2a442010-02-02 17:02:28 -080089 dst = \frac{1}{\sqrt{|src.x|}}
Keith Whitwella62aaa72009-12-21 23:25:15 +000090
91
Brian Pauld276a402013-02-01 10:59:43 -070092.. opcode:: SQRT - Square Root
93
94This instruction replicates its result.
95
96.. math::
97
98 dst = {\sqrt{src.x}}
99
100
Corbin Simpson85805222010-02-02 16:20:12 -0800101.. opcode:: EXP - Approximate Exponential Base 2
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800102
103.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000104
Corbin Simpsondd801e52009-12-21 19:41:09 -0800105 dst.x = 2^{\lfloor src.x\rfloor}
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800106
Corbin Simpsond92a6852009-12-21 19:30:29 -0800107 dst.y = src.x - \lfloor src.x\rfloor
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800108
Corbin Simpsondd801e52009-12-21 19:41:09 -0800109 dst.z = 2^{src.x}
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800110
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800111 dst.w = 1
Keith Whitwella62aaa72009-12-21 23:25:15 +0000112
113
Corbin Simpson85805222010-02-02 16:20:12 -0800114.. opcode:: LOG - Approximate Logarithm Base 2
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800115
116.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000117
Corbin Simpson14743ac2009-12-21 19:57:56 -0800118 dst.x = \lfloor\log_2{|src.x|}\rfloor
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800119
Corbin Simpson14743ac2009-12-21 19:57:56 -0800120 dst.y = \frac{|src.x|}{2^{\lfloor\log_2{|src.x|}\rfloor}}
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800121
Corbin Simpson14743ac2009-12-21 19:57:56 -0800122 dst.z = \log_2{|src.x|}
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800123
Corbin Simpson14743ac2009-12-21 19:57:56 -0800124 dst.w = 1
Keith Whitwella62aaa72009-12-21 23:25:15 +0000125
126
Corbin Simpson85805222010-02-02 16:20:12 -0800127.. opcode:: MUL - Multiply
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800128
129.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000130
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800131 dst.x = src0.x \times src1.x
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800132
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800133 dst.y = src0.y \times src1.y
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800134
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800135 dst.z = src0.z \times src1.z
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800136
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800137 dst.w = src0.w \times src1.w
Keith Whitwella62aaa72009-12-21 23:25:15 +0000138
139
Corbin Simpson85805222010-02-02 16:20:12 -0800140.. opcode:: ADD - Add
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800141
142.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000143
144 dst.x = src0.x + src1.x
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800145
Keith Whitwella62aaa72009-12-21 23:25:15 +0000146 dst.y = src0.y + src1.y
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800147
Keith Whitwella62aaa72009-12-21 23:25:15 +0000148 dst.z = src0.z + src1.z
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800149
Keith Whitwella62aaa72009-12-21 23:25:15 +0000150 dst.w = src0.w + src1.w
151
152
Corbin Simpson85805222010-02-02 16:20:12 -0800153.. opcode:: DP3 - 3-component Dot Product
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800154
Corbin Simpson17c2a442010-02-02 17:02:28 -0800155This instruction replicates its result.
156
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800157.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000158
Corbin Simpson17c2a442010-02-02 17:02:28 -0800159 dst = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z
Keith Whitwella62aaa72009-12-21 23:25:15 +0000160
161
Corbin Simpson85805222010-02-02 16:20:12 -0800162.. opcode:: DP4 - 4-component Dot Product
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800163
Corbin Simpson17c2a442010-02-02 17:02:28 -0800164This instruction replicates its result.
165
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800166.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000167
Corbin Simpson17c2a442010-02-02 17:02:28 -0800168 dst = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src0.w \times src1.w
Keith Whitwella62aaa72009-12-21 23:25:15 +0000169
170
Corbin Simpson85805222010-02-02 16:20:12 -0800171.. opcode:: DST - Distance Vector
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800172
173.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000174
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800175 dst.x = 1
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800176
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800177 dst.y = src0.y \times src1.y
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800178
Keith Whitwella62aaa72009-12-21 23:25:15 +0000179 dst.z = src0.z
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800180
Keith Whitwella62aaa72009-12-21 23:25:15 +0000181 dst.w = src1.w
182
183
Corbin Simpson85805222010-02-02 16:20:12 -0800184.. opcode:: MIN - Minimum
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800185
186.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000187
188 dst.x = min(src0.x, src1.x)
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800189
Keith Whitwella62aaa72009-12-21 23:25:15 +0000190 dst.y = min(src0.y, src1.y)
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800191
Keith Whitwella62aaa72009-12-21 23:25:15 +0000192 dst.z = min(src0.z, src1.z)
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800193
Keith Whitwella62aaa72009-12-21 23:25:15 +0000194 dst.w = min(src0.w, src1.w)
195
196
Corbin Simpson85805222010-02-02 16:20:12 -0800197.. opcode:: MAX - Maximum
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800198
199.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000200
201 dst.x = max(src0.x, src1.x)
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800202
Keith Whitwella62aaa72009-12-21 23:25:15 +0000203 dst.y = max(src0.y, src1.y)
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800204
Keith Whitwella62aaa72009-12-21 23:25:15 +0000205 dst.z = max(src0.z, src1.z)
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800206
Keith Whitwella62aaa72009-12-21 23:25:15 +0000207 dst.w = max(src0.w, src1.w)
208
209
Corbin Simpson85805222010-02-02 16:20:12 -0800210.. opcode:: SLT - Set On Less Than
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800211
212.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000213
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800214 dst.x = (src0.x < src1.x) ? 1 : 0
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800215
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800216 dst.y = (src0.y < src1.y) ? 1 : 0
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800217
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800218 dst.z = (src0.z < src1.z) ? 1 : 0
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800219
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800220 dst.w = (src0.w < src1.w) ? 1 : 0
Keith Whitwella62aaa72009-12-21 23:25:15 +0000221
222
Corbin Simpson85805222010-02-02 16:20:12 -0800223.. opcode:: SGE - Set On Greater Equal Than
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800224
225.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000226
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800227 dst.x = (src0.x >= src1.x) ? 1 : 0
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800228
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800229 dst.y = (src0.y >= src1.y) ? 1 : 0
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800230
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800231 dst.z = (src0.z >= src1.z) ? 1 : 0
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800232
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800233 dst.w = (src0.w >= src1.w) ? 1 : 0
Keith Whitwella62aaa72009-12-21 23:25:15 +0000234
235
Corbin Simpson85805222010-02-02 16:20:12 -0800236.. opcode:: MAD - Multiply And Add
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800237
238.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000239
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800240 dst.x = src0.x \times src1.x + src2.x
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800241
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800242 dst.y = src0.y \times src1.y + src2.y
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800243
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800244 dst.z = src0.z \times src1.z + src2.z
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800245
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800246 dst.w = src0.w \times src1.w + src2.w
Keith Whitwella62aaa72009-12-21 23:25:15 +0000247
248
Corbin Simpson85805222010-02-02 16:20:12 -0800249.. opcode:: SUB - Subtract
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800250
251.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000252
253 dst.x = src0.x - src1.x
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800254
Keith Whitwella62aaa72009-12-21 23:25:15 +0000255 dst.y = src0.y - src1.y
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800256
Keith Whitwella62aaa72009-12-21 23:25:15 +0000257 dst.z = src0.z - src1.z
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800258
Keith Whitwella62aaa72009-12-21 23:25:15 +0000259 dst.w = src0.w - src1.w
260
261
Corbin Simpson85805222010-02-02 16:20:12 -0800262.. opcode:: LRP - Linear Interpolate
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800263
264.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000265
Michal Krolb3567fc2010-01-04 12:59:17 +0100266 dst.x = src0.x \times src1.x + (1 - src0.x) \times src2.x
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800267
Michal Krolb3567fc2010-01-04 12:59:17 +0100268 dst.y = src0.y \times src1.y + (1 - src0.y) \times src2.y
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800269
Michal Krolb3567fc2010-01-04 12:59:17 +0100270 dst.z = src0.z \times src1.z + (1 - src0.z) \times src2.z
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800271
Michal Krolb3567fc2010-01-04 12:59:17 +0100272 dst.w = src0.w \times src1.w + (1 - src0.w) \times src2.w
Keith Whitwella62aaa72009-12-21 23:25:15 +0000273
274
Corbin Simpson85805222010-02-02 16:20:12 -0800275.. opcode:: CND - Condition
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800276
277.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000278
279 dst.x = (src2.x > 0.5) ? src0.x : src1.x
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800280
Keith Whitwella62aaa72009-12-21 23:25:15 +0000281 dst.y = (src2.y > 0.5) ? src0.y : src1.y
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800282
Keith Whitwella62aaa72009-12-21 23:25:15 +0000283 dst.z = (src2.z > 0.5) ? src0.z : src1.z
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800284
Keith Whitwella62aaa72009-12-21 23:25:15 +0000285 dst.w = (src2.w > 0.5) ? src0.w : src1.w
286
287
Corbin Simpson85805222010-02-02 16:20:12 -0800288.. opcode:: DP2A - 2-component Dot Product And Add
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800289
290.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000291
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800292 dst.x = src0.x \times src1.x + src0.y \times src1.y + src2.x
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800293
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800294 dst.y = src0.x \times src1.x + src0.y \times src1.y + src2.x
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800295
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800296 dst.z = src0.x \times src1.x + src0.y \times src1.y + src2.x
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800297
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800298 dst.w = src0.x \times src1.x + src0.y \times src1.y + src2.x
Keith Whitwella62aaa72009-12-21 23:25:15 +0000299
300
José Fonsecad9c6ebb2010-06-01 16:25:05 +0100301.. opcode:: FRC - Fraction
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800302
303.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000304
Corbin Simpsond92a6852009-12-21 19:30:29 -0800305 dst.x = src.x - \lfloor src.x\rfloor
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800306
Corbin Simpsond92a6852009-12-21 19:30:29 -0800307 dst.y = src.y - \lfloor src.y\rfloor
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800308
Corbin Simpsond92a6852009-12-21 19:30:29 -0800309 dst.z = src.z - \lfloor src.z\rfloor
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800310
Corbin Simpsond92a6852009-12-21 19:30:29 -0800311 dst.w = src.w - \lfloor src.w\rfloor
Keith Whitwella62aaa72009-12-21 23:25:15 +0000312
313
Corbin Simpson85805222010-02-02 16:20:12 -0800314.. opcode:: CLAMP - Clamp
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800315
316.. math::
Keith Whitwella62aaa72009-12-21 23:25:15 +0000317
318 dst.x = clamp(src0.x, src1.x, src2.x)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800319
Keith Whitwella62aaa72009-12-21 23:25:15 +0000320 dst.y = clamp(src0.y, src1.y, src2.y)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800321
Keith Whitwella62aaa72009-12-21 23:25:15 +0000322 dst.z = clamp(src0.z, src1.z, src2.z)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800323
Keith Whitwella62aaa72009-12-21 23:25:15 +0000324 dst.w = clamp(src0.w, src1.w, src2.w)
325
326
Corbin Simpson85805222010-02-02 16:20:12 -0800327.. opcode:: FLR - Floor
Corbin Simpsond92a6852009-12-21 19:30:29 -0800328
Corbin Simpson17c2a442010-02-02 17:02:28 -0800329This is identical to :opcode:`ARL`.
Keith Whitwella62aaa72009-12-21 23:25:15 +0000330
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800331.. math::
332
Corbin Simpsond92a6852009-12-21 19:30:29 -0800333 dst.x = \lfloor src.x\rfloor
334
335 dst.y = \lfloor src.y\rfloor
336
337 dst.z = \lfloor src.z\rfloor
338
339 dst.w = \lfloor src.w\rfloor
Keith Whitwella62aaa72009-12-21 23:25:15 +0000340
341
Corbin Simpson85805222010-02-02 16:20:12 -0800342.. opcode:: ROUND - Round
Keith Whitwella62aaa72009-12-21 23:25:15 +0000343
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800344.. math::
345
Keith Whitwella62aaa72009-12-21 23:25:15 +0000346 dst.x = round(src.x)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800347
Keith Whitwella62aaa72009-12-21 23:25:15 +0000348 dst.y = round(src.y)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800349
Keith Whitwella62aaa72009-12-21 23:25:15 +0000350 dst.z = round(src.z)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800351
Keith Whitwella62aaa72009-12-21 23:25:15 +0000352 dst.w = round(src.w)
353
354
Corbin Simpson85805222010-02-02 16:20:12 -0800355.. opcode:: EX2 - Exponential Base 2
Keith Whitwella62aaa72009-12-21 23:25:15 +0000356
Corbin Simpson17c2a442010-02-02 17:02:28 -0800357This instruction replicates its result.
358
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800359.. math::
360
Corbin Simpson17c2a442010-02-02 17:02:28 -0800361 dst = 2^{src.x}
Keith Whitwella62aaa72009-12-21 23:25:15 +0000362
363
Corbin Simpson85805222010-02-02 16:20:12 -0800364.. opcode:: LG2 - Logarithm Base 2
Keith Whitwella62aaa72009-12-21 23:25:15 +0000365
Corbin Simpson17c2a442010-02-02 17:02:28 -0800366This instruction replicates its result.
367
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800368.. math::
369
Corbin Simpson17c2a442010-02-02 17:02:28 -0800370 dst = \log_2{src.x}
Keith Whitwella62aaa72009-12-21 23:25:15 +0000371
372
Corbin Simpson85805222010-02-02 16:20:12 -0800373.. opcode:: POW - Power
Keith Whitwella62aaa72009-12-21 23:25:15 +0000374
Corbin Simpson17c2a442010-02-02 17:02:28 -0800375This instruction replicates its result.
376
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800377.. math::
378
Corbin Simpson17c2a442010-02-02 17:02:28 -0800379 dst = src0.x^{src1.x}
Keith Whitwella62aaa72009-12-21 23:25:15 +0000380
Corbin Simpson85805222010-02-02 16:20:12 -0800381.. opcode:: XPD - Cross Product
Keith Whitwella62aaa72009-12-21 23:25:15 +0000382
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800383.. math::
384
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800385 dst.x = src0.y \times src1.z - src1.y \times src0.z
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800386
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800387 dst.y = src0.z \times src1.x - src1.z \times src0.x
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800388
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800389 dst.z = src0.x \times src1.y - src1.x \times src0.y
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800390
391 dst.w = 1
Keith Whitwella62aaa72009-12-21 23:25:15 +0000392
393
Corbin Simpson85805222010-02-02 16:20:12 -0800394.. opcode:: ABS - Absolute
Keith Whitwella62aaa72009-12-21 23:25:15 +0000395
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800396.. math::
397
Corbin Simpson14743ac2009-12-21 19:57:56 -0800398 dst.x = |src.x|
399
400 dst.y = |src.y|
401
402 dst.z = |src.z|
403
404 dst.w = |src.w|
Keith Whitwella62aaa72009-12-21 23:25:15 +0000405
406
Corbin Simpson85805222010-02-02 16:20:12 -0800407.. opcode:: RCC - Reciprocal Clamped
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800408
Corbin Simpson17c2a442010-02-02 17:02:28 -0800409This instruction replicates its result.
410
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800411XXX cleanup on aisle three
Keith Whitwella62aaa72009-12-21 23:25:15 +0000412
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800413.. math::
414
Corbin Simpson17c2a442010-02-02 17:02:28 -0800415 dst = (1 / src.x) > 0 ? clamp(1 / src.x, 5.42101e-020, 1.884467e+019) : clamp(1 / src.x, -1.884467e+019, -5.42101e-020)
Keith Whitwella62aaa72009-12-21 23:25:15 +0000416
417
Corbin Simpson85805222010-02-02 16:20:12 -0800418.. opcode:: DPH - Homogeneous Dot Product
Keith Whitwella62aaa72009-12-21 23:25:15 +0000419
Corbin Simpson17c2a442010-02-02 17:02:28 -0800420This instruction replicates its result.
421
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800422.. math::
423
Corbin Simpson17c2a442010-02-02 17:02:28 -0800424 dst = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src1.w
Keith Whitwella62aaa72009-12-21 23:25:15 +0000425
426
Corbin Simpson85805222010-02-02 16:20:12 -0800427.. opcode:: COS - Cosine
Keith Whitwella62aaa72009-12-21 23:25:15 +0000428
Corbin Simpson17c2a442010-02-02 17:02:28 -0800429This instruction replicates its result.
430
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800431.. math::
432
Corbin Simpson17c2a442010-02-02 17:02:28 -0800433 dst = \cos{src.x}
Keith Whitwella62aaa72009-12-21 23:25:15 +0000434
435
Corbin Simpson85805222010-02-02 16:20:12 -0800436.. opcode:: DDX - Derivative Relative To X
Keith Whitwella62aaa72009-12-21 23:25:15 +0000437
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800438.. math::
439
Keith Whitwella62aaa72009-12-21 23:25:15 +0000440 dst.x = partialx(src.x)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800441
Keith Whitwella62aaa72009-12-21 23:25:15 +0000442 dst.y = partialx(src.y)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800443
Keith Whitwella62aaa72009-12-21 23:25:15 +0000444 dst.z = partialx(src.z)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800445
Keith Whitwella62aaa72009-12-21 23:25:15 +0000446 dst.w = partialx(src.w)
447
448
Corbin Simpson85805222010-02-02 16:20:12 -0800449.. opcode:: DDY - Derivative Relative To Y
Keith Whitwella62aaa72009-12-21 23:25:15 +0000450
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800451.. math::
452
Keith Whitwella62aaa72009-12-21 23:25:15 +0000453 dst.x = partialy(src.x)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800454
Keith Whitwella62aaa72009-12-21 23:25:15 +0000455 dst.y = partialy(src.y)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800456
Keith Whitwella62aaa72009-12-21 23:25:15 +0000457 dst.z = partialy(src.z)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800458
Keith Whitwella62aaa72009-12-21 23:25:15 +0000459 dst.w = partialy(src.w)
460
461
Corbin Simpson85805222010-02-02 16:20:12 -0800462.. opcode:: KILP - Predicated Discard
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800463
Keith Whitwella62aaa72009-12-21 23:25:15 +0000464 discard
465
466
Corbin Simpson85805222010-02-02 16:20:12 -0800467.. opcode:: PK2H - Pack Two 16-bit Floats
Keith Whitwella62aaa72009-12-21 23:25:15 +0000468
469 TBD
470
471
Corbin Simpson85805222010-02-02 16:20:12 -0800472.. opcode:: PK2US - Pack Two Unsigned 16-bit Scalars
Keith Whitwella62aaa72009-12-21 23:25:15 +0000473
474 TBD
475
476
Corbin Simpson85805222010-02-02 16:20:12 -0800477.. opcode:: PK4B - Pack Four Signed 8-bit Scalars
Keith Whitwella62aaa72009-12-21 23:25:15 +0000478
479 TBD
480
481
Corbin Simpson85805222010-02-02 16:20:12 -0800482.. opcode:: PK4UB - Pack Four Unsigned 8-bit Scalars
Keith Whitwella62aaa72009-12-21 23:25:15 +0000483
484 TBD
485
486
Corbin Simpson85805222010-02-02 16:20:12 -0800487.. opcode:: RFL - Reflection Vector
Keith Whitwella62aaa72009-12-21 23:25:15 +0000488
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800489.. math::
490
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800491 dst.x = 2 \times (src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z) / (src0.x \times src0.x + src0.y \times src0.y + src0.z \times src0.z) \times src0.x - src1.x
492
493 dst.y = 2 \times (src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z) / (src0.x \times src0.x + src0.y \times src0.y + src0.z \times src0.z) \times src0.y - src1.y
494
495 dst.z = 2 \times (src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z) / (src0.x \times src0.x + src0.y \times src0.y + src0.z \times src0.z) \times src0.z - src1.z
496
497 dst.w = 1
Keith Whitwella62aaa72009-12-21 23:25:15 +0000498
Corbin Simpson17c2a442010-02-02 17:02:28 -0800499.. note::
500
501 Considered for removal.
Keith Whitwell14eacb02009-12-21 23:38:29 +0000502
Keith Whitwella62aaa72009-12-21 23:25:15 +0000503
Corbin Simpson85805222010-02-02 16:20:12 -0800504.. opcode:: SEQ - Set On Equal
Keith Whitwella62aaa72009-12-21 23:25:15 +0000505
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800506.. math::
507
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800508 dst.x = (src0.x == src1.x) ? 1 : 0
Corbin Simpson04771912010-01-18 17:31:56 -0800509
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800510 dst.y = (src0.y == src1.y) ? 1 : 0
Corbin Simpson04771912010-01-18 17:31:56 -0800511
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800512 dst.z = (src0.z == src1.z) ? 1 : 0
Corbin Simpson04771912010-01-18 17:31:56 -0800513
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800514 dst.w = (src0.w == src1.w) ? 1 : 0
Keith Whitwella62aaa72009-12-21 23:25:15 +0000515
516
Corbin Simpson85805222010-02-02 16:20:12 -0800517.. opcode:: SFL - Set On False
Keith Whitwella62aaa72009-12-21 23:25:15 +0000518
Corbin Simpson17c2a442010-02-02 17:02:28 -0800519This instruction replicates its result.
520
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800521.. math::
522
Corbin Simpson17c2a442010-02-02 17:02:28 -0800523 dst = 0
Corbin Simpson04771912010-01-18 17:31:56 -0800524
Corbin Simpson17c2a442010-02-02 17:02:28 -0800525.. note::
Corbin Simpson04771912010-01-18 17:31:56 -0800526
Corbin Simpson17c2a442010-02-02 17:02:28 -0800527 Considered for removal.
Corbin Simpson04771912010-01-18 17:31:56 -0800528
Keith Whitwella62aaa72009-12-21 23:25:15 +0000529
Corbin Simpson85805222010-02-02 16:20:12 -0800530.. opcode:: SGT - Set On Greater Than
Keith Whitwella62aaa72009-12-21 23:25:15 +0000531
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800532.. math::
533
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800534 dst.x = (src0.x > src1.x) ? 1 : 0
Corbin Simpson04771912010-01-18 17:31:56 -0800535
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800536 dst.y = (src0.y > src1.y) ? 1 : 0
Corbin Simpson04771912010-01-18 17:31:56 -0800537
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800538 dst.z = (src0.z > src1.z) ? 1 : 0
Corbin Simpson04771912010-01-18 17:31:56 -0800539
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800540 dst.w = (src0.w > src1.w) ? 1 : 0
Keith Whitwella62aaa72009-12-21 23:25:15 +0000541
542
Corbin Simpson85805222010-02-02 16:20:12 -0800543.. opcode:: SIN - Sine
Keith Whitwella62aaa72009-12-21 23:25:15 +0000544
Corbin Simpson17c2a442010-02-02 17:02:28 -0800545This instruction replicates its result.
546
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800547.. math::
548
Corbin Simpson17c2a442010-02-02 17:02:28 -0800549 dst = \sin{src.x}
Keith Whitwella62aaa72009-12-21 23:25:15 +0000550
551
Corbin Simpson85805222010-02-02 16:20:12 -0800552.. opcode:: SLE - Set On Less Equal Than
Keith Whitwella62aaa72009-12-21 23:25:15 +0000553
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800554.. math::
555
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800556 dst.x = (src0.x <= src1.x) ? 1 : 0
Corbin Simpson04771912010-01-18 17:31:56 -0800557
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800558 dst.y = (src0.y <= src1.y) ? 1 : 0
Corbin Simpson04771912010-01-18 17:31:56 -0800559
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800560 dst.z = (src0.z <= src1.z) ? 1 : 0
Corbin Simpson04771912010-01-18 17:31:56 -0800561
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800562 dst.w = (src0.w <= src1.w) ? 1 : 0
Keith Whitwella62aaa72009-12-21 23:25:15 +0000563
564
Corbin Simpson85805222010-02-02 16:20:12 -0800565.. opcode:: SNE - Set On Not Equal
Keith Whitwella62aaa72009-12-21 23:25:15 +0000566
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800567.. math::
568
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800569 dst.x = (src0.x != src1.x) ? 1 : 0
Corbin Simpson04771912010-01-18 17:31:56 -0800570
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800571 dst.y = (src0.y != src1.y) ? 1 : 0
Corbin Simpson04771912010-01-18 17:31:56 -0800572
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800573 dst.z = (src0.z != src1.z) ? 1 : 0
Corbin Simpson04771912010-01-18 17:31:56 -0800574
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800575 dst.w = (src0.w != src1.w) ? 1 : 0
Keith Whitwella62aaa72009-12-21 23:25:15 +0000576
577
Corbin Simpson85805222010-02-02 16:20:12 -0800578.. opcode:: STR - Set On True
Keith Whitwella62aaa72009-12-21 23:25:15 +0000579
Corbin Simpson17c2a442010-02-02 17:02:28 -0800580This instruction replicates its result.
581
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800582.. math::
583
Corbin Simpson17c2a442010-02-02 17:02:28 -0800584 dst = 1
Keith Whitwella62aaa72009-12-21 23:25:15 +0000585
586
Corbin Simpson85805222010-02-02 16:20:12 -0800587.. opcode:: TEX - Texture Lookup
Keith Whitwella62aaa72009-12-21 23:25:15 +0000588
Brian Paul2a77c3c2010-12-14 12:45:36 -0700589.. math::
590
591 coord = src0
592
593 bias = 0.0
594
595 dst = texture_sample(unit, coord, bias)
Keith Whitwella62aaa72009-12-21 23:25:15 +0000596
Dave Airlie35db3262011-12-19 16:40:05 +0000597 for array textures src0.y contains the slice for 1D,
598 and src0.z contain the slice for 2D.
599 for shadow textures with no arrays, src0.z contains
600 the reference value.
601 for shadow textures with arrays, src0.z contains
602 the reference value for 1D arrays, and src0.w contains
603 the reference value for 2D arrays.
604 There is no way to pass a bias in the .w value for
605 shadow arrays, and GLSL doesn't allow this.
606 GLSL does allow cube shadows maps to take a bias value,
607 and we have to determine how this will look in TGSI.
Keith Whitwella62aaa72009-12-21 23:25:15 +0000608
Corbin Simpson85805222010-02-02 16:20:12 -0800609.. opcode:: TXD - Texture Lookup with Derivatives
Keith Whitwella62aaa72009-12-21 23:25:15 +0000610
Brian Paul2a77c3c2010-12-14 12:45:36 -0700611.. math::
612
613 coord = src0
614
615 ddx = src1
616
617 ddy = src2
618
619 bias = 0.0
620
621 dst = texture_sample_deriv(unit, coord, bias, ddx, ddy)
Keith Whitwella62aaa72009-12-21 23:25:15 +0000622
623
Corbin Simpson85805222010-02-02 16:20:12 -0800624.. opcode:: TXP - Projective Texture Lookup
Keith Whitwella62aaa72009-12-21 23:25:15 +0000625
Brian Paul2a77c3c2010-12-14 12:45:36 -0700626.. math::
627
628 coord.x = src0.x / src.w
629
630 coord.y = src0.y / src.w
631
632 coord.z = src0.z / src.w
633
634 coord.w = src0.w
635
636 bias = 0.0
637
638 dst = texture_sample(unit, coord, bias)
Keith Whitwella62aaa72009-12-21 23:25:15 +0000639
640
Corbin Simpson85805222010-02-02 16:20:12 -0800641.. opcode:: UP2H - Unpack Two 16-Bit Floats
Keith Whitwella62aaa72009-12-21 23:25:15 +0000642
643 TBD
644
Corbin Simpson17c2a442010-02-02 17:02:28 -0800645.. note::
646
647 Considered for removal.
Keith Whitwella62aaa72009-12-21 23:25:15 +0000648
Corbin Simpson85805222010-02-02 16:20:12 -0800649.. opcode:: UP2US - Unpack Two Unsigned 16-Bit Scalars
Keith Whitwella62aaa72009-12-21 23:25:15 +0000650
651 TBD
652
Corbin Simpson17c2a442010-02-02 17:02:28 -0800653.. note::
654
655 Considered for removal.
Keith Whitwella62aaa72009-12-21 23:25:15 +0000656
Corbin Simpson85805222010-02-02 16:20:12 -0800657.. opcode:: UP4B - Unpack Four Signed 8-Bit Values
Keith Whitwella62aaa72009-12-21 23:25:15 +0000658
659 TBD
660
Corbin Simpson17c2a442010-02-02 17:02:28 -0800661.. note::
662
663 Considered for removal.
Keith Whitwella62aaa72009-12-21 23:25:15 +0000664
Corbin Simpson85805222010-02-02 16:20:12 -0800665.. opcode:: UP4UB - Unpack Four Unsigned 8-Bit Scalars
Keith Whitwella62aaa72009-12-21 23:25:15 +0000666
667 TBD
668
Corbin Simpson17c2a442010-02-02 17:02:28 -0800669.. note::
670
671 Considered for removal.
Keith Whitwella62aaa72009-12-21 23:25:15 +0000672
Corbin Simpson85805222010-02-02 16:20:12 -0800673.. opcode:: X2D - 2D Coordinate Transformation
Keith Whitwella62aaa72009-12-21 23:25:15 +0000674
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800675.. math::
676
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800677 dst.x = src0.x + src1.x \times src2.x + src1.y \times src2.y
Corbin Simpson04771912010-01-18 17:31:56 -0800678
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800679 dst.y = src0.y + src1.x \times src2.z + src1.y \times src2.w
Corbin Simpson04771912010-01-18 17:31:56 -0800680
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800681 dst.z = src0.x + src1.x \times src2.x + src1.y \times src2.y
Corbin Simpson04771912010-01-18 17:31:56 -0800682
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800683 dst.w = src0.y + src1.x \times src2.z + src1.y \times src2.w
Keith Whitwella62aaa72009-12-21 23:25:15 +0000684
Corbin Simpson17c2a442010-02-02 17:02:28 -0800685.. note::
686
687 Considered for removal.
Keith Whitwell14eacb02009-12-21 23:38:29 +0000688
Keith Whitwella62aaa72009-12-21 23:25:15 +0000689
Corbin Simpson85805222010-02-02 16:20:12 -0800690.. opcode:: ARA - Address Register Add
Keith Whitwella62aaa72009-12-21 23:25:15 +0000691
692 TBD
693
Corbin Simpson17c2a442010-02-02 17:02:28 -0800694.. note::
695
696 Considered for removal.
Keith Whitwella62aaa72009-12-21 23:25:15 +0000697
Corbin Simpson85805222010-02-02 16:20:12 -0800698.. opcode:: ARR - Address Register Load With Round
Keith Whitwella62aaa72009-12-21 23:25:15 +0000699
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800700.. math::
701
Keith Whitwella62aaa72009-12-21 23:25:15 +0000702 dst.x = round(src.x)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800703
Keith Whitwella62aaa72009-12-21 23:25:15 +0000704 dst.y = round(src.y)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800705
Keith Whitwella62aaa72009-12-21 23:25:15 +0000706 dst.z = round(src.z)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800707
Keith Whitwella62aaa72009-12-21 23:25:15 +0000708 dst.w = round(src.w)
709
710
Corbin Simpson85805222010-02-02 16:20:12 -0800711.. opcode:: BRA - Branch
Keith Whitwella62aaa72009-12-21 23:25:15 +0000712
713 pc = target
714
Corbin Simpson17c2a442010-02-02 17:02:28 -0800715.. note::
716
717 Considered for removal.
Keith Whitwella62aaa72009-12-21 23:25:15 +0000718
Corbin Simpson85805222010-02-02 16:20:12 -0800719.. opcode:: CAL - Subroutine Call
Keith Whitwella62aaa72009-12-21 23:25:15 +0000720
721 push(pc)
722 pc = target
723
724
Corbin Simpson85805222010-02-02 16:20:12 -0800725.. opcode:: RET - Subroutine Call Return
Keith Whitwella62aaa72009-12-21 23:25:15 +0000726
727 pc = pop()
728
729
Corbin Simpson85805222010-02-02 16:20:12 -0800730.. opcode:: SSG - Set Sign
Keith Whitwella62aaa72009-12-21 23:25:15 +0000731
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800732.. math::
733
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800734 dst.x = (src.x > 0) ? 1 : (src.x < 0) ? -1 : 0
735
736 dst.y = (src.y > 0) ? 1 : (src.y < 0) ? -1 : 0
737
738 dst.z = (src.z > 0) ? 1 : (src.z < 0) ? -1 : 0
739
740 dst.w = (src.w > 0) ? 1 : (src.w < 0) ? -1 : 0
Keith Whitwella62aaa72009-12-21 23:25:15 +0000741
742
Corbin Simpson85805222010-02-02 16:20:12 -0800743.. opcode:: CMP - Compare
Keith Whitwella62aaa72009-12-21 23:25:15 +0000744
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800745.. math::
746
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800747 dst.x = (src0.x < 0) ? src1.x : src2.x
748
749 dst.y = (src0.y < 0) ? src1.y : src2.y
750
751 dst.z = (src0.z < 0) ? src1.z : src2.z
752
753 dst.w = (src0.w < 0) ? src1.w : src2.w
Keith Whitwella62aaa72009-12-21 23:25:15 +0000754
755
Corbin Simpson85805222010-02-02 16:20:12 -0800756.. opcode:: KIL - Conditional Discard
Keith Whitwella62aaa72009-12-21 23:25:15 +0000757
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800758.. math::
759
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800760 if (src.x < 0 || src.y < 0 || src.z < 0 || src.w < 0)
Keith Whitwella62aaa72009-12-21 23:25:15 +0000761 discard
762 endif
763
764
Corbin Simpson85805222010-02-02 16:20:12 -0800765.. opcode:: SCS - Sine Cosine
Keith Whitwella62aaa72009-12-21 23:25:15 +0000766
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800767.. math::
768
Corbin Simpsond92a6852009-12-21 19:30:29 -0800769 dst.x = \cos{src.x}
770
771 dst.y = \sin{src.x}
772
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800773 dst.z = 0
Corbin Simpsond92a6852009-12-21 19:30:29 -0800774
Tilman Sauerbeckd3231182010-09-19 09:03:11 +0200775 dst.w = 1
Keith Whitwella62aaa72009-12-21 23:25:15 +0000776
777
Corbin Simpson85805222010-02-02 16:20:12 -0800778.. opcode:: TXB - Texture Lookup With Bias
Keith Whitwella62aaa72009-12-21 23:25:15 +0000779
Brian Paul2a77c3c2010-12-14 12:45:36 -0700780.. math::
781
782 coord.x = src.x
783
784 coord.y = src.y
785
786 coord.z = src.z
787
788 coord.w = 1.0
789
790 bias = src.z
791
792 dst = texture_sample(unit, coord, bias)
Keith Whitwella62aaa72009-12-21 23:25:15 +0000793
794
Corbin Simpson85805222010-02-02 16:20:12 -0800795.. opcode:: NRM - 3-component Vector Normalise
Keith Whitwella62aaa72009-12-21 23:25:15 +0000796
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800797.. math::
798
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800799 dst.x = src.x / (src.x \times src.x + src.y \times src.y + src.z \times src.z)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800800
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800801 dst.y = src.y / (src.x \times src.x + src.y \times src.y + src.z \times src.z)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800802
Corbin Simpsonecb2f2a2009-12-21 20:07:10 -0800803 dst.z = src.z / (src.x \times src.x + src.y \times src.y + src.z \times src.z)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800804
805 dst.w = 1
Keith Whitwella62aaa72009-12-21 23:25:15 +0000806
807
Corbin Simpson85805222010-02-02 16:20:12 -0800808.. opcode:: DIV - Divide
Keith Whitwella62aaa72009-12-21 23:25:15 +0000809
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800810.. math::
811
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800812 dst.x = \frac{src0.x}{src1.x}
813
814 dst.y = \frac{src0.y}{src1.y}
815
816 dst.z = \frac{src0.z}{src1.z}
817
818 dst.w = \frac{src0.w}{src1.w}
Keith Whitwella62aaa72009-12-21 23:25:15 +0000819
820
Corbin Simpson85805222010-02-02 16:20:12 -0800821.. opcode:: DP2 - 2-component Dot Product
Keith Whitwella62aaa72009-12-21 23:25:15 +0000822
Corbin Simpson17c2a442010-02-02 17:02:28 -0800823This instruction replicates its result.
824
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800825.. math::
826
Corbin Simpson17c2a442010-02-02 17:02:28 -0800827 dst = src0.x \times src1.x + src0.y \times src1.y
Keith Whitwella62aaa72009-12-21 23:25:15 +0000828
829
Brian Paul2a77c3c2010-12-14 12:45:36 -0700830.. opcode:: TXL - Texture Lookup With explicit LOD
Keith Whitwella62aaa72009-12-21 23:25:15 +0000831
Brian Paul2a77c3c2010-12-14 12:45:36 -0700832.. math::
833
834 coord.x = src0.x
835
836 coord.y = src0.y
837
838 coord.z = src0.z
839
840 coord.w = 1.0
841
842 lod = src0.w
843
844 dst = texture_sample(unit, coord, lod)
Keith Whitwella62aaa72009-12-21 23:25:15 +0000845
846
Corbin Simpson85805222010-02-02 16:20:12 -0800847.. opcode:: BRK - Break
Keith Whitwella62aaa72009-12-21 23:25:15 +0000848
849 TBD
850
851
Corbin Simpson85805222010-02-02 16:20:12 -0800852.. opcode:: IF - If
Keith Whitwella62aaa72009-12-21 23:25:15 +0000853
854 TBD
855
856
Corbin Simpson85805222010-02-02 16:20:12 -0800857.. opcode:: ELSE - Else
Keith Whitwella62aaa72009-12-21 23:25:15 +0000858
859 TBD
860
861
Corbin Simpson85805222010-02-02 16:20:12 -0800862.. opcode:: ENDIF - End If
Keith Whitwella62aaa72009-12-21 23:25:15 +0000863
864 TBD
865
866
Corbin Simpson85805222010-02-02 16:20:12 -0800867.. opcode:: PUSHA - Push Address Register On Stack
Keith Whitwella62aaa72009-12-21 23:25:15 +0000868
869 push(src.x)
870 push(src.y)
871 push(src.z)
872 push(src.w)
873
Corbin Simpson17c2a442010-02-02 17:02:28 -0800874.. note::
875
876 Considered for cleanup.
877
878.. note::
879
880 Considered for removal.
Keith Whitwella62aaa72009-12-21 23:25:15 +0000881
Corbin Simpson85805222010-02-02 16:20:12 -0800882.. opcode:: POPA - Pop Address Register From Stack
Keith Whitwella62aaa72009-12-21 23:25:15 +0000883
884 dst.w = pop()
885 dst.z = pop()
886 dst.y = pop()
887 dst.x = pop()
888
Corbin Simpson17c2a442010-02-02 17:02:28 -0800889.. note::
890
891 Considered for cleanup.
892
893.. note::
894
895 Considered for removal.
Keith Whitwell14eacb02009-12-21 23:38:29 +0000896
Keith Whitwella62aaa72009-12-21 23:25:15 +0000897
Corbin Simpson9d4cb6e2010-06-16 18:34:32 -0700898Compute ISA
Corbin Simpson5bcd26c2009-12-21 21:04:10 -0800899^^^^^^^^^^^^^^^^^^^^^^^^
Keith Whitwella62aaa72009-12-21 23:25:15 +0000900
Corbin Simpson9d4cb6e2010-06-16 18:34:32 -0700901These opcodes are primarily provided for special-use computational shaders.
Keith Whitwell14eacb02009-12-21 23:38:29 +0000902Support for these opcodes indicated by a special pipe capability bit (TBD).
Keith Whitwella62aaa72009-12-21 23:25:15 +0000903
Corbin Simpson9d4cb6e2010-06-16 18:34:32 -0700904XXX so let's discuss it, yeah?
905
Corbin Simpson85805222010-02-02 16:20:12 -0800906.. opcode:: CEIL - Ceiling
Keith Whitwella62aaa72009-12-21 23:25:15 +0000907
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800908.. math::
909
Corbin Simpson14743ac2009-12-21 19:57:56 -0800910 dst.x = \lceil src.x\rceil
911
912 dst.y = \lceil src.y\rceil
913
914 dst.z = \lceil src.z\rceil
915
916 dst.w = \lceil src.w\rceil
Keith Whitwella62aaa72009-12-21 23:25:15 +0000917
918
Corbin Simpson85805222010-02-02 16:20:12 -0800919.. opcode:: I2F - Integer To Float
Keith Whitwella62aaa72009-12-21 23:25:15 +0000920
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800921.. math::
922
Keith Whitwella62aaa72009-12-21 23:25:15 +0000923 dst.x = (float) src.x
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800924
Keith Whitwella62aaa72009-12-21 23:25:15 +0000925 dst.y = (float) src.y
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800926
Keith Whitwella62aaa72009-12-21 23:25:15 +0000927 dst.z = (float) src.z
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800928
Keith Whitwella62aaa72009-12-21 23:25:15 +0000929 dst.w = (float) src.w
930
931
Corbin Simpson85805222010-02-02 16:20:12 -0800932.. opcode:: NOT - Bitwise Not
Keith Whitwella62aaa72009-12-21 23:25:15 +0000933
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800934.. math::
935
Keith Whitwella62aaa72009-12-21 23:25:15 +0000936 dst.x = ~src.x
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800937
Keith Whitwella62aaa72009-12-21 23:25:15 +0000938 dst.y = ~src.y
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800939
Keith Whitwella62aaa72009-12-21 23:25:15 +0000940 dst.z = ~src.z
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800941
Keith Whitwella62aaa72009-12-21 23:25:15 +0000942 dst.w = ~src.w
943
944
Corbin Simpson85805222010-02-02 16:20:12 -0800945.. opcode:: TRUNC - Truncate
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800946
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800947.. math::
948
Keith Whitwella62aaa72009-12-21 23:25:15 +0000949 dst.x = trunc(src.x)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800950
Keith Whitwella62aaa72009-12-21 23:25:15 +0000951 dst.y = trunc(src.y)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800952
Keith Whitwella62aaa72009-12-21 23:25:15 +0000953 dst.z = trunc(src.z)
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800954
Keith Whitwella62aaa72009-12-21 23:25:15 +0000955 dst.w = trunc(src.w)
956
957
Corbin Simpson85805222010-02-02 16:20:12 -0800958.. opcode:: SHL - Shift Left
Keith Whitwella62aaa72009-12-21 23:25:15 +0000959
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800960.. math::
961
Keith Whitwella62aaa72009-12-21 23:25:15 +0000962 dst.x = src0.x << src1.x
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800963
Keith Whitwella62aaa72009-12-21 23:25:15 +0000964 dst.y = src0.y << src1.x
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800965
Keith Whitwella62aaa72009-12-21 23:25:15 +0000966 dst.z = src0.z << src1.x
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800967
Keith Whitwella62aaa72009-12-21 23:25:15 +0000968 dst.w = src0.w << src1.x
969
970
Corbin Simpson85805222010-02-02 16:20:12 -0800971.. opcode:: SHR - Shift Right
Keith Whitwella62aaa72009-12-21 23:25:15 +0000972
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800973.. math::
974
Keith Whitwella62aaa72009-12-21 23:25:15 +0000975 dst.x = src0.x >> src1.x
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800976
Keith Whitwella62aaa72009-12-21 23:25:15 +0000977 dst.y = src0.y >> src1.x
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800978
Keith Whitwella62aaa72009-12-21 23:25:15 +0000979 dst.z = src0.z >> src1.x
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800980
Keith Whitwella62aaa72009-12-21 23:25:15 +0000981 dst.w = src0.w >> src1.x
982
983
Corbin Simpson85805222010-02-02 16:20:12 -0800984.. opcode:: AND - Bitwise And
Keith Whitwella62aaa72009-12-21 23:25:15 +0000985
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800986.. math::
987
Keith Whitwella62aaa72009-12-21 23:25:15 +0000988 dst.x = src0.x & src1.x
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800989
Keith Whitwella62aaa72009-12-21 23:25:15 +0000990 dst.y = src0.y & src1.y
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800991
Keith Whitwella62aaa72009-12-21 23:25:15 +0000992 dst.z = src0.z & src1.z
Corbin Simpsonda65ac62009-12-21 20:32:46 -0800993
Keith Whitwella62aaa72009-12-21 23:25:15 +0000994 dst.w = src0.w & src1.w
995
996
Corbin Simpson85805222010-02-02 16:20:12 -0800997.. opcode:: OR - Bitwise Or
Keith Whitwella62aaa72009-12-21 23:25:15 +0000998
Corbin Simpsone8ed3b92009-12-21 19:12:55 -0800999.. math::
1000
Keith Whitwella62aaa72009-12-21 23:25:15 +00001001 dst.x = src0.x | src1.x
Corbin Simpsonda65ac62009-12-21 20:32:46 -08001002
Keith Whitwella62aaa72009-12-21 23:25:15 +00001003 dst.y = src0.y | src1.y
Corbin Simpsonda65ac62009-12-21 20:32:46 -08001004
Keith Whitwella62aaa72009-12-21 23:25:15 +00001005 dst.z = src0.z | src1.z
Corbin Simpsonda65ac62009-12-21 20:32:46 -08001006
Keith Whitwella62aaa72009-12-21 23:25:15 +00001007 dst.w = src0.w | src1.w
1008
1009
Corbin Simpson85805222010-02-02 16:20:12 -08001010.. opcode:: MOD - Modulus
Keith Whitwella62aaa72009-12-21 23:25:15 +00001011
Corbin Simpsone8ed3b92009-12-21 19:12:55 -08001012.. math::
1013
Corbin Simpsonda65ac62009-12-21 20:32:46 -08001014 dst.x = src0.x \bmod src1.x
1015
1016 dst.y = src0.y \bmod src1.y
1017
1018 dst.z = src0.z \bmod src1.z
1019
1020 dst.w = src0.w \bmod src1.w
Keith Whitwella62aaa72009-12-21 23:25:15 +00001021
1022
Corbin Simpson85805222010-02-02 16:20:12 -08001023.. opcode:: XOR - Bitwise Xor
Keith Whitwella62aaa72009-12-21 23:25:15 +00001024
Corbin Simpsone8ed3b92009-12-21 19:12:55 -08001025.. math::
1026
Corbin Simpsonf90733c2010-01-18 17:37:25 -08001027 dst.x = src0.x \oplus src1.x
Corbin Simpsonda65ac62009-12-21 20:32:46 -08001028
Corbin Simpsonf90733c2010-01-18 17:37:25 -08001029 dst.y = src0.y \oplus src1.y
Corbin Simpsonda65ac62009-12-21 20:32:46 -08001030
Corbin Simpsonf90733c2010-01-18 17:37:25 -08001031 dst.z = src0.z \oplus src1.z
Corbin Simpsonda65ac62009-12-21 20:32:46 -08001032
Corbin Simpsonf90733c2010-01-18 17:37:25 -08001033 dst.w = src0.w \oplus src1.w
Keith Whitwella62aaa72009-12-21 23:25:15 +00001034
1035
Bryan Cain324ac982011-09-10 12:31:54 -05001036.. opcode:: UCMP - Integer Conditional Move
1037
1038.. math::
1039
1040 dst.x = src0.x ? src1.x : src2.x
1041
1042 dst.y = src0.y ? src1.y : src2.y
1043
1044 dst.z = src0.z ? src1.z : src2.z
1045
1046 dst.w = src0.w ? src1.w : src2.w
1047
1048
1049.. opcode:: UARL - Integer Address Register Load
1050
1051 Moves the contents of the source register, assumed to be an integer, into the
1052 destination register, which is assumed to be an address (ADDR) register.
1053
1054
Bryan Cain4c0f1fb2012-01-07 10:43:04 -06001055.. opcode:: IABS - Integer Absolute Value
1056
1057.. math::
1058
1059 dst.x = |src.x|
1060
1061 dst.y = |src.y|
1062
1063 dst.z = |src.z|
1064
1065 dst.w = |src.w|
1066
1067
Corbin Simpson85805222010-02-02 16:20:12 -08001068.. opcode:: SAD - Sum Of Absolute Differences
Keith Whitwella62aaa72009-12-21 23:25:15 +00001069
Corbin Simpsone8ed3b92009-12-21 19:12:55 -08001070.. math::
1071
Corbin Simpson14743ac2009-12-21 19:57:56 -08001072 dst.x = |src0.x - src1.x| + src2.x
1073
1074 dst.y = |src0.y - src1.y| + src2.y
1075
1076 dst.z = |src0.z - src1.z| + src2.z
1077
1078 dst.w = |src0.w - src1.w| + src2.w
Keith Whitwella62aaa72009-12-21 23:25:15 +00001079
1080
Dave Airlie2083a272011-08-26 10:59:18 +01001081.. opcode:: TXF - Texel Fetch (as per NV_gpu_shader4), extract a single texel
1082 from a specified texture image. The source sampler may
1083 not be a CUBE or SHADOW.
1084 src 0 is a four-component signed integer vector used to
1085 identify the single texel accessed. 3 components + level.
1086 src 1 is a 3 component constant signed integer vector,
1087 with each component only have a range of
1088 -8..+8 (hw only seems to deal with this range, interface
1089 allows for up to unsigned int).
1090 TXF(uint_vec coord, int_vec offset).
Keith Whitwella62aaa72009-12-21 23:25:15 +00001091
1092
Dave Airlie6fb12bf2011-08-25 13:03:19 +01001093.. opcode:: TXQ - Texture Size Query (as per NV_gpu_program4)
1094 retrieve the dimensions of the texture
1095 depending on the target. For 1D (width), 2D/RECT/CUBE
1096 (width, height), 3D (width, height, depth),
1097 1D array (width, layers), 2D array (width, height, layers)
Keith Whitwella62aaa72009-12-21 23:25:15 +00001098
Dave Airlie6fb12bf2011-08-25 13:03:19 +01001099.. math::
1100
1101 lod = src0
1102
1103 dst.x = texture_width(unit, lod)
1104
1105 dst.y = texture_height(unit, lod)
1106
1107 dst.z = texture_depth(unit, lod)
Keith Whitwella62aaa72009-12-21 23:25:15 +00001108
1109
Corbin Simpson85805222010-02-02 16:20:12 -08001110.. opcode:: CONT - Continue
Keith Whitwella62aaa72009-12-21 23:25:15 +00001111
1112 TBD
1113
Corbin Simpson9d4cb6e2010-06-16 18:34:32 -07001114.. note::
Keith Whitwella62aaa72009-12-21 23:25:15 +00001115
Corbin Simpson9d4cb6e2010-06-16 18:34:32 -07001116 Support for CONT is determined by a special capability bit,
1117 ``TGSI_CONT_SUPPORTED``. See :ref:`Screen` for more information.
1118
1119
1120Geometry ISA
Corbin Simpson5bcd26c2009-12-21 21:04:10 -08001121^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Keith Whitwella62aaa72009-12-21 23:25:15 +00001122
Corbin Simpson9d4cb6e2010-06-16 18:34:32 -07001123These opcodes are only supported in geometry shaders; they have no meaning
1124in any other type of shader.
Keith Whitwella62aaa72009-12-21 23:25:15 +00001125
Corbin Simpson85805222010-02-02 16:20:12 -08001126.. opcode:: EMIT - Emit
Keith Whitwella62aaa72009-12-21 23:25:15 +00001127
1128 TBD
1129
1130
Corbin Simpson85805222010-02-02 16:20:12 -08001131.. opcode:: ENDPRIM - End Primitive
Keith Whitwella62aaa72009-12-21 23:25:15 +00001132
1133 TBD
1134
1135
Corbin Simpson9d4cb6e2010-06-16 18:34:32 -07001136GLSL ISA
Corbin Simpson5bcd26c2009-12-21 21:04:10 -08001137^^^^^^^^^^
Keith Whitwella62aaa72009-12-21 23:25:15 +00001138
Corbin Simpson9d4cb6e2010-06-16 18:34:32 -07001139These opcodes are part of :term:`GLSL`'s opcode set. Support for these
1140opcodes is determined by a special capability bit, ``GLSL``.
Keith Whitwella62aaa72009-12-21 23:25:15 +00001141
Corbin Simpson85805222010-02-02 16:20:12 -08001142.. opcode:: BGNLOOP - Begin a Loop
Keith Whitwella62aaa72009-12-21 23:25:15 +00001143
1144 TBD
1145
1146
Corbin Simpson85805222010-02-02 16:20:12 -08001147.. opcode:: BGNSUB - Begin Subroutine
Keith Whitwella62aaa72009-12-21 23:25:15 +00001148
1149 TBD
1150
1151
Corbin Simpson85805222010-02-02 16:20:12 -08001152.. opcode:: ENDLOOP - End a Loop
Keith Whitwella62aaa72009-12-21 23:25:15 +00001153
1154 TBD
1155
1156
Corbin Simpson85805222010-02-02 16:20:12 -08001157.. opcode:: ENDSUB - End Subroutine
Keith Whitwella62aaa72009-12-21 23:25:15 +00001158
1159 TBD
1160
1161
Corbin Simpson85805222010-02-02 16:20:12 -08001162.. opcode:: NOP - No Operation
Keith Whitwella62aaa72009-12-21 23:25:15 +00001163
Michal Krol8ab89d72010-01-04 13:23:41 +01001164 Do nothing.
1165
Keith Whitwella62aaa72009-12-21 23:25:15 +00001166
Corbin Simpson85805222010-02-02 16:20:12 -08001167.. opcode:: NRM4 - 4-component Vector Normalise
Keith Whitwella62aaa72009-12-21 23:25:15 +00001168
Corbin Simpson17c2a442010-02-02 17:02:28 -08001169This instruction replicates its result.
1170
Corbin Simpsone8ed3b92009-12-21 19:12:55 -08001171.. math::
1172
Corbin Simpson17c2a442010-02-02 17:02:28 -08001173 dst = \frac{src.x}{src.x \times src.x + src.y \times src.y + src.z \times src.z + src.w \times src.w}
Keith Whitwella62aaa72009-12-21 23:25:15 +00001174
1175
Corbin Simpsonda65ac62009-12-21 20:32:46 -08001176ps_2_x
Corbin Simpson5bcd26c2009-12-21 21:04:10 -08001177^^^^^^^^^^^^
Keith Whitwella62aaa72009-12-21 23:25:15 +00001178
Corbin Simpson9d4cb6e2010-06-16 18:34:32 -07001179XXX wait what
Keith Whitwella62aaa72009-12-21 23:25:15 +00001180
Corbin Simpson85805222010-02-02 16:20:12 -08001181.. opcode:: CALLNZ - Subroutine Call If Not Zero
Keith Whitwella62aaa72009-12-21 23:25:15 +00001182
1183 TBD
1184
1185
Corbin Simpson85805222010-02-02 16:20:12 -08001186.. opcode:: IFC - If
Keith Whitwella62aaa72009-12-21 23:25:15 +00001187
1188 TBD
1189
1190
Corbin Simpson85805222010-02-02 16:20:12 -08001191.. opcode:: BREAKC - Break Conditional
Keith Whitwella62aaa72009-12-21 23:25:15 +00001192
1193 TBD
1194
Corbin Simpson62ca7b82010-02-02 16:36:34 -08001195.. _doubleopcodes:
1196
Corbin Simpson9d4cb6e2010-06-16 18:34:32 -07001197Double ISA
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001198^^^^^^^^^^^^^^^
1199
Corbin Simpsondbc95e82010-06-16 18:34:51 -07001200The double-precision opcodes reinterpret four-component vectors into
1201two-component vectors with doubled precision in each component.
1202
1203Support for these opcodes is XXX undecided. :T
1204
1205.. opcode:: DADD - Add
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001206
1207.. math::
1208
1209 dst.xy = src0.xy + src1.xy
1210
1211 dst.zw = src0.zw + src1.zw
1212
1213
Corbin Simpsondbc95e82010-06-16 18:34:51 -07001214.. opcode:: DDIV - Divide
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001215
1216.. math::
1217
1218 dst.xy = src0.xy / src1.xy
1219
1220 dst.zw = src0.zw / src1.zw
1221
Corbin Simpsondbc95e82010-06-16 18:34:51 -07001222.. opcode:: DSEQ - Set on Equal
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001223
1224.. math::
1225
1226 dst.xy = src0.xy == src1.xy ? 1.0F : 0.0F
1227
1228 dst.zw = src0.zw == src1.zw ? 1.0F : 0.0F
1229
Corbin Simpsondbc95e82010-06-16 18:34:51 -07001230.. opcode:: DSLT - Set on Less than
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001231
1232.. math::
1233
1234 dst.xy = src0.xy < src1.xy ? 1.0F : 0.0F
1235
1236 dst.zw = src0.zw < src1.zw ? 1.0F : 0.0F
1237
Corbin Simpsondbc95e82010-06-16 18:34:51 -07001238.. opcode:: DFRAC - Fraction
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001239
1240.. math::
1241
1242 dst.xy = src.xy - \lfloor src.xy\rfloor
1243
1244 dst.zw = src.zw - \lfloor src.zw\rfloor
1245
1246
Corbin Simpsondbc95e82010-06-16 18:34:51 -07001247.. opcode:: DFRACEXP - Convert Number to Fractional and Integral Components
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001248
Corbin Simpsonf98c4622010-06-16 18:45:50 -07001249Like the ``frexp()`` routine in many math libraries, this opcode stores the
1250exponent of its source to ``dst0``, and the significand to ``dst1``, such that
1251:math:`dst1 \times 2^{dst0} = src` .
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001252
1253.. math::
1254
Corbin Simpsonf98c4622010-06-16 18:45:50 -07001255 dst0.xy = exp(src.xy)
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001256
Corbin Simpsonf98c4622010-06-16 18:45:50 -07001257 dst1.xy = frac(src.xy)
1258
1259 dst0.zw = exp(src.zw)
1260
1261 dst1.zw = frac(src.zw)
1262
1263.. opcode:: DLDEXP - Multiply Number by Integral Power of 2
1264
1265This opcode is the inverse of :opcode:`DFRACEXP`.
1266
1267.. math::
1268
1269 dst.xy = src0.xy \times 2^{src1.xy}
1270
1271 dst.zw = src0.zw \times 2^{src1.zw}
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001272
Corbin Simpsondbc95e82010-06-16 18:34:51 -07001273.. opcode:: DMIN - Minimum
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001274
1275.. math::
1276
1277 dst.xy = min(src0.xy, src1.xy)
1278
1279 dst.zw = min(src0.zw, src1.zw)
1280
Corbin Simpsondbc95e82010-06-16 18:34:51 -07001281.. opcode:: DMAX - Maximum
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001282
1283.. math::
1284
1285 dst.xy = max(src0.xy, src1.xy)
1286
1287 dst.zw = max(src0.zw, src1.zw)
1288
Corbin Simpsondbc95e82010-06-16 18:34:51 -07001289.. opcode:: DMUL - Multiply
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001290
1291.. math::
1292
1293 dst.xy = src0.xy \times src1.xy
1294
1295 dst.zw = src0.zw \times src1.zw
1296
1297
Corbin Simpsondbc95e82010-06-16 18:34:51 -07001298.. opcode:: DMAD - Multiply And Add
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001299
1300.. math::
1301
1302 dst.xy = src0.xy \times src1.xy + src2.xy
1303
1304 dst.zw = src0.zw \times src1.zw + src2.zw
1305
1306
Corbin Simpsondbc95e82010-06-16 18:34:51 -07001307.. opcode:: DRCP - Reciprocal
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001308
1309.. math::
1310
1311 dst.xy = \frac{1}{src.xy}
1312
1313 dst.zw = \frac{1}{src.zw}
1314
Corbin Simpsondbc95e82010-06-16 18:34:51 -07001315.. opcode:: DSQRT - Square Root
Igor Oliveiradb89bf42010-01-25 19:23:04 -04001316
1317.. math::
1318
1319 dst.xy = \sqrt{src.xy}
1320
1321 dst.zw = \sqrt{src.zw}
1322
Keith Whitwella62aaa72009-12-21 23:25:15 +00001323
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001324.. _samplingopcodes:
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001325
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001326Resource Sampling Opcodes
1327^^^^^^^^^^^^^^^^^^^^^^^^^
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001328
1329Those opcodes follow very closely semantics of the respective Direct3D
1330instructions. If in doubt double check Direct3D documentation.
1331
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001332.. opcode:: SAMPLE - Using provided address, sample data from the
1333 specified texture using the filtering mode identified
1334 by the gven sampler. The source data may come from
1335 any resource type other than buffers.
1336 SAMPLE dst, address, sampler_view, sampler
1337 e.g.
1338 SAMPLE TEMP[0], TEMP[1], SVIEW[0], SAMP[0]
1339
1340.. opcode:: SAMPLE_I - Simplified alternative to the SAMPLE instruction.
1341 Using the provided integer address, SAMPLE_I fetches data
1342 from the specified sampler view without any filtering.
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001343 The source data may come from any resource type other
1344 than CUBE.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001345 SAMPLE_I dst, address, sampler_view
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001346 e.g.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001347 SAMPLE_I TEMP[0], TEMP[1], SVIEW[0]
Zack Rusin3fa814d2011-01-24 21:45:37 -05001348 The 'address' is specified as unsigned integers. If the
1349 'address' is out of range [0...(# texels - 1)] the
1350 result of the fetch is always 0 in all components.
1351 As such the instruction doesn't honor address wrap
1352 modes, in cases where that behavior is desirable
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001353 'SAMPLE' instruction should be used.
Zack Rusin3fa814d2011-01-24 21:45:37 -05001354 address.w always provides an unsigned integer mipmap
1355 level. If the value is out of the range then the
1356 instruction always returns 0 in all components.
1357 address.yz are ignored for buffers and 1d textures.
1358 address.z is ignored for 1d texture arrays and 2d
1359 textures.
1360 For 1D texture arrays address.y provides the array
1361 index (also as unsigned integer). If the value is
1362 out of the range of available array indices
1363 [0... (array size - 1)] then the opcode always returns
1364 0 in all components.
1365 For 2D texture arrays address.z provides the array
1366 index, otherwise it exhibits the same behavior as in
1367 the case for 1D texture arrays.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001368 The exact semantics of the source address are presented
Zack Rusin3fa814d2011-01-24 21:45:37 -05001369 in the table below:
1370 resource type X Y Z W
1371 ------------- ------------------------
1372 PIPE_BUFFER x ignored
1373 PIPE_TEXTURE_1D x mpl
1374 PIPE_TEXTURE_2D x y mpl
1375 PIPE_TEXTURE_3D x y z mpl
1376 PIPE_TEXTURE_RECT x y mpl
1377 PIPE_TEXTURE_CUBE not allowed as source
1378 PIPE_TEXTURE_1D_ARRAY x idx mpl
1379 PIPE_TEXTURE_2D_ARRAY x y idx mpl
1380
1381 Where 'mpl' is a mipmap level and 'idx' is the
1382 array index.
1383
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001384.. opcode:: SAMPLE_I_MS - Just like SAMPLE_I but allows fetch data from
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001385 multi-sampled surfaces.
1386
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001387.. opcode:: SAMPLE_B - Just like the SAMPLE instruction with the
1388 exception that an additiona bias is applied to the
1389 level of detail computed as part of the instruction
1390 execution.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001391 SAMPLE_B dst, address, sampler_view, sampler, lod_bias
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001392 e.g.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001393 SAMPLE_B TEMP[0], TEMP[1], SVIEW[0], SAMP[0], TEMP[2].x
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001394
Zack Rusin3fa814d2011-01-24 21:45:37 -05001395.. opcode:: SAMPLE_C - Similar to the SAMPLE instruction but it
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001396 performs a comparison filter. The operands to SAMPLE_C
1397 are identical to SAMPLE, except that tere is an additional
1398 float32 operand, reference value, which must be a register
1399 with single-component, or a scalar literal.
1400 SAMPLE_C makes the hardware use the current samplers
1401 compare_func (in pipe_sampler_state) to compare
1402 reference value against the red component value for the
1403 surce resource at each texel that the currently configured
1404 texture filter covers based on the provided coordinates.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001405 SAMPLE_C dst, address, sampler_view.r, sampler, ref_value
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001406 e.g.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001407 SAMPLE_C TEMP[0], TEMP[1], SVIEW[0].r, SAMP[0], TEMP[2].x
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001408
1409.. opcode:: SAMPLE_C_LZ - Same as SAMPLE_C, but LOD is 0 and derivatives
1410 are ignored. The LZ stands for level-zero.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001411 SAMPLE_C_LZ dst, address, sampler_view.r, sampler, ref_value
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001412 e.g.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001413 SAMPLE_C_LZ TEMP[0], TEMP[1], SVIEW[0].r, SAMP[0], TEMP[2].x
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001414
1415
1416.. opcode:: SAMPLE_D - SAMPLE_D is identical to the SAMPLE opcode except
1417 that the derivatives for the source address in the x
1418 direction and the y direction are provided by extra
1419 parameters.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001420 SAMPLE_D dst, address, sampler_view, sampler, der_x, der_y
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001421 e.g.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001422 SAMPLE_D TEMP[0], TEMP[1], SVIEW[0], SAMP[0], TEMP[2], TEMP[3]
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001423
1424.. opcode:: SAMPLE_L - SAMPLE_L is identical to the SAMPLE opcode except
1425 that the LOD is provided directly as a scalar value,
1426 representing no anisotropy. Source addresses A channel
1427 is used as the LOD.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001428 SAMPLE_L dst, address, sampler_view, sampler
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001429 e.g.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001430 SAMPLE_L TEMP[0], TEMP[1], SVIEW[0], SAMP[0]
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001431
1432.. opcode:: GATHER4 - Gathers the four texels to be used in a bi-linear
1433 filtering operation and packs them into a single register.
Brian Paul0cd68002012-03-30 09:41:42 -06001434 Only works with 2D, 2D array, cubemaps, and cubemaps arrays.
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001435 For 2D textures, only the addressing modes of the sampler and
1436 the top level of any mip pyramid are used. Set W to zero.
1437 It behaves like the SAMPLE instruction, but a filtered
1438 sample is not generated. The four samples that contribute
Brian Paul0cd68002012-03-30 09:41:42 -06001439 to filtering are placed into xyzw in counter-clockwise order,
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001440 starting with the (u,v) texture coordinate delta at the
1441 following locations (-, +), (+, +), (+, -), (-, -), where
1442 the magnitude of the deltas are half a texel.
1443
1444
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001445.. opcode:: SVIEWINFO - query the dimensions of a given sampler view.
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001446 dst receives width, height, depth or array size and
Zack Rusin3fa814d2011-01-24 21:45:37 -05001447 number of mipmap levels. The dst can have a writemask
1448 which will specify what info is the caller interested
1449 in.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001450 SVIEWINFO dst, src_mip_level, sampler_view
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001451 e.g.
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001452 SVIEWINFO TEMP[0], TEMP[1].x, SVIEW[0]
Zack Rusin3fa814d2011-01-24 21:45:37 -05001453 src_mip_level is an unsigned integer scalar. If it's
1454 out of range then returns 0 for width, height and
1455 depth/array size but the total number of mipmap is
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001456 still returned correctly for the given sampler view.
Zack Rusin3fa814d2011-01-24 21:45:37 -05001457 The returned width, height and depth values are for
1458 the mipmap level selected by the src_mip_level and
1459 are in the number of texels.
1460 For 1d texture array width is in dst.x, array size
1461 is in dst.y and dst.zw are always 0.
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001462
1463.. opcode:: SAMPLE_POS - query the position of a given sample.
1464 dst receives float4 (x, y, 0, 0) indicated where the
1465 sample is located. If the resource is not a multi-sample
1466 resource and not a render target, the result is 0.
1467
Zack Rusin3fa814d2011-01-24 21:45:37 -05001468.. opcode:: SAMPLE_INFO - dst receives number of samples in x.
1469 If the resource is not a multi-sample resource and
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001470 not a render target, the result is 0.
1471
1472
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001473.. _resourceopcodes:
1474
1475Resource Access Opcodes
1476^^^^^^^^^^^^^^^^^^^^^^^
1477
1478.. opcode:: LOAD - Fetch data from a shader resource
1479
1480 Syntax: ``LOAD dst, resource, address``
1481
1482 Example: ``LOAD TEMP[0], RES[0], TEMP[1]``
1483
1484 Using the provided integer address, LOAD fetches data
1485 from the specified buffer or texture without any
1486 filtering.
1487
1488 The 'address' is specified as a vector of unsigned
1489 integers. If the 'address' is out of range the result
1490 is unspecified.
1491
1492 Only the first mipmap level of a resource can be read
1493 from using this instruction.
1494
1495 For 1D or 2D texture arrays, the array index is
1496 provided as an unsigned integer in address.y or
1497 address.z, respectively. address.yz are ignored for
1498 buffers and 1D textures. address.z is ignored for 1D
1499 texture arrays and 2D textures. address.w is always
1500 ignored.
1501
Francisco Jerezb8e808f2012-04-30 20:20:29 +02001502.. opcode:: STORE - Write data to a shader resource
1503
1504 Syntax: ``STORE resource, address, src``
1505
1506 Example: ``STORE RES[0], TEMP[0], TEMP[1]``
1507
1508 Using the provided integer address, STORE writes data
1509 to the specified buffer or texture.
1510
1511 The 'address' is specified as a vector of unsigned
1512 integers. If the 'address' is out of range the result
1513 is unspecified.
1514
1515 Only the first mipmap level of a resource can be
1516 written to using this instruction.
1517
1518 For 1D or 2D texture arrays, the array index is
1519 provided as an unsigned integer in address.y or
1520 address.z, respectively. address.yz are ignored for
1521 buffers and 1D textures. address.z is ignored for 1D
1522 texture arrays and 2D textures. address.w is always
1523 ignored.
1524
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001525
Francisco Jerez9e550c32012-04-30 20:21:38 +02001526.. _threadsyncopcodes:
1527
1528Inter-thread synchronization opcodes
1529^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1530
1531These opcodes are intended for communication between threads running
1532within the same compute grid. For now they're only valid in compute
1533programs.
1534
1535.. opcode:: MFENCE - Memory fence
1536
1537 Syntax: ``MFENCE resource``
1538
1539 Example: ``MFENCE RES[0]``
1540
1541 This opcode forces strong ordering between any memory access
1542 operations that affect the specified resource. This means that
1543 previous loads and stores (and only those) will be performed and
1544 visible to other threads before the program execution continues.
1545
1546
1547.. opcode:: LFENCE - Load memory fence
1548
1549 Syntax: ``LFENCE resource``
1550
1551 Example: ``LFENCE RES[0]``
1552
1553 Similar to MFENCE, but it only affects the ordering of memory loads.
1554
1555
1556.. opcode:: SFENCE - Store memory fence
1557
1558 Syntax: ``SFENCE resource``
1559
1560 Example: ``SFENCE RES[0]``
1561
1562 Similar to MFENCE, but it only affects the ordering of memory stores.
1563
1564
1565.. opcode:: BARRIER - Thread group barrier
1566
1567 ``BARRIER``
1568
1569 This opcode suspends the execution of the current thread until all
1570 the remaining threads in the working group reach the same point of
1571 the program. Results are unspecified if any of the remaining
1572 threads terminates or never reaches an executed BARRIER instruction.
1573
1574
Francisco Jerezc2d31a82012-04-30 20:22:23 +02001575.. _atomopcodes:
1576
1577Atomic opcodes
1578^^^^^^^^^^^^^^
1579
1580These opcodes provide atomic variants of some common arithmetic and
1581logical operations. In this context atomicity means that another
1582concurrent memory access operation that affects the same memory
1583location is guaranteed to be performed strictly before or after the
1584entire execution of the atomic operation.
1585
1586For the moment they're only valid in compute programs.
1587
1588.. opcode:: ATOMUADD - Atomic integer addition
1589
1590 Syntax: ``ATOMUADD dst, resource, offset, src``
1591
1592 Example: ``ATOMUADD TEMP[0], RES[0], TEMP[1], TEMP[2]``
1593
1594 The following operation is performed atomically on each component:
1595
1596.. math::
1597
1598 dst_i = resource[offset]_i
1599
1600 resource[offset]_i = dst_i + src_i
1601
1602
1603.. opcode:: ATOMXCHG - Atomic exchange
1604
1605 Syntax: ``ATOMXCHG dst, resource, offset, src``
1606
1607 Example: ``ATOMXCHG TEMP[0], RES[0], TEMP[1], TEMP[2]``
1608
1609 The following operation is performed atomically on each component:
1610
1611.. math::
1612
1613 dst_i = resource[offset]_i
1614
1615 resource[offset]_i = src_i
1616
1617
1618.. opcode:: ATOMCAS - Atomic compare-and-exchange
1619
1620 Syntax: ``ATOMCAS dst, resource, offset, cmp, src``
1621
1622 Example: ``ATOMCAS TEMP[0], RES[0], TEMP[1], TEMP[2], TEMP[3]``
1623
1624 The following operation is performed atomically on each component:
1625
1626.. math::
1627
1628 dst_i = resource[offset]_i
1629
1630 resource[offset]_i = (dst_i == cmp_i ? src_i : dst_i)
1631
1632
1633.. opcode:: ATOMAND - Atomic bitwise And
1634
1635 Syntax: ``ATOMAND dst, resource, offset, src``
1636
1637 Example: ``ATOMAND TEMP[0], RES[0], TEMP[1], TEMP[2]``
1638
1639 The following operation is performed atomically on each component:
1640
1641.. math::
1642
1643 dst_i = resource[offset]_i
1644
1645 resource[offset]_i = dst_i \& src_i
1646
1647
1648.. opcode:: ATOMOR - Atomic bitwise Or
1649
1650 Syntax: ``ATOMOR dst, resource, offset, src``
1651
1652 Example: ``ATOMOR TEMP[0], RES[0], TEMP[1], TEMP[2]``
1653
1654 The following operation is performed atomically on each component:
1655
1656.. math::
1657
1658 dst_i = resource[offset]_i
1659
1660 resource[offset]_i = dst_i | src_i
1661
1662
1663.. opcode:: ATOMXOR - Atomic bitwise Xor
1664
1665 Syntax: ``ATOMXOR dst, resource, offset, src``
1666
1667 Example: ``ATOMXOR TEMP[0], RES[0], TEMP[1], TEMP[2]``
1668
1669 The following operation is performed atomically on each component:
1670
1671.. math::
1672
1673 dst_i = resource[offset]_i
1674
1675 resource[offset]_i = dst_i \oplus src_i
1676
1677
1678.. opcode:: ATOMUMIN - Atomic unsigned minimum
1679
1680 Syntax: ``ATOMUMIN dst, resource, offset, src``
1681
1682 Example: ``ATOMUMIN TEMP[0], RES[0], TEMP[1], TEMP[2]``
1683
1684 The following operation is performed atomically on each component:
1685
1686.. math::
1687
1688 dst_i = resource[offset]_i
1689
1690 resource[offset]_i = (dst_i < src_i ? dst_i : src_i)
1691
1692
1693.. opcode:: ATOMUMAX - Atomic unsigned maximum
1694
1695 Syntax: ``ATOMUMAX dst, resource, offset, src``
1696
1697 Example: ``ATOMUMAX TEMP[0], RES[0], TEMP[1], TEMP[2]``
1698
1699 The following operation is performed atomically on each component:
1700
1701.. math::
1702
1703 dst_i = resource[offset]_i
1704
1705 resource[offset]_i = (dst_i > src_i ? dst_i : src_i)
1706
1707
1708.. opcode:: ATOMIMIN - Atomic signed minimum
1709
1710 Syntax: ``ATOMIMIN dst, resource, offset, src``
1711
1712 Example: ``ATOMIMIN TEMP[0], RES[0], TEMP[1], TEMP[2]``
1713
1714 The following operation is performed atomically on each component:
1715
1716.. math::
1717
1718 dst_i = resource[offset]_i
1719
1720 resource[offset]_i = (dst_i < src_i ? dst_i : src_i)
1721
1722
1723.. opcode:: ATOMIMAX - Atomic signed maximum
1724
1725 Syntax: ``ATOMIMAX dst, resource, offset, src``
1726
1727 Example: ``ATOMIMAX TEMP[0], RES[0], TEMP[1], TEMP[2]``
1728
1729 The following operation is performed atomically on each component:
1730
1731.. math::
1732
1733 dst_i = resource[offset]_i
1734
1735 resource[offset]_i = (dst_i > src_i ? dst_i : src_i)
1736
1737
1738
Corbin Simpsonda65ac62009-12-21 20:32:46 -08001739Explanation of symbols used
Corbin Simpson5bcd26c2009-12-21 21:04:10 -08001740------------------------------
Keith Whitwella62aaa72009-12-21 23:25:15 +00001741
1742
Corbin Simpsonda65ac62009-12-21 20:32:46 -08001743Functions
Corbin Simpson5bcd26c2009-12-21 21:04:10 -08001744^^^^^^^^^^^^^^
Keith Whitwella62aaa72009-12-21 23:25:15 +00001745
1746
Corbin Simpson14743ac2009-12-21 19:57:56 -08001747 :math:`|x|` Absolute value of `x`.
Keith Whitwella62aaa72009-12-21 23:25:15 +00001748
Corbin Simpson14743ac2009-12-21 19:57:56 -08001749 :math:`\lceil x \rceil` Ceiling of `x`.
Keith Whitwella62aaa72009-12-21 23:25:15 +00001750
1751 clamp(x,y,z) Clamp x between y and z.
1752 (x < y) ? y : (x > z) ? z : x
1753
Corbin Simpsondd801e52009-12-21 19:41:09 -08001754 :math:`\lfloor x\rfloor` Floor of `x`.
Keith Whitwella62aaa72009-12-21 23:25:15 +00001755
Corbin Simpson14743ac2009-12-21 19:57:56 -08001756 :math:`\log_2{x}` Logarithm of `x`, base 2.
Keith Whitwella62aaa72009-12-21 23:25:15 +00001757
1758 max(x,y) Maximum of x and y.
1759 (x > y) ? x : y
1760
1761 min(x,y) Minimum of x and y.
1762 (x < y) ? x : y
1763
1764 partialx(x) Derivative of x relative to fragment's X.
1765
1766 partialy(x) Derivative of x relative to fragment's Y.
1767
1768 pop() Pop from stack.
1769
Corbin Simpsondd801e52009-12-21 19:41:09 -08001770 :math:`x^y` `x` to the power `y`.
Keith Whitwella62aaa72009-12-21 23:25:15 +00001771
1772 push(x) Push x on stack.
1773
1774 round(x) Round x.
1775
Michal Krol07f416c2010-01-04 13:21:32 +01001776 trunc(x) Truncate x, i.e. drop the fraction bits.
Keith Whitwella62aaa72009-12-21 23:25:15 +00001777
1778
Corbin Simpsonda65ac62009-12-21 20:32:46 -08001779Keywords
Corbin Simpson5bcd26c2009-12-21 21:04:10 -08001780^^^^^^^^^^^^^
Keith Whitwella62aaa72009-12-21 23:25:15 +00001781
1782
1783 discard Discard fragment.
1784
Keith Whitwella62aaa72009-12-21 23:25:15 +00001785 pc Program counter.
1786
Keith Whitwella62aaa72009-12-21 23:25:15 +00001787 target Label of target instruction.
1788
1789
Corbin Simpsonda65ac62009-12-21 20:32:46 -08001790Other tokens
Corbin Simpson5bcd26c2009-12-21 21:04:10 -08001791---------------
Keith Whitwella62aaa72009-12-21 23:25:15 +00001792
1793
Michal Krol63d60972010-02-03 15:45:32 +01001794Declaration
1795^^^^^^^^^^^
1796
1797
1798Declares a register that is will be referenced as an operand in Instruction
1799tokens.
1800
1801File field contains register file that is being declared and is one
1802of TGSI_FILE.
1803
1804UsageMask field specifies which of the register components can be accessed
1805and is one of TGSI_WRITEMASK.
1806
Francisco Jerez26449522012-03-18 19:21:36 +01001807The Local flag specifies that a given value isn't intended for
1808subroutine parameter passing and, as a result, the implementation
1809isn't required to give any guarantees of it being preserved across
1810subroutine boundaries. As it's merely a compiler hint, the
1811implementation is free to ignore it.
1812
Michal Krol63d60972010-02-03 15:45:32 +01001813If Dimension flag is set to 1, a Declaration Dimension token follows.
1814
1815If Semantic flag is set to 1, a Declaration Semantic token follows.
1816
Francisco Jerez12799232012-04-30 18:27:52 +02001817If Interpolate flag is set to 1, a Declaration Interpolate token follows.
Michal Krol63d60972010-02-03 15:45:32 +01001818
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001819If file is TGSI_FILE_RESOURCE, a Declaration Resource token follows.
1820
Michal Krol63d60972010-02-03 15:45:32 +01001821
Corbin Simpsonda65ac62009-12-21 20:32:46 -08001822Declaration Semantic
Corbin Simpson5bcd26c2009-12-21 21:04:10 -08001823^^^^^^^^^^^^^^^^^^^^^^^^
Keith Whitwella62aaa72009-12-21 23:25:15 +00001824
Brian Paul05a18f42010-06-24 07:21:15 -06001825 Vertex and fragment shader input and output registers may be labeled
1826 with semantic information consisting of a name and index.
Keith Whitwella62aaa72009-12-21 23:25:15 +00001827
1828 Follows Declaration token if Semantic bit is set.
1829
1830 Since its purpose is to link a shader with other stages of the pipeline,
1831 it is valid to follow only those Declaration tokens that declare a register
1832 either in INPUT or OUTPUT file.
1833
1834 SemanticName field contains the semantic name of the register being declared.
1835 There is no default value.
1836
1837 SemanticIndex is an optional subscript that can be used to distinguish
1838 different register declarations with the same semantic name. The default value
1839 is 0.
1840
1841 The meanings of the individual semantic names are explained in the following
1842 sections.
1843
Corbin Simpson54ddf642009-12-23 23:36:06 -08001844TGSI_SEMANTIC_POSITION
1845""""""""""""""""""""""
Keith Whitwella62aaa72009-12-21 23:25:15 +00001846
Brian Paul50b3f2e2010-06-23 17:00:10 -06001847For vertex shaders, TGSI_SEMANTIC_POSITION indicates the vertex shader
1848output register which contains the homogeneous vertex position in the clip
1849space coordinate system. After clipping, the X, Y and Z components of the
1850vertex will be divided by the W value to get normalized device coordinates.
Keith Whitwella62aaa72009-12-21 23:25:15 +00001851
Brian Paul50b3f2e2010-06-23 17:00:10 -06001852For fragment shaders, TGSI_SEMANTIC_POSITION is used to indicate that
1853fragment shader input contains the fragment's window position. The X
1854component starts at zero and always increases from left to right.
1855The Y component starts at zero and always increases but Y=0 may either
1856indicate the top of the window or the bottom depending on the fragment
1857coordinate origin convention (see TGSI_PROPERTY_FS_COORD_ORIGIN).
1858The Z coordinate ranges from 0 to 1 to represent depth from the front
1859to the back of the Z buffer. The W component contains the reciprocol
1860of the interpolated vertex position W component.
Corbin Simpson54ddf642009-12-23 23:36:06 -08001861
Brian Paul05a18f42010-06-24 07:21:15 -06001862Fragment shaders may also declare an output register with
1863TGSI_SEMANTIC_POSITION. Only the Z component is writable. This allows
1864the fragment shader to change the fragment's Z position.
1865
Corbin Simpson54ddf642009-12-23 23:36:06 -08001866
Corbin Simpson54ddf642009-12-23 23:36:06 -08001867
1868TGSI_SEMANTIC_COLOR
1869"""""""""""""""""""
1870
Brian Paul50b3f2e2010-06-23 17:00:10 -06001871For vertex shader outputs or fragment shader inputs/outputs, this
1872label indicates that the resister contains an R,G,B,A color.
Corbin Simpson54ddf642009-12-23 23:36:06 -08001873
Brian Paul50b3f2e2010-06-23 17:00:10 -06001874Several shader inputs/outputs may contain colors so the semantic index
1875is used to distinguish them. For example, color[0] may be the diffuse
1876color while color[1] may be the specular color.
1877
1878This label is needed so that the flat/smooth shading can be applied
1879to the right interpolants during rasterization.
1880
1881
Corbin Simpson54ddf642009-12-23 23:36:06 -08001882
1883TGSI_SEMANTIC_BCOLOR
1884""""""""""""""""""""
1885
1886Back-facing colors are only used for back-facing polygons, and are only valid
1887in vertex shader outputs. After rasterization, all polygons are front-facing
Brian Paul50b3f2e2010-06-23 17:00:10 -06001888and COLOR and BCOLOR end up occupying the same slots in the fragment shader,
1889so all BCOLORs effectively become regular COLORs in the fragment shader.
1890
Corbin Simpson54ddf642009-12-23 23:36:06 -08001891
1892TGSI_SEMANTIC_FOG
1893"""""""""""""""""
1894
Brian Paul05a18f42010-06-24 07:21:15 -06001895Vertex shader inputs and outputs and fragment shader inputs may be
1896labeled with TGSI_SEMANTIC_FOG to indicate that the register contains
1897a fog coordinate in the form (F, 0, 0, 1). Typically, the fragment
1898shader will use the fog coordinate to compute a fog blend factor which
1899is used to blend the normal fragment color with a constant fog color.
Corbin Simpson54ddf642009-12-23 23:36:06 -08001900
Brian Paul05a18f42010-06-24 07:21:15 -06001901Only the first component matters when writing from the vertex shader;
1902the driver will ensure that the coordinate is in this format when used
1903as a fragment shader input.
1904
Corbin Simpson54ddf642009-12-23 23:36:06 -08001905
1906TGSI_SEMANTIC_PSIZE
1907"""""""""""""""""""
1908
Brian Paul05a18f42010-06-24 07:21:15 -06001909Vertex shader input and output registers may be labeled with
1910TGIS_SEMANTIC_PSIZE to indicate that the register contains a point size
1911in the form (S, 0, 0, 1). The point size controls the width or diameter
1912of points for rasterization. This label cannot be used in fragment
1913shaders.
Corbin Simpson54ddf642009-12-23 23:36:06 -08001914
1915When using this semantic, be sure to set the appropriate state in the
1916:ref:`rasterizer` first.
1917
Brian Paul05a18f42010-06-24 07:21:15 -06001918
Corbin Simpson54ddf642009-12-23 23:36:06 -08001919TGSI_SEMANTIC_GENERIC
1920"""""""""""""""""""""
1921
Brian Paul05a18f42010-06-24 07:21:15 -06001922All vertex/fragment shader inputs/outputs not labeled with any other
1923semantic label can be considered to be generic attributes. Typical
1924uses of generic inputs/outputs are texcoords and user-defined values.
Corbin Simpson54ddf642009-12-23 23:36:06 -08001925
Corbin Simpson54ddf642009-12-23 23:36:06 -08001926
1927TGSI_SEMANTIC_NORMAL
1928""""""""""""""""""""
1929
Brian Paul05a18f42010-06-24 07:21:15 -06001930Indicates that a vertex shader input is a normal vector. This is
1931typically only used for legacy graphics APIs.
1932
Corbin Simpson54ddf642009-12-23 23:36:06 -08001933
1934TGSI_SEMANTIC_FACE
1935""""""""""""""""""
1936
Brian Paul05a18f42010-06-24 07:21:15 -06001937This label applies to fragment shader inputs only and indicates that
1938the register contains front/back-face information of the form (F, 0,
19390, 1). The first component will be positive when the fragment belongs
1940to a front-facing polygon, and negative when the fragment belongs to a
1941back-facing polygon.
1942
Corbin Simpson54ddf642009-12-23 23:36:06 -08001943
1944TGSI_SEMANTIC_EDGEFLAG
1945""""""""""""""""""""""
1946
Brian Paul73153002010-06-23 17:38:58 -06001947For vertex shaders, this sematic label indicates that an input or
1948output is a boolean edge flag. The register layout is [F, x, x, x]
1949where F is 0.0 or 1.0 and x = don't care. Normally, the vertex shader
1950simply copies the edge flag input to the edgeflag output.
1951
1952Edge flags are used to control which lines or points are actually
1953drawn when the polygon mode converts triangles/quads/polygons into
1954points or lines.
1955
Dave Airlie4ecb2c12010-10-06 09:28:46 +10001956TGSI_SEMANTIC_STENCIL
1957""""""""""""""""""""""
1958
1959For fragment shaders, this semantic label indicates than an output
1960is a writable stencil reference value. Only the Y component is writable.
1961This allows the fragment shader to change the fragments stencilref value.
Luca Barbieri73317132010-01-21 05:36:14 +01001962
1963
Francisco Jerez12799232012-04-30 18:27:52 +02001964Declaration Interpolate
1965^^^^^^^^^^^^^^^^^^^^^^^
1966
1967This token is only valid for fragment shader INPUT declarations.
1968
1969The Interpolate field specifes the way input is being interpolated by
1970the rasteriser and is one of TGSI_INTERPOLATE_*.
1971
1972The CylindricalWrap bitfield specifies which register components
1973should be subject to cylindrical wrapping when interpolating by the
1974rasteriser. If TGSI_CYLINDRICAL_WRAP_X is set to 1, the X component
1975should be interpolated according to cylindrical wrapping rules.
1976
1977
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001978Declaration Sampler View
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001979^^^^^^^^^^^^^^^^^^^^^^^^
1980
Francisco Jereza5f44cc2012-05-01 02:38:51 +02001981 Follows Declaration token if file is TGSI_FILE_SAMPLER_VIEW.
1982
1983 DCL SVIEW[#], resource, type(s)
1984
1985 Declares a shader input sampler view and assigns it to a SVIEW[#]
1986 register.
1987
1988 resource can be one of BUFFER, 1D, 2D, 3D, 1DArray and 2DArray.
1989
1990 type must be 1 or 4 entries (if specifying on a per-component
1991 level) out of UNORM, SNORM, SINT, UINT and FLOAT.
1992
1993
1994Declaration Resource
1995^^^^^^^^^^^^^^^^^^^^
1996
Zack Rusinbdbe77f2011-01-24 17:47:10 -05001997 Follows Declaration token if file is TGSI_FILE_RESOURCE.
1998
Francisco Jerezb8e808f2012-04-30 20:20:29 +02001999 DCL RES[#], resource [, WR] [, RAW]
Zack Rusinbdbe77f2011-01-24 17:47:10 -05002000
2001 Declares a shader input resource and assigns it to a RES[#]
2002 register.
2003
2004 resource can be one of BUFFER, 1D, 2D, 3D, CUBE, 1DArray and
2005 2DArray.
2006
Francisco Jerez82c90b22012-04-30 19:08:55 +02002007 If the RAW keyword is not specified, the texture data will be
2008 subject to conversion, swizzling and scaling as required to yield
2009 the specified data type from the physical data format of the bound
2010 resource.
2011
2012 If the RAW keyword is specified, no channel conversion will be
2013 performed: the values read for each of the channels (X,Y,Z,W) will
2014 correspond to consecutive words in the same order and format
2015 they're found in memory. No element-to-address conversion will be
2016 performed either: the value of the provided X coordinate will be
2017 interpreted in byte units instead of texel units. The result of
2018 accessing a misaligned address is undefined.
2019
Francisco Jerezb8e808f2012-04-30 20:20:29 +02002020 Usage of the STORE opcode is only allowed if the WR (writable) flag
2021 is set.
2022
Zack Rusinbdbe77f2011-01-24 17:47:10 -05002023
Luca Barbieri73317132010-01-21 05:36:14 +01002024Properties
2025^^^^^^^^^^^^^^^^^^^^^^^^
2026
2027
2028 Properties are general directives that apply to the whole TGSI program.
2029
2030FS_COORD_ORIGIN
2031"""""""""""""""
2032
2033Specifies the fragment shader TGSI_SEMANTIC_POSITION coordinate origin.
2034The default value is UPPER_LEFT.
2035
2036If UPPER_LEFT, the position will be (0,0) at the upper left corner and
2037increase downward and rightward.
2038If LOWER_LEFT, the position will be (0,0) at the lower left corner and
2039increase upward and rightward.
2040
2041OpenGL defaults to LOWER_LEFT, and is configurable with the
2042GL_ARB_fragment_coord_conventions extension.
2043
2044DirectX 9/10 use UPPER_LEFT.
2045
2046FS_COORD_PIXEL_CENTER
2047"""""""""""""""""""""
2048
2049Specifies the fragment shader TGSI_SEMANTIC_POSITION pixel center convention.
2050The default value is HALF_INTEGER.
2051
2052If HALF_INTEGER, the fractionary part of the position will be 0.5
2053If INTEGER, the fractionary part of the position will be 0.0
2054
2055Note that this does not affect the set of fragments generated by
2056rasterization, which is instead controlled by gl_rasterization_rules in the
2057rasterizer.
2058
2059OpenGL defaults to HALF_INTEGER, and is configurable with the
2060GL_ARB_fragment_coord_conventions extension.
2061
2062DirectX 9 uses INTEGER.
2063DirectX 10 uses HALF_INTEGER.
Brian Paul4778f462010-02-02 08:14:40 -07002064
Dave Airliec9c8a5e2010-12-18 10:34:35 +10002065FS_COLOR0_WRITES_ALL_CBUFS
2066""""""""""""""""""""""""""
2067Specifies that writes to the fragment shader color 0 are replicated to all
2068bound cbufs. This facilitates OpenGL's fragColor output vs fragData[0] where
2069fragData is directed to a single color buffer, but fragColor is broadcast.
Brian Paul4778f462010-02-02 08:14:40 -07002070
Marek Olšákdc4c8212012-01-10 00:19:00 +01002071VS_PROHIBIT_UCPS
2072""""""""""""""""""""""""""
2073If this property is set on the program bound to the shader stage before the
2074fragment shader, user clip planes should have no effect (be disabled) even if
2075that shader does not write to any clip distance outputs and the rasterizer's
2076clip_plane_enable is non-zero.
2077This property is only supported by drivers that also support shader clip
2078distance outputs.
2079This is useful for APIs that don't have UCPs and where clip distances written
2080by a shader cannot be disabled.
2081
Brian Paul4778f462010-02-02 08:14:40 -07002082
2083Texture Sampling and Texture Formats
2084------------------------------------
2085
Corbin Simpson797dcc02010-02-02 17:07:26 -08002086This table shows how texture image components are returned as (x,y,z,w) tuples
2087by TGSI texture instructions, such as :opcode:`TEX`, :opcode:`TXD`, and
2088:opcode:`TXP`. For reference, OpenGL and Direct3D conventions are shown as
2089well.
Brian Paul4778f462010-02-02 08:14:40 -07002090
Corbin Simpson516e7152010-02-02 12:44:22 -08002091+--------------------+--------------+--------------------+--------------+
2092| Texture Components | Gallium | OpenGL | Direct3D 9 |
2093+====================+==============+====================+==============+
Corbin Simpson92867dc2010-06-16 16:56:55 -07002094| R | (r, 0, 0, 1) | (r, 0, 0, 1) | (r, 1, 1, 1) |
Corbin Simpson516e7152010-02-02 12:44:22 -08002095+--------------------+--------------+--------------------+--------------+
Corbin Simpson92867dc2010-06-16 16:56:55 -07002096| RG | (r, g, 0, 1) | (r, g, 0, 1) | (r, g, 1, 1) |
Corbin Simpson516e7152010-02-02 12:44:22 -08002097+--------------------+--------------+--------------------+--------------+
2098| RGB | (r, g, b, 1) | (r, g, b, 1) | (r, g, b, 1) |
2099+--------------------+--------------+--------------------+--------------+
2100| RGBA | (r, g, b, a) | (r, g, b, a) | (r, g, b, a) |
2101+--------------------+--------------+--------------------+--------------+
2102| A | (0, 0, 0, a) | (0, 0, 0, a) | (0, 0, 0, a) |
2103+--------------------+--------------+--------------------+--------------+
2104| L | (l, l, l, 1) | (l, l, l, 1) | (l, l, l, 1) |
2105+--------------------+--------------+--------------------+--------------+
2106| LA | (l, l, l, a) | (l, l, l, a) | (l, l, l, a) |
2107+--------------------+--------------+--------------------+--------------+
2108| I | (i, i, i, i) | (i, i, i, i) | N/A |
2109+--------------------+--------------+--------------------+--------------+
2110| UV | XXX TBD | (0, 0, 0, 1) | (u, v, 1, 1) |
2111| | | [#envmap-bumpmap]_ | |
2112+--------------------+--------------+--------------------+--------------+
Brian Paul3e572eb2010-02-02 16:27:07 -07002113| Z | XXX TBD | (z, z, z, 1) | (0, z, 0, 1) |
Corbin Simpson516e7152010-02-02 12:44:22 -08002114| | | [#depth-tex-mode]_ | |
2115+--------------------+--------------+--------------------+--------------+
Dave Airlie66a0d1e2010-10-06 09:30:17 +10002116| S | (s, s, s, s) | unknown | unknown |
2117+--------------------+--------------+--------------------+--------------+
Brian Paul4778f462010-02-02 08:14:40 -07002118
Corbin Simpson516e7152010-02-02 12:44:22 -08002119.. [#envmap-bumpmap] http://www.opengl.org/registry/specs/ATI/envmap_bumpmap.txt
Brian Paul3e572eb2010-02-02 16:27:07 -07002120.. [#depth-tex-mode] the default is (z, z, z, 1) but may also be (0, 0, 0, z)
Corbin Simpson797dcc02010-02-02 17:07:26 -08002121 or (z, z, z, z) depending on the value of GL_DEPTH_TEXTURE_MODE.