Corbin Simpson | c686e17 | 2009-12-20 15:00:40 -0800 | [diff] [blame] | 1 | TGSI |
| 2 | ==== |
| 3 | |
| 4 | TGSI, Tungsten Graphics Shader Instructions, is an intermediate language |
| 5 | for describing shaders. Since Gallium is inherently shaderful, shaders are |
| 6 | an important part of the API. TGSI is the only intermediate representation |
| 7 | used by all drivers. |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 8 | |
| 9 | |
| 10 | TGSI Instruction Specification |
| 11 | ============================== |
| 12 | |
| 13 | |
| 14 | 1 Instruction Set Operations |
| 15 | ============================= |
| 16 | |
| 17 | |
| 18 | 1.1 GL_NV_vertex_program |
| 19 | ------------------------- |
| 20 | |
| 21 | |
| 22 | 1.1.1 ARL - Address Register Load |
| 23 | |
| 24 | dst.x = floor(src.x) |
| 25 | dst.y = floor(src.y) |
| 26 | dst.z = floor(src.z) |
| 27 | dst.w = floor(src.w) |
| 28 | |
| 29 | |
| 30 | 1.1.2 MOV - Move |
| 31 | |
| 32 | dst.x = src.x |
| 33 | dst.y = src.y |
| 34 | dst.z = src.z |
| 35 | dst.w = src.w |
| 36 | |
| 37 | |
| 38 | 1.1.3 LIT - Light Coefficients |
| 39 | |
| 40 | dst.x = 1.0 |
| 41 | dst.y = max(src.x, 0.0) |
| 42 | dst.z = (src.x > 0.0) ? pow(max(src.y, 0.0), clamp(src.w, -128.0, 128.0)) : 0.0 |
| 43 | dst.w = 1.0 |
| 44 | |
| 45 | |
| 46 | 1.1.4 RCP - Reciprocal |
| 47 | |
| 48 | dst.x = 1.0 / src.x |
| 49 | dst.y = 1.0 / src.x |
| 50 | dst.z = 1.0 / src.x |
| 51 | dst.w = 1.0 / src.x |
| 52 | |
| 53 | |
| 54 | 1.1.5 RSQ - Reciprocal Square Root |
| 55 | |
| 56 | dst.x = 1.0 / sqrt(abs(src.x)) |
| 57 | dst.y = 1.0 / sqrt(abs(src.x)) |
| 58 | dst.z = 1.0 / sqrt(abs(src.x)) |
| 59 | dst.w = 1.0 / sqrt(abs(src.x)) |
| 60 | |
| 61 | |
| 62 | 1.1.6 EXP - Approximate Exponential Base 2 |
| 63 | |
| 64 | dst.x = pow(2.0, floor(src.x)) |
| 65 | dst.y = src.x - floor(src.x) |
| 66 | dst.z = pow(2.0, src.x) |
| 67 | dst.w = 1.0 |
| 68 | |
| 69 | |
| 70 | 1.1.7 LOG - Approximate Logarithm Base 2 |
| 71 | |
| 72 | dst.x = floor(lg2(abs(src.x))) |
| 73 | dst.y = abs(src.x) / pow(2.0, floor(lg2(abs(src.x)))) |
| 74 | dst.z = lg2(abs(src.x)) |
| 75 | dst.w = 1.0 |
| 76 | |
| 77 | |
| 78 | 1.1.8 MUL - Multiply |
| 79 | |
| 80 | dst.x = src0.x * src1.x |
| 81 | dst.y = src0.y * src1.y |
| 82 | dst.z = src0.z * src1.z |
| 83 | dst.w = src0.w * src1.w |
| 84 | |
| 85 | |
| 86 | 1.1.9 ADD - Add |
| 87 | |
| 88 | dst.x = src0.x + src1.x |
| 89 | dst.y = src0.y + src1.y |
| 90 | dst.z = src0.z + src1.z |
| 91 | dst.w = src0.w + src1.w |
| 92 | |
| 93 | |
| 94 | 1.1.10 DP3 - 3-component Dot Product |
| 95 | |
| 96 | dst.x = src0.x * src1.x + src0.y * src1.y + src0.z * src1.z |
| 97 | dst.y = src0.x * src1.x + src0.y * src1.y + src0.z * src1.z |
| 98 | dst.z = src0.x * src1.x + src0.y * src1.y + src0.z * src1.z |
| 99 | dst.w = src0.x * src1.x + src0.y * src1.y + src0.z * src1.z |
| 100 | |
| 101 | |
| 102 | 1.1.11 DP4 - 4-component Dot Product |
| 103 | |
| 104 | dst.x = src0.x * src1.x + src0.y * src1.y + src0.z * src1.z + src0.w * src1.w |
| 105 | dst.y = src0.x * src1.x + src0.y * src1.y + src0.z * src1.z + src0.w * src1.w |
| 106 | dst.z = src0.x * src1.x + src0.y * src1.y + src0.z * src1.z + src0.w * src1.w |
| 107 | dst.w = src0.x * src1.x + src0.y * src1.y + src0.z * src1.z + src0.w * src1.w |
| 108 | |
| 109 | |
| 110 | 1.1.12 DST - Distance Vector |
| 111 | |
| 112 | dst.x = 1.0 |
| 113 | dst.y = src0.y * src1.y |
| 114 | dst.z = src0.z |
| 115 | dst.w = src1.w |
| 116 | |
| 117 | |
| 118 | 1.1.13 MIN - Minimum |
| 119 | |
| 120 | dst.x = min(src0.x, src1.x) |
| 121 | dst.y = min(src0.y, src1.y) |
| 122 | dst.z = min(src0.z, src1.z) |
| 123 | dst.w = min(src0.w, src1.w) |
| 124 | |
| 125 | |
| 126 | 1.1.14 MAX - Maximum |
| 127 | |
| 128 | dst.x = max(src0.x, src1.x) |
| 129 | dst.y = max(src0.y, src1.y) |
| 130 | dst.z = max(src0.z, src1.z) |
| 131 | dst.w = max(src0.w, src1.w) |
| 132 | |
| 133 | |
| 134 | 1.1.15 SLT - Set On Less Than |
| 135 | |
| 136 | dst.x = (src0.x < src1.x) ? 1.0 : 0.0 |
| 137 | dst.y = (src0.y < src1.y) ? 1.0 : 0.0 |
| 138 | dst.z = (src0.z < src1.z) ? 1.0 : 0.0 |
| 139 | dst.w = (src0.w < src1.w) ? 1.0 : 0.0 |
| 140 | |
| 141 | |
| 142 | 1.1.16 SGE - Set On Greater Equal Than |
| 143 | |
| 144 | dst.x = (src0.x >= src1.x) ? 1.0 : 0.0 |
| 145 | dst.y = (src0.y >= src1.y) ? 1.0 : 0.0 |
| 146 | dst.z = (src0.z >= src1.z) ? 1.0 : 0.0 |
| 147 | dst.w = (src0.w >= src1.w) ? 1.0 : 0.0 |
| 148 | |
| 149 | |
| 150 | 1.1.17 MAD - Multiply And Add |
| 151 | |
| 152 | dst.x = src0.x * src1.x + src2.x |
| 153 | dst.y = src0.y * src1.y + src2.y |
| 154 | dst.z = src0.z * src1.z + src2.z |
| 155 | dst.w = src0.w * src1.w + src2.w |
| 156 | |
| 157 | |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 158 | 1.2.1 SUB - Subtract |
| 159 | |
| 160 | dst.x = src0.x - src1.x |
| 161 | dst.y = src0.y - src1.y |
| 162 | dst.z = src0.z - src1.z |
| 163 | dst.w = src0.w - src1.w |
| 164 | |
| 165 | |
Keith Whitwell | 14eacb0 | 2009-12-21 23:38:29 +0000 | [diff] [blame] | 166 | 1.2.4 LRP - Linear Interpolate |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 167 | |
| 168 | dst.x = src0.x * (src1.x - src2.x) + src2.x |
| 169 | dst.y = src0.y * (src1.y - src2.y) + src2.y |
| 170 | dst.z = src0.z * (src1.z - src2.z) + src2.z |
| 171 | dst.w = src0.w * (src1.w - src2.w) + src2.w |
| 172 | |
| 173 | |
| 174 | 1.2.5 CND - Condition |
| 175 | |
| 176 | dst.x = (src2.x > 0.5) ? src0.x : src1.x |
| 177 | dst.y = (src2.y > 0.5) ? src0.y : src1.y |
| 178 | dst.z = (src2.z > 0.5) ? src0.z : src1.z |
| 179 | dst.w = (src2.w > 0.5) ? src0.w : src1.w |
| 180 | |
| 181 | |
Keith Whitwell | 14eacb0 | 2009-12-21 23:38:29 +0000 | [diff] [blame] | 182 | 1.2.7 DP2A - 2-component Dot Product And Add |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 183 | |
| 184 | dst.x = src0.x * src1.x + src0.y * src1.y + src2.x |
| 185 | dst.y = src0.x * src1.x + src0.y * src1.y + src2.x |
| 186 | dst.z = src0.x * src1.x + src0.y * src1.y + src2.x |
| 187 | dst.w = src0.x * src1.x + src0.y * src1.y + src2.x |
| 188 | |
| 189 | |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 190 | 1.3.4 FRAC - Fraction |
| 191 | |
| 192 | dst.x = src.x - floor(src.x) |
| 193 | dst.y = src.y - floor(src.y) |
| 194 | dst.z = src.z - floor(src.z) |
| 195 | dst.w = src.w - floor(src.w) |
| 196 | |
| 197 | |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 198 | 1.3.7 CLAMP - Clamp |
| 199 | |
| 200 | dst.x = clamp(src0.x, src1.x, src2.x) |
| 201 | dst.y = clamp(src0.y, src1.y, src2.y) |
| 202 | dst.z = clamp(src0.z, src1.z, src2.z) |
| 203 | dst.w = clamp(src0.w, src1.w, src2.w) |
| 204 | |
| 205 | |
Keith Whitwell | 14eacb0 | 2009-12-21 23:38:29 +0000 | [diff] [blame] | 206 | 1.3.8 FLR - Floor |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 207 | |
| 208 | dst.x = floor(src.x) |
| 209 | dst.y = floor(src.y) |
| 210 | dst.z = floor(src.z) |
| 211 | dst.w = floor(src.w) |
| 212 | |
| 213 | |
| 214 | 1.3.9 ROUND - Round |
| 215 | |
| 216 | dst.x = round(src.x) |
| 217 | dst.y = round(src.y) |
| 218 | dst.z = round(src.z) |
| 219 | dst.w = round(src.w) |
| 220 | |
| 221 | |
Keith Whitwell | 14eacb0 | 2009-12-21 23:38:29 +0000 | [diff] [blame] | 222 | 1.3.10 EX2 - Exponential Base 2 |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 223 | |
| 224 | dst.x = pow(2.0, src.x) |
| 225 | dst.y = pow(2.0, src.x) |
| 226 | dst.z = pow(2.0, src.x) |
| 227 | dst.w = pow(2.0, src.x) |
| 228 | |
| 229 | |
Keith Whitwell | 14eacb0 | 2009-12-21 23:38:29 +0000 | [diff] [blame] | 230 | 1.3.11 LG2 - Logarithm Base 2 |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 231 | |
| 232 | dst.x = lg2(src.x) |
| 233 | dst.y = lg2(src.x) |
| 234 | dst.z = lg2(src.x) |
| 235 | dst.w = lg2(src.x) |
| 236 | |
| 237 | |
Keith Whitwell | 14eacb0 | 2009-12-21 23:38:29 +0000 | [diff] [blame] | 238 | 1.3.12 POW - Power |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 239 | |
| 240 | dst.x = pow(src0.x, src1.x) |
| 241 | dst.y = pow(src0.x, src1.x) |
| 242 | dst.z = pow(src0.x, src1.x) |
| 243 | dst.w = pow(src0.x, src1.x) |
| 244 | |
Keith Whitwell | 14eacb0 | 2009-12-21 23:38:29 +0000 | [diff] [blame] | 245 | 1.3.15 XPD - Cross Product |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 246 | |
| 247 | dst.x = src0.y * src1.z - src1.y * src0.z |
| 248 | dst.y = src0.z * src1.x - src1.z * src0.x |
| 249 | dst.z = src0.x * src1.y - src1.x * src0.y |
| 250 | dst.w = 1.0 |
| 251 | |
| 252 | |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 253 | 1.4.1 ABS - Absolute |
| 254 | |
| 255 | dst.x = abs(src.x) |
| 256 | dst.y = abs(src.y) |
| 257 | dst.z = abs(src.z) |
| 258 | dst.w = abs(src.w) |
| 259 | |
| 260 | |
| 261 | 1.4.2 RCC - Reciprocal Clamped |
| 262 | |
| 263 | dst.x = (1.0 / src.x) > 0.0 ? clamp(1.0 / src.x, 5.42101e-020, 1.884467e+019) : clamp(1.0 / src.x, -1.884467e+019, -5.42101e-020) |
| 264 | dst.y = (1.0 / src.x) > 0.0 ? clamp(1.0 / src.x, 5.42101e-020, 1.884467e+019) : clamp(1.0 / src.x, -1.884467e+019, -5.42101e-020) |
| 265 | dst.z = (1.0 / src.x) > 0.0 ? clamp(1.0 / src.x, 5.42101e-020, 1.884467e+019) : clamp(1.0 / src.x, -1.884467e+019, -5.42101e-020) |
| 266 | dst.w = (1.0 / src.x) > 0.0 ? clamp(1.0 / src.x, 5.42101e-020, 1.884467e+019) : clamp(1.0 / src.x, -1.884467e+019, -5.42101e-020) |
| 267 | |
| 268 | |
| 269 | 1.4.3 DPH - Homogeneous Dot Product |
| 270 | |
| 271 | dst.x = src0.x * src1.x + src0.y * src1.y + src0.z * src1.z + src1.w |
| 272 | dst.y = src0.x * src1.x + src0.y * src1.y + src0.z * src1.z + src1.w |
| 273 | dst.z = src0.x * src1.x + src0.y * src1.y + src0.z * src1.z + src1.w |
| 274 | dst.w = src0.x * src1.x + src0.y * src1.y + src0.z * src1.z + src1.w |
| 275 | |
| 276 | |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 277 | 1.5.1 COS - Cosine |
| 278 | |
| 279 | dst.x = cos(src.x) |
| 280 | dst.y = cos(src.x) |
| 281 | dst.z = cos(src.x) |
| 282 | dst.w = cos(src.w) |
| 283 | |
| 284 | |
| 285 | 1.5.2 DDX - Derivative Relative To X |
| 286 | |
| 287 | dst.x = partialx(src.x) |
| 288 | dst.y = partialx(src.y) |
| 289 | dst.z = partialx(src.z) |
| 290 | dst.w = partialx(src.w) |
| 291 | |
| 292 | |
| 293 | 1.5.3 DDY - Derivative Relative To Y |
| 294 | |
| 295 | dst.x = partialy(src.x) |
| 296 | dst.y = partialy(src.y) |
| 297 | dst.z = partialy(src.z) |
| 298 | dst.w = partialy(src.w) |
| 299 | |
| 300 | |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 301 | 1.5.7 KILP - Predicated Discard |
| 302 | |
| 303 | discard |
| 304 | |
| 305 | |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 306 | 1.5.10 PK2H - Pack Two 16-bit Floats |
| 307 | |
| 308 | TBD |
| 309 | |
| 310 | |
| 311 | 1.5.11 PK2US - Pack Two Unsigned 16-bit Scalars |
| 312 | |
| 313 | TBD |
| 314 | |
| 315 | |
| 316 | 1.5.12 PK4B - Pack Four Signed 8-bit Scalars |
| 317 | |
| 318 | TBD |
| 319 | |
| 320 | |
| 321 | 1.5.13 PK4UB - Pack Four Unsigned 8-bit Scalars |
| 322 | |
| 323 | TBD |
| 324 | |
| 325 | |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 326 | 1.5.15 RFL - Reflection Vector |
| 327 | |
| 328 | dst.x = 2.0 * (src0.x * src1.x + src0.y * src1.y + src0.z * src1.z) / (src0.x * src0.x + src0.y * src0.y + src0.z * src0.z) * src0.x - src1.x |
| 329 | dst.y = 2.0 * (src0.x * src1.x + src0.y * src1.y + src0.z * src1.z) / (src0.x * src0.x + src0.y * src0.y + src0.z * src0.z) * src0.y - src1.y |
| 330 | dst.z = 2.0 * (src0.x * src1.x + src0.y * src1.y + src0.z * src1.z) / (src0.x * src0.x + src0.y * src0.y + src0.z * src0.z) * src0.z - src1.z |
| 331 | dst.w = 1.0 |
| 332 | |
Keith Whitwell | 14eacb0 | 2009-12-21 23:38:29 +0000 | [diff] [blame] | 333 | Considered for removal. |
| 334 | |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 335 | |
| 336 | 1.5.16 SEQ - Set On Equal |
| 337 | |
| 338 | dst.x = (src0.x == src1.x) ? 1.0 : 0.0 |
| 339 | dst.y = (src0.y == src1.y) ? 1.0 : 0.0 |
| 340 | dst.z = (src0.z == src1.z) ? 1.0 : 0.0 |
| 341 | dst.w = (src0.w == src1.w) ? 1.0 : 0.0 |
| 342 | |
| 343 | |
| 344 | 1.5.17 SFL - Set On False |
| 345 | |
| 346 | dst.x = 0.0 |
| 347 | dst.y = 0.0 |
| 348 | dst.z = 0.0 |
| 349 | dst.w = 0.0 |
| 350 | |
Keith Whitwell | 14eacb0 | 2009-12-21 23:38:29 +0000 | [diff] [blame] | 351 | Considered for removal. |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 352 | |
| 353 | 1.5.18 SGT - Set On Greater Than |
| 354 | |
| 355 | dst.x = (src0.x > src1.x) ? 1.0 : 0.0 |
| 356 | dst.y = (src0.y > src1.y) ? 1.0 : 0.0 |
| 357 | dst.z = (src0.z > src1.z) ? 1.0 : 0.0 |
| 358 | dst.w = (src0.w > src1.w) ? 1.0 : 0.0 |
| 359 | |
| 360 | |
| 361 | 1.5.19 SIN - Sine |
| 362 | |
| 363 | dst.x = sin(src.x) |
| 364 | dst.y = sin(src.x) |
| 365 | dst.z = sin(src.x) |
| 366 | dst.w = sin(src.w) |
| 367 | |
| 368 | |
| 369 | 1.5.20 SLE - Set On Less Equal Than |
| 370 | |
| 371 | dst.x = (src0.x <= src1.x) ? 1.0 : 0.0 |
| 372 | dst.y = (src0.y <= src1.y) ? 1.0 : 0.0 |
| 373 | dst.z = (src0.z <= src1.z) ? 1.0 : 0.0 |
| 374 | dst.w = (src0.w <= src1.w) ? 1.0 : 0.0 |
| 375 | |
| 376 | |
| 377 | 1.5.21 SNE - Set On Not Equal |
| 378 | |
| 379 | dst.x = (src0.x != src1.x) ? 1.0 : 0.0 |
| 380 | dst.y = (src0.y != src1.y) ? 1.0 : 0.0 |
| 381 | dst.z = (src0.z != src1.z) ? 1.0 : 0.0 |
| 382 | dst.w = (src0.w != src1.w) ? 1.0 : 0.0 |
| 383 | |
| 384 | |
| 385 | 1.5.22 STR - Set On True |
| 386 | |
| 387 | dst.x = 1.0 |
| 388 | dst.y = 1.0 |
| 389 | dst.z = 1.0 |
| 390 | dst.w = 1.0 |
| 391 | |
| 392 | |
| 393 | 1.5.23 TEX - Texture Lookup |
| 394 | |
| 395 | TBD |
| 396 | |
| 397 | |
| 398 | 1.5.24 TXD - Texture Lookup with Derivatives |
| 399 | |
| 400 | TBD |
| 401 | |
| 402 | |
| 403 | 1.5.25 TXP - Projective Texture Lookup |
| 404 | |
| 405 | TBD |
| 406 | |
| 407 | |
| 408 | 1.5.26 UP2H - Unpack Two 16-Bit Floats |
| 409 | |
| 410 | TBD |
| 411 | |
Keith Whitwell | 14eacb0 | 2009-12-21 23:38:29 +0000 | [diff] [blame] | 412 | Considered for removal. |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 413 | |
| 414 | 1.5.27 UP2US - Unpack Two Unsigned 16-Bit Scalars |
| 415 | |
| 416 | TBD |
| 417 | |
Keith Whitwell | 14eacb0 | 2009-12-21 23:38:29 +0000 | [diff] [blame] | 418 | Considered for removal. |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 419 | |
| 420 | 1.5.28 UP4B - Unpack Four Signed 8-Bit Values |
| 421 | |
| 422 | TBD |
| 423 | |
Keith Whitwell | 14eacb0 | 2009-12-21 23:38:29 +0000 | [diff] [blame] | 424 | Considered for removal. |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 425 | |
| 426 | 1.5.29 UP4UB - Unpack Four Unsigned 8-Bit Scalars |
| 427 | |
| 428 | TBD |
| 429 | |
Keith Whitwell | 14eacb0 | 2009-12-21 23:38:29 +0000 | [diff] [blame] | 430 | Considered for removal. |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 431 | |
| 432 | 1.5.30 X2D - 2D Coordinate Transformation |
| 433 | |
| 434 | dst.x = src0.x + src1.x * src2.x + src1.y * src2.y |
| 435 | dst.y = src0.y + src1.x * src2.z + src1.y * src2.w |
| 436 | dst.z = src0.x + src1.x * src2.x + src1.y * src2.y |
| 437 | dst.w = src0.y + src1.x * src2.z + src1.y * src2.w |
| 438 | |
Keith Whitwell | 14eacb0 | 2009-12-21 23:38:29 +0000 | [diff] [blame] | 439 | Considered for removal. |
| 440 | |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 441 | |
| 442 | 1.6 GL_NV_vertex_program2 |
| 443 | -------------------------- |
| 444 | |
| 445 | |
| 446 | 1.6.1 ARA - Address Register Add |
| 447 | |
| 448 | TBD |
| 449 | |
Keith Whitwell | 14eacb0 | 2009-12-21 23:38:29 +0000 | [diff] [blame] | 450 | Considered for removal. |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 451 | |
| 452 | 1.6.2 ARR - Address Register Load With Round |
| 453 | |
| 454 | dst.x = round(src.x) |
| 455 | dst.y = round(src.y) |
| 456 | dst.z = round(src.z) |
| 457 | dst.w = round(src.w) |
| 458 | |
| 459 | |
| 460 | 1.6.3 BRA - Branch |
| 461 | |
| 462 | pc = target |
| 463 | |
Keith Whitwell | 14eacb0 | 2009-12-21 23:38:29 +0000 | [diff] [blame] | 464 | Considered for removal. |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 465 | |
| 466 | 1.6.4 CAL - Subroutine Call |
| 467 | |
| 468 | push(pc) |
| 469 | pc = target |
| 470 | |
| 471 | |
| 472 | 1.6.5 RET - Subroutine Call Return |
| 473 | |
| 474 | pc = pop() |
| 475 | |
Keith Whitwell | 14eacb0 | 2009-12-21 23:38:29 +0000 | [diff] [blame] | 476 | Potential restrictions: |
| 477 | * Only occurs at end of function. |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 478 | |
| 479 | 1.6.6 SSG - Set Sign |
| 480 | |
| 481 | dst.x = (src.x > 0.0) ? 1.0 : (src.x < 0.0) ? -1.0 : 0.0 |
| 482 | dst.y = (src.y > 0.0) ? 1.0 : (src.y < 0.0) ? -1.0 : 0.0 |
| 483 | dst.z = (src.z > 0.0) ? 1.0 : (src.z < 0.0) ? -1.0 : 0.0 |
| 484 | dst.w = (src.w > 0.0) ? 1.0 : (src.w < 0.0) ? -1.0 : 0.0 |
| 485 | |
| 486 | |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 487 | 1.8.1 CMP - Compare |
| 488 | |
| 489 | dst.x = (src0.x < 0.0) ? src1.x : src2.x |
| 490 | dst.y = (src0.y < 0.0) ? src1.y : src2.y |
| 491 | dst.z = (src0.z < 0.0) ? src1.z : src2.z |
| 492 | dst.w = (src0.w < 0.0) ? src1.w : src2.w |
| 493 | |
| 494 | |
| 495 | 1.8.2 KIL - Conditional Discard |
| 496 | |
| 497 | if (src.x < 0.0 || src.y < 0.0 || src.z < 0.0 || src.w < 0.0) |
| 498 | discard |
| 499 | endif |
| 500 | |
| 501 | |
| 502 | 1.8.3 SCS - Sine Cosine |
| 503 | |
| 504 | dst.x = cos(src.x) |
| 505 | dst.y = sin(src.x) |
| 506 | dst.z = 0.0 |
| 507 | dst.y = 1.0 |
| 508 | |
| 509 | |
| 510 | 1.8.4 TXB - Texture Lookup With Bias |
| 511 | |
| 512 | TBD |
| 513 | |
| 514 | |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 515 | 1.9.1 NRM - 3-component Vector Normalise |
| 516 | |
| 517 | dst.x = src.x / (src.x * src.x + src.y * src.y + src.z * src.z) |
| 518 | dst.y = src.y / (src.x * src.x + src.y * src.y + src.z * src.z) |
| 519 | dst.z = src.z / (src.x * src.x + src.y * src.y + src.z * src.z) |
| 520 | dst.w = 1.0 |
| 521 | |
| 522 | |
| 523 | 1.9.2 DIV - Divide |
| 524 | |
| 525 | dst.x = src0.x / src1.x |
| 526 | dst.y = src0.y / src1.y |
| 527 | dst.z = src0.z / src1.z |
| 528 | dst.w = src0.w / src1.w |
| 529 | |
| 530 | |
| 531 | 1.9.3 DP2 - 2-component Dot Product |
| 532 | |
| 533 | dst.x = src0.x * src1.x + src0.y * src1.y |
| 534 | dst.y = src0.x * src1.x + src0.y * src1.y |
| 535 | dst.z = src0.x * src1.x + src0.y * src1.y |
| 536 | dst.w = src0.x * src1.x + src0.y * src1.y |
| 537 | |
| 538 | |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 539 | 1.9.5 TXL - Texture Lookup With LOD |
| 540 | |
| 541 | TBD |
| 542 | |
| 543 | |
| 544 | 1.9.6 BRK - Break |
| 545 | |
| 546 | TBD |
| 547 | |
| 548 | |
| 549 | 1.9.7 IF - If |
| 550 | |
| 551 | TBD |
| 552 | |
| 553 | |
| 554 | 1.9.8 BGNFOR - Begin a For-Loop |
| 555 | |
| 556 | dst.x = floor(src.x) |
| 557 | dst.y = floor(src.y) |
| 558 | dst.z = floor(src.z) |
| 559 | |
| 560 | if (dst.y <= 0) |
| 561 | pc = [matching ENDFOR] + 1 |
| 562 | endif |
| 563 | |
| 564 | Note: The destination must be a loop register. |
| 565 | The source must be a constant register. |
| 566 | |
Keith Whitwell | 14eacb0 | 2009-12-21 23:38:29 +0000 | [diff] [blame] | 567 | Considered for cleanup / removal. |
| 568 | |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 569 | |
| 570 | 1.9.9 REP - Repeat |
| 571 | |
| 572 | TBD |
| 573 | |
| 574 | |
| 575 | 1.9.10 ELSE - Else |
| 576 | |
| 577 | TBD |
| 578 | |
| 579 | |
| 580 | 1.9.11 ENDIF - End If |
| 581 | |
| 582 | TBD |
| 583 | |
| 584 | |
| 585 | 1.9.12 ENDFOR - End a For-Loop |
| 586 | |
| 587 | dst.x = dst.x + dst.z |
| 588 | dst.y = dst.y - 1.0 |
| 589 | |
| 590 | if (dst.y > 0) |
| 591 | pc = [matching BGNFOR instruction] + 1 |
| 592 | endif |
| 593 | |
| 594 | Note: The destination must be a loop register. |
| 595 | |
Keith Whitwell | 14eacb0 | 2009-12-21 23:38:29 +0000 | [diff] [blame] | 596 | Considered for cleanup / removal. |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 597 | |
| 598 | 1.9.13 ENDREP - End Repeat |
| 599 | |
| 600 | TBD |
| 601 | |
| 602 | |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 603 | 1.10.1 PUSHA - Push Address Register On Stack |
| 604 | |
| 605 | push(src.x) |
| 606 | push(src.y) |
| 607 | push(src.z) |
| 608 | push(src.w) |
| 609 | |
Keith Whitwell | 14eacb0 | 2009-12-21 23:38:29 +0000 | [diff] [blame] | 610 | Considered for cleanup / removal. |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 611 | |
| 612 | 1.10.2 POPA - Pop Address Register From Stack |
| 613 | |
| 614 | dst.w = pop() |
| 615 | dst.z = pop() |
| 616 | dst.y = pop() |
| 617 | dst.x = pop() |
| 618 | |
Keith Whitwell | 14eacb0 | 2009-12-21 23:38:29 +0000 | [diff] [blame] | 619 | Considered for cleanup / removal. |
| 620 | |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 621 | |
| 622 | 1.11 GL_NV_gpu_program4 |
| 623 | ------------------------ |
| 624 | |
Keith Whitwell | 14eacb0 | 2009-12-21 23:38:29 +0000 | [diff] [blame] | 625 | Support for these opcodes indicated by a special pipe capability bit (TBD). |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 626 | |
| 627 | 1.11.1 CEIL - Ceiling |
| 628 | |
| 629 | dst.x = ceil(src.x) |
| 630 | dst.y = ceil(src.y) |
| 631 | dst.z = ceil(src.z) |
| 632 | dst.w = ceil(src.w) |
| 633 | |
| 634 | |
| 635 | 1.11.2 I2F - Integer To Float |
| 636 | |
| 637 | dst.x = (float) src.x |
| 638 | dst.y = (float) src.y |
| 639 | dst.z = (float) src.z |
| 640 | dst.w = (float) src.w |
| 641 | |
| 642 | |
| 643 | 1.11.3 NOT - Bitwise Not |
| 644 | |
| 645 | dst.x = ~src.x |
| 646 | dst.y = ~src.y |
| 647 | dst.z = ~src.z |
| 648 | dst.w = ~src.w |
| 649 | |
| 650 | |
| 651 | 1.11.4 TRUNC - Truncate |
| 652 | |
| 653 | dst.x = trunc(src.x) |
| 654 | dst.y = trunc(src.y) |
| 655 | dst.z = trunc(src.z) |
| 656 | dst.w = trunc(src.w) |
| 657 | |
| 658 | |
| 659 | 1.11.5 SHL - Shift Left |
| 660 | |
| 661 | dst.x = src0.x << src1.x |
| 662 | dst.y = src0.y << src1.x |
| 663 | dst.z = src0.z << src1.x |
| 664 | dst.w = src0.w << src1.x |
| 665 | |
| 666 | |
| 667 | 1.11.6 SHR - Shift Right |
| 668 | |
| 669 | dst.x = src0.x >> src1.x |
| 670 | dst.y = src0.y >> src1.x |
| 671 | dst.z = src0.z >> src1.x |
| 672 | dst.w = src0.w >> src1.x |
| 673 | |
| 674 | |
| 675 | 1.11.7 AND - Bitwise And |
| 676 | |
| 677 | dst.x = src0.x & src1.x |
| 678 | dst.y = src0.y & src1.y |
| 679 | dst.z = src0.z & src1.z |
| 680 | dst.w = src0.w & src1.w |
| 681 | |
| 682 | |
| 683 | 1.11.8 OR - Bitwise Or |
| 684 | |
| 685 | dst.x = src0.x | src1.x |
| 686 | dst.y = src0.y | src1.y |
| 687 | dst.z = src0.z | src1.z |
| 688 | dst.w = src0.w | src1.w |
| 689 | |
| 690 | |
| 691 | 1.11.9 MOD - Modulus |
| 692 | |
| 693 | dst.x = src0.x % src1.x |
| 694 | dst.y = src0.y % src1.y |
| 695 | dst.z = src0.z % src1.z |
| 696 | dst.w = src0.w % src1.w |
| 697 | |
| 698 | |
| 699 | 1.11.10 XOR - Bitwise Xor |
| 700 | |
| 701 | dst.x = src0.x ^ src1.x |
| 702 | dst.y = src0.y ^ src1.y |
| 703 | dst.z = src0.z ^ src1.z |
| 704 | dst.w = src0.w ^ src1.w |
| 705 | |
| 706 | |
| 707 | 1.11.11 SAD - Sum Of Absolute Differences |
| 708 | |
| 709 | dst.x = abs(src0.x - src1.x) + src2.x |
| 710 | dst.y = abs(src0.y - src1.y) + src2.y |
| 711 | dst.z = abs(src0.z - src1.z) + src2.z |
| 712 | dst.w = abs(src0.w - src1.w) + src2.w |
| 713 | |
| 714 | |
| 715 | 1.11.12 TXF - Texel Fetch |
| 716 | |
| 717 | TBD |
| 718 | |
| 719 | |
| 720 | 1.11.13 TXQ - Texture Size Query |
| 721 | |
| 722 | TBD |
| 723 | |
| 724 | |
| 725 | 1.11.14 CONT - Continue |
| 726 | |
| 727 | TBD |
| 728 | |
| 729 | |
| 730 | 1.12 GL_NV_geometry_program4 |
| 731 | ----------------------------- |
| 732 | |
| 733 | |
| 734 | 1.12.1 EMIT - Emit |
| 735 | |
| 736 | TBD |
| 737 | |
| 738 | |
| 739 | 1.12.2 ENDPRIM - End Primitive |
| 740 | |
| 741 | TBD |
| 742 | |
| 743 | |
| 744 | 1.13 GLSL |
| 745 | ---------- |
| 746 | |
| 747 | |
| 748 | 1.13.1 BGNLOOP - Begin a Loop |
| 749 | |
| 750 | TBD |
| 751 | |
| 752 | |
| 753 | 1.13.2 BGNSUB - Begin Subroutine |
| 754 | |
| 755 | TBD |
| 756 | |
| 757 | |
| 758 | 1.13.3 ENDLOOP - End a Loop |
| 759 | |
| 760 | TBD |
| 761 | |
| 762 | |
| 763 | 1.13.4 ENDSUB - End Subroutine |
| 764 | |
| 765 | TBD |
| 766 | |
| 767 | |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 768 | |
| 769 | 1.13.10 NOP - No Operation |
| 770 | |
| 771 | Do nothing. |
| 772 | |
| 773 | |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 774 | |
| 775 | 1.16.7 NRM4 - 4-component Vector Normalise |
| 776 | |
| 777 | dst.x = src.x / (src.x * src.x + src.y * src.y + src.z * src.z + src.w * src.w) |
| 778 | dst.y = src.y / (src.x * src.x + src.y * src.y + src.z * src.z + src.w * src.w) |
| 779 | dst.z = src.z / (src.x * src.x + src.y * src.y + src.z * src.z + src.w * src.w) |
| 780 | dst.w = src.w / (src.x * src.x + src.y * src.y + src.z * src.z + src.w * src.w) |
| 781 | |
| 782 | |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 783 | 1.17 ps_2_x |
| 784 | ------------ |
| 785 | |
| 786 | |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 787 | 1.17.2 CALLNZ - Subroutine Call If Not Zero |
| 788 | |
| 789 | TBD |
| 790 | |
| 791 | |
| 792 | 1.17.3 IFC - If |
| 793 | |
| 794 | TBD |
| 795 | |
| 796 | |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 797 | 1.17.5 BREAKC - Break Conditional |
| 798 | |
| 799 | TBD |
| 800 | |
| 801 | |
Keith Whitwell | a62aaa7 | 2009-12-21 23:25:15 +0000 | [diff] [blame] | 802 | 2 Explanation of symbols used |
| 803 | ============================== |
| 804 | |
| 805 | |
| 806 | 2.1 Functions |
| 807 | -------------- |
| 808 | |
| 809 | |
| 810 | abs(x) Absolute value of x. |
| 811 | '|x|' |
| 812 | (x < 0.0) ? -x : x |
| 813 | |
| 814 | ceil(x) Ceiling of x. |
| 815 | |
| 816 | clamp(x,y,z) Clamp x between y and z. |
| 817 | (x < y) ? y : (x > z) ? z : x |
| 818 | |
| 819 | cos(x) Cosine of x. |
| 820 | |
| 821 | floor(x) Floor of x. |
| 822 | |
| 823 | lg2(x) Logarithm base 2 of x. |
| 824 | |
| 825 | max(x,y) Maximum of x and y. |
| 826 | (x > y) ? x : y |
| 827 | |
| 828 | min(x,y) Minimum of x and y. |
| 829 | (x < y) ? x : y |
| 830 | |
| 831 | partialx(x) Derivative of x relative to fragment's X. |
| 832 | |
| 833 | partialy(x) Derivative of x relative to fragment's Y. |
| 834 | |
| 835 | pop() Pop from stack. |
| 836 | |
| 837 | pow(x,y) Raise x to power of y. |
| 838 | |
| 839 | push(x) Push x on stack. |
| 840 | |
| 841 | round(x) Round x. |
| 842 | |
| 843 | sin(x) Sine of x. |
| 844 | |
| 845 | sqrt(x) Square root of x. |
| 846 | |
| 847 | trunc(x) Truncate x. |
| 848 | |
| 849 | |
| 850 | 2.2 Keywords |
| 851 | ------------- |
| 852 | |
| 853 | |
| 854 | discard Discard fragment. |
| 855 | |
| 856 | dst First destination register. |
| 857 | |
| 858 | dst0 First destination register. |
| 859 | |
| 860 | pc Program counter. |
| 861 | |
| 862 | src First source register. |
| 863 | |
| 864 | src0 First source register. |
| 865 | |
| 866 | src1 Second source register. |
| 867 | |
| 868 | src2 Third source register. |
| 869 | |
| 870 | target Label of target instruction. |
| 871 | |
| 872 | |
| 873 | 3 Other tokens |
| 874 | =============== |
| 875 | |
| 876 | |
| 877 | 3.1 Declaration Semantic |
| 878 | ------------------------- |
| 879 | |
| 880 | |
| 881 | Follows Declaration token if Semantic bit is set. |
| 882 | |
| 883 | Since its purpose is to link a shader with other stages of the pipeline, |
| 884 | it is valid to follow only those Declaration tokens that declare a register |
| 885 | either in INPUT or OUTPUT file. |
| 886 | |
| 887 | SemanticName field contains the semantic name of the register being declared. |
| 888 | There is no default value. |
| 889 | |
| 890 | SemanticIndex is an optional subscript that can be used to distinguish |
| 891 | different register declarations with the same semantic name. The default value |
| 892 | is 0. |
| 893 | |
| 894 | The meanings of the individual semantic names are explained in the following |
| 895 | sections. |
| 896 | |
| 897 | |
| 898 | 3.1.1 FACE |
| 899 | |
| 900 | Valid only in a fragment shader INPUT declaration. |
| 901 | |
| 902 | FACE.x is negative when the primitive is back facing. FACE.x is positive |
| 903 | when the primitive is front facing. |