| <HTML> |
| |
| <TITLE>Shading Language Support</TITLE> |
| |
| <link rel="stylesheet" type="text/css" href="mesa.css"></head> |
| |
| <BODY> |
| |
| <H1>Shading Language Support</H1> |
| |
| <p> |
| This page describes the features and status of Mesa's support for the |
| <a href="http://opengl.org/documentation/glsl/" target="_parent"> |
| OpenGL Shading Language</a>. |
| </p> |
| |
| <p> |
| Last updated on 15 December 2008. |
| </p> |
| |
| <p> |
| Contents |
| </p> |
| <ul> |
| <li><a href="#120">GLSL 1.20 support</a> |
| <li><a href="#unsup">Unsupported Features</a> |
| <li><a href="#notes">Implementation Notes</a> |
| <li><a href="#hints">Programming Hints</a> |
| <li><a href="#standalone">Stand-alone GLSL Compiler</a> |
| <li><a href="#implementation">Compiler Implementation</a> |
| <li><a href="#validation">Compiler Validation</a> |
| </ul> |
| |
| |
| |
| <a name="120"> |
| <h2>GLSL 1.20 support</h2> |
| |
| <p> |
| GLSL version 1.20 is supported in Mesa 7.3. |
| Among the features/differences of GLSL 1.20 are: |
| <ul> |
| <li><code>mat2x3, mat2x4</code>, etc. types and functions |
| <li><code>transpose(), outerProduct(), matrixCompMult()</code> functions |
| (but untested) |
| <li>precision qualifiers (lowp, mediump, highp) |
| <li><code>invariant</code> qualifier |
| <li><code>array.length()</code> method |
| <li><code>float[5] a;</code> array syntax |
| <li><code>centroid</code> qualifier |
| <li>unsized array constructors |
| <li>initializers for uniforms |
| <li>const initializers calling built-in functions |
| </ul> |
| |
| |
| |
| <a name="unsup"> |
| <h2>Unsupported Features</h2> |
| |
| <p> |
| The following features of the shading language are not yet supported |
| in Mesa: |
| </p> |
| |
| <ul> |
| <li>Linking of multiple shaders is not supported |
| <li>gl_ClipVertex |
| <li>The gl_Color and gl_SecondaryColor varying vars are interpolated |
| without perspective correction |
| </ul> |
| |
| <p> |
| All other major features of the shading language should function. |
| </p> |
| |
| |
| <a name="notes"> |
| <h2>Implementation Notes</h2> |
| |
| <ul> |
| <li>Shading language programs are compiled into low-level programs |
| very similar to those of GL_ARB_vertex/fragment_program. |
| <li>All vector types (vec2, vec3, vec4, bvec2, etc) currently occupy full |
| float[4] registers. |
| <li>Float constants and variables are packed so that up to four floats |
| can occupy one program parameter/register. |
| <li>All function calls are inlined. |
| <li>Shaders which use too many registers will not compile. |
| <li>The quality of generated code is pretty good, register usage is fair. |
| <li>Shader error detection and reporting of errors (InfoLog) is not |
| very good yet. |
| <li>The ftransform() function doesn't necessarily match the results of |
| fixed-function transformation. |
| </ul> |
| |
| <p> |
| These issues will be addressed/resolved in the future. |
| </p> |
| |
| |
| <a name="hints"> |
| <h2>Programming Hints</h2> |
| |
| <ul> |
| <li>Declare <em>in</em> function parameters as <em>const</em> whenever possible. |
| This improves the efficiency of function inlining. |
| </li> |
| <br> |
| <li>To reduce register usage, declare variables within smaller scopes. |
| For example, the following code: |
| <pre> |
| void main() |
| { |
| vec4 a1, a2, b1, b2; |
| gl_Position = expression using a1, a2. |
| gl_Color = expression using b1, b2; |
| } |
| </pre> |
| Can be rewritten as follows to use half as many registers: |
| <pre> |
| void main() |
| { |
| { |
| vec4 a1, a2; |
| gl_Position = expression using a1, a2. |
| } |
| { |
| vec4 b1, b2; |
| gl_Color = expression using b1, b2; |
| } |
| } |
| </pre> |
| Alternately, rather than using several float variables, use |
| a vec4 instead. Use swizzling and writemasks to access the |
| components of the vec4 as floats. |
| </li> |
| <br> |
| <li>Use the built-in library functions whenever possible. |
| For example, instead of writing this: |
| <pre> |
| float x = 1.0 / sqrt(y); |
| </pre> |
| Write this: |
| <pre> |
| float x = inversesqrt(y); |
| </pre> |
| <li> |
| Use ++i when possible as it's more efficient than i++ |
| </li> |
| </ul> |
| |
| |
| <a name="standalone"> |
| <h2>Stand-alone GLSL Compiler</h2> |
| |
| <p> |
| A unique stand-alone GLSL compiler driver has been added to Mesa. |
| <p> |
| |
| <p> |
| The stand-alone compiler (like a conventional command-line compiler) |
| is a tool that accepts Shading Language programs and emits low-level |
| GPU programs. |
| </p> |
| |
| <p> |
| This tool is useful for: |
| <p> |
| <ul> |
| <li>Inspecting GPU code to gain insight into compilation |
| <li>Generating initial GPU code for subsequent hand-tuning |
| <li>Debugging the GLSL compiler itself |
| </ul> |
| |
| <p> |
| After building Mesa, the glslcompiler can be built by manually running: |
| </p> |
| <pre> |
| cd src/mesa/drivers/glslcompiler |
| make |
| </pre> |
| |
| |
| <p> |
| Here's an example of using the compiler to compile a vertex shader and |
| emit GL_ARB_vertex_program-style instructions: |
| </p> |
| <pre> |
| bin/glslcompiler --debug --numbers --fs progs/glsl/CH06-brick.frag.txt |
| </pre> |
| <p> |
| results in: |
| </p> |
| <pre> |
| # Fragment Program/Shader |
| 0: RCP TEMP[4].x, UNIFORM[2].xxxx; |
| 1: RCP TEMP[4].y, UNIFORM[2].yyyy; |
| 2: MUL TEMP[3].xy, VARYING[0], TEMP[4]; |
| 3: MOV TEMP[1], TEMP[3]; |
| 4: MUL TEMP[0].w, TEMP[1].yyyy, CONST[4].xxxx; |
| 5: FRC TEMP[1].z, TEMP[0].wwww; |
| 6: SGT.C TEMP[0].w, TEMP[1].zzzz, CONST[4].xxxx; |
| 7: IF (NE.wwww); # (if false, goto 9); |
| 8: ADD TEMP[1].x, TEMP[1].xxxx, CONST[4].xxxx; |
| 9: ENDIF; |
| 10: FRC TEMP[1].xy, TEMP[1]; |
| 11: SGT TEMP[2].xy, UNIFORM[3], TEMP[1]; |
| 12: MUL TEMP[1].z, TEMP[2].xxxx, TEMP[2].yyyy; |
| 13: LRP TEMP[0], TEMP[1].zzzz, UNIFORM[0], UNIFORM[1]; |
| 14: MUL TEMP[0].xyz, TEMP[0], VARYING[1].xxxx; |
| 15: MOV OUTPUT[0].xyz, TEMP[0]; |
| 16: MOV OUTPUT[0].w, CONST[4].yyyy; |
| 17: END |
| </pre> |
| |
| <p> |
| Note that some shading language constructs (such as uniform and varying |
| variables) aren't expressible in ARB or NV-style programs. |
| Therefore, the resulting output is not always legal by definition of |
| those program languages. |
| </p> |
| <p> |
| Also note that this compiler driver is still under development. |
| Over time, the correctness of the GPU programs, with respect to the ARB |
| and NV languagues, should improve. |
| </p> |
| |
| |
| |
| <a name="implementation"> |
| <h2>Compiler Implementation</h2> |
| |
| <p> |
| The source code for Mesa's shading language compiler is in the |
| <code>src/mesa/shader/slang/</code> directory. |
| </p> |
| |
| <p> |
| The compiler follows a fairly standard design and basically works as follows: |
| </p> |
| <ul> |
| <li>The input string is tokenized (see grammar.c) and parsed |
| (see slang_compiler_*.c) to produce an Abstract Syntax Tree (AST). |
| The nodes in this tree are slang_operation structures |
| (see slang_compile_operation.h). |
| The nodes are decorated with symbol table, scoping and datatype information. |
| <li>The AST is converted into an Intermediate representation (IR) tree |
| (see the slang_codegen.c file). |
| The IR nodes represent basic GPU instructions, like add, dot product, |
| move, etc. |
| The IR tree is mostly a binary tree, but a few nodes have three or four |
| children. |
| In principle, the IR tree could be executed by doing an in-order traversal. |
| <li>The IR tree is traversed in-order to emit code (see slang_emit.c). |
| This is also when registers are allocated to store variables and temps. |
| <li>In the future, a pattern-matching code generator-generator may be |
| used for code generation. |
| Programs such as L-BURG (Bottom-Up Rewrite Generator) and Twig look for |
| patterns in IR trees, compute weights for subtrees and use the weights |
| to select the best instructions to represent the sub-tree. |
| <li>The emitted GPU instructions (see prog_instruction.h) are stored in a |
| gl_program object (see mtypes.h). |
| <li>When a fragment shader and vertex shader are linked (see slang_link.c) |
| the varying vars are matched up, uniforms are merged, and vertex |
| attributes are resolved (rewriting instructions as needed). |
| </ul> |
| |
| <p> |
| The final vertex and fragment programs may be interpreted in software |
| (see prog_execute.c) or translated into a specific hardware architecture |
| (see drivers/dri/i915/i915_fragprog.c for example). |
| </p> |
| |
| <h3>Code Generation Options</h3> |
| |
| <p> |
| Internally, there are several options that control the compiler's code |
| generation and instruction selection. |
| These options are seen in the gl_shader_state struct and may be set |
| by the device driver to indicate its preferences: |
| |
| <pre> |
| struct gl_shader_state |
| { |
| ... |
| /** Driver-selectable options: */ |
| GLboolean EmitHighLevelInstructions; |
| GLboolean EmitCondCodes; |
| GLboolean EmitComments; |
| }; |
| </pre> |
| |
| <ul> |
| <li>EmitHighLevelInstructions |
| <br> |
| This option controls instruction selection for loops and conditionals. |
| If the option is set high-level IF/ELSE/ENDIF, LOOP/ENDLOOP, CONT/BRK |
| instructions will be emitted. |
| Otherwise, those constructs will be implemented with BRA instructions. |
| </li> |
| |
| <li>EmitCondCodes |
| <br> |
| If set, condition codes (ala GL_NV_fragment_program) will be used for |
| branching and looping. |
| Otherwise, ordinary registers will be used (the IF instruction will |
| examine the first operand's X component and do the if-part if non-zero). |
| This option is only relevant if EmitHighLevelInstructions is set. |
| </li> |
| |
| <li>EmitComments |
| <br> |
| If set, instructions will be annoted with comments to help with debugging. |
| Extra NOP instructions will also be inserted. |
| </br> |
| |
| </ul> |
| |
| |
| <a name="validation"> |
| <h2>Compiler Validation</h2> |
| |
| <p> |
| A <a href="http://glean.sf.net" target="_parent">Glean</a> test has |
| been create to exercise the GLSL compiler. |
| </p> |
| <p> |
| The <em>glsl1</em> test runs over 170 sub-tests to check that the language |
| features and built-in functions work properly. |
| This test should be run frequently while working on the compiler to catch |
| regressions. |
| </p> |
| <p> |
| The test coverage is reasonably broad and complete but additional tests |
| should be added. |
| </p> |
| |
| |
| </BODY> |
| </HTML> |