Corbin Simpson | e7d05f1 | 2010-06-16 16:52:52 -0700 | [diff] [blame] | 1 | .. _context: |
| 2 | |
Corbin Simpson | 8283e20 | 2009-12-20 15:28:00 -0800 | [diff] [blame] | 3 | Context |
| 4 | ======= |
| 5 | |
Brian Paul | 73e37d9 | 2011-02-03 12:30:19 -0700 | [diff] [blame] | 6 | A Gallium rendering context encapsulates the state which effects 3D |
| 7 | rendering such as blend state, depth/stencil state, texture samplers, |
| 8 | etc. |
| 9 | |
| 10 | Note that resource/texture allocation is not per-context but per-screen. |
| 11 | |
Corbin Simpson | 8283e20 | 2009-12-20 15:28:00 -0800 | [diff] [blame] | 12 | |
| 13 | Methods |
| 14 | ------- |
| 15 | |
Corbin Simpson | a524aab | 2009-12-20 19:41:50 -0800 | [diff] [blame] | 16 | CSO State |
| 17 | ^^^^^^^^^ |
| 18 | |
Brian Paul | 73e37d9 | 2011-02-03 12:30:19 -0700 | [diff] [blame] | 19 | All Constant State Object (CSO) state is created, bound, and destroyed, |
| 20 | with triplets of methods that all follow a specific naming scheme. |
| 21 | For example, ``create_blend_state``, ``bind_blend_state``, and |
| 22 | ``destroy_blend_state``. |
Corbin Simpson | a524aab | 2009-12-20 19:41:50 -0800 | [diff] [blame] | 23 | |
| 24 | CSO objects handled by the context object: |
| 25 | |
| 26 | * :ref:`Blend`: ``*_blend_state`` |
Brian Paul | 73e37d9 | 2011-02-03 12:30:19 -0700 | [diff] [blame] | 27 | * :ref:`Sampler`: Texture sampler states are bound separately for fragment, |
| 28 | vertex and geometry samplers. Note that sampler states are set en masse. |
| 29 | If M is the max number of sampler units supported by the driver and N |
| 30 | samplers are bound with ``bind_fragment_sampler_states`` then sampler |
| 31 | units N..M-1 are considered disabled/NULL. |
Corbin Simpson | a524aab | 2009-12-20 19:41:50 -0800 | [diff] [blame] | 32 | * :ref:`Rasterizer`: ``*_rasterizer_state`` |
| 33 | * :ref:`Depth, Stencil, & Alpha`: ``*_depth_stencil_alpha_state`` |
Brian Paul | 73e37d9 | 2011-02-03 12:30:19 -0700 | [diff] [blame] | 34 | * :ref:`Shader`: These are create, bind and destroy methods for vertex, |
| 35 | fragment and geometry shaders. |
Roland Scheidegger | 8397c80 | 2010-03-01 18:42:47 +0100 | [diff] [blame] | 36 | * :ref:`Vertex Elements`: ``*_vertex_elements_state`` |
Corbin Simpson | a524aab | 2009-12-20 19:41:50 -0800 | [diff] [blame] | 37 | |
Keith Whitwell | f3347fe | 2009-12-21 23:44:32 +0000 | [diff] [blame] | 38 | |
| 39 | Resource Binding State |
| 40 | ^^^^^^^^^^^^^^^^^^^^^^ |
| 41 | |
| 42 | This state describes how resources in various flavours (textures, |
| 43 | buffers, surfaces) are bound to the driver. |
| 44 | |
| 45 | |
Roland Scheidegger | bf575b6 | 2010-01-15 18:25:14 +0100 | [diff] [blame] | 46 | * ``set_constant_buffer`` sets a constant buffer to be used for a given shader |
| 47 | type. index is used to indicate which buffer to set (some apis may allow |
| 48 | multiple ones to be set, and binding a specific one later, though drivers |
| 49 | are mostly restricted to the first one right now). |
| 50 | |
Keith Whitwell | f3347fe | 2009-12-21 23:44:32 +0000 | [diff] [blame] | 51 | * ``set_framebuffer_state`` |
Michal Krol | e81caad | 2010-02-25 15:33:15 +0100 | [diff] [blame] | 52 | |
Keith Whitwell | f3347fe | 2009-12-21 23:44:32 +0000 | [diff] [blame] | 53 | * ``set_vertex_buffers`` |
| 54 | |
Chia-I Wu | e7f69c4 | 2010-07-17 22:00:04 +0800 | [diff] [blame] | 55 | * ``set_index_buffer`` |
Keith Whitwell | f3347fe | 2009-12-21 23:44:32 +0000 | [diff] [blame] | 56 | |
Brian Paul | 73e37d9 | 2011-02-03 12:30:19 -0700 | [diff] [blame] | 57 | |
Corbin Simpson | a524aab | 2009-12-20 19:41:50 -0800 | [diff] [blame] | 58 | Non-CSO State |
| 59 | ^^^^^^^^^^^^^ |
| 60 | |
| 61 | These pieces of state are too small, variable, and/or trivial to have CSO |
| 62 | objects. They all follow simple, one-method binding calls, e.g. |
Roland Scheidegger | 98f8c4d0 | 2010-02-09 21:48:43 +0100 | [diff] [blame] | 63 | ``set_blend_color``. |
Corbin Simpson | 8e1768c | 2010-03-19 00:07:55 -0700 | [diff] [blame] | 64 | |
Roland Scheidegger | 98f8c4d0 | 2010-02-09 21:48:43 +0100 | [diff] [blame] | 65 | * ``set_stencil_ref`` sets the stencil front and back reference values |
| 66 | which are used as comparison values in stencil test. |
Corbin Simpson | a524aab | 2009-12-20 19:41:50 -0800 | [diff] [blame] | 67 | * ``set_blend_color`` |
Roland Scheidegger | aac2cccc | 2010-04-26 19:50:57 +0200 | [diff] [blame] | 68 | * ``set_sample_mask`` |
Corbin Simpson | a524aab | 2009-12-20 19:41:50 -0800 | [diff] [blame] | 69 | * ``set_clip_state`` |
Corbin Simpson | a524aab | 2009-12-20 19:41:50 -0800 | [diff] [blame] | 70 | * ``set_polygon_stipple`` |
Corbin Simpson | 8cf1af4 | 2010-01-25 01:12:30 -0800 | [diff] [blame] | 71 | * ``set_scissor_state`` sets the bounds for the scissor test, which culls |
| 72 | pixels before blending to render targets. If the :ref:`Rasterizer` does |
| 73 | not have the scissor test enabled, then the scissor bounds never need to |
Keith Whitwell | bc3cff2 | 2010-08-20 11:38:33 +0100 | [diff] [blame] | 74 | be set since they will not be used. Note that scissor xmin and ymin are |
| 75 | inclusive, but xmax and ymax are exclusive. The inclusive ranges in x |
| 76 | and y would be [xmin..xmax-1] and [ymin..ymax-1]. |
Corbin Simpson | a524aab | 2009-12-20 19:41:50 -0800 | [diff] [blame] | 77 | * ``set_viewport_state`` |
Corbin Simpson | a524aab | 2009-12-20 19:41:50 -0800 | [diff] [blame] | 78 | |
Keith Whitwell | f3347fe | 2009-12-21 23:44:32 +0000 | [diff] [blame] | 79 | |
Michal Krol | e4b8a30 | 2010-03-16 10:58:33 +0100 | [diff] [blame] | 80 | Sampler Views |
| 81 | ^^^^^^^^^^^^^ |
| 82 | |
| 83 | These are the means to bind textures to shader stages. To create one, specify |
| 84 | its format, swizzle and LOD range in sampler view template. |
| 85 | |
| 86 | If texture format is different than template format, it is said the texture |
| 87 | is being cast to another format. Casting can be done only between compatible |
| 88 | formats, that is formats that have matching component order and sizes. |
| 89 | |
| 90 | Swizzle fields specify they way in which fetched texel components are placed |
Michal Krol | 980da4a | 2010-03-19 09:08:33 +0100 | [diff] [blame] | 91 | in the result register. For example, ``swizzle_r`` specifies what is going to be |
| 92 | placed in first component of result register. |
Michal Krol | e4b8a30 | 2010-03-16 10:58:33 +0100 | [diff] [blame] | 93 | |
Michal Krol | 980da4a | 2010-03-19 09:08:33 +0100 | [diff] [blame] | 94 | The ``first_level`` and ``last_level`` fields of sampler view template specify |
Roland Scheidegger | 4c70014 | 2010-12-02 04:33:43 +0100 | [diff] [blame] | 95 | the LOD range the texture is going to be constrained to. Note that these |
| 96 | values are in addition to the respective min_lod, max_lod values in the |
| 97 | pipe_sampler_state (that is if min_lod is 2.0, and first_level 3, the first mip |
| 98 | level used for sampling from the resource is effectively the fifth). |
| 99 | |
| 100 | The ``first_layer`` and ``last_layer`` fields specify the layer range the |
| 101 | texture is going to be constrained to. Similar to the LOD range, this is added |
| 102 | to the array index which is used for sampling. |
Michal Krol | e4b8a30 | 2010-03-16 10:58:33 +0100 | [diff] [blame] | 103 | |
| 104 | * ``set_fragment_sampler_views`` binds an array of sampler views to |
| 105 | fragment shader stage. Every binding point acquires a reference |
| 106 | to a respective sampler view and releases a reference to the previous |
Brian Paul | 73e37d9 | 2011-02-03 12:30:19 -0700 | [diff] [blame] | 107 | sampler view. If M is the maximum number of sampler units and N units |
| 108 | is passed to set_fragment_sampler_views, the driver should unbind the |
| 109 | sampler views for units N..M-1. |
Michal Krol | e4b8a30 | 2010-03-16 10:58:33 +0100 | [diff] [blame] | 110 | |
| 111 | * ``set_vertex_sampler_views`` binds an array of sampler views to vertex |
| 112 | shader stage. Every binding point acquires a reference to a respective |
| 113 | sampler view and releases a reference to the previous sampler view. |
| 114 | |
Michal Krol | 980da4a | 2010-03-19 09:08:33 +0100 | [diff] [blame] | 115 | * ``create_sampler_view`` creates a new sampler view. ``texture`` is associated |
Michal Krol | e4b8a30 | 2010-03-16 10:58:33 +0100 | [diff] [blame] | 116 | with the sampler view which results in sampler view holding a reference |
| 117 | to the texture. Format specified in template must be compatible |
| 118 | with texture format. |
| 119 | |
| 120 | * ``sampler_view_destroy`` destroys a sampler view and releases its reference |
| 121 | to associated texture. |
| 122 | |
Francisco Jerez | 5f55cbc | 2012-05-01 02:47:03 +0200 | [diff] [blame] | 123 | Shader Resources |
| 124 | ^^^^^^^^^^^^^^^^ |
| 125 | |
| 126 | Shader resources are textures or buffers that may be read or written |
| 127 | from a shader without an associated sampler. This means that they |
| 128 | have no support for floating point coordinates, address wrap modes or |
| 129 | filtering. |
| 130 | |
| 131 | Shader resources are specified for all the shader stages at once using |
| 132 | the ``set_shader_resources`` method. When binding texture resources, |
| 133 | the ``level``, ``first_layer`` and ``last_layer`` pipe_surface fields |
| 134 | specify the mipmap level and the range of layers the texture will be |
| 135 | constrained to. In the case of buffers, ``first_element`` and |
| 136 | ``last_element`` specify the range within the buffer that will be used |
Francisco Jerez | b8e808f | 2012-04-30 20:20:29 +0200 | [diff] [blame] | 137 | by the shader resource. Writes to a shader resource are only allowed |
| 138 | when the ``writable`` flag is set. |
Francisco Jerez | 5f55cbc | 2012-05-01 02:47:03 +0200 | [diff] [blame] | 139 | |
Roland Scheidegger | 4c70014 | 2010-12-02 04:33:43 +0100 | [diff] [blame] | 140 | Surfaces |
| 141 | ^^^^^^^^ |
| 142 | |
| 143 | These are the means to use resources as color render targets or depthstencil |
| 144 | attachments. To create one, specify the mip level, the range of layers, and |
| 145 | the bind flags (either PIPE_BIND_DEPTH_STENCIL or PIPE_BIND_RENDER_TARGET). |
| 146 | Note that layer values are in addition to what is indicated by the geometry |
| 147 | shader output variable XXX_FIXME (that is if first_layer is 3 and geometry |
| 148 | shader indicates index 2, the 5th layer of the resource will be used). These |
| 149 | first_layer and last_layer parameters will only be used for 1d array, 2d array, |
| 150 | cube, and 3d textures otherwise they are 0. |
| 151 | |
| 152 | * ``create_surface`` creates a new surface. |
| 153 | |
| 154 | * ``surface_destroy`` destroys a surface and releases its reference to the |
| 155 | associated resource. |
Michal Krol | e4b8a30 | 2010-03-16 10:58:33 +0100 | [diff] [blame] | 156 | |
Marek Olšák | 861a029 | 2011-12-15 18:42:21 +0100 | [diff] [blame] | 157 | Stream output targets |
| 158 | ^^^^^^^^^^^^^^^^^^^^^ |
| 159 | |
| 160 | Stream output, also known as transform feedback, allows writing the primitives |
| 161 | produced by the vertex pipeline to buffers. This is done after the geometry |
| 162 | shader or vertex shader if no geometry shader is present. |
| 163 | |
| 164 | The stream output targets are views into buffer resources which can be bound |
| 165 | as stream outputs and specify a memory range where it's valid to write |
| 166 | primitives. The pipe driver must implement memory protection such that any |
| 167 | primitives written outside of the specified memory range are discarded. |
| 168 | |
| 169 | Two stream output targets can use the same resource at the same time, but |
| 170 | with a disjoint memory range. |
| 171 | |
| 172 | Additionally, the stream output target internally maintains the offset |
| 173 | into the buffer which is incremented everytime something is written to it. |
| 174 | The internal offset is equal to how much data has already been written. |
| 175 | It can be stored in device memory and the CPU actually doesn't have to query |
| 176 | it. |
| 177 | |
| 178 | The stream output target can be used in a draw command to provide |
| 179 | the vertex count. The vertex count is derived from the internal offset |
| 180 | discussed above. |
| 181 | |
| 182 | * ``create_stream_output_target`` create a new target. |
| 183 | |
| 184 | * ``stream_output_target_destroy`` destroys a target. Users of this should |
| 185 | use pipe_so_target_reference instead. |
| 186 | |
| 187 | * ``set_stream_output_targets`` binds stream output targets. The parameter |
| 188 | append_bitmask is a bitmask, where the i-th bit specifies whether new |
| 189 | primitives should be appended to the i-th buffer (writing starts at |
| 190 | the internal offset), or whether writing should start at the beginning |
| 191 | (the internal offset is effectively set to 0). |
| 192 | |
| 193 | NOTE: The currently-bound vertex or geometry shader must be compiled with |
| 194 | the properly-filled-in structure pipe_stream_output_info describing which |
| 195 | outputs should be written to buffers and how. The structure is part of |
| 196 | pipe_shader_state. |
| 197 | |
Keith Whitwell | f3347fe | 2009-12-21 23:44:32 +0000 | [diff] [blame] | 198 | Clearing |
| 199 | ^^^^^^^^ |
| 200 | |
Roland Scheidegger | 0cd70b5 | 2010-05-28 23:57:47 +0200 | [diff] [blame] | 201 | Clear is one of the most difficult concepts to nail down to a single |
| 202 | interface (due to both different requirements from APIs and also driver/hw |
| 203 | specific differences). |
| 204 | |
Keith Whitwell | f3347fe | 2009-12-21 23:44:32 +0000 | [diff] [blame] | 205 | ``clear`` initializes some or all of the surfaces currently bound to |
| 206 | the framebuffer to particular RGBA, depth, or stencil values. |
Roland Scheidegger | 0cd70b5 | 2010-05-28 23:57:47 +0200 | [diff] [blame] | 207 | Currently, this does not take into account color or stencil write masks (as |
| 208 | used by GL), and always clears the whole surfaces (no scissoring as used by |
| 209 | GL clear or explicit rectangles like d3d9 uses). It can, however, also clear |
| 210 | only depth or stencil in a combined depth/stencil surface, if the driver |
| 211 | supports PIPE_CAP_DEPTHSTENCIL_CLEAR_SEPARATE. |
Roland Scheidegger | 4c70014 | 2010-12-02 04:33:43 +0100 | [diff] [blame] | 212 | If a surface includes several layers then all layers will be cleared. |
Keith Whitwell | f3347fe | 2009-12-21 23:44:32 +0000 | [diff] [blame] | 213 | |
Roland Scheidegger | a6e5c6c | 2010-06-03 16:33:25 +0200 | [diff] [blame] | 214 | ``clear_render_target`` clears a single color rendertarget with the specified |
| 215 | color value. While it is only possible to clear one surface at a time (which can |
Roland Scheidegger | 0cd70b5 | 2010-05-28 23:57:47 +0200 | [diff] [blame] | 216 | include several layers), this surface need not be bound to the framebuffer. |
| 217 | |
Corbin Simpson | 517a4fb | 2010-06-16 11:10:46 -0700 | [diff] [blame] | 218 | ``clear_depth_stencil`` clears a single depth, stencil or depth/stencil surface |
Roland Scheidegger | a6e5c6c | 2010-06-03 16:33:25 +0200 | [diff] [blame] | 219 | with the specified depth and stencil values (for combined depth/stencil buffers, |
Roland Scheidegger | 0cd70b5 | 2010-05-28 23:57:47 +0200 | [diff] [blame] | 220 | is is also possible to only clear one or the other part). While it is only |
| 221 | possible to clear one surface at a time (which can include several layers), |
| 222 | this surface need not be bound to the framebuffer. |
Keith Whitwell | f3347fe | 2009-12-21 23:44:32 +0000 | [diff] [blame] | 223 | |
| 224 | |
| 225 | Drawing |
Corbin Simpson | a524aab | 2009-12-20 19:41:50 -0800 | [diff] [blame] | 226 | ^^^^^^^ |
| 227 | |
Chia-I Wu | e7f69c4 | 2010-07-17 22:00:04 +0800 | [diff] [blame] | 228 | ``draw_vbo`` draws a specified primitive. The primitive mode and other |
| 229 | properties are described by ``pipe_draw_info``. |
Corbin Simpson | a524aab | 2009-12-20 19:41:50 -0800 | [diff] [blame] | 230 | |
Chia-I Wu | e7f69c4 | 2010-07-17 22:00:04 +0800 | [diff] [blame] | 231 | The ``mode``, ``start``, and ``count`` fields of ``pipe_draw_info`` specify the |
| 232 | the mode of the primitive and the vertices to be fetched, in the range between |
| 233 | ``start`` to ``start``+``count``-1, inclusive. |
Michal Krol | ffd2848 | 2010-01-14 18:55:52 +0100 | [diff] [blame] | 234 | |
Chia-I Wu | e7f69c4 | 2010-07-17 22:00:04 +0800 | [diff] [blame] | 235 | Every instance with instanceID in the range between ``start_instance`` and |
| 236 | ``start_instance``+``instance_count``-1, inclusive, will be drawn. |
Michal Krol | ffd2848 | 2010-01-14 18:55:52 +0100 | [diff] [blame] | 237 | |
José Fonseca | bb78f6a | 2011-04-16 10:18:20 +0100 | [diff] [blame] | 238 | If there is an index buffer bound, and ``indexed`` field is true, all vertex |
| 239 | indices will be looked up in the index buffer. |
| 240 | |
| 241 | In indexed draw, ``min_index`` and ``max_index`` respectively provide a lower |
| 242 | and upper bound of the indices contained in the index buffer inside the range |
| 243 | between ``start`` to ``start``+``count``-1. This allows the driver to |
| 244 | determine which subset of vertices will be referenced during te draw call |
| 245 | without having to scan the index buffer. Providing a over-estimation of the |
| 246 | the true bounds, for example, a ``min_index`` and ``max_index`` of 0 and |
| 247 | 0xffffffff respectively, must give exactly the same rendering, albeit with less |
| 248 | performance due to unreferenced vertex buffers being unnecessarily DMA'ed or |
| 249 | processed. Providing a underestimation of the true bounds will result in |
| 250 | undefined behavior, but should not result in program or system failure. |
| 251 | |
| 252 | In case of non-indexed draw, ``min_index`` should be set to |
Chia-I Wu | e7f69c4 | 2010-07-17 22:00:04 +0800 | [diff] [blame] | 253 | ``start`` and ``max_index`` should be set to ``start``+``count``-1. |
Corbin Simpson | a524aab | 2009-12-20 19:41:50 -0800 | [diff] [blame] | 254 | |
José Fonseca | bb78f6a | 2011-04-16 10:18:20 +0100 | [diff] [blame] | 255 | ``index_bias`` is a value added to every vertex index after lookup and before |
| 256 | fetching vertex attributes. |
José Fonseca | 493a1bb | 2010-04-20 10:22:28 +0200 | [diff] [blame] | 257 | |
Brian Paul | adf35e8 | 2010-10-21 19:03:38 -0600 | [diff] [blame] | 258 | When drawing indexed primitives, the primitive restart index can be |
| 259 | used to draw disjoint primitive strips. For example, several separate |
| 260 | line strips can be drawn by designating a special index value as the |
| 261 | restart index. The ``primitive_restart`` flag enables/disables this |
| 262 | feature. The ``restart_index`` field specifies the restart index value. |
| 263 | |
| 264 | When primitive restart is in use, array indexes are compared to the |
| 265 | restart index before adding the index_bias offset. |
| 266 | |
Michal Krol | ffd2848 | 2010-01-14 18:55:52 +0100 | [diff] [blame] | 267 | If a given vertex element has ``instance_divisor`` set to 0, it is said |
| 268 | it contains per-vertex data and effective vertex attribute address needs |
| 269 | to be recalculated for every index. |
| 270 | |
| 271 | attribAddr = ``stride`` * index + ``src_offset`` |
| 272 | |
| 273 | If a given vertex element has ``instance_divisor`` set to non-zero, |
| 274 | it is said it contains per-instance data and effective vertex attribute |
| 275 | address needs to recalculated for every ``instance_divisor``-th instance. |
| 276 | |
| 277 | attribAddr = ``stride`` * instanceID / ``instance_divisor`` + ``src_offset`` |
| 278 | |
| 279 | In the above formulas, ``src_offset`` is taken from the given vertex element |
| 280 | and ``stride`` is taken from a vertex buffer associated with the given |
| 281 | vertex element. |
| 282 | |
| 283 | The calculated attribAddr is used as an offset into the vertex buffer to |
| 284 | fetch the attribute data. |
| 285 | |
| 286 | The value of ``instanceID`` can be read in a vertex shader through a system |
| 287 | value register declared with INSTANCEID semantic name. |
| 288 | |
Keith Whitwell | f3347fe | 2009-12-21 23:44:32 +0000 | [diff] [blame] | 289 | |
| 290 | Queries |
| 291 | ^^^^^^^ |
| 292 | |
| 293 | Queries gather some statistic from the 3D pipeline over one or more |
Christoph Bumiller | 10f67c0 | 2011-10-20 18:03:23 +0200 | [diff] [blame] | 294 | draws. Queries may be nested, though only d3d1x currently exercises this. |
Keith Whitwell | f3347fe | 2009-12-21 23:44:32 +0000 | [diff] [blame] | 295 | |
| 296 | Queries can be created with ``create_query`` and deleted with |
Brian Paul | 98f3f1c | 2010-01-29 12:36:26 -0700 | [diff] [blame] | 297 | ``destroy_query``. To start a query, use ``begin_query``, and when finished, |
| 298 | use ``end_query`` to end the query. |
| 299 | |
| 300 | ``get_query_result`` is used to retrieve the results of a query. If |
| 301 | the ``wait`` parameter is TRUE, then the ``get_query_result`` call |
| 302 | will block until the results of the query are ready (and TRUE will be |
| 303 | returned). Otherwise, if the ``wait`` parameter is FALSE, the call |
| 304 | will not block and the return value will be TRUE if the query has |
| 305 | completed or FALSE otherwise. |
Keith Whitwell | f3347fe | 2009-12-21 23:44:32 +0000 | [diff] [blame] | 306 | |
Christoph Bumiller | 10f67c0 | 2011-10-20 18:03:23 +0200 | [diff] [blame] | 307 | The interface currently includes the following types of queries: |
| 308 | |
| 309 | ``PIPE_QUERY_OCCLUSION_COUNTER`` counts the number of fragments which |
Corbin Simpson | f1cf6b0 | 2010-05-17 12:00:59 -0700 | [diff] [blame] | 310 | are written to the framebuffer without being culled by |
| 311 | :ref:`Depth, Stencil, & Alpha` testing or shader KILL instructions. |
Brian Paul | 34613c6 | 2011-01-18 16:34:22 -0700 | [diff] [blame] | 312 | The result is an unsigned 64-bit integer. |
Christoph Bumiller | 10f67c0 | 2011-10-20 18:03:23 +0200 | [diff] [blame] | 313 | This query can be used with ``render_condition``. |
| 314 | |
Zack Rusin | 0657fc0 | 2011-01-26 00:01:51 -0500 | [diff] [blame] | 315 | In cases where a boolean result of an occlusion query is enough, |
| 316 | ``PIPE_QUERY_OCCLUSION_PREDICATE`` should be used. It is just like |
| 317 | ``PIPE_QUERY_OCCLUSION_COUNTER`` except that the result is a boolean |
| 318 | value of FALSE for cases where COUNTER would result in 0 and TRUE |
| 319 | for all other cases. |
Christoph Bumiller | 10f67c0 | 2011-10-20 18:03:23 +0200 | [diff] [blame] | 320 | This query can be used with ``render_condition``. |
Corbin Simpson | f1cf6b0 | 2010-05-17 12:00:59 -0700 | [diff] [blame] | 321 | |
Christoph Bumiller | 10f67c0 | 2011-10-20 18:03:23 +0200 | [diff] [blame] | 322 | ``PIPE_QUERY_TIME_ELAPSED`` returns the amount of time, in nanoseconds, |
| 323 | the context takes to perform operations. |
Brian Paul | 34613c6 | 2011-01-18 16:34:22 -0700 | [diff] [blame] | 324 | The result is an unsigned 64-bit integer. |
Corbin Simpson | f1cf6b0 | 2010-05-17 12:00:59 -0700 | [diff] [blame] | 325 | |
Christoph Bumiller | 10f67c0 | 2011-10-20 18:03:23 +0200 | [diff] [blame] | 326 | ``PIPE_QUERY_TIMESTAMP`` returns a device/driver internal timestamp, |
| 327 | scaled to nanoseconds, recorded after all commands issued prior to |
| 328 | ``end_query`` have been processed. |
| 329 | This query does not require a call to ``begin_query``. |
| 330 | The result is an unsigned 64-bit integer. |
| 331 | |
| 332 | ``PIPE_QUERY_TIMESTAMP_DISJOINT`` can be used to check whether the |
| 333 | internal timer resolution is good enough to distinguish between the |
| 334 | events at ``begin_query`` and ``end_query``. |
| 335 | The result is a 64-bit integer specifying the timer resolution in Hz, |
| 336 | followed by a boolean value indicating whether the timer has incremented. |
| 337 | |
| 338 | ``PIPE_QUERY_PRIMITIVES_GENERATED`` returns a 64-bit integer indicating |
| 339 | the number of primitives processed by the pipeline. |
| 340 | |
| 341 | ``PIPE_QUERY_PRIMITIVES_EMITTED`` returns a 64-bit integer indicating |
| 342 | the number of primitives written to stream output buffers. |
| 343 | |
| 344 | ``PIPE_QUERY_SO_STATISTICS`` returns 2 64-bit integers corresponding to |
| 345 | the results of |
| 346 | ``PIPE_QUERY_PRIMITIVES_EMITTED`` and |
| 347 | ``PIPE_QUERY_PRIMITIVES_GENERATED``, in this order. |
| 348 | |
| 349 | ``PIPE_QUERY_SO_OVERFLOW_PREDICATE`` returns a boolean value indicating |
| 350 | whether the stream output targets have overflowed as a result of the |
| 351 | commands issued between ``begin_query`` and ``end_query``. |
| 352 | This query can be used with ``render_condition``. |
| 353 | |
| 354 | ``PIPE_QUERY_GPU_FINISHED`` returns a boolean value indicating whether |
| 355 | all commands issued before ``end_query`` have completed. However, this |
| 356 | does not imply serialization. |
| 357 | This query does not require a call to ``begin_query``. |
| 358 | |
| 359 | ``PIPE_QUERY_PIPELINE_STATISTICS`` returns an array of the following |
| 360 | 64-bit integers: |
| 361 | Number of vertices read from vertex buffers. |
| 362 | Number of primitives read from vertex buffers. |
| 363 | Number of vertex shader threads launched. |
| 364 | Number of geometry shader threads launched. |
| 365 | Number of primitives generated by geometry shaders. |
| 366 | Number of primitives forwarded to the rasterizer. |
| 367 | Number of primitives rasterized. |
| 368 | Number of fragment shader threads launched. |
| 369 | Number of tessellation control shader threads launched. |
| 370 | Number of tessellation evaluation shader threads launched. |
| 371 | If a shader type is not supported by the device/driver, |
| 372 | the corresponding values should be set to 0. |
| 373 | |
Corbin Simpson | f1cf6b0 | 2010-05-17 12:00:59 -0700 | [diff] [blame] | 374 | Gallium does not guarantee the availability of any query types; one must |
| 375 | always check the capabilities of the :ref:`Screen` first. |
Brian Paul | 6c1549a | 2010-01-21 11:52:36 -0700 | [diff] [blame] | 376 | |
| 377 | |
| 378 | Conditional Rendering |
| 379 | ^^^^^^^^^^^^^^^^^^^^^ |
| 380 | |
| 381 | A drawing command can be skipped depending on the outcome of a query |
| 382 | (typically an occlusion query). The ``render_condition`` function specifies |
| 383 | the query which should be checked prior to rendering anything. |
| 384 | |
| 385 | If ``render_condition`` is called with ``query`` = NULL, conditional |
| 386 | rendering is disabled and drawing takes place normally. |
| 387 | |
| 388 | If ``render_condition`` is called with a non-null ``query`` subsequent |
| 389 | drawing commands will be predicated on the outcome of the query. If |
| 390 | the query result is zero subsequent drawing commands will be skipped. |
| 391 | |
| 392 | If ``mode`` is PIPE_RENDER_COND_WAIT the driver will wait for the |
| 393 | query to complete before deciding whether to render. |
| 394 | |
| 395 | If ``mode`` is PIPE_RENDER_COND_NO_WAIT and the query has not yet |
| 396 | completed, the drawing command will be executed normally. If the query |
| 397 | has completed, drawing will be predicated on the outcome of the query. |
| 398 | |
| 399 | If ``mode`` is PIPE_RENDER_COND_BY_REGION_WAIT or |
| 400 | PIPE_RENDER_COND_BY_REGION_NO_WAIT rendering will be predicated as above |
| 401 | for the non-REGION modes but in the case that an occulusion query returns |
| 402 | a non-zero result, regions which were occluded may be ommitted by subsequent |
| 403 | drawing commands. This can result in better performance with some GPUs. |
| 404 | Normally, if the occlusion query returned a non-zero result subsequent |
| 405 | drawing happens normally so fragments may be generated, shaded and |
| 406 | processed even where they're known to be obscured. |
| 407 | |
| 408 | |
Keith Whitwell | f3347fe | 2009-12-21 23:44:32 +0000 | [diff] [blame] | 409 | Flushing |
| 410 | ^^^^^^^^ |
| 411 | |
Corbin Simpson | a524aab | 2009-12-20 19:41:50 -0800 | [diff] [blame] | 412 | ``flush`` |
| 413 | |
Keith Whitwell | f3347fe | 2009-12-21 23:44:32 +0000 | [diff] [blame] | 414 | |
| 415 | Resource Busy Queries |
| 416 | ^^^^^^^^^^^^^^^^^^^^^ |
| 417 | |
Keith Whitwell | 287c94e | 2010-04-10 16:05:54 +0100 | [diff] [blame] | 418 | ``is_resource_referenced`` |
Keith Whitwell | f3347fe | 2009-12-21 23:44:32 +0000 | [diff] [blame] | 419 | |
| 420 | |
| 421 | |
| 422 | Blitting |
| 423 | ^^^^^^^^ |
Corbin Simpson | a524aab | 2009-12-20 19:41:50 -0800 | [diff] [blame] | 424 | |
Roland Scheidegger | 379db6a | 2010-05-17 21:02:24 +0200 | [diff] [blame] | 425 | These methods emulate classic blitter controls. |
Corbin Simpson | a524aab | 2009-12-20 19:41:50 -0800 | [diff] [blame] | 426 | |
Roland Scheidegger | aac2cccc | 2010-04-26 19:50:57 +0200 | [diff] [blame] | 427 | These methods operate directly on ``pipe_resource`` objects, and stand |
Keith Whitwell | f3347fe | 2009-12-21 23:44:32 +0000 | [diff] [blame] | 428 | apart from any 3D state in the context. Blitting functionality may be |
| 429 | moved to a separate abstraction at some point in the future. |
| 430 | |
Roland Scheidegger | 4c70014 | 2010-12-02 04:33:43 +0100 | [diff] [blame] | 431 | ``resource_copy_region`` blits a region of a resource to a region of another |
| 432 | resource, provided that both resources have the same format, or compatible |
| 433 | formats, i.e., formats for which copying the bytes from the source resource |
| 434 | unmodified to the destination resource will achieve the same effect of a |
| 435 | textured quad blitter.. The source and destination may be the same resource, |
| 436 | but overlapping blits are not permitted. |
Marek Olšák | c4df2e3 | 2012-09-12 01:36:31 +0200 | [diff] [blame] | 437 | This can be considered the equivalent of a CPU memcpy. |
| 438 | |
| 439 | ``blit`` blits a region of a resource to a region of another resource, including |
| 440 | scaling, format conversion, and up-/downsampling, as well as |
| 441 | a destination clip rectangle (scissors). |
| 442 | As opposed to manually drawing a textured quad, this lets the pipe driver choose |
| 443 | the optimal method for blitting (like using a special 2D engine), and usually |
| 444 | offers, for example, accelerated stencil-only copies even where |
| 445 | PIPE_CAP_SHADER_STENCIL_EXPORT is not available. |
Roland Scheidegger | aac2cccc | 2010-04-26 19:50:57 +0200 | [diff] [blame] | 446 | |
Keith Whitwell | f3347fe | 2009-12-21 23:44:32 +0000 | [diff] [blame] | 447 | |
Keith Whitwell | 287c94e | 2010-04-10 16:05:54 +0100 | [diff] [blame] | 448 | Transfers |
| 449 | ^^^^^^^^^ |
| 450 | |
| 451 | These methods are used to get data to/from a resource. |
| 452 | |
Marek Olšák | 369e468 | 2012-10-08 04:06:42 +0200 | [diff] [blame^] | 453 | ``transfer_map`` creates a memory mapping and the transfer object |
| 454 | associated with it. |
| 455 | The returned pointer points to the start of the mapped range according to |
| 456 | the box region, not the beginning of the resource. If transfer_map fails, |
| 457 | the returned pointer to the buffer memory is NULL, and the pointer |
| 458 | to the transfer object remains unchanged (i.e. it can be non-NULL). |
Keith Whitwell | 287c94e | 2010-04-10 16:05:54 +0100 | [diff] [blame] | 459 | |
Marek Olšák | 369e468 | 2012-10-08 04:06:42 +0200 | [diff] [blame^] | 460 | ``transfer_unmap`` remove the memory mapping for and destroy |
| 461 | the transfer object. The pointer into the resource should be considered |
| 462 | invalid and discarded. |
Keith Whitwell | 287c94e | 2010-04-10 16:05:54 +0100 | [diff] [blame] | 463 | |
| 464 | ``transfer_inline_write`` performs a simplified transfer for simple writes. |
Marek Olšák | 369e468 | 2012-10-08 04:06:42 +0200 | [diff] [blame^] | 465 | Basically transfer_map, data write, and transfer_unmap all in one. |
Keith Whitwell | 287c94e | 2010-04-10 16:05:54 +0100 | [diff] [blame] | 466 | |
Brian Paul | c5fb051 | 2011-01-28 20:25:27 -0700 | [diff] [blame] | 467 | |
| 468 | The box parameter to some of these functions defines a 1D, 2D or 3D |
| 469 | region of pixels. This is self-explanatory for 1D, 2D and 3D texture |
| 470 | targets. |
| 471 | |
| 472 | For PIPE_TEXTURE_1D_ARRAY, the box::y and box::height fields refer to the |
| 473 | array dimension of the texture. |
| 474 | |
| 475 | For PIPE_TEXTURE_2D_ARRAY, the box::z and box::depth fields refer to the |
| 476 | array dimension of the texture. |
| 477 | |
| 478 | For PIPE_TEXTURE_CUBE, the box:z and box::depth fields refer to the |
| 479 | faces of the cube map (z + depth <= 6). |
| 480 | |
| 481 | |
| 482 | |
Corbin Simpson | bb81f65 | 2010-05-17 12:58:29 -0700 | [diff] [blame] | 483 | .. _transfer_flush_region: |
| 484 | |
| 485 | transfer_flush_region |
| 486 | %%%%%%%%%%%%%%%%%%%%% |
| 487 | |
| 488 | If a transfer was created with ``FLUSH_EXPLICIT``, it will not automatically |
| 489 | be flushed on write or unmap. Flushes must be requested with |
| 490 | ``transfer_flush_region``. Flush ranges are relative to the mapped range, not |
| 491 | the beginning of the resource. |
| 492 | |
Marek Olšák | 588fa88 | 2011-02-09 01:10:11 +0100 | [diff] [blame] | 493 | |
| 494 | |
Marek Olšák | aea4ed4 | 2011-03-08 11:32:35 +0100 | [diff] [blame] | 495 | .. _texture_barrier |
| 496 | |
| 497 | texture_barrier |
| 498 | %%%%%%%%%%%%%%% |
| 499 | |
| 500 | This function flushes all pending writes to the currently-set surfaces and |
| 501 | invalidates all read caches of the currently-set samplers. |
| 502 | |
| 503 | |
| 504 | |
Keith Whitwell | 287c94e | 2010-04-10 16:05:54 +0100 | [diff] [blame] | 505 | .. _pipe_transfer: |
| 506 | |
| 507 | PIPE_TRANSFER |
| 508 | ^^^^^^^^^^^^^ |
| 509 | |
| 510 | These flags control the behavior of a transfer object. |
| 511 | |
José Fonseca | 0562f44 | 2011-02-22 14:14:22 +0000 | [diff] [blame] | 512 | ``PIPE_TRANSFER_READ`` |
| 513 | Resource contents read back (or accessed directly) at transfer create time. |
| 514 | |
| 515 | ``PIPE_TRANSFER_WRITE`` |
Marek Olšák | 369e468 | 2012-10-08 04:06:42 +0200 | [diff] [blame^] | 516 | Resource contents will be written back at transfer_unmap time (or modified |
José Fonseca | 0562f44 | 2011-02-22 14:14:22 +0000 | [diff] [blame] | 517 | as a result of being accessed directly). |
| 518 | |
| 519 | ``PIPE_TRANSFER_MAP_DIRECTLY`` |
| 520 | a transfer should directly map the resource. May return NULL if not supported. |
| 521 | |
| 522 | ``PIPE_TRANSFER_DISCARD_RANGE`` |
| 523 | The memory within the mapped region is discarded. Cannot be used with |
| 524 | ``PIPE_TRANSFER_READ``. |
| 525 | |
| 526 | ``PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE`` |
| 527 | Discards all memory backing the resource. It should not be used with |
| 528 | ``PIPE_TRANSFER_READ``. |
| 529 | |
| 530 | ``PIPE_TRANSFER_DONTBLOCK`` |
| 531 | Fail if the resource cannot be mapped immediately. |
| 532 | |
| 533 | ``PIPE_TRANSFER_UNSYNCHRONIZED`` |
| 534 | Do not synchronize pending operations on the resource when mapping. The |
| 535 | interaction of any writes to the map and any operations pending on the |
| 536 | resource are undefined. Cannot be used with ``PIPE_TRANSFER_READ``. |
| 537 | |
| 538 | ``PIPE_TRANSFER_FLUSH_EXPLICIT`` |
| 539 | Written ranges will be notified later with :ref:`transfer_flush_region`. |
| 540 | Cannot be used with ``PIPE_TRANSFER_READ``. |
Francisco Jerez | d9d82dc | 2012-04-25 22:15:16 +0200 | [diff] [blame] | 541 | |
| 542 | |
| 543 | Compute kernel execution |
| 544 | ^^^^^^^^^^^^^^^^^^^^^^^^ |
| 545 | |
| 546 | A compute program can be defined, bound or destroyed using |
| 547 | ``create_compute_state``, ``bind_compute_state`` or |
| 548 | ``destroy_compute_state`` respectively. |
| 549 | |
| 550 | Any of the subroutines contained within the compute program can be |
| 551 | executed on the device using the ``launch_grid`` method. This method |
| 552 | will execute as many instances of the program as elements in the |
| 553 | specified N-dimensional grid, hopefully in parallel. |
| 554 | |
| 555 | The compute program has access to four special resources: |
| 556 | |
| 557 | * ``GLOBAL`` represents a memory space shared among all the threads |
| 558 | running on the device. An arbitrary buffer created with the |
| 559 | ``PIPE_BIND_GLOBAL`` flag can be mapped into it using the |
| 560 | ``set_global_binding`` method. |
| 561 | |
| 562 | * ``LOCAL`` represents a memory space shared among all the threads |
| 563 | running in the same working group. The initial contents of this |
| 564 | resource are undefined. |
| 565 | |
| 566 | * ``PRIVATE`` represents a memory space local to a single thread. |
| 567 | The initial contents of this resource are undefined. |
| 568 | |
| 569 | * ``INPUT`` represents a read-only memory space that can be |
| 570 | initialized at ``launch_grid`` time. |
| 571 | |
| 572 | These resources use a byte-based addressing scheme, and they can be |
| 573 | accessed from the compute program by means of the LOAD/STORE TGSI |
Francisco Jerez | 5f55cbc | 2012-05-01 02:47:03 +0200 | [diff] [blame] | 574 | opcodes. Additional resources to be accessed using the same opcodes |
| 575 | may be specified by the user with the ``set_compute_resources`` |
| 576 | method. |
Francisco Jerez | d9d82dc | 2012-04-25 22:15:16 +0200 | [diff] [blame] | 577 | |
| 578 | In addition, normal texture sampling is allowed from the compute |
| 579 | program: ``bind_compute_sampler_states`` may be used to set up texture |
| 580 | samplers for the compute stage and ``set_compute_sampler_views`` may |
| 581 | be used to bind a number of sampler views to it. |