Heidi von Markham | fd022c7 | 2016-06-30 10:15:28 -0700 | [diff] [blame] | 1 | page.title=Implementing the Hardware Composer HAL |
| 2 | @jd:body |
| 3 | |
| 4 | <!-- |
| 5 | Copyright 2016 The Android Open Source Project |
| 6 | |
| 7 | Licensed under the Apache License, Version 2.0 (the "License"); |
| 8 | you may not use this file except in compliance with the License. |
| 9 | You may obtain a copy of the License at |
| 10 | |
| 11 | http://www.apache.org/licenses/LICENSE-2.0 |
| 12 | |
| 13 | Unless required by applicable law or agreed to in writing, software |
| 14 | distributed under the License is distributed on an "AS IS" BASIS, |
| 15 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| 16 | See the License for the specific language governing permissions and |
| 17 | limitations under the License. |
| 18 | --> |
| 19 | |
| 20 | <div id="qv-wrapper"> |
| 21 | <div id="qv"> |
| 22 | <h2>In this document</h2> |
| 23 | <ol id="auto-toc"> |
| 24 | </ol> |
| 25 | </div> |
| 26 | </div> |
| 27 | |
| 28 | |
| 29 | <p>The Hardware Composer HAL (HWC) is used by SurfaceFlinger to composite |
| 30 | surfaces to the screen. The HWC abstracts objects such as overlays and 2D |
| 31 | blitters and helps offload some work that would normally be done with OpenGL.</p> |
| 32 | |
| 33 | <p>Android 7.0 includes a new version of HWC (HWC2) used by SurfaceFlinger to |
| 34 | talk to specialized window composition hardware. SurfaceFlinger contains a |
| 35 | fallback path that uses the 3D graphics processor (GPU) to perform the task of |
| 36 | window composition, but this path is not ideal for a couple of reasons:</p> |
| 37 | |
| 38 | <ul> |
| 39 | <li>Typically, GPUs are not optimized for this use case and may use more power |
| 40 | than necessary to perform composition.</li> |
| 41 | <li>Any time SurfaceFlinger is using the GPU for composition is time that |
| 42 | applications cannot use the processor for their own rendering, so it is |
| 43 | preferable to use specialized hardware for composition instead of the GPU |
| 44 | whenever possible.</li> |
| 45 | </ul> |
| 46 | |
| 47 | <h2 id="guidance">General guidance</h2> |
| 48 | |
| 49 | <p>As the physical display hardware behind the Hardware Composer abstraction |
| 50 | layer can vary from device to device, it's difficult to give recommendations on |
| 51 | specific features. In general, use the following guidance:</p> |
| 52 | |
| 53 | <ul> |
| 54 | <li>The HWC should support at least four overlays (status bar, system bar, |
| 55 | application, and wallpaper/background).</li> |
| 56 | <li>Layers can be bigger than the screen, so the HWC should be able to handle |
| 57 | layers that are larger than the display (for example, a wallpaper).</li> |
| 58 | <li>Pre-multiplied per-pixel alpha blending and per-plane alpha blending |
| 59 | should be supported at the same time.</li> |
| 60 | <li>The HWC should be able to consume the same buffers the GPU, camera, and |
| 61 | video decoder are producing, so supporting some of the following |
| 62 | properties is helpful: |
| 63 | <ul> |
| 64 | <li>RGBA packing order</li> |
| 65 | <li>YUV formats</li> |
| 66 | <li>Tiling, swizzling, and stride properties</li> |
| 67 | </ul> |
| 68 | <li>To support protected content, a hardware path for protected video playback |
| 69 | must be present.</li> |
| 70 | </ul> |
| 71 | |
| 72 | <p>The general recommendation is to implement a non-operational HWC first; after |
| 73 | the structure is complete, implement a simple algorithm to delegate composition |
| 74 | to the HWC (for example, delegate only the first three or four surfaces to the |
| 75 | overlay hardware of the HWC).</p> |
| 76 | |
| 77 | <p>Focus on optimization, such as intelligently selecting the surfaces to send |
| 78 | to the overlay hardware that maximizes the load taken off of the GPU. Another |
| 79 | optimization is to detect whether the screen is updating; if it isn't, delegate |
| 80 | composition to OpenGL instead of the HWC to save power. When the screen updates |
| 81 | again, continue to offload composition to the HWC.</p> |
| 82 | |
| 83 | <p>Prepare for common use cases, such as:</p> |
| 84 | |
| 85 | <ul> |
| 86 | <li>Full-screen games in portrait and landscape mode</li> |
| 87 | <li>Full-screen video with closed captioning and playback control</li> |
| 88 | <li>The home screen (compositing the status bar, system bar, application |
| 89 | window, and live wallpapers)</li> |
| 90 | <li>Protected video playback</li> |
| 91 | <li>Multiple display support</li> |
| 92 | </ul> |
| 93 | |
| 94 | <p>These use cases should address regular, predictable uses rather than edge |
| 95 | cases that are rarely encountered (otherwise, optimizations will have little |
| 96 | benefit). Implementations must balance two competing goals: animation smoothness |
| 97 | and interaction latency.</p> |
| 98 | |
| 99 | |
| 100 | <h2 id="interface_activities">HWC2 interface activities</h2> |
| 101 | |
| 102 | <p>HWC2 provides a few primitives (layer, display) to represent composition work |
| 103 | and its interaction with the display hardware.</p> |
| 104 | <p>A <em>layer</em> is the most important unit of composition; every layer has a |
| 105 | set of properties that define how it interacts with other layers. Property |
| 106 | categories include the following:</p> |
| 107 | |
| 108 | <ul> |
| 109 | <li><strong>Positional</strong>. Defines where the layer appears on its display. |
| 110 | Includes information such as the positions of a layer's edges and its <em>Z |
| 111 | order</em> relative to other layers (whether it should be in front of or behind |
| 112 | other layers).</li> |
| 113 | <li><strong>Content</strong>. Defines how content displayed on the layer should |
| 114 | be presented within the bounds defined by the positional properties. Includes |
| 115 | information such as crop (to expand a portion of the content to fill the bounds |
| 116 | of the layer) and transform (to show rotated or flipped content).</li> |
| 117 | <li><strong>Composition</strong>. Defines how the layer should be composited |
| 118 | with other layers. Includes information such as blending mode and a layer-wide |
| 119 | alpha value for |
| 120 | <a href="https://en.wikipedia.org/wiki/Alpha_compositing#Alpha_blending">alpha |
| 121 | compositing</a>.</li> |
| 122 | <li><strong>Optimization</strong>. Provides information not strictly necessary |
| 123 | to correctly composite the layer, but which can be used by the HWC device to |
| 124 | optimize how it performs composition. Includes information such as the visible |
| 125 | region of the layer and which portion of the layer has been updated since the |
| 126 | previous frame.</li> |
| 127 | </ul> |
| 128 | |
| 129 | <p>A <em>display</em> is another important unit of composition. Every layer can |
| 130 | be present only on one display. A system can have multiple displays, and |
| 131 | displays can be added or removed during normal system operations. This |
| 132 | addition/removal can come at the request of the HWC device (typically in |
| 133 | response to an external display being plugged into or removed from the device, |
| 134 | called <em>hotplugging</em>), or at the request of the client, which permits the |
| 135 | creation of <em>virtual displays</em> whose contents are rendered into an |
| 136 | off-screen buffer instead of to a physical display.</p> |
| 137 | <p>HWC2 provides functions to determine the properties of a given display, to |
| 138 | switch between different configurations (e.g., 4k or 1080p resolution) and color |
| 139 | modes (e.g., native color or true sRGB), and to turn the display on, off, or |
| 140 | into a low-power mode if supported.</p> |
| 141 | <p>In addition to layers and displays, HWC2 also provides control over the |
| 142 | hardware vertical sync (VSYNC) signal along with a callback into the client to |
| 143 | notify it of when a vsync event has occurred.</p> |
| 144 | |
| 145 | <h3 id="func_pointers">Function pointers</h3> |
| 146 | <p>In this section and in HWC2 header comments, HWC interface functions are |
| 147 | referred to by lowerCamelCase names that do not actually exist in the interface |
| 148 | as named fields. Instead, almost every function is loaded by requesting a |
| 149 | function pointer using <code>getFunction</code> provided by |
| 150 | <code>hwc2_device_t</code>. For example, the function <code>createLayer</code> |
| 151 | is a function pointer of type <code>HWC2_PFN_CREATE_LAYER</code>, which is |
| 152 | returned when the enumerated value <code>HWC2_FUNCTION_CREATE_LAYER</code> is |
| 153 | passed into <code>getFunction</code>.</p> |
| 154 | <p>For detailed documentation on functions (including functions required for |
Heidi von Markham | 5098904 | 2016-07-15 10:28:50 -0700 | [diff] [blame] | 155 | every HWC2 implementation), refer to the |
| 156 | <a href="{@docRoot}devices/halref/hwcomposer2_8h.html">HWC2 header</a>.</p> |
Heidi von Markham | fd022c7 | 2016-06-30 10:15:28 -0700 | [diff] [blame] | 157 | |
| 158 | <h3 id="layer_display_handles">Layer and display handles</h3> |
| 159 | <p>Layers and displays are manipulated by opaque handles.</p> |
| 160 | <p>When SurfaceFlinger wants to create a new layer, it calls the |
| 161 | <code>createLayer</code> function, which then returns an opaque handle of type |
| 162 | <code>hwc2_layer_t</code>. From that point on, any time SurfaceFlinger wants to |
| 163 | modify a property of that layer, it passes that <code>hwc2_layer_t</code> value |
| 164 | into the appropriate modification function, along with any other information |
| 165 | needed to make the modification. The <code>hwc2_layer_t</code> type was made |
| 166 | large enough to be able to hold either a pointer or an index, and it will be |
| 167 | treated as opaque by SurfaceFlinger to provide HWC implementers maximum |
| 168 | flexibility.</p> |
| 169 | <p>Most of the above also applies to display handles, though handles are created |
| 170 | differently depending on whether they are hotplugged (where the handle is passed |
| 171 | through the hotplug callback) or requested by the client as a virtual display |
| 172 | (where the handle is returned from <code>createVirtualDisplay</code>).</p> |
| 173 | |
| 174 | <h2 id="display_comp_ops">Display composition operations</h2> |
| 175 | <p>Once per hardware vsync, SurfaceFlinger wakes if it has new content to |
| 176 | composite. This new content could be new image buffers from applications or just |
| 177 | a change in the properties of one or more layers. When it wakes, it performs the |
| 178 | following steps:</p> |
| 179 | |
| 180 | <ol> |
| 181 | <li>Apply transactions, if present. Includes changes in the properties of layers |
| 182 | specified by the window manager but not changes in the contents of layers (i.e., |
| 183 | graphic buffers from applications).</li> |
| 184 | <li>Latch new graphic buffers (acquire their handles from their respective |
| 185 | applications), if present.</li> |
| 186 | <li>If step 1 or 2 resulted in a change to the display contents, perform a new |
| 187 | composition (described below).</li> |
| 188 | </ol> |
| 189 | |
| 190 | <p>Steps 1 and 2 have some nuances (such as deferred transactions and |
| 191 | presentation timestamps) that are outside the scope of this section. However, |
| 192 | step 3 involves the HWC interface and is detailed below.</p> |
| 193 | <p>At the beginning of the composition process, SurfaceFlinger will create and |
| 194 | destroy layers or modify layer state as applicable. It will also update the |
| 195 | layers with their current contents, using calls such as |
| 196 | <code>setLayerBuffer</code> or <code>setLayerColor</code>. After all layers have |
| 197 | been updated, it will call <code>validateDisplay</code>, which tells the device |
| 198 | to examine the state of the various layers and determine how composition will |
| 199 | proceed. By default, SurfaceFlinger usually attempts to configure every layer |
| 200 | such that it will be composited by the device, though there may be some |
| 201 | circumstances where it will mandate that it be composited by the client.</p> |
| 202 | <p>After the call to <code>validateDisplay</code>, SurfaceFlinger will follow up |
| 203 | with a call to <code>getChangedCompositionTypes</code> to see if the device |
| 204 | wants any of the layers' composition types changed before performing the actual |
| 205 | composition. SurfaceFlinger may choose to:</p> |
| 206 | |
| 207 | <ul> |
| 208 | <li>Change some of the layer composition types and re-validate the display.</li> |
| 209 | </ul> |
| 210 | |
| 211 | <blockquote><strong><em>OR</strong></em></blockquote> |
| 212 | |
| 213 | <ul> |
| 214 | <li>Call <code>acceptDisplayChanges</code>, which has the same effect as |
| 215 | changing the composition types as requested by the device and re-validating |
| 216 | without actually having to call <code>validateDisplay</code> again.</li> |
| 217 | </ul> |
| 218 | |
| 219 | <p>In practice, SurfaceFlinger always takes the latter path (calling |
| 220 | <code>acceptDisplayChanges</code>) though this behavior may change in the |
| 221 | future.</p> |
| 222 | <p>At this point, the behavior differs depending on whether any of the layers |
| 223 | have been marked for client composition. If any (or all) layers have been marked |
| 224 | for client composition, SurfaceFlinger will now composite all of those layers |
| 225 | into the client target buffer. This buffer will be provided to the device using |
| 226 | the <code>setClientTarget</code> call so that it may be either displayed |
| 227 | directly on the screen or further composited with layers that have not been |
| 228 | marked for client composition. If no layers have been marked for client |
| 229 | composition, then the client composition step is bypassed.</p> |
| 230 | <p>Finally, after all of the state has been validated and client composition has |
| 231 | been performed if needed, SurfaceFlinger will call <code>presentDisplay</code>. |
| 232 | This is the HWC device's cue to complete the composition process and display the |
| 233 | final result.</p> |
| 234 | |
| 235 | <h2 id="multiple_displays">Multiple displays in Android N</h2> |
| 236 | <p>While the HWC2 interface is quite flexible when it comes to the number of |
| 237 | displays in the system, the rest of the Android framework is not yet as |
| 238 | flexible. When designing a HWC2 implementation intended for use on Android N, |
| 239 | there are some additional restrictions not present in the HWC definition itself: |
| 240 | </p> |
| 241 | |
| 242 | <ul> |
| 243 | <li>It is assumed that there is exactly one <em>primary</em> display; that is, |
| 244 | that there is one physical display that will be hotplugged immediately during |
| 245 | the initialization of the device (specifically after the hotplug callback is |
| 246 | registered).</li> |
| 247 | <li>In addition to the primary display, exactly one <em>external</em> display |
| 248 | may be hotplugged during normal operation of the device.</li> |
| 249 | </ul> |
| 250 | |
| 251 | <p>While the SurfaceFlinger operations described above are performed per-display |
| 252 | (eventual goal is to be able to composite displays independently of each other), |
| 253 | they are currently performed sequentially for all active displays, even if only |
| 254 | the contents of one display are updated.</p> |
| 255 | <p>For example, if only the external display is updated, the sequence is:</p> |
| 256 | |
| 257 | <pre> |
| 258 | // Update state for internal display |
| 259 | // Update state for external display |
| 260 | validateDisplay(<internal display>) |
| 261 | validateDisplay(<external display>) |
| 262 | presentDisplay(<internal display>) |
| 263 | presentDisplay(<external display>) |
| 264 | </pre> |
| 265 | |
| 266 | |
| 267 | <h2 id="sync_fences">Synchronization fences</h2> |
| 268 | <p>Synchronization (sync) fences are a crucial aspect of the Android graphics |
| 269 | system. Fences allow CPU work to proceed independently from concurrent GPU work, |
| 270 | blocking only when there is a true dependency.</p> |
| 271 | <p>For example, when an application submits a buffer that is being produced on |
| 272 | the GPU, it will also submit a fence object; this fence signals only when the |
| 273 | GPU has finished writing into the buffer. Since the only part of the system that |
| 274 | truly needs the GPU write to have finished is the display hardware (the hardware |
| 275 | abstracted by the HWC HAL), the graphics pipeline is able to pass this fence |
| 276 | along with the buffer through SurfaceFlinger to the HWC device. Only immediately |
| 277 | before that buffer would be displayed does the device need to actually check |
| 278 | that the fence has signaled.</p> |
| 279 | <p>Sync fences are integrated tightly into HWC2 and organized in the following |
| 280 | categories:</p> |
| 281 | |
| 282 | <ol> |
| 283 | <li>Acquire fences are passed along with input buffers to the |
| 284 | <code>setLayerBuffer</code> and <code>setClientTarget</code> calls. These |
| 285 | represent a pending write into the buffer and must signal before the HWC client |
| 286 | or device attempts to read from the associated buffer to perform composition. |
| 287 | </li> |
| 288 | <li>Release fences are retrieved after the call to <code>presentDisplay</code> |
| 289 | using the <code>getReleaseFences</code> call and are passed back to the |
| 290 | application along with buffers that will be replaced during the next |
| 291 | composition. These represent a pending read from the buffer, and must signal |
| 292 | before the application attempts to write new contents into the buffer.</li> |
| 293 | <li>Retire fences are returned, one per frame, as part of the call to |
| 294 | <code>presentDisplay</code> and represent when the composition of this frame |
| 295 | has completed, or alternately, when the composition result of the prior frame is |
| 296 | no longer needed. For physical displays, this is when the current frame appears |
| 297 | on the screen and can also be interpreted as the time after which it is safe to |
| 298 | write to the client target buffer again (if applicable). For virtual displays, |
| 299 | this is the time when it is safe to read from the output buffer.</li> |
| 300 | </ol> |
| 301 | |
| 302 | <h3 id="hwc2_changes">Changes in HWC2</h3> |
| 303 | <p>The meaning of sync fences in HWC 2.0 has changed significantly relative to |
| 304 | previous versions of the HAL.</p> |
| 305 | <p>In HWC v1.x, the release and retire fences were speculative. A release fence |
| 306 | for a buffer or a retire fence for the display retrieved in frame N would not |
| 307 | signal any sooner than frame N + 1. In other words, the meaning of the fence |
| 308 | was "the content of the buffer you provided for frame N is no longer needed." |
| 309 | This is speculative because in theory SurfaceFlinger may not run again after |
| 310 | frame N for an indeterminate period of time, which would leave those fences |
| 311 | unsignaled for the same period.</p> |
| 312 | <p>In HWC 2.0, release and retire fences are non-speculative. A release or |
| 313 | retire fence retrieved in frame N will signal as soon as the content of the |
| 314 | associated buffers replaces the contents of the buffers from frame N - 1, or in |
| 315 | other words, the meaning of the fence is "the content of the buffer you provided |
| 316 | for frame N has now replaced the previous content." This is non-speculative, |
| 317 | since this fence should signal shortly after <code>presentDisplay</code> is |
| 318 | called as soon as the hardware presents this frame's content.</p> |
Heidi von Markham | 5098904 | 2016-07-15 10:28:50 -0700 | [diff] [blame] | 319 | <p>For implementation details, refer to the |
| 320 | <a href="{@docRoot}devices/halref/hwcomposer2_8h.html">HWC2 header</a>.</p> |