| Clay Murphy | e3ae396 | 2014-09-02 17:30:57 -0700 | [diff] [blame] | 1 | page.title=Graphics architecture | 
| Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 2 | @jd:body | 
 | 3 |  | 
 | 4 | <!-- | 
 | 5 |     Copyright 2014 The Android Open Source Project | 
 | 6 |  | 
 | 7 |     Licensed under the Apache License, Version 2.0 (the "License"); | 
 | 8 |     you may not use this file except in compliance with the License. | 
 | 9 |     You may obtain a copy of the License at | 
 | 10 |  | 
 | 11 |         http://www.apache.org/licenses/LICENSE-2.0 | 
 | 12 |  | 
 | 13 |     Unless required by applicable law or agreed to in writing, software | 
 | 14 |     distributed under the License is distributed on an "AS IS" BASIS, | 
 | 15 |     WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | 
 | 16 |     See the License for the specific language governing permissions and | 
 | 17 |     limitations under the License. | 
 | 18 | --> | 
 | 19 | <div id="qv-wrapper"> | 
 | 20 |   <div id="qv"> | 
 | 21 |     <h2>In this document</h2> | 
 | 22 |     <ol id="auto-toc"> | 
 | 23 |     </ol> | 
 | 24 |   </div> | 
 | 25 | </div> | 
 | 26 |  | 
 | 27 |  | 
 | 28 | <p><em>What every developer should know about Surface, SurfaceHolder, EGLSurface, | 
 | 29 | SurfaceView, GLSurfaceView, SurfaceTexture, TextureView, and SurfaceFlinger</em> | 
 | 30 | </p> | 
 | 31 | <p>This document describes the essential elements of Android's "system-level" | 
 | 32 |   graphics architecture, and how it is used by the application framework and | 
 | 33 |   multimedia system.  The focus is on how buffers of graphical data move through | 
 | 34 |   the system.  If you've ever wondered why SurfaceView and TextureView behave the | 
 | 35 |   way they do, or how Surface and EGLSurface interact, you've come to the right | 
 | 36 | place.</p> | 
 | 37 |  | 
 | 38 | <p>Some familiarity with Android devices and application development is assumed. | 
 | 39 | You don't need detailed knowledge of the app framework, and very few API calls | 
 | 40 | will be mentioned, but the material herein doesn't overlap much with other | 
 | 41 | public documentation.  The goal here is to provide a sense for the significant | 
 | 42 | events involved in rendering a frame for output, so that you can make informed | 
 | 43 | choices when designing an application.  To achieve this, we work from the bottom | 
 | 44 | up, describing how the UI classes work rather than how they can be used.</p> | 
 | 45 |  | 
 | 46 | <p>Early sections contain background material used in later sections, so it's a | 
 | 47 | good idea to read straight through rather than skipping to a section that sounds | 
 | 48 | interesting.  We start with an explanation of Android's graphics buffers, | 
 | 49 | describe the composition and display mechanism, and then proceed to the | 
 | 50 | higher-level mechanisms that supply the compositor with data.</p> | 
 | 51 |  | 
 | 52 | <p>This document is chiefly concerned with the system as it exists in Android 4.4 | 
 | 53 | ("KitKat").  Earlier versions of the system worked differently, and future | 
 | 54 | versions will likely be different as well.  Version-specific features are called | 
 | 55 | out in a few places.</p> | 
 | 56 |  | 
 | 57 | <p>At various points I will refer to source code from the AOSP sources or from | 
 | 58 | Grafika.  Grafika is a Google open-source project for testing; it can be found at | 
 | 59 | <a | 
 | 60 | href="https://github.com/google/grafika">https://github.com/google/grafika</a>. | 
 | 61 | It's more "quick hack" than solid example code, but it will suffice.</p> | 
 | 62 | <h2 id="BufferQueue">BufferQueue and gralloc</h2> | 
 | 63 |  | 
 | 64 | <p>To understand how Android's graphics system works, we have to start behind the | 
 | 65 | scenes.  At the heart of everything graphical in Android is a class called | 
 | 66 | BufferQueue.  Its role is simple enough: connect something that generates | 
 | 67 | buffers of graphical data (the "producer") to something that accepts the data | 
 | 68 | for display or further processing (the "consumer").  The producer and consumer | 
 | 69 | can live in different processes.  Nearly everything that moves buffers of | 
 | 70 | graphical data through the system relies on BufferQueue.</p> | 
 | 71 |  | 
 | 72 | <p>The basic usage is straightforward.  The producer requests a free buffer | 
 | 73 | (<code>dequeueBuffer()</code>), specifying a set of characteristics including width, | 
 | 74 | height, pixel format, and usage flags.  The producer populates the buffer and | 
 | 75 | returns it to the queue (<code>queueBuffer()</code>).  Some time later, the consumer | 
 | 76 | acquires the buffer (<code>acquireBuffer()</code>) and makes use of the buffer contents. | 
 | 77 | When the consumer is done, it returns the buffer to the queue | 
 | 78 | (<code>releaseBuffer()</code>).</p> | 
 | 79 |  | 
 | 80 | <p>Most recent Android devices support the "sync framework".  This allows the | 
 | 81 | system to do some nifty thing when combined with hardware components that can | 
 | 82 | manipulate graphics data asynchronously.  For example, a producer can submit a | 
 | 83 | series of OpenGL ES drawing commands and then enqueue the output buffer before | 
 | 84 | rendering completes.  The buffer is accompanied by a fence that signals when the | 
 | 85 | contents are ready.  A second fence accompanies the buffer when it is returned | 
 | 86 | to the free list, so that the consumer can release the buffer while the contents | 
 | 87 | are still in use.  This approach improves latency and throughput as the buffers | 
 | 88 | move through the system.</p> | 
 | 89 |  | 
 | 90 | <p>Some characteristics of the queue, such as the maximum number of buffers it can | 
 | 91 | hold, are determined jointly by the producer and the consumer.</p> | 
 | 92 |  | 
 | 93 | <p>The BufferQueue is responsible for allocating buffers as it needs them.  Buffers | 
 | 94 | are retained unless the characteristics change; for example, if the producer | 
 | 95 | starts requesting buffers with a different size, the old buffers will be freed | 
 | 96 | and new buffers will be allocated on demand.</p> | 
 | 97 |  | 
 | 98 | <p>The data structure is currently always created and "owned" by the consumer.  In | 
 | 99 | Android 4.3 only the producer side was "binderized", i.e. the producer could be | 
 | 100 | in a remote process but the consumer had to live in the process where the queue | 
 | 101 | was created.  This evolved a bit in 4.4, moving toward a more general | 
 | 102 | implementation.</p> | 
 | 103 |  | 
 | 104 | <p>Buffer contents are never copied by BufferQueue.  Moving that much data around | 
 | 105 | would be very inefficient.  Instead, buffers are always passed by handle.</p> | 
 | 106 |  | 
 | 107 | <h3 id="gralloc_HAL">gralloc HAL</h3> | 
 | 108 |  | 
 | 109 | <p>The actual buffer allocations are performed through a memory allocator called | 
 | 110 | "gralloc", which is implemented through a vendor-specific HAL interface (see | 
 | 111 | <a | 
 | 112 | href="https://android.googlesource.com/platform/hardware/libhardware/+/kitkat-release/include/hardware/gralloc.h">hardware/libhardware/include/hardware/gralloc.h</a>). | 
 | 113 | The <code>alloc()</code> function takes the arguments you'd expect -- width, | 
 | 114 | height, pixel format -- as well as a set of usage flags.  Those flags merit | 
 | 115 | closer attention.</p> | 
 | 116 |  | 
 | 117 | <p>The gralloc allocator is not just another way to allocate memory on the native | 
 | 118 | heap.  In some situations, the allocated memory may not be cache-coherent, or | 
 | 119 | could be totally inaccessible from user space.  The nature of the allocation is | 
 | 120 | determined by the usage flags, which include attributes like:</p> | 
 | 121 |  | 
 | 122 | <ul> | 
 | 123 | <li>how often the memory will be accessed from software (CPU)</li> | 
 | 124 | <li>how often the memory will be accessed from hardware (GPU)</li> | 
 | 125 | <li>whether the memory will be used as an OpenGL ES ("GLES") texture</li> | 
 | 126 | <li>whether the memory will be used by a video encoder</li> | 
 | 127 | </ul> | 
 | 128 |  | 
 | 129 | <p>For example, if your format specifies RGBA 8888 pixels, and you indicate | 
 | 130 | the buffer will be accessed from software -- meaning your application will touch | 
 | 131 | pixels directly -- then the allocator needs to create a buffer with 4 bytes per | 
 | 132 | pixel in R-G-B-A order.  If instead you say the buffer will only be | 
 | 133 | accessed from hardware and as a GLES texture, the allocator can do anything the | 
 | 134 | GLES driver wants -- BGRA ordering, non-linear "swizzled" layouts, alternative | 
 | 135 | color formats, etc.  Allowing the hardware to use its preferred format can | 
 | 136 | improve performance.</p> | 
 | 137 |  | 
 | 138 | <p>Some values cannot be combined on certain platforms.  For example, the "video | 
 | 139 | encoder" flag may require YUV pixels, so adding "software access" and specifying | 
 | 140 | RGBA 8888 would fail.</p> | 
 | 141 |  | 
 | 142 | <p>The handle returned by the gralloc allocator can be passed between processes | 
 | 143 | through Binder.</p> | 
 | 144 |  | 
 | 145 | <h2 id="SurfaceFlinger">SurfaceFlinger and Hardware Composer</h2> | 
 | 146 |  | 
 | 147 | <p>Having buffers of graphical data is wonderful, but life is even better when you | 
 | 148 | get to see them on your device's screen.  That's where SurfaceFlinger and the | 
 | 149 | Hardware Composer HAL come in.</p> | 
 | 150 |  | 
 | 151 | <p>SurfaceFlinger's role is to accept buffers of data from multiple sources, | 
 | 152 | composite them, and send them to the display.  Once upon a time this was done | 
 | 153 | with software blitting to a hardware framebuffer (e.g. | 
 | 154 | <code>/dev/graphics/fb0</code>), but those days are long gone.</p> | 
 | 155 |  | 
 | 156 | <p>When an app comes to the foreground, the WindowManager service asks | 
 | 157 | SurfaceFlinger for a drawing surface.  SurfaceFlinger creates a "layer" - the | 
 | 158 | primary component of which is a BufferQueue - for which SurfaceFlinger acts as | 
 | 159 | the consumer.  A Binder object for the producer side is passed through the | 
 | 160 | WindowManager to the app, which can then start sending frames directly to | 
 | 161 | SurfaceFlinger.  (Note: The WindowManager uses the term "window" instead of | 
 | 162 | "layer" for this and uses "layer" to mean something else.  We're going to use the | 
 | 163 | SurfaceFlinger terminology.  It can be argued that SurfaceFlinger should really | 
 | 164 | be called LayerFlinger.)</p> | 
 | 165 |  | 
 | 166 | <p>For most apps, there will be three layers on screen at any time: the "status | 
 | 167 | bar" at the top of the screen, the "navigation bar" at the bottom or side, and | 
 | 168 | the application's UI.  Some apps will have more or less, e.g. the default home app has a | 
 | 169 | separate layer for the wallpaper, while a full-screen game might hide the status | 
 | 170 | bar.  Each layer can be updated independently.  The status and navigation bars | 
 | 171 | are rendered by a system process, while the app layers are rendered by the app, | 
 | 172 | with no coordination between the two.</p> | 
 | 173 |  | 
 | 174 | <p>Device displays refresh at a certain rate, typically 60 frames per second on | 
 | 175 | phones and tablets.  If the display contents are updated mid-refresh, "tearing" | 
 | 176 | will be visible; so it's important to update the contents only between cycles. | 
 | 177 | The system receives a signal from the display when it's safe to update the | 
 | 178 | contents.  For historical reasons we'll call this the VSYNC signal.</p> | 
 | 179 |  | 
 | 180 | <p>The refresh rate may vary over time, e.g. some mobile devices will range from 58 | 
 | 181 | to 62fps depending on current conditions.  For an HDMI-attached television, this | 
 | 182 | could theoretically dip to 24 or 48Hz to match a video.  Because we can update | 
 | 183 | the screen only once per refresh cycle, submitting buffers for display at | 
 | 184 | 200fps would be a waste of effort as most of the frames would never be seen. | 
 | 185 | Instead of taking action whenever an app submits a buffer, SurfaceFlinger wakes | 
 | 186 | up when the display is ready for something new.</p> | 
 | 187 |  | 
 | 188 | <p>When the VSYNC signal arrives, SurfaceFlinger walks through its list of layers | 
 | 189 | looking for new buffers.  If it finds a new one, it acquires it; if not, it | 
 | 190 | continues to use the previously-acquired buffer.  SurfaceFlinger always wants to | 
 | 191 | have something to display, so it will hang on to one buffer.  If no buffers have | 
 | 192 | ever been submitted on a layer, the layer is ignored.</p> | 
 | 193 |  | 
 | 194 | <p>Once SurfaceFlinger has collected all of the buffers for visible layers, it | 
 | 195 | asks the Hardware Composer how composition should be performed.</p> | 
 | 196 |  | 
 | 197 | <h3 id="hwcomposer">Hardware Composer</h3> | 
 | 198 |  | 
 | 199 | <p>The Hardware Composer HAL ("HWC") was first introduced in Android 3.0 | 
 | 200 | ("Honeycomb") and has evolved steadily over the years.  Its primary purpose is | 
 | 201 | to determine the most efficient way to composite buffers with the available | 
 | 202 | hardware.  As a HAL, its implementation is device-specific and usually | 
 | 203 | implemented by the display hardware OEM.</p> | 
 | 204 |  | 
 | 205 | <p>The value of this approach is easy to recognize when you consider "overlay | 
 | 206 | planes."  The purpose of overlay planes is to composite multiple buffers | 
 | 207 | together, but in the display hardware rather than the GPU.  For example, suppose | 
 | 208 | you have a typical Android phone in portrait orientation, with the status bar on | 
 | 209 | top and navigation bar at the bottom, and app content everywhere else.  The contents | 
 | 210 | for each layer are in separate buffers.  You could handle composition by | 
 | 211 | rendering the app content into a scratch buffer, then rendering the status bar | 
 | 212 | over it, then rendering the navigation bar on top of that, and finally passing the | 
 | 213 | scratch buffer to the display hardware.  Or, you could pass all three buffers to | 
 | 214 | the display hardware, and tell it to read data from different buffers for | 
 | 215 | different parts of the screen.  The latter approach can be significantly more | 
 | 216 | efficient.</p> | 
 | 217 |  | 
 | 218 | <p>As you might expect, the capabilities of different display processors vary | 
 | 219 | significantly.  The number of overlays, whether layers can be rotated or | 
 | 220 | blended, and restrictions on positioning and overlap can be difficult to express | 
 | 221 | through an API.  So, the HWC works like this:</p> | 
 | 222 |  | 
 | 223 | <ol> | 
 | 224 | <li>SurfaceFlinger provides the HWC with a full list of layers, and asks, "how do | 
 | 225 | you want to handle this?"</li> | 
 | 226 | <li>The HWC responds by marking each layer as "overlay" or "GLES composition."</li> | 
 | 227 | <li>SurfaceFlinger takes care of any GLES composition, passing the output buffer | 
 | 228 | to HWC, and lets HWC handle the rest.</li> | 
 | 229 | </ol> | 
 | 230 |  | 
 | 231 | <p>Since the decision-making code can be custom tailored by the hardware vendor, | 
 | 232 | it's possible to get the best performance out of every device.</p> | 
 | 233 |  | 
 | 234 | <p>Overlay planes may be less efficient than GL composition when nothing on the | 
 | 235 | screen is changing.  This is particularly true when the overlay contents have | 
 | 236 | transparent pixels, and overlapping layers are being blended together.  In such | 
 | 237 | cases, the HWC can choose to request GLES composition for some or all layers | 
 | 238 | and retain the composited buffer.  If SurfaceFlinger comes back again asking to | 
 | 239 | composite the same set of buffers, the HWC can just continue to show the | 
 | 240 | previously-composited scratch buffer.  This can improve the battery life of an | 
 | 241 | idle device.</p> | 
 | 242 |  | 
 | 243 | <p>Devices shipping with Android 4.4 ("KitKat") typically support four overlay | 
 | 244 | planes.  Attempting to composite more layers than there are overlays will cause | 
 | 245 | the system to use GLES composition for some of them; so the number of layers | 
 | 246 | used by an application can have a measurable impact on power consumption and | 
 | 247 | performance.</p> | 
 | 248 |  | 
 | 249 | <p>You can see exactly what SurfaceFlinger is up to with the command <code>adb shell | 
 | 250 | dumpsys SurfaceFlinger</code>.  The output is verbose.  The part most relevant to our | 
 | 251 | current discussion is the HWC summary that appears near the bottom of the | 
 | 252 | output:</p> | 
 | 253 |  | 
 | 254 | <pre> | 
 | 255 |     type    |          source crop              |           frame           name | 
 | 256 | ------------+-----------------------------------+-------------------------------- | 
 | 257 |         HWC | [    0.0,    0.0,  320.0,  240.0] | [   48,  411, 1032, 1149] SurfaceView | 
 | 258 |         HWC | [    0.0,   75.0, 1080.0, 1776.0] | [    0,   75, 1080, 1776] com.android.grafika/com.android.grafika.PlayMovieSurfaceActivity | 
 | 259 |         HWC | [    0.0,    0.0, 1080.0,   75.0] | [    0,    0, 1080,   75] StatusBar | 
 | 260 |         HWC | [    0.0,    0.0, 1080.0,  144.0] | [    0, 1776, 1080, 1920] NavigationBar | 
 | 261 |   FB TARGET | [    0.0,    0.0, 1080.0, 1920.0] | [    0,    0, 1080, 1920] HWC_FRAMEBUFFER_TARGET | 
 | 262 | </pre> | 
 | 263 |  | 
 | 264 | <p>This tells you what layers are on screen, whether they're being handled with | 
 | 265 | overlays ("HWC") or OpenGL ES composition ("GLES"), and gives you a bunch of | 
 | 266 | other facts you probably won't care about ("handle" and "hints" and "flags" and | 
 | 267 | other stuff that we've trimmed out of the snippet above).  The "source crop" and | 
 | 268 | "frame" values will be examined more closely later on.</p> | 
 | 269 |  | 
 | 270 | <p>The FB_TARGET layer is where GLES composition output goes.  Since all layers | 
 | 271 | shown above are using overlays, FB_TARGET isn’t being used for this frame. The | 
 | 272 | layer's name is indicative of its original role: On a device with | 
 | 273 | <code>/dev/graphics/fb0</code> and no overlays, all composition would be done | 
 | 274 | with GLES, and the output would be written to the framebuffer.  On recent devices there | 
 | 275 | generally is no simple framebuffer, so the FB_TARGET layer is a scratch buffer. | 
 | 276 | (Note: This is why screen grabbers written for old versions of Android no | 
 | 277 | longer work: They're trying to read from The Framebuffer, but there is no such | 
 | 278 | thing.)</p> | 
 | 279 |  | 
 | 280 | <p>The overlay planes have another important role: they're the only way to display | 
 | 281 | DRM content.  DRM-protected buffers cannot be accessed by SurfaceFlinger or the | 
 | 282 | GLES driver, which means that your video will disappear if HWC switches to GLES | 
 | 283 | composition.</p> | 
 | 284 |  | 
 | 285 | <h3 id="triple-buffering">The Need for Triple-Buffering</h3> | 
 | 286 |  | 
 | 287 | <p>To avoid tearing on the display, the system needs to be double-buffered: the | 
 | 288 | front buffer is displayed while the back buffer is being prepared.  At VSYNC, if | 
 | 289 | the back buffer is ready, you quickly switch them.  This works reasonably well | 
 | 290 | in a system where you're drawing directly into the framebuffer, but there's a | 
 | 291 | hitch in the flow when a composition step is added.  Because of the way | 
 | 292 | SurfaceFlinger is triggered, our double-buffered pipeline will have a bubble.</p> | 
 | 293 |  | 
 | 294 | <p>Suppose frame N is being displayed, and frame N+1 has been acquired by | 
 | 295 | SurfaceFlinger for display on the next VSYNC.  (Assume frame N is composited | 
 | 296 | with an overlay, so we can't alter the buffer contents until the display is done | 
 | 297 | with it.)  When VSYNC arrives, HWC flips the buffers.  While the app is starting | 
 | 298 | to render frame N+2 into the buffer that used to hold frame N, SurfaceFlinger is | 
 | 299 | scanning the layer list, looking for updates.  SurfaceFlinger won't find any new | 
 | 300 | buffers, so it prepares to show frame N+1 again after the next VSYNC.  A little | 
 | 301 | while later, the app finishes rendering frame N+2 and queues it for | 
 | 302 | SurfaceFlinger, but it's too late.  This has effectively cut our maximum frame | 
 | 303 | rate in half.</p> | 
 | 304 |  | 
 | 305 | <p>We can fix this with triple-buffering.  Just before VSYNC, frame N is being | 
 | 306 | displayed, frame N+1 has been composited (or scheduled for an overlay) and is | 
 | 307 | ready to be displayed, and frame N+2 is queued up and ready to be acquired by | 
 | 308 | SurfaceFlinger.  When the screen flips, the buffers rotate through the stages | 
 | 309 | with no bubble.  The app has just less than a full VSYNC period (16.7ms at 60fps) to | 
 | 310 | do its rendering and queue the buffer. And SurfaceFlinger / HWC has a full VSYNC | 
 | 311 | period to figure out the composition before the next flip.  The downside is | 
 | 312 | that it takes at least two VSYNC periods for anything that the app does to | 
 | 313 | appear on the screen.  As the latency increases, the device feels less | 
 | 314 | responsive to touch input.</p> | 
 | 315 |  | 
 | 316 | <img src="images/surfaceflinger_bufferqueue.png" alt="SurfaceFlinger with BufferQueue" /> | 
 | 317 |  | 
 | 318 | <p class="img-caption"> | 
 | 319 |   <strong>Figure 1.</strong> SurfaceFlinger + BufferQueue | 
 | 320 | </p> | 
 | 321 |  | 
 | 322 | <p>The diagram above depicts the flow of SurfaceFlinger and BufferQueue. During | 
 | 323 | frame:</p> | 
 | 324 |  | 
 | 325 | <ol> | 
 | 326 | <li>red buffer fills up, then slides into BufferQueue</li> | 
 | 327 | <li>after red buffer leaves app, blue buffer slides in, replacing it</li> | 
 | 328 | <li>green buffer and systemUI* shadow-slide into HWC (showing that SurfaceFlinger | 
 | 329 | still has the buffers, but now HWC has prepared them for display via overlay on | 
 | 330 | the next VSYNC).</li> | 
 | 331 | </ol> | 
 | 332 |  | 
 | 333 | <p>The blue buffer is referenced by both the display and the BufferQueue.  The | 
 | 334 | app is not allowed to render to it until the associated sync fence signals.</p> | 
 | 335 |  | 
 | 336 | <p>On VSYNC, all of these happen at once:</p> | 
 | 337 |  | 
 | 338 | <ul> | 
 | 339 | <li>red buffer leaps into SurfaceFlinger, replacing green buffer</li> | 
 | 340 | <li>green buffer leaps into Display, replacing blue buffer, and a dotted-line | 
 | 341 | green twin appears in the BufferQueue</li> | 
 | 342 | <li>the blue buffer’s fence is signaled, and the blue buffer in App empties**</li> | 
 | 343 | <li>display rect changes from <blue + SystemUI> to <green + | 
 | 344 | SystemUI></li> | 
 | 345 | </ul> | 
 | 346 |  | 
 | 347 | <p><strong>*</strong> - The System UI process is providing the status and nav | 
 | 348 | bars, which for our purposes here aren’t changing, so SurfaceFlinger keeps using | 
 | 349 | the previously-acquired buffer.  In practice there would be two separate | 
 | 350 | buffers, one for the status bar at the top, one for the navigation bar at the | 
 | 351 | bottom, and they would be sized to fit their contents.  Each would arrive on its | 
 | 352 | own BufferQueue.</p> | 
 | 353 |  | 
 | 354 | <p><strong>**</strong> - The buffer doesn’t actually “empty”; if you submit it | 
 | 355 | without drawing on it you’ll get that same blue again.  The emptying is the | 
 | 356 | result of clearing the buffer contents, which the app should do before it starts | 
 | 357 | drawing.</p> | 
 | 358 |  | 
 | 359 | <p>We can reduce the latency by noting layer composition should not require a | 
 | 360 | full VSYNC period.  If composition is performed by overlays, it takes essentially | 
 | 361 | zero CPU and GPU time. But we can't count on that, so we need to allow a little | 
 | 362 | time.  If the app starts rendering halfway between VSYNC signals, and | 
 | 363 | SurfaceFlinger defers the HWC setup until a few milliseconds before the signal | 
 | 364 | is due to arrive, we can cut the latency from 2 frames to perhaps 1.5.  In | 
 | 365 | theory you could render and composite in a single period, allowing a return to | 
 | 366 | double-buffering; but getting it down that far is difficult on current devices. | 
 | 367 | Minor fluctuations in rendering and composition time, and switching from | 
 | 368 | overlays to GLES composition, can cause us to miss a swap deadline and repeat | 
 | 369 | the previous frame.</p> | 
 | 370 |  | 
 | 371 | <p>SurfaceFlinger's buffer handling demonstrates the fence-based buffer | 
 | 372 | management mentioned earlier.  If we're animating at full speed, we need to | 
 | 373 | have an acquired buffer for the display ("front") and an acquired buffer for | 
 | 374 | the next flip ("back").  If we're showing the buffer on an overlay, the | 
 | 375 | contents are being accessed directly by the display and must not be touched. | 
 | 376 | But if you look at an active layer's BufferQueue state in the <code>dumpsys | 
 | 377 | SurfaceFlinger</code> output, you'll see one acquired buffer, one queued buffer, and | 
 | 378 | one free buffer.  That's because, when SurfaceFlinger acquires the new "back" | 
 | 379 | buffer, it releases the current "front" buffer to the queue.  The "front" | 
 | 380 | buffer is still in use by the display, so anything that dequeues it must wait | 
 | 381 | for the fence to signal before drawing on it.  So long as everybody follows | 
 | 382 | the fencing rules, all of the queue-management IPC requests can happen in | 
 | 383 | parallel with the display.</p> | 
 | 384 |  | 
 | 385 | <h3 id="virtual-displays">Virtual Displays</h3> | 
 | 386 |  | 
 | 387 | <p>SurfaceFlinger supports a "primary" display, i.e. what's built into your phone | 
 | 388 | or tablet, and an "external" display, such as a television connected through | 
 | 389 | HDMI.  It also supports a number of "virtual" displays, which make composited | 
 | 390 | output available within the system.  Virtual displays can be used to record the | 
 | 391 | screen or send it over a network.</p> | 
 | 392 |  | 
 | 393 | <p>Virtual displays may share the same set of layers as the main display | 
 | 394 | (the "layer stack") or have its own set.  There is no VSYNC for a virtual | 
 | 395 | display, so the VSYNC for the primary display is used to trigger composition for | 
 | 396 | all displays.</p> | 
 | 397 |  | 
 | 398 | <p>In the past, virtual displays were always composited with GLES.  The Hardware | 
 | 399 | Composer managed composition for only the primary display.  In Android 4.4, the | 
 | 400 | Hardware Composer gained the ability to participate in virtual display | 
 | 401 | composition.</p> | 
 | 402 |  | 
 | 403 | <p>As you might expect, the frames generated for a virtual display are written to a | 
 | 404 | BufferQueue.</p> | 
 | 405 |  | 
 | 406 | <h3 id="screenrecord">Case study: screenrecord</h3> | 
 | 407 |  | 
 | 408 | <p>Now that we've established some background on BufferQueue and SurfaceFlinger, | 
 | 409 | it's useful to examine a practical use case.</p> | 
 | 410 |  | 
 | 411 | <p>The <a href="https://android.googlesource.com/platform/frameworks/av/+/kitkat-release/cmds/screenrecord/">screenrecord | 
 | 412 | command</a>, | 
 | 413 | introduced in Android 4.4, allows you to record everything that appears on the | 
 | 414 | screen as an .mp4 file on disk.  To implement this, we have to receive composited | 
 | 415 | frames from SurfaceFlinger, write them to the video encoder, and then write the | 
 | 416 | encoded video data to a file.  The video codecs are managed by a separate | 
 | 417 | process - called "mediaserver" - so we have to move large graphics buffers around | 
 | 418 | the system.  To make it more challenging, we're trying to record 60fps video at | 
 | 419 | full resolution.  The key to making this work efficiently is BufferQueue.</p> | 
 | 420 |  | 
 | 421 | <p>The MediaCodec class allows an app to provide data as raw bytes in buffers, or | 
 | 422 | through a Surface.  We'll discuss Surface in more detail later, but for now just | 
 | 423 | think of it as a wrapper around the producer end of a BufferQueue.  When | 
 | 424 | screenrecord requests access to a video encoder, mediaserver creates a | 
 | 425 | BufferQueue and connects itself to the consumer side, and then passes the | 
 | 426 | producer side back to screenrecord as a Surface.</p> | 
 | 427 |  | 
 | 428 | <p>The screenrecord command then asks SurfaceFlinger to create a virtual display | 
 | 429 | that mirrors the main display (i.e. it has all of the same layers), and directs | 
 | 430 | it to send output to the Surface that came from mediaserver.  Note that, in this | 
 | 431 | case, SurfaceFlinger is the producer of buffers rather than the consumer.</p> | 
 | 432 |  | 
 | 433 | <p>Once the configuration is complete, screenrecord can just sit and wait for | 
 | 434 | encoded data to appear.  As apps draw, their buffers travel to SurfaceFlinger, | 
 | 435 | which composites them into a single buffer that gets sent directly to the video | 
 | 436 | encoder in mediaserver.  The full frames are never even seen by the screenrecord | 
 | 437 | process.  Internally, mediaserver has its own way of moving buffers around that | 
 | 438 | also passes data by handle, minimizing overhead.</p> | 
 | 439 |  | 
 | 440 | <h3 id="simulate-secondary">Case study: Simulate Secondary Displays</h3> | 
 | 441 |  | 
 | 442 | <p>The WindowManager can ask SurfaceFlinger to create a visible layer for which | 
 | 443 | SurfaceFlinger will act as the BufferQueue consumer.  It's also possible to ask | 
 | 444 | SurfaceFlinger to create a virtual display, for which SurfaceFlinger will act as | 
 | 445 | the BufferQueue producer.  What happens if you connect them, configuring a | 
 | 446 | virtual display that renders to a visible layer?</p> | 
 | 447 |  | 
 | 448 | <p>You create a closed loop, where the composited screen appears in a window.  Of | 
 | 449 | course, that window is now part of the composited output, so on the next refresh | 
 | 450 | the composited image inside the window will show the window contents as well. | 
 | 451 | It's turtles all the way down.  You can see this in action by enabling | 
 | 452 | "<a href="http://developer.android.com/tools/index.html">Developer options</a>" in | 
 | 453 | settings, selecting "Simulate secondary displays", and enabling a window.  For | 
 | 454 | bonus points, use screenrecord to capture the act of enabling the display, then | 
 | 455 | play it back frame-by-frame.</p> | 
 | 456 |  | 
 | 457 | <h2 id="surface">Surface and SurfaceHolder</h2> | 
 | 458 |  | 
 | 459 | <p>The <a | 
 | 460 | href="http://developer.android.com/reference/android/view/Surface.html">Surface</a> | 
 | 461 | class has been part of the public API since 1.0.  Its description simply says, | 
 | 462 | "Handle onto a raw buffer that is being managed by the screen compositor."  The | 
 | 463 | statement was accurate when initially written but falls well short of the mark | 
 | 464 | on a modern system.</p> | 
 | 465 |  | 
 | 466 | <p>The Surface represents the producer side of a buffer queue that is often (but | 
 | 467 | not always!) consumed by SurfaceFlinger.  When you render onto a Surface, the | 
 | 468 | result ends up in a buffer that gets shipped to the consumer.  A Surface is not | 
 | 469 | simply a raw chunk of memory you can scribble on.</p> | 
 | 470 |  | 
 | 471 | <p>The BufferQueue for a display Surface is typically configured for | 
 | 472 | triple-buffering; but buffers are allocated on demand.  So if the producer | 
 | 473 | generates buffers slowly enough -- maybe it's animating at 30fps on a 60fps | 
 | 474 | display -- there might only be two allocated buffers in the queue.  This helps | 
 | 475 | minimize memory consumption.  You can see a summary of the buffers associated | 
 | 476 | with every layer in the <code>dumpsys SurfaceFlinger</code> output.</p> | 
 | 477 |  | 
 | 478 | <h3 id="canvas">Canvas Rendering</h3> | 
 | 479 |  | 
 | 480 | <p>Once upon a time, all rendering was done in software, and you can still do this | 
 | 481 | today.  The low-level implementation is provided by the Skia graphics library. | 
 | 482 | If you want to draw a rectangle, you make a library call, and it sets bytes in a | 
 | 483 | buffer appropriately.  To ensure that a buffer isn't updated by two clients at | 
 | 484 | once, or written to while being displayed, you have to lock the buffer to access | 
 | 485 | it.  <code>lockCanvas()</code> locks the buffer and returns a Canvas to use for drawing, | 
 | 486 | and <code>unlockCanvasAndPost()</code> unlocks the buffer and sends it to the compositor.</p> | 
 | 487 |  | 
 | 488 | <p>As time went on, and devices with general-purpose 3D engines appeared, Android | 
 | 489 | reoriented itself around OpenGL ES.  However, it was important to keep the old | 
 | 490 | API working, for apps as well as app framework code, so an effort was made to | 
 | 491 | hardware-accelerate the Canvas API.  As you can see from the charts on the | 
 | 492 | <a href="http://developer.android.com/guide/topics/graphics/hardware-accel.html">Hardware | 
 | 493 | Acceleration</a> | 
 | 494 | page, this was a bit of a bumpy ride.  Note in particular that while the Canvas | 
 | 495 | provided to a View's <code>onDraw()</code> method may be hardware-accelerated, the Canvas | 
 | 496 | obtained when an app locks a Surface directly with <code>lockCanvas()</code> never is.</p> | 
 | 497 |  | 
 | 498 | <p>When you lock a Surface for Canvas access, the "CPU renderer" connects to the | 
 | 499 | producer side of the BufferQueue and does not disconnect until the Surface is | 
 | 500 | destroyed.  Most other producers (like GLES) can be disconnected and reconnected | 
 | 501 | to a Surface, but the Canvas-based "CPU renderer" cannot.  This means you can't | 
 | 502 | draw on a surface with GLES or send it frames from a video decoder if you've | 
 | 503 | ever locked it for a Canvas.</p> | 
 | 504 |  | 
 | 505 | <p>The first time the producer requests a buffer from a BufferQueue, it is | 
 | 506 | allocated and initialized to zeroes.  Initialization is necessary to avoid | 
 | 507 | inadvertently sharing data between processes.  When you re-use a buffer, | 
 | 508 | however, the previous contents will still be present.  If you repeatedly call | 
 | 509 | <code>lockCanvas()</code> and <code>unlockCanvasAndPost()</code> without | 
 | 510 | drawing anything, you'll cycle between previously-rendered frames.</p> | 
 | 511 |  | 
 | 512 | <p>The Surface lock/unlock code keeps a reference to the previously-rendered | 
 | 513 | buffer.  If you specify a dirty region when locking the Surface, it will copy | 
 | 514 | the non-dirty pixels from the previous buffer.  There's a fair chance the buffer | 
 | 515 | will be handled by SurfaceFlinger or HWC; but since we need to only read from | 
 | 516 | it, there's no need to wait for exclusive access.</p> | 
 | 517 |  | 
 | 518 | <p>The main non-Canvas way for an application to draw directly on a Surface is | 
 | 519 | through OpenGL ES.  That's described in the <a href="#eglsurface">EGLSurface and | 
 | 520 | OpenGL ES</a> section.</p> | 
 | 521 |  | 
 | 522 | <h3 id="surfaceholder">SurfaceHolder</h3> | 
 | 523 |  | 
 | 524 | <p>Some things that work with Surfaces want a SurfaceHolder, notably SurfaceView. | 
 | 525 | The original idea was that Surface represented the raw compositor-managed | 
 | 526 | buffer, while SurfaceHolder was managed by the app and kept track of | 
 | 527 | higher-level information like the dimensions and format.  The Java-language | 
 | 528 | definition mirrors the underlying native implementation.  It's arguably no | 
 | 529 | longer useful to split it this way, but it has long been part of the public API.</p> | 
 | 530 |  | 
 | 531 | <p>Generally speaking, anything having to do with a View will involve a | 
 | 532 | SurfaceHolder.  Some other APIs, such as MediaCodec, will operate on the Surface | 
 | 533 | itself.  You can easily get the Surface from the SurfaceHolder, so hang on to | 
 | 534 | the latter when you have it.</p> | 
 | 535 |  | 
 | 536 | <p>APIs to get and set Surface parameters, such as the size and format, are | 
 | 537 | implemented through SurfaceHolder.</p> | 
 | 538 |  | 
 | 539 | <h2 id="eglsurface">EGLSurface and OpenGL ES</h2> | 
 | 540 |  | 
 | 541 | <p>OpenGL ES defines an API for rendering graphics.  It does not define a windowing | 
 | 542 | system.  To allow GLES to work on a variety of platforms, it is designed to be | 
 | 543 | combined with a library that knows how to create and access windows through the | 
 | 544 | operating system.  The library used for Android is called EGL.  If you want to | 
 | 545 | draw textured polygons, you use GLES calls; if you want to put your rendering on | 
 | 546 | the screen, you use EGL calls.</p> | 
 | 547 |  | 
 | 548 | <p>Before you can do anything with GLES, you need to create a GL context.  In EGL, | 
 | 549 | this means creating an EGLContext and an EGLSurface.  GLES operations apply to | 
 | 550 | the current context, which is accessed through thread-local storage rather than | 
 | 551 | passed around as an argument.  This means you have to be careful about which | 
 | 552 | thread your rendering code executes on, and which context is current on that | 
 | 553 | thread.</p> | 
 | 554 |  | 
 | 555 | <p>The EGLSurface can be an off-screen buffer allocated by EGL (called a "pbuffer") | 
 | 556 | or a window allocated by the operating system.  EGL window surfaces are created | 
 | 557 | with the <code>eglCreateWindowSurface()</code> call.  It takes a "window object" as an | 
 | 558 | argument, which on Android can be a SurfaceView, a SurfaceTexture, a | 
 | 559 | SurfaceHolder, or a Surface -- all of which have a BufferQueue underneath.  When | 
 | 560 | you make this call, EGL creates a new EGLSurface object, and connects it to the | 
 | 561 | producer interface of the window object's BufferQueue.  From that point onward, | 
 | 562 | rendering to that EGLSurface results in a buffer being dequeued, rendered into, | 
 | 563 | and queued for use by the consumer.  (The term "window" is indicative of the | 
 | 564 | expected use, but bear in mind the output might not be destined to appear | 
 | 565 | on the display.)</p> | 
 | 566 |  | 
 | 567 | <p>EGL does not provide lock/unlock calls.  Instead, you issue drawing commands and | 
 | 568 | then call <code>eglSwapBuffers()</code> to submit the current frame.  The | 
 | 569 | method name comes from the traditional swap of front and back buffers, but the actual | 
 | 570 | implementation may be very different.</p> | 
 | 571 |  | 
 | 572 | <p>Only one EGLSurface can be associated with a Surface at a time -- you can have | 
 | 573 | only one producer connected to a BufferQueue -- but if you destroy the | 
 | 574 | EGLSurface it will disconnect from the BufferQueue and allow something else to | 
 | 575 | connect.</p> | 
 | 576 |  | 
 | 577 | <p>A given thread can switch between multiple EGLSurfaces by changing what's | 
 | 578 | "current."  An EGLSurface must be current on only one thread at a time.</p> | 
 | 579 |  | 
 | 580 | <p>The most common mistake when thinking about EGLSurface is assuming that it is | 
 | 581 | just another aspect of Surface (like SurfaceHolder).  It's a related but | 
 | 582 | independent concept.  You can draw on an EGLSurface that isn't backed by a | 
 | 583 | Surface, and you can use a Surface without EGL.  EGLSurface just gives GLES a | 
 | 584 | place to draw.</p> | 
 | 585 |  | 
 | 586 | <h3 id="anativewindow">ANativeWindow</h3> | 
 | 587 |  | 
 | 588 | <p>The public Surface class is implemented in the Java programming language.  The | 
 | 589 | equivalent in C/C++ is the ANativeWindow class, semi-exposed by the <a | 
 | 590 | href="https://developer.android.com/tools/sdk/ndk/index.html">Android NDK</a>.  You | 
 | 591 | can get the ANativeWindow from a Surface with the <code>ANativeWindow_fromSurface()</code> | 
 | 592 | call.  Just like its Java-language cousin, you can lock it, render in software, | 
 | 593 | and unlock-and-post.</p> | 
 | 594 |  | 
 | 595 | <p>To create an EGL window surface from native code, you pass an instance of | 
 | 596 | EGLNativeWindowType to <code>eglCreateWindowSurface()</code>.  EGLNativeWindowType is just | 
 | 597 | a synonym for ANativeWindow, so you can freely cast one to the other.</p> | 
 | 598 |  | 
 | 599 | <p>The fact that the basic "native window" type just wraps the producer side of a | 
 | 600 | BufferQueue should not come as a surprise.</p> | 
 | 601 |  | 
 | 602 | <h2 id="surfaceview">SurfaceView and GLSurfaceView</h2> | 
 | 603 |  | 
 | 604 | <p>Now that we've explored the lower-level components, it's time to see how they | 
 | 605 | fit into the higher-level components that apps are built from.</p> | 
 | 606 |  | 
 | 607 | <p>The Android app framework UI is based on a hierarchy of objects that start with | 
 | 608 | View.  Most of the details don't matter for this discussion, but it's helpful to | 
 | 609 | understand that UI elements go through a complicated measurement and layout | 
 | 610 | process that fits them into a rectangular area.  All visible View objects are | 
 | 611 | rendered to a SurfaceFlinger-created Surface that was set up by the | 
 | 612 | WindowManager when the app was brought to the foreground.  The layout and | 
 | 613 | rendering is performed on the app's UI thread.</p> | 
 | 614 |  | 
 | 615 | <p>Regardless of how many Layouts and Views you have, everything gets rendered into | 
 | 616 | a single buffer.  This is true whether or not the Views are hardware-accelerated.</p> | 
 | 617 |  | 
 | 618 | <p>A SurfaceView takes the same sorts of parameters as other views, so you can give | 
 | 619 | it a position and size, and fit other elements around it.  When it comes time to | 
 | 620 | render, however, the contents are completely transparent.  The View part of a | 
 | 621 | SurfaceView is just a see-through placeholder.</p> | 
 | 622 |  | 
 | 623 | <p>When the SurfaceView's View component is about to become visible, the framework | 
 | 624 | asks the WindowManager to ask SurfaceFlinger to create a new Surface.  (This | 
 | 625 | doesn't happen synchronously, which is why you should provide a callback that | 
 | 626 | notifies you when the Surface creation finishes.)  By default, the new Surface | 
 | 627 | is placed behind the app UI Surface, but the default "Z-ordering" can be | 
 | 628 | overridden to put the Surface on top.</p> | 
 | 629 |  | 
 | 630 | <p>Whatever you render onto this Surface will be composited by SurfaceFlinger, not | 
 | 631 | by the app.  This is the real power of SurfaceView: the Surface you get can be | 
 | 632 | rendered by a separate thread or a separate process, isolated from any rendering | 
 | 633 | performed by the app UI, and the buffers go directly to SurfaceFlinger.  You | 
 | 634 | can't totally ignore the UI thread -- you still have to coordinate with the | 
 | 635 | Activity lifecycle, and you may need to adjust something if the size or position | 
 | 636 | of the View changes -- but you have a whole Surface all to yourself, and | 
 | 637 | blending with the app UI and other layers is handled by the Hardware Composer.</p> | 
 | 638 |  | 
 | 639 | <p>It's worth taking a moment to note that this new Surface is the producer side of | 
 | 640 | a BufferQueue whose consumer is a SurfaceFlinger layer.  You can update the | 
 | 641 | Surface with any mechanism that can feed a BufferQueue.  You can: use the | 
 | 642 | Surface-supplied Canvas functions, attach an EGLSurface and draw on it | 
 | 643 | with GLES, and configure a MediaCodec video decoder to write to it.</p> | 
 | 644 |  | 
 | 645 | <h3 id="composition">Composition and the Hardware Scaler</h3> | 
 | 646 |  | 
 | 647 | <p>Now that we have a bit more context, it's useful to go back and look at a couple | 
 | 648 | of fields from <code>dumpsys SurfaceFlinger</code> that we skipped over earlier | 
 | 649 | on.  Back in the <a href="#hwcomposer">Hardware Composer</a> discussion, we | 
 | 650 | looked at some output like this:</p> | 
 | 651 |  | 
 | 652 | <pre> | 
 | 653 |     type    |          source crop              |           frame           name | 
 | 654 | ------------+-----------------------------------+-------------------------------- | 
 | 655 |         HWC | [    0.0,    0.0,  320.0,  240.0] | [   48,  411, 1032, 1149] SurfaceView | 
 | 656 |         HWC | [    0.0,   75.0, 1080.0, 1776.0] | [    0,   75, 1080, 1776] com.android.grafika/com.android.grafika.PlayMovieSurfaceActivity | 
 | 657 |         HWC | [    0.0,    0.0, 1080.0,   75.0] | [    0,    0, 1080,   75] StatusBar | 
 | 658 |         HWC | [    0.0,    0.0, 1080.0,  144.0] | [    0, 1776, 1080, 1920] NavigationBar | 
 | 659 |   FB TARGET | [    0.0,    0.0, 1080.0, 1920.0] | [    0,    0, 1080, 1920] HWC_FRAMEBUFFER_TARGET | 
 | 660 | </pre> | 
 | 661 |  | 
 | 662 | <p>This was taken while playing a movie in Grafika's "Play video (SurfaceView)" | 
 | 663 | activity, on a Nexus 5 in portrait orientation.  Note that the list is ordered | 
 | 664 | from back to front: the SurfaceView's Surface is in the back, the app UI layer | 
 | 665 | sits on top of that, followed by the status and navigation bars that are above | 
 | 666 | everything else.  The video is QVGA (320x240).</p> | 
 | 667 |  | 
 | 668 | <p>The "source crop" indicates the portion of the Surface's buffer that | 
 | 669 | SurfaceFlinger is going to display.  The app UI was given a Surface equal to the | 
 | 670 | full size of the display (1080x1920), but there's no point rendering and | 
 | 671 | compositing pixels that will be obscured by the status and navigation bars, so | 
 | 672 | the source is cropped to a rectangle that starts 75 pixels from the top, and | 
 | 673 | ends 144 pixels from the bottom.  The status and navigation bars have smaller | 
 | 674 | Surfaces, and the source crop describes a rectangle that begins at the the top | 
 | 675 | left (0,0) and spans their content.</p> | 
 | 676 |  | 
 | 677 | <p>The "frame" is the rectangle where the pixels end up on the display.  For the | 
 | 678 | app UI layer, the frame matches the source crop, because we're copying (or | 
 | 679 | overlaying) a portion of a display-sized layer to the same location in another | 
 | 680 | display-sized layer.  For the status and navigation bars, the size of the frame | 
 | 681 | rectangle is the same, but the position is adjusted so that the navigation bar | 
 | 682 | appears at the bottom of the screen.</p> | 
 | 683 |  | 
 | 684 | <p>Now consider the layer labeled "SurfaceView", which holds our video content. | 
 | 685 | The source crop matches the video size, which SurfaceFlinger knows because the | 
 | 686 | MediaCodec decoder (the buffer producer) is dequeuing buffers that size.  The | 
 | 687 | frame rectangle has a completely different size -- 984x738.</p> | 
 | 688 |  | 
 | 689 | <p>SurfaceFlinger handles size differences by scaling the buffer contents to fill | 
 | 690 | the frame rectangle, upscaling or downscaling as needed.  This particular size | 
 | 691 | was chosen because it has the same aspect ratio as the video (4:3), and is as | 
 | 692 | wide as possible given the constraints of the View layout (which includes some | 
 | 693 | padding at the edges of the screen for aesthetic reasons).</p> | 
 | 694 |  | 
 | 695 | <p>If you started playing a different video on the same Surface, the underlying | 
 | 696 | BufferQueue would reallocate buffers to the new size automatically, and | 
 | 697 | SurfaceFlinger would adjust the source crop.  If the aspect ratio of the new | 
 | 698 | video is different, the app would need to force a re-layout of the View to match | 
 | 699 | it, which causes the WindowManager to tell SurfaceFlinger to update the frame | 
 | 700 | rectangle.</p> | 
 | 701 |  | 
 | 702 | <p>If you're rendering on the Surface through some other means, perhaps GLES, you | 
 | 703 | can set the Surface size using the <code>SurfaceHolder#setFixedSize()</code> | 
 | 704 | call.  You could, for example, configure a game to always render at 1280x720, | 
 | 705 | which would significantly reduce the number of pixels that must be touched to | 
 | 706 | fill the screen on a 2560x1440 tablet or 4K television.  The display processor | 
 | 707 | handles the scaling.  If you don't want to letter- or pillar-box your game, you | 
 | 708 | could adjust the game's aspect ratio by setting the size so that the narrow | 
 | 709 | dimension is 720 pixels, but the long dimension is set to maintain the aspect | 
 | 710 | ratio of the physical display (e.g. 1152x720 to match a 2560x1600 display). | 
 | 711 | You can see an example of this approach in Grafika's "Hardware scaler | 
 | 712 | exerciser" activity.</p> | 
 | 713 |  | 
 | 714 | <h3 id="glsurfaceview">GLSurfaceView</h3> | 
 | 715 |  | 
 | 716 | <p>The GLSurfaceView class provides some helper classes that help manage EGL | 
 | 717 | contexts, inter-thread communication, and interaction with the Activity | 
 | 718 | lifecycle.  That's it.  You do not need to use a GLSurfaceView to use GLES.</p> | 
 | 719 |  | 
 | 720 | <p>For example, GLSurfaceView creates a thread for rendering and configures an EGL | 
 | 721 | context there.  The state is cleaned up automatically when the activity pauses. | 
 | 722 | Most apps won't need to know anything about EGL to use GLES with GLSurfaceView.</p> | 
 | 723 |  | 
 | 724 | <p>In most cases, GLSurfaceView is very helpful and can make working with GLES | 
 | 725 | easier.  In some situations, it can get in the way.  Use it if it helps, don't | 
 | 726 | if it doesn't.</p> | 
 | 727 |  | 
 | 728 | <h2 id="surfacetexture">SurfaceTexture</h2> | 
 | 729 |  | 
 | 730 | <p>The SurfaceTexture class is a relative newcomer, added in Android 3.0 | 
 | 731 | ("Honeycomb").  Just as SurfaceView is the combination of a Surface and a View, | 
 | 732 | SurfaceTexture is the combination of a Surface and a GLES texture.  Sort of.</p> | 
 | 733 |  | 
 | 734 | <p>When you create a SurfaceTexture, you are creating a BufferQueue for which your | 
 | 735 | app is the consumer.  When a new buffer is queued by the producer, your app is | 
 | 736 | notified via callback (<code>onFrameAvailable()</code>).  Your app calls | 
 | 737 | <code>updateTexImage()</code>, which releases the previously-held buffer, | 
 | 738 | acquires the new buffer from the queue, and makes some EGL calls to make the | 
 | 739 | buffer available to GLES as an "external" texture.</p> | 
 | 740 |  | 
 | 741 | <p>External textures (<code>GL_TEXTURE_EXTERNAL_OES</code>) are not quite the | 
 | 742 | same as textures created by GLES (<code>GL_TEXTURE_2D</code>).  You have to | 
 | 743 | configure your renderer a bit differently, and there are things you can't do | 
 | 744 | with them. But the key point is this: You can render textured polygons directly | 
 | 745 | from the data received by your BufferQueue.</p> | 
 | 746 |  | 
 | 747 | <p>You may be wondering how we can guarantee the format of the data in the | 
 | 748 | buffer is something GLES can recognize -- gralloc supports a wide variety | 
 | 749 | of formats.  When SurfaceTexture created the BufferQueue, it set the consumer's | 
 | 750 | usage flags to <code>GRALLOC_USAGE_HW_TEXTURE</code>, ensuring that any buffer | 
 | 751 | created by gralloc would be usable by GLES.</p> | 
 | 752 |  | 
 | 753 | <p>Because SurfaceTexture interacts with an EGL context, you have to be careful to | 
 | 754 | call its methods from the correct thread.  This is spelled out in the class | 
 | 755 | documentation.</p> | 
 | 756 |  | 
 | 757 | <p>If you look deeper into the class documentation, you will see a couple of odd | 
 | 758 | calls.  One retrieves a timestamp, the other a transformation matrix, the value | 
 | 759 | of each having been set by the previous call to <code>updateTexImage()</code>. | 
 | 760 | It turns out that BufferQueue passes more than just a buffer handle to the consumer. | 
 | 761 | Each buffer is accompanied by a timestamp and transformation parameters.</p> | 
 | 762 |  | 
 | 763 | <p>The transformation is provided for efficiency.  In some cases, the source data | 
 | 764 | might be in the "wrong" orientation for the consumer; but instead of rotating | 
 | 765 | the data before sending it, we can send the data in its current orientation with | 
 | 766 | a transform that corrects it.  The transformation matrix can be merged with | 
 | 767 | other transformations at the point the data is used, minimizing overhead.</p> | 
 | 768 |  | 
 | 769 | <p>The timestamp is useful for certain buffer sources.  For example, suppose you | 
 | 770 | connect the producer interface to the output of the camera (with | 
 | 771 | <code>setPreviewTexture()</code>).  If you want to create a video, you need to | 
 | 772 | set the presentation time stamp for each frame; but you want to base that on the time | 
 | 773 | when the frame was captured, not the time when the buffer was received by your | 
 | 774 | app.  The timestamp provided with the buffer is set by the camera code, | 
 | 775 | resulting in a more consistent series of timestamps.</p> | 
 | 776 |  | 
 | 777 | <h3 id="surfacet">SurfaceTexture and Surface</h3> | 
 | 778 |  | 
 | 779 | <p>If you look closely at the API you'll see the only way for an application | 
 | 780 | to create a plain Surface is through a constructor that takes a SurfaceTexture | 
 | 781 | as the sole argument.  (Prior to API 11, there was no public constructor for | 
 | 782 | Surface at all.)  This might seem a bit backward if you view SurfaceTexture as a | 
 | 783 | combination of a Surface and a texture.</p> | 
 | 784 |  | 
 | 785 | <p>Under the hood, SurfaceTexture is called GLConsumer, which more accurately | 
 | 786 | reflects its role as the owner and consumer of a BufferQueue.  When you create a | 
 | 787 | Surface from a SurfaceTexture, what you're doing is creating an object that | 
 | 788 | represents the producer side of the SurfaceTexture's BufferQueue.</p> | 
 | 789 |  | 
 | 790 | <h3 id="continuous-capture">Case Study: Grafika's "Continuous Capture" Activity</h3> | 
 | 791 |  | 
 | 792 | <p>The camera can provide a stream of frames suitable for recording as a movie.  If | 
 | 793 | you want to display it on screen, you create a SurfaceView, pass the Surface to | 
 | 794 | <code>setPreviewDisplay()</code>, and let the producer (camera) and consumer | 
 | 795 | (SurfaceFlinger) do all the work.  If you want to record the video, you create a | 
 | 796 | Surface with MediaCodec's <code>createInputSurface()</code>, pass that to the | 
 | 797 | camera, and again you sit back and relax.  If you want to show the video and | 
 | 798 | record it at the same time, you have to get more involved.</p> | 
 | 799 |  | 
 | 800 | <p>The "Continuous capture" activity displays video from the camera as it's being | 
 | 801 | recorded.  In this case, encoded video is written to a circular buffer in memory | 
 | 802 | that can be saved to disk at any time.  It's straightforward to implement so | 
 | 803 | long as you keep track of where everything is.</p> | 
 | 804 |  | 
 | 805 | <p>There are three BufferQueues involved.  The app uses a SurfaceTexture to receive | 
 | 806 | frames from Camera, converting them to an external GLES texture.  The app | 
 | 807 | declares a SurfaceView, which we use to display the frames, and we configure a | 
 | 808 | MediaCodec encoder with an input Surface to create the video.  So one | 
 | 809 | BufferQueue is created by the app, one by SurfaceFlinger, and one by | 
 | 810 | mediaserver.</p> | 
 | 811 |  | 
 | 812 | <img src="images/continuous_capture_activity.png" alt="Grafika continuous | 
 | 813 | capture activity" /> | 
 | 814 |  | 
 | 815 | <p class="img-caption"> | 
 | 816 |   <strong>Figure 2.</strong>Grafika's continuous capture activity | 
 | 817 | </p> | 
 | 818 |  | 
 | 819 | <p>In the diagram above, the arrows show the propagation of the data from the | 
 | 820 | camera.  BufferQueues are in color (purple producer, cyan consumer).  Note | 
 | 821 | “Camera” actually lives in the mediaserver process.</p> | 
 | 822 |  | 
 | 823 | <p>Encoded H.264 video goes to a circular buffer in RAM in the app process, and is | 
 | 824 | written to an MP4 file on disk using the MediaMuxer class when the “capture” | 
 | 825 | button is hit.</p> | 
 | 826 |  | 
 | 827 | <p>All three of the BufferQueues are handled with a single EGL context in the | 
 | 828 | app, and the GLES operations are performed on the UI thread.  Doing the | 
 | 829 | SurfaceView rendering on the UI thread is generally discouraged, but since we're | 
 | 830 | doing simple operations that are handled asynchronously by the GLES driver we | 
 | 831 | should be fine.  (If the video encoder locks up and we block trying to dequeue a | 
 | 832 | buffer, the app will become unresponsive. But at that point, we're probably | 
 | 833 | failing anyway.)  The handling of the encoded data -- managing the circular | 
 | 834 | buffer and writing it to disk -- is performed on a separate thread.</p> | 
 | 835 |  | 
 | 836 | <p>The bulk of the configuration happens in the SurfaceView's <code>surfaceCreated()</code> | 
 | 837 | callback.  The EGLContext is created, and EGLSurfaces are created for the | 
 | 838 | display and for the video encoder.  When a new frame arrives, we tell | 
 | 839 | SurfaceTexture to acquire it and make it available as a GLES texture, then | 
 | 840 | render it with GLES commands on each EGLSurface (forwarding the transform and | 
 | 841 | timestamp from SurfaceTexture).  The encoder thread pulls the encoded output | 
 | 842 | from MediaCodec and stashes it in memory.</p> | 
 | 843 |  | 
 | 844 | <h2 id="texture">TextureView</h2> | 
 | 845 |  | 
 | 846 | <p>The TextureView class was | 
 | 847 | <a href="http://android-developers.blogspot.com/2011/11/android-40-graphics-and-animations.html">introduced</a> | 
 | 848 | in Android 4.0 ("Ice Cream Sandwich").  It's the most complex of the View | 
 | 849 | objects discussed here, combining a View with a SurfaceTexture.</p> | 
 | 850 |  | 
 | 851 | <p>Recall that the SurfaceTexture is a "GL consumer", consuming buffers of graphics | 
 | 852 | data and making them available as textures.  TextureView wraps a SurfaceTexture, | 
 | 853 | taking over the responsibility of responding to the callbacks and acquiring new | 
 | 854 | buffers.  The arrival of new buffers causes TextureView to issue a View | 
 | 855 | invalidate request.  When asked to draw, the TextureView uses the contents of | 
 | 856 | the most recently received buffer as its data source, rendering wherever and | 
 | 857 | however the View state indicates it should.</p> | 
 | 858 |  | 
 | 859 | <p>You can render on a TextureView with GLES just as you would SurfaceView.  Just | 
 | 860 | pass the SurfaceTexture to the EGL window creation call.  However, doing so | 
 | 861 | exposes a potential problem.</p> | 
 | 862 |  | 
 | 863 | <p>In most of what we've looked at, the BufferQueues have passed buffers between | 
 | 864 | different processes.  When rendering to a TextureView with GLES, both producer | 
 | 865 | and consumer are in the same process, and they might even be handled on a single | 
 | 866 | thread.  Suppose we submit several buffers in quick succession from the UI | 
 | 867 | thread.  The EGL buffer swap call will need to dequeue a buffer from the | 
 | 868 | BufferQueue, and it will stall until one is available.  There won't be any | 
 | 869 | available until the consumer acquires one for rendering, but that also happens | 
 | 870 | on the UI thread… so we're stuck.</p> | 
 | 871 |  | 
 | 872 | <p>The solution is to have BufferQueue ensure there is always a buffer | 
 | 873 | available to be dequeued, so the buffer swap never stalls.  One way to guarantee | 
 | 874 | this is to have BufferQueue discard the contents of the previously-queued buffer | 
 | 875 | when a new buffer is queued, and to place restrictions on minimum buffer counts | 
 | 876 | and maximum acquired buffer counts.  (If your queue has three buffers, and all | 
 | 877 | three buffers are acquired by the consumer, then there's nothing to dequeue and | 
 | 878 | the buffer swap call must hang or fail.  So we need to prevent the consumer from | 
 | 879 | acquiring more than two buffers at once.)  Dropping buffers is usually | 
 | 880 | undesirable, so it's only enabled in specific situations, such as when the | 
 | 881 | producer and consumer are in the same process.</p> | 
 | 882 |  | 
 | 883 | <h3 id="surface-or-texture">SurfaceView or TextureView?</h3> | 
 | 884 | SurfaceView and TextureView fill similar roles, but have very different | 
 | 885 | implementations.  To decide which is best requires an understanding of the | 
 | 886 | trade-offs.</p> | 
 | 887 |  | 
 | 888 | <p>Because TextureView is a proper citizen of the View hierarchy, it behaves like | 
 | 889 | any other View, and can overlap or be overlapped by other elements.  You can | 
 | 890 | perform arbitrary transformations and retrieve the contents as a bitmap with | 
 | 891 | simple API calls.</p> | 
 | 892 |  | 
 | 893 | <p>The main strike against TextureView is the performance of the composition step. | 
 | 894 | With SurfaceView, the content is written to a separate layer that SurfaceFlinger | 
 | 895 | composites, ideally with an overlay.  With TextureView, the View composition is | 
 | 896 | always performed with GLES, and updates to its contents may cause other View | 
 | 897 | elements to redraw as well (e.g. if they're positioned on top of the | 
 | 898 | TextureView).  After the View rendering completes, the app UI layer must then be | 
 | 899 | composited with other layers by SurfaceFlinger, so you're effectively | 
 | 900 | compositing every visible pixel twice.  For a full-screen video player, or any | 
 | 901 | other application that is effectively just UI elements layered on top of video, | 
 | 902 | SurfaceView offers much better performance.</p> | 
 | 903 |  | 
 | 904 | <p>As noted earlier, DRM-protected video can be presented only on an overlay plane. | 
 | 905 |  Video players that support protected content must be implemented with | 
 | 906 | SurfaceView.</p> | 
 | 907 |  | 
 | 908 | <h3 id="grafika">Case Study: Grafika's Play Video (TextureView)</h3> | 
 | 909 |  | 
 | 910 | <p>Grafika includes a pair of video players, one implemented with TextureView, the | 
 | 911 | other with SurfaceView.  The video decoding portion, which just sends frames | 
 | 912 | from MediaCodec to a Surface, is the same for both.  The most interesting | 
 | 913 | differences between the implementations are the steps required to present the | 
 | 914 | correct aspect ratio.</p> | 
 | 915 |  | 
 | 916 | <p>While SurfaceView requires a custom implementation of FrameLayout, resizing | 
 | 917 | SurfaceTexture is a simple matter of configuring a transformation matrix with | 
 | 918 | <code>TextureView#setTransform()</code>.  For the former, you're sending new | 
 | 919 | window position and size values to SurfaceFlinger through WindowManager; for | 
 | 920 | the latter, you're just rendering it differently.</p> | 
 | 921 |  | 
 | 922 | <p>Otherwise, both implementations follow the same pattern.  Once the Surface has | 
 | 923 | been created, playback is enabled.  When "play" is hit, a video decoding thread | 
 | 924 | is started, with the Surface as the output target.  After that, the app code | 
 | 925 | doesn't have to do anything -- composition and display will either be handled by | 
 | 926 | SurfaceFlinger (for the SurfaceView) or by TextureView.</p> | 
 | 927 |  | 
 | 928 | <h3 id="decode">Case Study: Grafika's Double Decode</h3> | 
 | 929 |  | 
 | 930 | <p>This activity demonstrates manipulation of the SurfaceTexture inside a | 
 | 931 | TextureView.</p> | 
 | 932 |  | 
 | 933 | <p>The basic structure of this activity is a pair of TextureViews that show two | 
 | 934 | different videos playing side-by-side.  To simulate the needs of a | 
 | 935 | videoconferencing app, we want to keep the MediaCodec decoders alive when the | 
 | 936 | activity is paused and resumed for an orientation change.  The trick is that you | 
 | 937 | can't change the Surface that a MediaCodec decoder uses without fully | 
 | 938 | reconfiguring it, which is a fairly expensive operation; so we want to keep the | 
 | 939 | Surface alive.  The Surface is just a handle to the producer interface in the | 
 | 940 | SurfaceTexture's BufferQueue, and the SurfaceTexture is managed by the | 
 | 941 | TextureView;, so we also need to keep the SurfaceTexture alive.  So how do we deal | 
 | 942 | with the TextureView getting torn down?</p> | 
 | 943 |  | 
 | 944 | <p>It just so happens TextureView provides a <code>setSurfaceTexture()</code> call | 
 | 945 | that does exactly what we want.  We obtain references to the SurfaceTextures | 
 | 946 | from the TextureViews and save them in a static field.  When the activity is | 
 | 947 | shut down, we return "false" from the <code>onSurfaceTextureDestroyed()</code> | 
 | 948 | callback to prevent destruction of the SurfaceTexture.  When the activity is | 
 | 949 | restarted, we stuff the old SurfaceTexture into the new TextureView.  The | 
 | 950 | TextureView class takes care of creating and destroying the EGL contexts.</p> | 
 | 951 |  | 
 | 952 | <p>Each video decoder is driven from a separate thread.  At first glance it might | 
 | 953 | seem like we need EGL contexts local to each thread; but remember the buffers | 
 | 954 | with decoded output are actually being sent from mediaserver to our | 
 | 955 | BufferQueue consumers (the SurfaceTextures).  The TextureViews take care of the | 
 | 956 | rendering for us, and they execute on the UI thread.</p> | 
 | 957 |  | 
 | 958 | <p>Implementing this activity with SurfaceView would be a bit harder.  We can't | 
 | 959 | just create a pair of SurfaceViews and direct the output to them, because the | 
 | 960 | Surfaces would be destroyed during an orientation change.  Besides, that would | 
 | 961 | add two layers, and limitations on the number of available overlays strongly | 
 | 962 | motivate us to keep the number of layers to a minimum.  Instead, we'd want to | 
 | 963 | create a pair of SurfaceTextures to receive the output from the video decoders, | 
 | 964 | and then perform the rendering in the app, using GLES to render two textured | 
 | 965 | quads onto the SurfaceView's Surface.</p> | 
 | 966 |  | 
 | 967 | <h2 id="notes">Conclusion</h2> | 
 | 968 |  | 
 | 969 | <p>We hope this page has provided useful insights into the way Android handles | 
 | 970 | graphics at the system level.</p> | 
 | 971 |  | 
 | 972 | <p>Some information and advice on related topics can be found in the appendices | 
 | 973 | that follow.</p> | 
 | 974 |  | 
 | 975 | <h2 id="loops">Appendix A: Game Loops</h2> | 
 | 976 |  | 
 | 977 | <p>A very popular way to implement a game loop looks like this:</p> | 
 | 978 |  | 
 | 979 | <pre> | 
 | 980 | while (playing) { | 
 | 981 |     advance state by one frame | 
 | 982 |     render the new frame | 
 | 983 |     sleep until it’s time to do the next frame | 
 | 984 | } | 
 | 985 | </pre> | 
 | 986 |  | 
 | 987 | <p>There are a few problems with this, the most fundamental being the idea that the | 
 | 988 | game can define what a "frame" is.  Different displays will refresh at different | 
 | 989 | rates, and that rate may vary over time.  If you generate frames faster than the | 
 | 990 | display can show them, you will have to drop one occasionally.  If you generate | 
 | 991 | them too slowly, SurfaceFlinger will periodically fail to find a new buffer to | 
 | 992 | acquire and will re-show the previous frame.  Both of these situations can | 
 | 993 | cause visible glitches.</p> | 
 | 994 |  | 
 | 995 | <p>What you need to do is match the display's frame rate, and advance game state | 
 | 996 | according to how much time has elapsed since the previous frame.  There are two | 
 | 997 | ways to go about this: (1) stuff the BufferQueue full and rely on the "swap | 
 | 998 | buffers" back-pressure; (2) use Choreographer (API 16+).</p> | 
 | 999 |  | 
 | 1000 | <h3 id="stuffing">Queue Stuffing</h3> | 
 | 1001 |  | 
 | 1002 | <p>This is very easy to implement: just swap buffers as fast as you can.  In early | 
 | 1003 | versions of Android this could actually result in a penalty where | 
 | 1004 | <code>SurfaceView#lockCanvas()</code> would put you to sleep for 100ms.  Now | 
 | 1005 | it's paced by the BufferQueue, and the BufferQueue is emptied as quickly as | 
 | 1006 | SurfaceFlinger is able.</p> | 
 | 1007 |  | 
 | 1008 | <p>One example of this approach can be seen in <a | 
 | 1009 | href="https://code.google.com/p/android-breakout/">Android Breakout</a>.  It | 
 | 1010 | uses GLSurfaceView, which runs in a loop that calls the application's | 
 | 1011 | onDrawFrame() callback and then swaps the buffer.  If the BufferQueue is full, | 
 | 1012 | the <code>eglSwapBuffers()</code> call will wait until a buffer is available. | 
 | 1013 | Buffers become available when SurfaceFlinger releases them, which it does after | 
 | 1014 | acquiring a new one for display.  Because this happens on VSYNC, your draw loop | 
 | 1015 | timing will match the refresh rate.  Mostly.</p> | 
 | 1016 |  | 
 | 1017 | <p>There are a couple of problems with this approach.  First, the app is tied to | 
 | 1018 | SurfaceFlinger activity, which is going to take different amounts of time | 
 | 1019 | depending on how much work there is to do and whether it's fighting for CPU time | 
 | 1020 | with other processes.  Since your game state advances according to the time | 
 | 1021 | between buffer swaps, your animation won't update at a consistent rate.  When | 
 | 1022 | running at 60fps with the inconsistencies averaged out over time, though, you | 
 | 1023 | probably won't notice the bumps.</p> | 
 | 1024 |  | 
 | 1025 | <p>Second, the first couple of buffer swaps are going to happen very quickly | 
 | 1026 | because the BufferQueue isn't full yet.  The computed time between frames will | 
 | 1027 | be near zero, so the game will generate a few frames in which nothing happens. | 
 | 1028 | In a game like Breakout, which updates the screen on every refresh, the queue is | 
 | 1029 | always full except when a game is first starting (or un-paused), so the effect | 
 | 1030 | isn't noticeable.  A game that pauses animation occasionally and then returns to | 
 | 1031 | as-fast-as-possible mode might see odd hiccups.</p> | 
 | 1032 |  | 
 | 1033 | <h3 id="choreographer">Choreographer</h3> | 
 | 1034 |  | 
 | 1035 | <p>Choreographer allows you to set a callback that fires on the next VSYNC.  The | 
 | 1036 | actual VSYNC time is passed in as an argument.  So even if your app doesn't wake | 
 | 1037 | up right away, you still have an accurate picture of when the display refresh | 
 | 1038 | period began.  Using this value, rather than the current time, yields a | 
 | 1039 | consistent time source for your game state update logic.</p> | 
 | 1040 |  | 
 | 1041 | <p>Unfortunately, the fact that you get a callback after every VSYNC does not | 
 | 1042 | guarantee that your callback will be executed in a timely fashion or that you | 
 | 1043 | will be able to act upon it sufficiently swiftly.  Your app will need to detect | 
 | 1044 | situations where it's falling behind and drop frames manually.</p> | 
 | 1045 |  | 
 | 1046 | <p>The "Record GL app" activity in Grafika provides an example of this.  On some | 
 | 1047 | devices (e.g. Nexus 4 and Nexus 5), the activity will start dropping frames if | 
 | 1048 | you just sit and watch.  The GL rendering is trivial, but occasionally the View | 
 | 1049 | elements get redrawn, and the measure/layout pass can take a very long time if | 
 | 1050 | the device has dropped into a reduced-power mode.  (According to systrace, it | 
 | 1051 | takes 28ms instead of 6ms after the clocks slow on Android 4.4.  If you drag | 
 | 1052 | your finger around the screen, it thinks you're interacting with the activity, | 
 | 1053 | so the clock speeds stay high and you'll never drop a frame.)</p> | 
 | 1054 |  | 
 | 1055 | <p>The simple fix was to drop a frame in the Choreographer callback if the current | 
 | 1056 | time is more than N milliseconds after the VSYNC time.  Ideally the value of N | 
 | 1057 | is determined based on previously observed VSYNC intervals.  For example, if the | 
 | 1058 | refresh period is 16.7ms (60fps), you might drop a frame if you're running more | 
 | 1059 | than 15ms late.</p> | 
 | 1060 |  | 
 | 1061 | <p>If you watch "Record GL app" run, you will see the dropped-frame counter | 
 | 1062 | increase, and even see a flash of red in the border when frames drop.  Unless | 
 | 1063 | your eyes are very good, though, you won't see the animation stutter.  At 60fps, | 
 | 1064 | the app can drop the occasional frame without anyone noticing so long as the | 
 | 1065 | animation continues to advance at a constant rate.  How much you can get away | 
 | 1066 | with depends to some extent on what you're drawing, the characteristics of the | 
 | 1067 | display, and how good the person using the app is at detecting jank.</p> | 
 | 1068 |  | 
 | 1069 | <h3 id="thread">Thread Management</h3> | 
 | 1070 |  | 
 | 1071 | <p>Generally speaking, if you're rendering onto a SurfaceView, GLSurfaceView, or | 
 | 1072 | TextureView, you want to do that rendering in a dedicated thread.  Never do any | 
 | 1073 | "heavy lifting" or anything that takes an indeterminate amount of time on the | 
 | 1074 | UI thread.</p> | 
 | 1075 |  | 
 | 1076 | <p>Breakout and "Record GL app" use dedicated renderer threads, and they also | 
 | 1077 | update animation state on that thread.  This is a reasonable approach so long as | 
 | 1078 | game state can be updated quickly.</p> | 
 | 1079 |  | 
 | 1080 | <p>Other games separate the game logic and rendering completely.  If you had a | 
 | 1081 | simple game that did nothing but move a block every 100ms, you could have a | 
 | 1082 | dedicated thread that just did this:</p> | 
 | 1083 |  | 
 | 1084 | <pre> | 
 | 1085 |     run() { | 
 | 1086 |         Thread.sleep(100); | 
 | 1087 |         synchronized (mLock) { | 
 | 1088 |             moveBlock(); | 
 | 1089 |         } | 
 | 1090 |     } | 
 | 1091 | </pre> | 
 | 1092 |  | 
 | 1093 | <p>(You may want to base the sleep time off of a fixed clock to prevent drift -- | 
 | 1094 | sleep() isn't perfectly consistent, and moveBlock() takes a nonzero amount of | 
 | 1095 | time -- but you get the idea.)</p> | 
 | 1096 |  | 
 | 1097 | <p>When the draw code wakes up, it just grabs the lock, gets the current position | 
 | 1098 | of the block, releases the lock, and draws.  Instead of doing fractional | 
 | 1099 | movement based on inter-frame delta times, you just have one thread that moves | 
 | 1100 | things along and another thread that draws things wherever they happen to be | 
 | 1101 | when the drawing starts.</p> | 
 | 1102 |  | 
 | 1103 | <p>For a scene with any complexity you'd want to create a list of upcoming events | 
 | 1104 | sorted by wake time, and sleep until the next event is due, but it's the same | 
 | 1105 | idea.</p> | 
 | 1106 |  | 
 | 1107 | <h2 id="activity">Appendix B: SurfaceView and the Activity Lifecycle</h2> | 
 | 1108 |  | 
 | 1109 | <p>When using a SurfaceView, it's considered good practice to render the Surface | 
 | 1110 | from a thread other than the main UI thread.  This raises some questions about | 
 | 1111 | the interaction between that thread and the Activity lifecycle.</p> | 
 | 1112 |  | 
 | 1113 | <p>First, a little background.  For an Activity with a SurfaceView, there are two | 
 | 1114 | separate but interdependent state machines:</p> | 
 | 1115 |  | 
 | 1116 | <ol> | 
 | 1117 | <li>Application onCreate / onResume / onPause</li> | 
 | 1118 | <li>Surface created / changed / destroyed</li> | 
 | 1119 | </ol> | 
 | 1120 |  | 
 | 1121 | <p>When the Activity starts, you get callbacks in this order:</p> | 
 | 1122 |  | 
 | 1123 | <ul> | 
 | 1124 | <li>onCreate</li> | 
 | 1125 | <li>onResume</li> | 
 | 1126 | <li>surfaceCreated</li> | 
 | 1127 | <li>surfaceChanged</li> | 
 | 1128 | </ul> | 
 | 1129 |  | 
 | 1130 | <p>If you hit "back" you get:</p> | 
 | 1131 |  | 
 | 1132 | <ul> | 
 | 1133 | <li>onPause</li> | 
 | 1134 | <li>surfaceDestroyed (called just before the Surface goes away)</li> | 
 | 1135 | </ul> | 
 | 1136 |  | 
 | 1137 | <p>If you rotate the screen, the Activity is torn down and recreated, so you | 
 | 1138 | get the full cycle.  If it matters, you can tell that it's a "quick" restart by | 
 | 1139 | checking <code>isFinishing()</code>.  (It might be possible to start / stop an | 
 | 1140 | Activity so quickly that surfaceCreated() might actually happen after onPause().)</p> | 
 | 1141 |  | 
 | 1142 | <p>If you tap the power button to blank the screen, you only get | 
 | 1143 | <code>onPause()</code> -- no <code>surfaceDestroyed()</code>.  The Surface | 
 | 1144 | remains alive, and rendering can continue.  You can even keep getting | 
 | 1145 | Choreographer events if you continue to request them.  If you have a lock | 
 | 1146 | screen that forces a different orientation, your Activity may be restarted when | 
 | 1147 | the device is unblanked; but if not, you can come out of screen-blank with the | 
 | 1148 | same Surface you had before.</p> | 
 | 1149 |  | 
 | 1150 | <p>This raises a fundamental question when using a separate renderer thread with | 
 | 1151 | SurfaceView: Should the lifespan of the thread be tied to that of the Surface or | 
 | 1152 | the Activity?  The answer depends on what you want to have happen when the | 
 | 1153 | screen goes blank. There are two basic approaches: (1) start/stop the thread on | 
 | 1154 | Activity start/stop; (2) start/stop the thread on Surface create/destroy.</p> | 
 | 1155 |  | 
 | 1156 | <p>#1 interacts well with the app lifecycle. We start the renderer thread in | 
 | 1157 | <code>onResume()</code> and stop it in <code>onPause()</code>. It gets a bit | 
 | 1158 | awkward when creating and configuring the thread because sometimes the Surface | 
 | 1159 | will already exist and sometimes it won't (e.g. it's still alive after toggling | 
 | 1160 | the screen with the power button).  We have to wait for the surface to be | 
 | 1161 | created before we do some initialization in the thread, but we can't simply do | 
 | 1162 | it in the <code>surfaceCreated()</code> callback because that won't fire again | 
 | 1163 | if the Surface didn't get recreated.  So we need to query or cache the Surface | 
 | 1164 | state, and forward it to the renderer thread. Note we have to be a little | 
 | 1165 | careful here passing objects between threads -- it is best to pass the Surface or | 
 | 1166 | SurfaceHolder through a Handler message, rather than just stuffing it into the | 
 | 1167 | thread, to avoid issues on multi-core systems (cf. the <a | 
 | 1168 | href="http://developer.android.com/training/articles/smp.html">Android SMP | 
 | 1169 | Primer</a>).</p> | 
 | 1170 |  | 
 | 1171 | <p>#2 has a certain appeal because the Surface and the renderer are logically | 
 | 1172 | intertwined. We start the thread after the Surface has been created, which | 
 | 1173 | avoids some inter-thread communication concerns.  Surface created / changed | 
 | 1174 | messages are simply forwarded.  We need to make sure rendering stops when the | 
 | 1175 | screen goes blank, and resumes when it un-blanks; this could be a simple matter | 
 | 1176 | of telling Choreographer to stop invoking the frame draw callback.  Our | 
 | 1177 | <code>onResume()</code> will need to resume the callbacks if and only if the | 
 | 1178 | renderer thread is running.  It may not be so trivial though -- if we animate | 
 | 1179 | based on elapsed time between frames, we could have a very large gap when the | 
 | 1180 | next event arrives; so an explicit pause/resume message may be desirable.</p> | 
 | 1181 |  | 
 | 1182 | <p>The above is primarily concerned with how the renderer thread is configured and | 
 | 1183 | whether it's executing. A related concern is extracting state from the thread | 
 | 1184 | when the Activity is killed (in <code>onPause()</code> or <code>onSaveInstanceState()</code>). | 
 | 1185 | Approach #1 will work best for that, because once the renderer thread has been | 
 | 1186 | joined its state can be accessed without synchronization primitives.</p> | 
 | 1187 |  | 
 | 1188 | <p>You can see an example of approach #2 in Grafika's "Hardware scaler exerciser."</p> | 
 | 1189 |  | 
 | 1190 | <h2 id="tracking">Appendix C: Tracking BufferQueue with systrace</h2> | 
 | 1191 |  | 
 | 1192 | <p>If you really want to understand how graphics buffers move around, you need to | 
 | 1193 | use systrace.  The system-level graphics code is well instrumented, as is much | 
 | 1194 | of the relevant app framework code.  Enable the "gfx" and "view" tags, and | 
 | 1195 | generally "sched" as well.</p> | 
 | 1196 |  | 
 | 1197 | <p>A full description of how to use systrace effectively would fill a rather long | 
 | 1198 | document.  One noteworthy item is the presence of BufferQueues in the trace.  If | 
 | 1199 | you've used systrace before, you've probably seen them, but maybe weren't sure | 
 | 1200 | what they were.  As an example, if you grab a trace while Grafika's "Play video | 
 | 1201 | (SurfaceView)" is running, you will see a row labeled: "SurfaceView"  This row | 
 | 1202 | tells you how many buffers were queued up at any given time.</p> | 
 | 1203 |  | 
 | 1204 | <p>You'll notice the value increments while the app is active -- triggering | 
 | 1205 | the rendering of frames by the MediaCodec decoder -- and decrements while | 
 | 1206 | SurfaceFlinger is doing work, consuming buffers.  If you're showing video at | 
 | 1207 | 30fps, the queue's value will vary from 0 to 1, because the ~60fps display can | 
 | 1208 | easily keep up with the source.  (You'll also notice that SurfaceFlinger is only | 
 | 1209 | waking up when there's work to be done, not 60 times per second.  The system tries | 
 | 1210 | very hard to avoid work and will disable VSYNC entirely if nothing is updating | 
 | 1211 | the screen.)</p> | 
 | 1212 |  | 
 | 1213 | <p>If you switch to "Play video (TextureView)" and grab a new trace, you'll see a | 
 | 1214 | row with a much longer name | 
 | 1215 | ("com.android.grafika/com.android.grafika.PlayMovieActivity").  This is the | 
 | 1216 | main UI layer, which is of course just another BufferQueue.  Because TextureView | 
 | 1217 | renders into the UI layer, rather than a separate layer, you'll see all of the | 
 | 1218 | video-driven updates here.</p> | 
 | 1219 |  | 
 | 1220 | <p>For more information about systrace, see the <a | 
 | 1221 | href="http://developer.android.com/tools/help/systrace.html">Android | 
 | 1222 | documentation</a> for the tool.</p> |