Clay Murphy | e3ae396 | 2014-09-02 17:30:57 -0700 | [diff] [blame] | 1 | page.title=Graphics architecture |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 2 | @jd:body |
| 3 | |
| 4 | <!-- |
| 5 | Copyright 2014 The Android Open Source Project |
| 6 | |
| 7 | Licensed under the Apache License, Version 2.0 (the "License"); |
| 8 | you may not use this file except in compliance with the License. |
| 9 | You may obtain a copy of the License at |
| 10 | |
| 11 | http://www.apache.org/licenses/LICENSE-2.0 |
| 12 | |
| 13 | Unless required by applicable law or agreed to in writing, software |
| 14 | distributed under the License is distributed on an "AS IS" BASIS, |
| 15 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| 16 | See the License for the specific language governing permissions and |
| 17 | limitations under the License. |
| 18 | --> |
| 19 | <div id="qv-wrapper"> |
| 20 | <div id="qv"> |
| 21 | <h2>In this document</h2> |
| 22 | <ol id="auto-toc"> |
| 23 | </ol> |
| 24 | </div> |
| 25 | </div> |
| 26 | |
| 27 | |
| 28 | <p><em>What every developer should know about Surface, SurfaceHolder, EGLSurface, |
| 29 | SurfaceView, GLSurfaceView, SurfaceTexture, TextureView, and SurfaceFlinger</em> |
| 30 | </p> |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 31 | <p>This page describes the essential elements of system-level graphics |
| 32 | architecture in Android N and how it is used by the application framework and |
| 33 | multimedia system. The focus is on how buffers of graphical data move through |
| 34 | the system. If you've ever wondered why SurfaceView and TextureView behave the |
| 35 | way they do, or how Surface and EGLSurface interact, you are in the correct |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 36 | place.</p> |
| 37 | |
| 38 | <p>Some familiarity with Android devices and application development is assumed. |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 39 | You don't need detailed knowledge of the app framework and very few API calls |
| 40 | are mentioned, but the material doesn't overlap with other public |
| 41 | documentation. The goal here is to provide details on the significant events |
| 42 | involved in rendering a frame for output to help you make informed choices |
| 43 | when designing an application. To achieve this, we work from the bottom up, |
| 44 | describing how the UI classes work rather than how they can be used.</p> |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 45 | |
| 46 | <p>Early sections contain background material used in later sections, so it's a |
| 47 | good idea to read straight through rather than skipping to a section that sounds |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 48 | interesting. We start with an explanation of Android's graphics buffers, |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 49 | describe the composition and display mechanism, and then proceed to the |
| 50 | higher-level mechanisms that supply the compositor with data.</p> |
| 51 | |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 52 | <p class="note">This page includes references to AOSP source code and |
| 53 | <a href="https://github.com/google/grafika">Grafika</a>, a Google open source |
| 54 | project for testing.</p> |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 55 | |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 56 | <h2 id="BufferQueue">BufferQueue and gralloc</h2> |
| 57 | |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 58 | <p>To understand how Android's graphics system works, we must start behind the |
| 59 | scenes. At the heart of everything graphical in Android is a class called |
| 60 | BufferQueue. Its role is simple: connect something that generates buffers of |
| 61 | graphical data (the <em>producer</em>) to something that accepts the data for |
| 62 | display or further processing (the <em>consumer</em>). The producer and consumer |
| 63 | can live in different processes. Nearly everything that moves buffers of |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 64 | graphical data through the system relies on BufferQueue.</p> |
| 65 | |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 66 | <p>Basic usage is straightforward: The producer requests a free buffer |
| 67 | (<code>dequeueBuffer()</code>), specifying a set of characteristics including |
| 68 | width, height, pixel format, and usage flags. The producer populates the buffer |
| 69 | and returns it to the queue (<code>queueBuffer()</code>). Some time later, the |
| 70 | consumer acquires the buffer (<code>acquireBuffer()</code>) and makes use of the |
| 71 | buffer contents. When the consumer is done, it returns the buffer to the queue |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 72 | (<code>releaseBuffer()</code>).</p> |
| 73 | |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 74 | <p>Recent Android devices support the <em>sync framework</em>, which enables the |
| 75 | system to do nifty things when combined with hardware components that can |
| 76 | manipulate graphics data asynchronously. For example, a producer can submit a |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 77 | series of OpenGL ES drawing commands and then enqueue the output buffer before |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 78 | rendering completes. The buffer is accompanied by a fence that signals when the |
| 79 | contents are ready. A second fence accompanies the buffer when it is returned |
| 80 | to the free list, so the consumer can release the buffer while the contents are |
| 81 | still in use. This approach improves latency and throughput as the buffers |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 82 | move through the system.</p> |
| 83 | |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 84 | <p>Some characteristics of the queue, such as the maximum number of buffers it |
| 85 | can hold, are determined jointly by the producer and the consumer.</p> |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 86 | |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 87 | <p>The BufferQueue is responsible for allocating buffers as it needs them. |
| 88 | Buffers are retained unless the characteristics change; for example, if the |
| 89 | producer requests buffers with a different size, old buffers are freed and new |
| 90 | buffers are allocated on demand.</p> |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 91 | |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 92 | <p>Currently, the consumer always creates and owns the data structure. In |
| 93 | Android 4.3, only the producer side was binderized (i.e. producer could be |
| 94 | in a remote process but consumer had to live in the process where the queue |
| 95 | was created). Android 4.4 and later releases moved toward a more general |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 96 | implementation.</p> |
| 97 | |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 98 | <p>Buffer contents are never copied by BufferQueue (moving that much data around |
| 99 | would be very inefficient). Instead, buffers are always passed by handle.</p> |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 100 | |
| 101 | <h3 id="gralloc_HAL">gralloc HAL</h3> |
| 102 | |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 103 | <p>Buffer allocations are performed through the <em>gralloc</em> memory |
| 104 | allocator, which is implemented through a vendor-specific HAL interface (for |
| 105 | details, refer to <code>hardware/libhardware/include/hardware/gralloc.h</code>). |
| 106 | The <code>alloc()</code> function takes expected arguments (width, height, pixel |
| 107 | format) as well as a set of usage flags that merit closer attention.</p> |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 108 | |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 109 | <p>The gralloc allocator is not just another way to allocate memory on the |
| 110 | native heap; in some situations, the allocated memory may not be cache-coherent |
| 111 | or could be totally inaccessible from user space. The nature of the allocation |
| 112 | is determined by the usage flags, which include attributes such as:</p> |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 113 | |
| 114 | <ul> |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 115 | <li>How often the memory will be accessed from software (CPU)</li> |
| 116 | <li>How often the memory will be accessed from hardware (GPU)</li> |
| 117 | <li>Whether the memory will be used as an OpenGL ES (GLES) texture</li> |
| 118 | <li>Whether the memory will be used by a video encoder</li> |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 119 | </ul> |
| 120 | |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 121 | <p>For example, if your format specifies RGBA 8888 pixels, and you indicate the |
| 122 | buffer will be accessed from software (meaning your application will touch |
| 123 | pixels directly) then the allocator must create a buffer with 4 bytes per pixel |
| 124 | in R-G-B-A order. If instead you say the buffer will be only accessed from |
| 125 | hardware and as a GLES texture, the allocator can do anything the GLES driver |
| 126 | wants—BGRA ordering, non-linear swizzled layouts, alternative color |
| 127 | formats, etc. Allowing the hardware to use its preferred format can improve |
| 128 | performance.</p> |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 129 | |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 130 | <p>Some values cannot be combined on certain platforms. For example, the video |
| 131 | encoder flag may require YUV pixels, so adding software access and specifying |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 132 | RGBA 8888 would fail.</p> |
| 133 | |
| 134 | <p>The handle returned by the gralloc allocator can be passed between processes |
| 135 | through Binder.</p> |
| 136 | |
| 137 | <h2 id="SurfaceFlinger">SurfaceFlinger and Hardware Composer</h2> |
| 138 | |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 139 | <p>Having buffers of graphical data is wonderful, but life is even better when |
| 140 | you get to see them on your device's screen. That's where SurfaceFlinger and the |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 141 | Hardware Composer HAL come in.</p> |
| 142 | |
| 143 | <p>SurfaceFlinger's role is to accept buffers of data from multiple sources, |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 144 | composite them, and send them to the display. Once upon a time this was done |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 145 | with software blitting to a hardware framebuffer (e.g. |
| 146 | <code>/dev/graphics/fb0</code>), but those days are long gone.</p> |
| 147 | |
| 148 | <p>When an app comes to the foreground, the WindowManager service asks |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 149 | SurfaceFlinger for a drawing surface. SurfaceFlinger creates a layer (the |
| 150 | primary component of which is a BufferQueue) for which SurfaceFlinger acts as |
| 151 | the consumer. A Binder object for the producer side is passed through the |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 152 | WindowManager to the app, which can then start sending frames directly to |
Bert McMeen | 3bb4b8f | 2015-05-06 17:21:27 -0700 | [diff] [blame] | 153 | SurfaceFlinger.</p> |
| 154 | |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 155 | <p class="note"><strong>Note:</strong> While this section uses SurfaceFlinger |
| 156 | terminology, WindowManager uses the term <em>window</em> instead of |
| 157 | <em>layer</em>…and uses layer to mean something else. (It can be argued |
| 158 | that SurfaceFlinger should really be called LayerFlinger.)</p> |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 159 | |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 160 | <p>Most applications have three layers on screen at any time: the status bar at |
| 161 | the top of the screen, the navigation bar at the bottom or side, and the |
| 162 | application UI. Some apps have more, some less (e.g. the default home app has a |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 163 | separate layer for the wallpaper, while a full-screen game might hide the status |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 164 | bar. Each layer can be updated independently. The status and navigation bars |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 165 | are rendered by a system process, while the app layers are rendered by the app, |
| 166 | with no coordination between the two.</p> |
| 167 | |
| 168 | <p>Device displays refresh at a certain rate, typically 60 frames per second on |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 169 | phones and tablets. If the display contents are updated mid-refresh, tearing |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 170 | will be visible; so it's important to update the contents only between cycles. |
| 171 | The system receives a signal from the display when it's safe to update the |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 172 | contents. For historical reasons we'll call this the VSYNC signal.</p> |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 173 | |
| 174 | <p>The refresh rate may vary over time, e.g. some mobile devices will range from 58 |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 175 | to 62fps depending on current conditions. For an HDMI-attached television, this |
| 176 | could theoretically dip to 24 or 48Hz to match a video. Because we can update |
| 177 | the screen only once per refresh cycle, submitting buffers for display at 200fps |
| 178 | would be a waste of effort as most of the frames would never be seen. Instead of |
| 179 | taking action whenever an app submits a buffer, SurfaceFlinger wakes up when the |
| 180 | display is ready for something new.</p> |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 181 | |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 182 | <p>When the VSYNC signal arrives, SurfaceFlinger walks through its list of |
| 183 | layers looking for new buffers. If it finds a new one, it acquires it; if not, |
| 184 | it continues to use the previously-acquired buffer. SurfaceFlinger always wants |
| 185 | to have something to display, so it will hang on to one buffer. If no buffers |
| 186 | have ever been submitted on a layer, the layer is ignored.</p> |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 187 | |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 188 | <p>After SurfaceFlinger has collected all buffers for visible layers, it asks |
| 189 | the Hardware Composer how composition should be performed.</p> |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 190 | |
| 191 | <h3 id="hwcomposer">Hardware Composer</h3> |
| 192 | |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 193 | <p>The Hardware Composer HAL (HWC) was introduced in Android 3.0 and has evolved |
| 194 | steadily over the years. Its primary purpose is to determine the most efficient |
| 195 | way to composite buffers with the available hardware. As a HAL, its |
| 196 | implementation is device-specific and usually done by the display hardware OEM.</p> |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 197 | |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 198 | <p>The value of this approach is easy to recognize when you consider <em>overlay |
| 199 | planes</em>, the purpose of which is to composite multiple buffers together in |
| 200 | the display hardware rather than the GPU. For example, consider a typical |
| 201 | Android phone in portrait orientation, with the status bar on top, navigation |
| 202 | bar at the bottom, and app content everywhere else. The contents for each layer |
| 203 | are in separate buffers. You could handle composition using either of the |
| 204 | following methods:</p> |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 205 | |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 206 | <ul> |
| 207 | <li>Rendering the app content into a scratch buffer, then rendering the status |
| 208 | bar over it, the navigation bar on top of that, and finally passing the scratch |
| 209 | buffer to the display hardware.</li> |
| 210 | <li>Passing all three buffers to the display hardware and tell it to read data |
| 211 | from different buffers for different parts of the screen.</li> |
| 212 | </ul> |
| 213 | |
| 214 | <p>The latter approach can be significantly more efficient.</p> |
| 215 | |
| 216 | <p>Display processor capabilities vary significantly. The number of overlays, |
| 217 | whether layers can be rotated or blended, and restrictions on positioning and |
| 218 | overlap can be difficult to express through an API. The HWC attempts to |
| 219 | accommodate such diversity through a series of decisions:</p> |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 220 | |
| 221 | <ol> |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 222 | <li>SurfaceFlinger provides HWC with a full list of layers and asks, "How do |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 223 | you want to handle this?"</li> |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 224 | <li>HWC responds by marking each layer as overlay or GLES composition.</li> |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 225 | <li>SurfaceFlinger takes care of any GLES composition, passing the output buffer |
| 226 | to HWC, and lets HWC handle the rest.</li> |
| 227 | </ol> |
| 228 | |
| 229 | <p>Since the decision-making code can be custom tailored by the hardware vendor, |
| 230 | it's possible to get the best performance out of every device.</p> |
| 231 | |
| 232 | <p>Overlay planes may be less efficient than GL composition when nothing on the |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 233 | screen is changing. This is particularly true when overlay contents have |
| 234 | transparent pixels and overlapping layers are blended together. In such cases, |
| 235 | the HWC can choose to request GLES composition for some or all layers and retain |
| 236 | the composited buffer. If SurfaceFlinger comes back asking to composite the same |
| 237 | set of buffers, the HWC can continue to show the previously-composited scratch |
| 238 | buffer. This can improve the battery life of an idle device.</p> |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 239 | |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 240 | <p>Devices running Android 4.4 and later typically support four overlay planes. |
| 241 | Attempting to composite more layers than overlays causes the system to use GLES |
| 242 | composition for some of them, meaning the number of layers used by an app can |
| 243 | have a measurable impact on power consumption and performance.</p> |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 244 | |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 245 | <p>You can see exactly what SurfaceFlinger is up to with the command <code>adb |
| 246 | shell dumpsys SurfaceFlinger</code>. The output is verbose; the relevant section |
| 247 | is HWC summary that appears near the bottom of the output:</p> |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 248 | |
| 249 | <pre> |
| 250 | type | source crop | frame name |
| 251 | ------------+-----------------------------------+-------------------------------- |
| 252 | HWC | [ 0.0, 0.0, 320.0, 240.0] | [ 48, 411, 1032, 1149] SurfaceView |
| 253 | HWC | [ 0.0, 75.0, 1080.0, 1776.0] | [ 0, 75, 1080, 1776] com.android.grafika/com.android.grafika.PlayMovieSurfaceActivity |
| 254 | HWC | [ 0.0, 0.0, 1080.0, 75.0] | [ 0, 0, 1080, 75] StatusBar |
| 255 | HWC | [ 0.0, 0.0, 1080.0, 144.0] | [ 0, 1776, 1080, 1920] NavigationBar |
| 256 | FB TARGET | [ 0.0, 0.0, 1080.0, 1920.0] | [ 0, 0, 1080, 1920] HWC_FRAMEBUFFER_TARGET |
| 257 | </pre> |
| 258 | |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 259 | <p>The summary includes what layers are on screen and whether they are handled |
| 260 | with overlays (HWC) or OpenGL ES composition (GLES). It also includes other data |
| 261 | you likely don't care about (handle, hints, flags, etc.) and which has been |
| 262 | trimmed from the snippet above; source crop and frame values will be examined |
| 263 | more closely later on.</p> |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 264 | |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 265 | <p>The FB_TARGET layer is where GLES composition output goes. Since all layers |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 266 | shown above are using overlays, FB_TARGET isn’t being used for this frame. The |
| 267 | layer's name is indicative of its original role: On a device with |
| 268 | <code>/dev/graphics/fb0</code> and no overlays, all composition would be done |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 269 | with GLES, and the output would be written to the framebuffer. On newer devices, |
| 270 | generally is no simple framebuffer so the FB_TARGET layer is a scratch buffer.</p> |
Bert McMeen | 3bb4b8f | 2015-05-06 17:21:27 -0700 | [diff] [blame] | 271 | |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 272 | <p class="note"><strong>Note:</strong> This is why screen grabbers written for |
| 273 | older versions of Android no longer work: They are trying to read from the |
| 274 | Framebuffer, but there is no such thing.</p> |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 275 | |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 276 | <p>The overlay planes have another important role: They're the only way to |
| 277 | display DRM content. DRM-protected buffers cannot be accessed by SurfaceFlinger |
| 278 | or the GLES driver, which means your video will disappear if HWC switches to |
| 279 | GLES composition.</p> |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 280 | |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 281 | <h3 id="triple-buffering">Triple-Buffering</h3> |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 282 | |
| 283 | <p>To avoid tearing on the display, the system needs to be double-buffered: the |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 284 | front buffer is displayed while the back buffer is being prepared. At VSYNC, if |
| 285 | the back buffer is ready, you quickly switch them. This works reasonably well |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 286 | in a system where you're drawing directly into the framebuffer, but there's a |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 287 | hitch in the flow when a composition step is added. Because of the way |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 288 | SurfaceFlinger is triggered, our double-buffered pipeline will have a bubble.</p> |
| 289 | |
| 290 | <p>Suppose frame N is being displayed, and frame N+1 has been acquired by |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 291 | SurfaceFlinger for display on the next VSYNC. (Assume frame N is composited |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 292 | with an overlay, so we can't alter the buffer contents until the display is done |
| 293 | with it.) When VSYNC arrives, HWC flips the buffers. While the app is starting |
| 294 | to render frame N+2 into the buffer that used to hold frame N, SurfaceFlinger is |
| 295 | scanning the layer list, looking for updates. SurfaceFlinger won't find any new |
| 296 | buffers, so it prepares to show frame N+1 again after the next VSYNC. A little |
| 297 | while later, the app finishes rendering frame N+2 and queues it for |
| 298 | SurfaceFlinger, but it's too late. This has effectively cut our maximum frame |
| 299 | rate in half.</p> |
| 300 | |
| 301 | <p>We can fix this with triple-buffering. Just before VSYNC, frame N is being |
| 302 | displayed, frame N+1 has been composited (or scheduled for an overlay) and is |
| 303 | ready to be displayed, and frame N+2 is queued up and ready to be acquired by |
| 304 | SurfaceFlinger. When the screen flips, the buffers rotate through the stages |
| 305 | with no bubble. The app has just less than a full VSYNC period (16.7ms at 60fps) to |
| 306 | do its rendering and queue the buffer. And SurfaceFlinger / HWC has a full VSYNC |
| 307 | period to figure out the composition before the next flip. The downside is |
| 308 | that it takes at least two VSYNC periods for anything that the app does to |
| 309 | appear on the screen. As the latency increases, the device feels less |
| 310 | responsive to touch input.</p> |
| 311 | |
| 312 | <img src="images/surfaceflinger_bufferqueue.png" alt="SurfaceFlinger with BufferQueue" /> |
| 313 | |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 314 | <p class="img-caption"><strong>Figure 1.</strong> SurfaceFlinger + BufferQueue</p> |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 315 | |
| 316 | <p>The diagram above depicts the flow of SurfaceFlinger and BufferQueue. During |
| 317 | frame:</p> |
| 318 | |
| 319 | <ol> |
| 320 | <li>red buffer fills up, then slides into BufferQueue</li> |
| 321 | <li>after red buffer leaves app, blue buffer slides in, replacing it</li> |
| 322 | <li>green buffer and systemUI* shadow-slide into HWC (showing that SurfaceFlinger |
| 323 | still has the buffers, but now HWC has prepared them for display via overlay on |
| 324 | the next VSYNC).</li> |
| 325 | </ol> |
| 326 | |
| 327 | <p>The blue buffer is referenced by both the display and the BufferQueue. The |
| 328 | app is not allowed to render to it until the associated sync fence signals.</p> |
| 329 | |
| 330 | <p>On VSYNC, all of these happen at once:</p> |
| 331 | |
| 332 | <ul> |
| 333 | <li>red buffer leaps into SurfaceFlinger, replacing green buffer</li> |
| 334 | <li>green buffer leaps into Display, replacing blue buffer, and a dotted-line |
| 335 | green twin appears in the BufferQueue</li> |
| 336 | <li>the blue buffer’s fence is signaled, and the blue buffer in App empties**</li> |
| 337 | <li>display rect changes from <blue + SystemUI> to <green + |
| 338 | SystemUI></li> |
| 339 | </ul> |
| 340 | |
| 341 | <p><strong>*</strong> - The System UI process is providing the status and nav |
| 342 | bars, which for our purposes here aren’t changing, so SurfaceFlinger keeps using |
| 343 | the previously-acquired buffer. In practice there would be two separate |
| 344 | buffers, one for the status bar at the top, one for the navigation bar at the |
| 345 | bottom, and they would be sized to fit their contents. Each would arrive on its |
| 346 | own BufferQueue.</p> |
| 347 | |
| 348 | <p><strong>**</strong> - The buffer doesn’t actually “empty”; if you submit it |
| 349 | without drawing on it you’ll get that same blue again. The emptying is the |
| 350 | result of clearing the buffer contents, which the app should do before it starts |
| 351 | drawing.</p> |
| 352 | |
| 353 | <p>We can reduce the latency by noting layer composition should not require a |
| 354 | full VSYNC period. If composition is performed by overlays, it takes essentially |
| 355 | zero CPU and GPU time. But we can't count on that, so we need to allow a little |
| 356 | time. If the app starts rendering halfway between VSYNC signals, and |
| 357 | SurfaceFlinger defers the HWC setup until a few milliseconds before the signal |
| 358 | is due to arrive, we can cut the latency from 2 frames to perhaps 1.5. In |
| 359 | theory you could render and composite in a single period, allowing a return to |
| 360 | double-buffering; but getting it down that far is difficult on current devices. |
| 361 | Minor fluctuations in rendering and composition time, and switching from |
| 362 | overlays to GLES composition, can cause us to miss a swap deadline and repeat |
| 363 | the previous frame.</p> |
| 364 | |
| 365 | <p>SurfaceFlinger's buffer handling demonstrates the fence-based buffer |
| 366 | management mentioned earlier. If we're animating at full speed, we need to |
| 367 | have an acquired buffer for the display ("front") and an acquired buffer for |
| 368 | the next flip ("back"). If we're showing the buffer on an overlay, the |
| 369 | contents are being accessed directly by the display and must not be touched. |
| 370 | But if you look at an active layer's BufferQueue state in the <code>dumpsys |
| 371 | SurfaceFlinger</code> output, you'll see one acquired buffer, one queued buffer, and |
| 372 | one free buffer. That's because, when SurfaceFlinger acquires the new "back" |
| 373 | buffer, it releases the current "front" buffer to the queue. The "front" |
| 374 | buffer is still in use by the display, so anything that dequeues it must wait |
| 375 | for the fence to signal before drawing on it. So long as everybody follows |
| 376 | the fencing rules, all of the queue-management IPC requests can happen in |
| 377 | parallel with the display.</p> |
| 378 | |
| 379 | <h3 id="virtual-displays">Virtual Displays</h3> |
| 380 | |
| 381 | <p>SurfaceFlinger supports a "primary" display, i.e. what's built into your phone |
| 382 | or tablet, and an "external" display, such as a television connected through |
| 383 | HDMI. It also supports a number of "virtual" displays, which make composited |
| 384 | output available within the system. Virtual displays can be used to record the |
| 385 | screen or send it over a network.</p> |
| 386 | |
| 387 | <p>Virtual displays may share the same set of layers as the main display |
| 388 | (the "layer stack") or have its own set. There is no VSYNC for a virtual |
| 389 | display, so the VSYNC for the primary display is used to trigger composition for |
| 390 | all displays.</p> |
| 391 | |
| 392 | <p>In the past, virtual displays were always composited with GLES. The Hardware |
| 393 | Composer managed composition for only the primary display. In Android 4.4, the |
| 394 | Hardware Composer gained the ability to participate in virtual display |
| 395 | composition.</p> |
| 396 | |
| 397 | <p>As you might expect, the frames generated for a virtual display are written to a |
| 398 | BufferQueue.</p> |
| 399 | |
| 400 | <h3 id="screenrecord">Case study: screenrecord</h3> |
| 401 | |
| 402 | <p>Now that we've established some background on BufferQueue and SurfaceFlinger, |
| 403 | it's useful to examine a practical use case.</p> |
| 404 | |
| 405 | <p>The <a href="https://android.googlesource.com/platform/frameworks/av/+/kitkat-release/cmds/screenrecord/">screenrecord |
| 406 | command</a>, |
| 407 | introduced in Android 4.4, allows you to record everything that appears on the |
| 408 | screen as an .mp4 file on disk. To implement this, we have to receive composited |
| 409 | frames from SurfaceFlinger, write them to the video encoder, and then write the |
| 410 | encoded video data to a file. The video codecs are managed by a separate |
| 411 | process - called "mediaserver" - so we have to move large graphics buffers around |
| 412 | the system. To make it more challenging, we're trying to record 60fps video at |
| 413 | full resolution. The key to making this work efficiently is BufferQueue.</p> |
| 414 | |
| 415 | <p>The MediaCodec class allows an app to provide data as raw bytes in buffers, or |
| 416 | through a Surface. We'll discuss Surface in more detail later, but for now just |
| 417 | think of it as a wrapper around the producer end of a BufferQueue. When |
| 418 | screenrecord requests access to a video encoder, mediaserver creates a |
| 419 | BufferQueue and connects itself to the consumer side, and then passes the |
| 420 | producer side back to screenrecord as a Surface.</p> |
| 421 | |
| 422 | <p>The screenrecord command then asks SurfaceFlinger to create a virtual display |
| 423 | that mirrors the main display (i.e. it has all of the same layers), and directs |
| 424 | it to send output to the Surface that came from mediaserver. Note that, in this |
| 425 | case, SurfaceFlinger is the producer of buffers rather than the consumer.</p> |
| 426 | |
| 427 | <p>Once the configuration is complete, screenrecord can just sit and wait for |
| 428 | encoded data to appear. As apps draw, their buffers travel to SurfaceFlinger, |
| 429 | which composites them into a single buffer that gets sent directly to the video |
| 430 | encoder in mediaserver. The full frames are never even seen by the screenrecord |
| 431 | process. Internally, mediaserver has its own way of moving buffers around that |
| 432 | also passes data by handle, minimizing overhead.</p> |
| 433 | |
| 434 | <h3 id="simulate-secondary">Case study: Simulate Secondary Displays</h3> |
| 435 | |
| 436 | <p>The WindowManager can ask SurfaceFlinger to create a visible layer for which |
| 437 | SurfaceFlinger will act as the BufferQueue consumer. It's also possible to ask |
| 438 | SurfaceFlinger to create a virtual display, for which SurfaceFlinger will act as |
| 439 | the BufferQueue producer. What happens if you connect them, configuring a |
| 440 | virtual display that renders to a visible layer?</p> |
| 441 | |
| 442 | <p>You create a closed loop, where the composited screen appears in a window. Of |
| 443 | course, that window is now part of the composited output, so on the next refresh |
| 444 | the composited image inside the window will show the window contents as well. |
| 445 | It's turtles all the way down. You can see this in action by enabling |
| 446 | "<a href="http://developer.android.com/tools/index.html">Developer options</a>" in |
| 447 | settings, selecting "Simulate secondary displays", and enabling a window. For |
| 448 | bonus points, use screenrecord to capture the act of enabling the display, then |
| 449 | play it back frame-by-frame.</p> |
| 450 | |
| 451 | <h2 id="surface">Surface and SurfaceHolder</h2> |
| 452 | |
| 453 | <p>The <a |
| 454 | href="http://developer.android.com/reference/android/view/Surface.html">Surface</a> |
| 455 | class has been part of the public API since 1.0. Its description simply says, |
| 456 | "Handle onto a raw buffer that is being managed by the screen compositor." The |
| 457 | statement was accurate when initially written but falls well short of the mark |
| 458 | on a modern system.</p> |
| 459 | |
| 460 | <p>The Surface represents the producer side of a buffer queue that is often (but |
| 461 | not always!) consumed by SurfaceFlinger. When you render onto a Surface, the |
| 462 | result ends up in a buffer that gets shipped to the consumer. A Surface is not |
| 463 | simply a raw chunk of memory you can scribble on.</p> |
| 464 | |
| 465 | <p>The BufferQueue for a display Surface is typically configured for |
| 466 | triple-buffering; but buffers are allocated on demand. So if the producer |
| 467 | generates buffers slowly enough -- maybe it's animating at 30fps on a 60fps |
| 468 | display -- there might only be two allocated buffers in the queue. This helps |
| 469 | minimize memory consumption. You can see a summary of the buffers associated |
| 470 | with every layer in the <code>dumpsys SurfaceFlinger</code> output.</p> |
| 471 | |
| 472 | <h3 id="canvas">Canvas Rendering</h3> |
| 473 | |
| 474 | <p>Once upon a time, all rendering was done in software, and you can still do this |
| 475 | today. The low-level implementation is provided by the Skia graphics library. |
| 476 | If you want to draw a rectangle, you make a library call, and it sets bytes in a |
| 477 | buffer appropriately. To ensure that a buffer isn't updated by two clients at |
| 478 | once, or written to while being displayed, you have to lock the buffer to access |
| 479 | it. <code>lockCanvas()</code> locks the buffer and returns a Canvas to use for drawing, |
| 480 | and <code>unlockCanvasAndPost()</code> unlocks the buffer and sends it to the compositor.</p> |
| 481 | |
| 482 | <p>As time went on, and devices with general-purpose 3D engines appeared, Android |
| 483 | reoriented itself around OpenGL ES. However, it was important to keep the old |
| 484 | API working, for apps as well as app framework code, so an effort was made to |
| 485 | hardware-accelerate the Canvas API. As you can see from the charts on the |
| 486 | <a href="http://developer.android.com/guide/topics/graphics/hardware-accel.html">Hardware |
| 487 | Acceleration</a> |
| 488 | page, this was a bit of a bumpy ride. Note in particular that while the Canvas |
| 489 | provided to a View's <code>onDraw()</code> method may be hardware-accelerated, the Canvas |
| 490 | obtained when an app locks a Surface directly with <code>lockCanvas()</code> never is.</p> |
| 491 | |
| 492 | <p>When you lock a Surface for Canvas access, the "CPU renderer" connects to the |
| 493 | producer side of the BufferQueue and does not disconnect until the Surface is |
| 494 | destroyed. Most other producers (like GLES) can be disconnected and reconnected |
| 495 | to a Surface, but the Canvas-based "CPU renderer" cannot. This means you can't |
| 496 | draw on a surface with GLES or send it frames from a video decoder if you've |
| 497 | ever locked it for a Canvas.</p> |
| 498 | |
| 499 | <p>The first time the producer requests a buffer from a BufferQueue, it is |
| 500 | allocated and initialized to zeroes. Initialization is necessary to avoid |
| 501 | inadvertently sharing data between processes. When you re-use a buffer, |
| 502 | however, the previous contents will still be present. If you repeatedly call |
| 503 | <code>lockCanvas()</code> and <code>unlockCanvasAndPost()</code> without |
| 504 | drawing anything, you'll cycle between previously-rendered frames.</p> |
| 505 | |
| 506 | <p>The Surface lock/unlock code keeps a reference to the previously-rendered |
| 507 | buffer. If you specify a dirty region when locking the Surface, it will copy |
| 508 | the non-dirty pixels from the previous buffer. There's a fair chance the buffer |
| 509 | will be handled by SurfaceFlinger or HWC; but since we need to only read from |
| 510 | it, there's no need to wait for exclusive access.</p> |
| 511 | |
| 512 | <p>The main non-Canvas way for an application to draw directly on a Surface is |
| 513 | through OpenGL ES. That's described in the <a href="#eglsurface">EGLSurface and |
| 514 | OpenGL ES</a> section.</p> |
| 515 | |
| 516 | <h3 id="surfaceholder">SurfaceHolder</h3> |
| 517 | |
| 518 | <p>Some things that work with Surfaces want a SurfaceHolder, notably SurfaceView. |
| 519 | The original idea was that Surface represented the raw compositor-managed |
| 520 | buffer, while SurfaceHolder was managed by the app and kept track of |
| 521 | higher-level information like the dimensions and format. The Java-language |
| 522 | definition mirrors the underlying native implementation. It's arguably no |
| 523 | longer useful to split it this way, but it has long been part of the public API.</p> |
| 524 | |
| 525 | <p>Generally speaking, anything having to do with a View will involve a |
| 526 | SurfaceHolder. Some other APIs, such as MediaCodec, will operate on the Surface |
| 527 | itself. You can easily get the Surface from the SurfaceHolder, so hang on to |
| 528 | the latter when you have it.</p> |
| 529 | |
| 530 | <p>APIs to get and set Surface parameters, such as the size and format, are |
| 531 | implemented through SurfaceHolder.</p> |
| 532 | |
| 533 | <h2 id="eglsurface">EGLSurface and OpenGL ES</h2> |
| 534 | |
| 535 | <p>OpenGL ES defines an API for rendering graphics. It does not define a windowing |
| 536 | system. To allow GLES to work on a variety of platforms, it is designed to be |
| 537 | combined with a library that knows how to create and access windows through the |
| 538 | operating system. The library used for Android is called EGL. If you want to |
| 539 | draw textured polygons, you use GLES calls; if you want to put your rendering on |
| 540 | the screen, you use EGL calls.</p> |
| 541 | |
| 542 | <p>Before you can do anything with GLES, you need to create a GL context. In EGL, |
| 543 | this means creating an EGLContext and an EGLSurface. GLES operations apply to |
| 544 | the current context, which is accessed through thread-local storage rather than |
| 545 | passed around as an argument. This means you have to be careful about which |
| 546 | thread your rendering code executes on, and which context is current on that |
| 547 | thread.</p> |
| 548 | |
| 549 | <p>The EGLSurface can be an off-screen buffer allocated by EGL (called a "pbuffer") |
| 550 | or a window allocated by the operating system. EGL window surfaces are created |
| 551 | with the <code>eglCreateWindowSurface()</code> call. It takes a "window object" as an |
| 552 | argument, which on Android can be a SurfaceView, a SurfaceTexture, a |
| 553 | SurfaceHolder, or a Surface -- all of which have a BufferQueue underneath. When |
| 554 | you make this call, EGL creates a new EGLSurface object, and connects it to the |
| 555 | producer interface of the window object's BufferQueue. From that point onward, |
| 556 | rendering to that EGLSurface results in a buffer being dequeued, rendered into, |
| 557 | and queued for use by the consumer. (The term "window" is indicative of the |
| 558 | expected use, but bear in mind the output might not be destined to appear |
| 559 | on the display.)</p> |
| 560 | |
| 561 | <p>EGL does not provide lock/unlock calls. Instead, you issue drawing commands and |
| 562 | then call <code>eglSwapBuffers()</code> to submit the current frame. The |
| 563 | method name comes from the traditional swap of front and back buffers, but the actual |
| 564 | implementation may be very different.</p> |
| 565 | |
| 566 | <p>Only one EGLSurface can be associated with a Surface at a time -- you can have |
| 567 | only one producer connected to a BufferQueue -- but if you destroy the |
| 568 | EGLSurface it will disconnect from the BufferQueue and allow something else to |
| 569 | connect.</p> |
| 570 | |
| 571 | <p>A given thread can switch between multiple EGLSurfaces by changing what's |
| 572 | "current." An EGLSurface must be current on only one thread at a time.</p> |
| 573 | |
| 574 | <p>The most common mistake when thinking about EGLSurface is assuming that it is |
| 575 | just another aspect of Surface (like SurfaceHolder). It's a related but |
| 576 | independent concept. You can draw on an EGLSurface that isn't backed by a |
| 577 | Surface, and you can use a Surface without EGL. EGLSurface just gives GLES a |
| 578 | place to draw.</p> |
| 579 | |
| 580 | <h3 id="anativewindow">ANativeWindow</h3> |
| 581 | |
| 582 | <p>The public Surface class is implemented in the Java programming language. The |
| 583 | equivalent in C/C++ is the ANativeWindow class, semi-exposed by the <a |
| 584 | href="https://developer.android.com/tools/sdk/ndk/index.html">Android NDK</a>. You |
| 585 | can get the ANativeWindow from a Surface with the <code>ANativeWindow_fromSurface()</code> |
| 586 | call. Just like its Java-language cousin, you can lock it, render in software, |
| 587 | and unlock-and-post.</p> |
| 588 | |
| 589 | <p>To create an EGL window surface from native code, you pass an instance of |
| 590 | EGLNativeWindowType to <code>eglCreateWindowSurface()</code>. EGLNativeWindowType is just |
| 591 | a synonym for ANativeWindow, so you can freely cast one to the other.</p> |
| 592 | |
| 593 | <p>The fact that the basic "native window" type just wraps the producer side of a |
| 594 | BufferQueue should not come as a surprise.</p> |
| 595 | |
| 596 | <h2 id="surfaceview">SurfaceView and GLSurfaceView</h2> |
| 597 | |
| 598 | <p>Now that we've explored the lower-level components, it's time to see how they |
| 599 | fit into the higher-level components that apps are built from.</p> |
| 600 | |
| 601 | <p>The Android app framework UI is based on a hierarchy of objects that start with |
| 602 | View. Most of the details don't matter for this discussion, but it's helpful to |
| 603 | understand that UI elements go through a complicated measurement and layout |
| 604 | process that fits them into a rectangular area. All visible View objects are |
| 605 | rendered to a SurfaceFlinger-created Surface that was set up by the |
| 606 | WindowManager when the app was brought to the foreground. The layout and |
| 607 | rendering is performed on the app's UI thread.</p> |
| 608 | |
| 609 | <p>Regardless of how many Layouts and Views you have, everything gets rendered into |
| 610 | a single buffer. This is true whether or not the Views are hardware-accelerated.</p> |
| 611 | |
| 612 | <p>A SurfaceView takes the same sorts of parameters as other views, so you can give |
| 613 | it a position and size, and fit other elements around it. When it comes time to |
| 614 | render, however, the contents are completely transparent. The View part of a |
| 615 | SurfaceView is just a see-through placeholder.</p> |
| 616 | |
| 617 | <p>When the SurfaceView's View component is about to become visible, the framework |
| 618 | asks the WindowManager to ask SurfaceFlinger to create a new Surface. (This |
| 619 | doesn't happen synchronously, which is why you should provide a callback that |
| 620 | notifies you when the Surface creation finishes.) By default, the new Surface |
| 621 | is placed behind the app UI Surface, but the default "Z-ordering" can be |
| 622 | overridden to put the Surface on top.</p> |
| 623 | |
| 624 | <p>Whatever you render onto this Surface will be composited by SurfaceFlinger, not |
| 625 | by the app. This is the real power of SurfaceView: the Surface you get can be |
| 626 | rendered by a separate thread or a separate process, isolated from any rendering |
| 627 | performed by the app UI, and the buffers go directly to SurfaceFlinger. You |
| 628 | can't totally ignore the UI thread -- you still have to coordinate with the |
| 629 | Activity lifecycle, and you may need to adjust something if the size or position |
| 630 | of the View changes -- but you have a whole Surface all to yourself, and |
| 631 | blending with the app UI and other layers is handled by the Hardware Composer.</p> |
| 632 | |
| 633 | <p>It's worth taking a moment to note that this new Surface is the producer side of |
| 634 | a BufferQueue whose consumer is a SurfaceFlinger layer. You can update the |
| 635 | Surface with any mechanism that can feed a BufferQueue. You can: use the |
| 636 | Surface-supplied Canvas functions, attach an EGLSurface and draw on it |
| 637 | with GLES, and configure a MediaCodec video decoder to write to it.</p> |
| 638 | |
| 639 | <h3 id="composition">Composition and the Hardware Scaler</h3> |
| 640 | |
| 641 | <p>Now that we have a bit more context, it's useful to go back and look at a couple |
| 642 | of fields from <code>dumpsys SurfaceFlinger</code> that we skipped over earlier |
| 643 | on. Back in the <a href="#hwcomposer">Hardware Composer</a> discussion, we |
| 644 | looked at some output like this:</p> |
| 645 | |
| 646 | <pre> |
| 647 | type | source crop | frame name |
| 648 | ------------+-----------------------------------+-------------------------------- |
| 649 | HWC | [ 0.0, 0.0, 320.0, 240.0] | [ 48, 411, 1032, 1149] SurfaceView |
| 650 | HWC | [ 0.0, 75.0, 1080.0, 1776.0] | [ 0, 75, 1080, 1776] com.android.grafika/com.android.grafika.PlayMovieSurfaceActivity |
| 651 | HWC | [ 0.0, 0.0, 1080.0, 75.0] | [ 0, 0, 1080, 75] StatusBar |
| 652 | HWC | [ 0.0, 0.0, 1080.0, 144.0] | [ 0, 1776, 1080, 1920] NavigationBar |
| 653 | FB TARGET | [ 0.0, 0.0, 1080.0, 1920.0] | [ 0, 0, 1080, 1920] HWC_FRAMEBUFFER_TARGET |
| 654 | </pre> |
| 655 | |
| 656 | <p>This was taken while playing a movie in Grafika's "Play video (SurfaceView)" |
| 657 | activity, on a Nexus 5 in portrait orientation. Note that the list is ordered |
| 658 | from back to front: the SurfaceView's Surface is in the back, the app UI layer |
| 659 | sits on top of that, followed by the status and navigation bars that are above |
| 660 | everything else. The video is QVGA (320x240).</p> |
| 661 | |
| 662 | <p>The "source crop" indicates the portion of the Surface's buffer that |
| 663 | SurfaceFlinger is going to display. The app UI was given a Surface equal to the |
| 664 | full size of the display (1080x1920), but there's no point rendering and |
| 665 | compositing pixels that will be obscured by the status and navigation bars, so |
| 666 | the source is cropped to a rectangle that starts 75 pixels from the top, and |
| 667 | ends 144 pixels from the bottom. The status and navigation bars have smaller |
| 668 | Surfaces, and the source crop describes a rectangle that begins at the the top |
| 669 | left (0,0) and spans their content.</p> |
| 670 | |
| 671 | <p>The "frame" is the rectangle where the pixels end up on the display. For the |
| 672 | app UI layer, the frame matches the source crop, because we're copying (or |
| 673 | overlaying) a portion of a display-sized layer to the same location in another |
| 674 | display-sized layer. For the status and navigation bars, the size of the frame |
| 675 | rectangle is the same, but the position is adjusted so that the navigation bar |
| 676 | appears at the bottom of the screen.</p> |
| 677 | |
| 678 | <p>Now consider the layer labeled "SurfaceView", which holds our video content. |
| 679 | The source crop matches the video size, which SurfaceFlinger knows because the |
| 680 | MediaCodec decoder (the buffer producer) is dequeuing buffers that size. The |
| 681 | frame rectangle has a completely different size -- 984x738.</p> |
| 682 | |
| 683 | <p>SurfaceFlinger handles size differences by scaling the buffer contents to fill |
| 684 | the frame rectangle, upscaling or downscaling as needed. This particular size |
| 685 | was chosen because it has the same aspect ratio as the video (4:3), and is as |
| 686 | wide as possible given the constraints of the View layout (which includes some |
| 687 | padding at the edges of the screen for aesthetic reasons).</p> |
| 688 | |
| 689 | <p>If you started playing a different video on the same Surface, the underlying |
| 690 | BufferQueue would reallocate buffers to the new size automatically, and |
| 691 | SurfaceFlinger would adjust the source crop. If the aspect ratio of the new |
| 692 | video is different, the app would need to force a re-layout of the View to match |
| 693 | it, which causes the WindowManager to tell SurfaceFlinger to update the frame |
| 694 | rectangle.</p> |
| 695 | |
| 696 | <p>If you're rendering on the Surface through some other means, perhaps GLES, you |
| 697 | can set the Surface size using the <code>SurfaceHolder#setFixedSize()</code> |
| 698 | call. You could, for example, configure a game to always render at 1280x720, |
| 699 | which would significantly reduce the number of pixels that must be touched to |
| 700 | fill the screen on a 2560x1440 tablet or 4K television. The display processor |
| 701 | handles the scaling. If you don't want to letter- or pillar-box your game, you |
| 702 | could adjust the game's aspect ratio by setting the size so that the narrow |
| 703 | dimension is 720 pixels, but the long dimension is set to maintain the aspect |
| 704 | ratio of the physical display (e.g. 1152x720 to match a 2560x1600 display). |
| 705 | You can see an example of this approach in Grafika's "Hardware scaler |
| 706 | exerciser" activity.</p> |
| 707 | |
| 708 | <h3 id="glsurfaceview">GLSurfaceView</h3> |
| 709 | |
| 710 | <p>The GLSurfaceView class provides some helper classes that help manage EGL |
| 711 | contexts, inter-thread communication, and interaction with the Activity |
| 712 | lifecycle. That's it. You do not need to use a GLSurfaceView to use GLES.</p> |
| 713 | |
| 714 | <p>For example, GLSurfaceView creates a thread for rendering and configures an EGL |
| 715 | context there. The state is cleaned up automatically when the activity pauses. |
| 716 | Most apps won't need to know anything about EGL to use GLES with GLSurfaceView.</p> |
| 717 | |
| 718 | <p>In most cases, GLSurfaceView is very helpful and can make working with GLES |
| 719 | easier. In some situations, it can get in the way. Use it if it helps, don't |
| 720 | if it doesn't.</p> |
| 721 | |
| 722 | <h2 id="surfacetexture">SurfaceTexture</h2> |
| 723 | |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 724 | <p>The SurfaceTexture class was introduced in Android 3.0. Just as SurfaceView |
| 725 | is the combination of a Surface and a View, SurfaceTexture is a rough |
| 726 | combination of a Surface and a GLES texture (with a few caveats).</p> |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 727 | |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 728 | <p>When you create a SurfaceTexture, you are creating a BufferQueue for which |
| 729 | your app is the consumer. When a new buffer is queued by the producer, your app |
| 730 | is notified via callback (<code>onFrameAvailable()</code>). Your app calls |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 731 | <code>updateTexImage()</code>, which releases the previously-held buffer, |
| 732 | acquires the new buffer from the queue, and makes some EGL calls to make the |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 733 | buffer available to GLES as an external texture.</p> |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 734 | |
| 735 | <p>External textures (<code>GL_TEXTURE_EXTERNAL_OES</code>) are not quite the |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 736 | same as textures created by GLES (<code>GL_TEXTURE_2D</code>): You have to |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 737 | configure your renderer a bit differently, and there are things you can't do |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 738 | with them. The key point is that you can render textured polygons directly |
| 739 | from the data received by your BufferQueue. gralloc supports a wide variety of |
| 740 | formats, so we need to guarantee the format of the data in the buffer is |
| 741 | something GLES can recognize. To do so, when SurfaceTexture creates the |
| 742 | BufferQueue, it sets the consumer usage flags to |
| 743 | <code>GRALLOC_USAGE_HW_TEXTURE</code>, ensuring that any buffer created by |
| 744 | gralloc would be usable by GLES.</p> |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 745 | |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 746 | <p>Because SurfaceTexture interacts with an EGL context, you must be careful to |
| 747 | call its methods from the correct thread (this is detailed in the class |
| 748 | documentation).</p> |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 749 | |
| 750 | <p>If you look deeper into the class documentation, you will see a couple of odd |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 751 | calls. One retrieves a timestamp, the other a transformation matrix, the value |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 752 | of each having been set by the previous call to <code>updateTexImage()</code>. |
| 753 | It turns out that BufferQueue passes more than just a buffer handle to the consumer. |
| 754 | Each buffer is accompanied by a timestamp and transformation parameters.</p> |
| 755 | |
| 756 | <p>The transformation is provided for efficiency. In some cases, the source data |
| 757 | might be in the "wrong" orientation for the consumer; but instead of rotating |
| 758 | the data before sending it, we can send the data in its current orientation with |
| 759 | a transform that corrects it. The transformation matrix can be merged with |
| 760 | other transformations at the point the data is used, minimizing overhead.</p> |
| 761 | |
| 762 | <p>The timestamp is useful for certain buffer sources. For example, suppose you |
| 763 | connect the producer interface to the output of the camera (with |
| 764 | <code>setPreviewTexture()</code>). If you want to create a video, you need to |
| 765 | set the presentation time stamp for each frame; but you want to base that on the time |
| 766 | when the frame was captured, not the time when the buffer was received by your |
| 767 | app. The timestamp provided with the buffer is set by the camera code, |
| 768 | resulting in a more consistent series of timestamps.</p> |
| 769 | |
| 770 | <h3 id="surfacet">SurfaceTexture and Surface</h3> |
| 771 | |
| 772 | <p>If you look closely at the API you'll see the only way for an application |
| 773 | to create a plain Surface is through a constructor that takes a SurfaceTexture |
| 774 | as the sole argument. (Prior to API 11, there was no public constructor for |
| 775 | Surface at all.) This might seem a bit backward if you view SurfaceTexture as a |
| 776 | combination of a Surface and a texture.</p> |
| 777 | |
| 778 | <p>Under the hood, SurfaceTexture is called GLConsumer, which more accurately |
| 779 | reflects its role as the owner and consumer of a BufferQueue. When you create a |
| 780 | Surface from a SurfaceTexture, what you're doing is creating an object that |
| 781 | represents the producer side of the SurfaceTexture's BufferQueue.</p> |
| 782 | |
| 783 | <h3 id="continuous-capture">Case Study: Grafika's "Continuous Capture" Activity</h3> |
| 784 | |
| 785 | <p>The camera can provide a stream of frames suitable for recording as a movie. If |
| 786 | you want to display it on screen, you create a SurfaceView, pass the Surface to |
| 787 | <code>setPreviewDisplay()</code>, and let the producer (camera) and consumer |
| 788 | (SurfaceFlinger) do all the work. If you want to record the video, you create a |
| 789 | Surface with MediaCodec's <code>createInputSurface()</code>, pass that to the |
| 790 | camera, and again you sit back and relax. If you want to show the video and |
| 791 | record it at the same time, you have to get more involved.</p> |
| 792 | |
| 793 | <p>The "Continuous capture" activity displays video from the camera as it's being |
| 794 | recorded. In this case, encoded video is written to a circular buffer in memory |
| 795 | that can be saved to disk at any time. It's straightforward to implement so |
| 796 | long as you keep track of where everything is.</p> |
| 797 | |
| 798 | <p>There are three BufferQueues involved. The app uses a SurfaceTexture to receive |
| 799 | frames from Camera, converting them to an external GLES texture. The app |
| 800 | declares a SurfaceView, which we use to display the frames, and we configure a |
| 801 | MediaCodec encoder with an input Surface to create the video. So one |
| 802 | BufferQueue is created by the app, one by SurfaceFlinger, and one by |
| 803 | mediaserver.</p> |
| 804 | |
| 805 | <img src="images/continuous_capture_activity.png" alt="Grafika continuous |
| 806 | capture activity" /> |
| 807 | |
| 808 | <p class="img-caption"> |
| 809 | <strong>Figure 2.</strong>Grafika's continuous capture activity |
| 810 | </p> |
| 811 | |
| 812 | <p>In the diagram above, the arrows show the propagation of the data from the |
| 813 | camera. BufferQueues are in color (purple producer, cyan consumer). Note |
| 814 | “Camera” actually lives in the mediaserver process.</p> |
| 815 | |
| 816 | <p>Encoded H.264 video goes to a circular buffer in RAM in the app process, and is |
| 817 | written to an MP4 file on disk using the MediaMuxer class when the “capture” |
| 818 | button is hit.</p> |
| 819 | |
| 820 | <p>All three of the BufferQueues are handled with a single EGL context in the |
| 821 | app, and the GLES operations are performed on the UI thread. Doing the |
| 822 | SurfaceView rendering on the UI thread is generally discouraged, but since we're |
| 823 | doing simple operations that are handled asynchronously by the GLES driver we |
| 824 | should be fine. (If the video encoder locks up and we block trying to dequeue a |
| 825 | buffer, the app will become unresponsive. But at that point, we're probably |
| 826 | failing anyway.) The handling of the encoded data -- managing the circular |
| 827 | buffer and writing it to disk -- is performed on a separate thread.</p> |
| 828 | |
| 829 | <p>The bulk of the configuration happens in the SurfaceView's <code>surfaceCreated()</code> |
| 830 | callback. The EGLContext is created, and EGLSurfaces are created for the |
| 831 | display and for the video encoder. When a new frame arrives, we tell |
| 832 | SurfaceTexture to acquire it and make it available as a GLES texture, then |
| 833 | render it with GLES commands on each EGLSurface (forwarding the transform and |
| 834 | timestamp from SurfaceTexture). The encoder thread pulls the encoded output |
| 835 | from MediaCodec and stashes it in memory.</p> |
| 836 | |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 837 | |
| 838 | <h3 id="secure-texture-video-playback">Secure Texture Video Playback</h3> |
| 839 | <p>Android N supports GPU post-processing of protected video content. This |
| 840 | allows using the GPU for complex non-linear video effects (such as warps), |
| 841 | mapping protected video content onto textures for use in general graphics scenes |
| 842 | (e.g., using OpenGL ES), and virtual reality (VR).</p> |
| 843 | |
| 844 | <img src="images/graphics_secure_texture_playback.png" alt="Secure Texture Video Playback" /> |
| 845 | <p class="img-caption"><strong>Figure 3.</strong>Secure texture video playback</p> |
| 846 | |
| 847 | <p>Support is enabled using the following two extensions:</p> |
| 848 | <ul> |
| 849 | <li><strong>EGL extension</strong> |
| 850 | (<code><a href="https://www.khronos.org/registry/egl/extensions/EXT/EGL_EXT_protected_content.txt">EGL_EXT_protected_content</code></a>). |
| 851 | Allows the creation of protected GL contexts and surfaces, which can both |
| 852 | operate on protected content.</li> |
| 853 | <li><strong>GLES extension</strong> |
| 854 | (<code><a href="https://www.khronos.org/registry/gles/extensions/EXT/EXT_protected_textures.txt">GL_EXT_protected_textures</code></a>). |
| 855 | Allows tagging textures as protected so they can be used as framebuffer texture |
| 856 | attachments.</li> |
| 857 | </ul> |
| 858 | |
| 859 | <p>Android N also updates SurfaceTexture and ACodec |
| 860 | (<code>libstagefright.so</code>) to allow protected content to be sent even if |
| 861 | the windows surface does not queue to the window composer (i.e., SurfaceFlinger) |
| 862 | and provide a protected video surface for use within a protected context. This |
| 863 | is done by setting the correct protected consumer bits |
| 864 | (<code>GRALLOC_USAGE_PROTECTED</code>) on surfaces created in a protected |
| 865 | context (verified by ACodec).</p> |
| 866 | |
| 867 | <p>These changes benefit app developers who can create apps that perform |
| 868 | enhanced video effects or apply video textures using protected content in GL |
| 869 | (for example, in VR), end users who can view high-value video content (such as |
| 870 | movies and TV shows) in GL environment (for example, in VR), and OEMs who can |
| 871 | achieve higher sales due to added device functionality (for example, watching HD |
| 872 | movies in VR). The new EGL and GLES extensions can be used by system on chip |
| 873 | (SoCs) providers and other vendors, and are currently implemented on the |
| 874 | Qualcomm MSM8994 SoC chipset used in the Nexus 6P. |
| 875 | |
| 876 | <p>Secure texture video playback sets the foundation for strong DRM |
| 877 | implementation in the OpenGL ES environment. Without a strong DRM implementation |
| 878 | such as Widevine Level 1, many content providers would not allow rendering of |
| 879 | their high-value content in the OpenGL ES environment, preventing important VR |
| 880 | use cases such as watching DRM protected content in VR.</p> |
| 881 | |
| 882 | <p>AOSP includes framework code for secure texture video playback; driver |
| 883 | support is up to the vendor. Partners must implement the |
| 884 | <code>EGL_EXT_protected_content</code> and |
| 885 | <code>GL_EXT_protected_textures extensions</code>. When using your own codec |
| 886 | library (to replace libstagefright), note the changes in |
| 887 | <code>/frameworks/av/media/libstagefright/SurfaceUtils.cpp</code> that allow |
| 888 | buffers marked with <code>GRALLOC_USAGE_PROTECTED</code> to be sent to |
| 889 | ANativeWindows (even if the ANativeWindow does not queue directly to the window |
| 890 | composer) as long as the consumer usage bits contain |
| 891 | <code>GRALLOC_USAGE_PROTECTED</code>. For detailed documentation on implementing |
| 892 | the extensions, refer to the Khronos Registry |
| 893 | (<a href="https://www.khronos.org/registry/egl/extensions/EXT/EGL_EXT_protected_content.txt">EGL_EXT_protected_content</a>, |
| 894 | <a href="https://www.khronos.org/registry/gles/extensions/EXT/EXT_protected_textures.txt">GL_EXT_protected_textures</a>).</p> |
| 895 | |
| 896 | <p>Partners may also need to make hardware changes to ensure that protected |
| 897 | memory mapped onto the GPU remains protected and unreadable by unprotected |
| 898 | code.</p> |
| 899 | |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 900 | <h2 id="texture">TextureView</h2> |
| 901 | |
Heidi von Markham | 2684c46 | 2016-06-24 13:46:53 -0700 | [diff] [blame^] | 902 | <p>The TextureView class introduced in Android 4.0 and is the most complex of |
| 903 | the View objects discussed here, combining a View with a SurfaceTexture.</p> |
Clay Murphy | ccf3037 | 2014-04-07 16:13:19 -0700 | [diff] [blame] | 904 | |
| 905 | <p>Recall that the SurfaceTexture is a "GL consumer", consuming buffers of graphics |
| 906 | data and making them available as textures. TextureView wraps a SurfaceTexture, |
| 907 | taking over the responsibility of responding to the callbacks and acquiring new |
| 908 | buffers. The arrival of new buffers causes TextureView to issue a View |
| 909 | invalidate request. When asked to draw, the TextureView uses the contents of |
| 910 | the most recently received buffer as its data source, rendering wherever and |
| 911 | however the View state indicates it should.</p> |
| 912 | |
| 913 | <p>You can render on a TextureView with GLES just as you would SurfaceView. Just |
| 914 | pass the SurfaceTexture to the EGL window creation call. However, doing so |
| 915 | exposes a potential problem.</p> |
| 916 | |
| 917 | <p>In most of what we've looked at, the BufferQueues have passed buffers between |
| 918 | different processes. When rendering to a TextureView with GLES, both producer |
| 919 | and consumer are in the same process, and they might even be handled on a single |
| 920 | thread. Suppose we submit several buffers in quick succession from the UI |
| 921 | thread. The EGL buffer swap call will need to dequeue a buffer from the |
| 922 | BufferQueue, and it will stall until one is available. There won't be any |
| 923 | available until the consumer acquires one for rendering, but that also happens |
| 924 | on the UI thread… so we're stuck.</p> |
| 925 | |
| 926 | <p>The solution is to have BufferQueue ensure there is always a buffer |
| 927 | available to be dequeued, so the buffer swap never stalls. One way to guarantee |
| 928 | this is to have BufferQueue discard the contents of the previously-queued buffer |
| 929 | when a new buffer is queued, and to place restrictions on minimum buffer counts |
| 930 | and maximum acquired buffer counts. (If your queue has three buffers, and all |
| 931 | three buffers are acquired by the consumer, then there's nothing to dequeue and |
| 932 | the buffer swap call must hang or fail. So we need to prevent the consumer from |
| 933 | acquiring more than two buffers at once.) Dropping buffers is usually |
| 934 | undesirable, so it's only enabled in specific situations, such as when the |
| 935 | producer and consumer are in the same process.</p> |
| 936 | |
| 937 | <h3 id="surface-or-texture">SurfaceView or TextureView?</h3> |
| 938 | SurfaceView and TextureView fill similar roles, but have very different |
| 939 | implementations. To decide which is best requires an understanding of the |
| 940 | trade-offs.</p> |
| 941 | |
| 942 | <p>Because TextureView is a proper citizen of the View hierarchy, it behaves like |
| 943 | any other View, and can overlap or be overlapped by other elements. You can |
| 944 | perform arbitrary transformations and retrieve the contents as a bitmap with |
| 945 | simple API calls.</p> |
| 946 | |
| 947 | <p>The main strike against TextureView is the performance of the composition step. |
| 948 | With SurfaceView, the content is written to a separate layer that SurfaceFlinger |
| 949 | composites, ideally with an overlay. With TextureView, the View composition is |
| 950 | always performed with GLES, and updates to its contents may cause other View |
| 951 | elements to redraw as well (e.g. if they're positioned on top of the |
| 952 | TextureView). After the View rendering completes, the app UI layer must then be |
| 953 | composited with other layers by SurfaceFlinger, so you're effectively |
| 954 | compositing every visible pixel twice. For a full-screen video player, or any |
| 955 | other application that is effectively just UI elements layered on top of video, |
| 956 | SurfaceView offers much better performance.</p> |
| 957 | |
| 958 | <p>As noted earlier, DRM-protected video can be presented only on an overlay plane. |
| 959 | Video players that support protected content must be implemented with |
| 960 | SurfaceView.</p> |
| 961 | |
| 962 | <h3 id="grafika">Case Study: Grafika's Play Video (TextureView)</h3> |
| 963 | |
| 964 | <p>Grafika includes a pair of video players, one implemented with TextureView, the |
| 965 | other with SurfaceView. The video decoding portion, which just sends frames |
| 966 | from MediaCodec to a Surface, is the same for both. The most interesting |
| 967 | differences between the implementations are the steps required to present the |
| 968 | correct aspect ratio.</p> |
| 969 | |
| 970 | <p>While SurfaceView requires a custom implementation of FrameLayout, resizing |
| 971 | SurfaceTexture is a simple matter of configuring a transformation matrix with |
| 972 | <code>TextureView#setTransform()</code>. For the former, you're sending new |
| 973 | window position and size values to SurfaceFlinger through WindowManager; for |
| 974 | the latter, you're just rendering it differently.</p> |
| 975 | |
| 976 | <p>Otherwise, both implementations follow the same pattern. Once the Surface has |
| 977 | been created, playback is enabled. When "play" is hit, a video decoding thread |
| 978 | is started, with the Surface as the output target. After that, the app code |
| 979 | doesn't have to do anything -- composition and display will either be handled by |
| 980 | SurfaceFlinger (for the SurfaceView) or by TextureView.</p> |
| 981 | |
| 982 | <h3 id="decode">Case Study: Grafika's Double Decode</h3> |
| 983 | |
| 984 | <p>This activity demonstrates manipulation of the SurfaceTexture inside a |
| 985 | TextureView.</p> |
| 986 | |
| 987 | <p>The basic structure of this activity is a pair of TextureViews that show two |
| 988 | different videos playing side-by-side. To simulate the needs of a |
| 989 | videoconferencing app, we want to keep the MediaCodec decoders alive when the |
| 990 | activity is paused and resumed for an orientation change. The trick is that you |
| 991 | can't change the Surface that a MediaCodec decoder uses without fully |
| 992 | reconfiguring it, which is a fairly expensive operation; so we want to keep the |
| 993 | Surface alive. The Surface is just a handle to the producer interface in the |
| 994 | SurfaceTexture's BufferQueue, and the SurfaceTexture is managed by the |
| 995 | TextureView;, so we also need to keep the SurfaceTexture alive. So how do we deal |
| 996 | with the TextureView getting torn down?</p> |
| 997 | |
| 998 | <p>It just so happens TextureView provides a <code>setSurfaceTexture()</code> call |
| 999 | that does exactly what we want. We obtain references to the SurfaceTextures |
| 1000 | from the TextureViews and save them in a static field. When the activity is |
| 1001 | shut down, we return "false" from the <code>onSurfaceTextureDestroyed()</code> |
| 1002 | callback to prevent destruction of the SurfaceTexture. When the activity is |
| 1003 | restarted, we stuff the old SurfaceTexture into the new TextureView. The |
| 1004 | TextureView class takes care of creating and destroying the EGL contexts.</p> |
| 1005 | |
| 1006 | <p>Each video decoder is driven from a separate thread. At first glance it might |
| 1007 | seem like we need EGL contexts local to each thread; but remember the buffers |
| 1008 | with decoded output are actually being sent from mediaserver to our |
| 1009 | BufferQueue consumers (the SurfaceTextures). The TextureViews take care of the |
| 1010 | rendering for us, and they execute on the UI thread.</p> |
| 1011 | |
| 1012 | <p>Implementing this activity with SurfaceView would be a bit harder. We can't |
| 1013 | just create a pair of SurfaceViews and direct the output to them, because the |
| 1014 | Surfaces would be destroyed during an orientation change. Besides, that would |
| 1015 | add two layers, and limitations on the number of available overlays strongly |
| 1016 | motivate us to keep the number of layers to a minimum. Instead, we'd want to |
| 1017 | create a pair of SurfaceTextures to receive the output from the video decoders, |
| 1018 | and then perform the rendering in the app, using GLES to render two textured |
| 1019 | quads onto the SurfaceView's Surface.</p> |
| 1020 | |
| 1021 | <h2 id="notes">Conclusion</h2> |
| 1022 | |
| 1023 | <p>We hope this page has provided useful insights into the way Android handles |
| 1024 | graphics at the system level.</p> |
| 1025 | |
| 1026 | <p>Some information and advice on related topics can be found in the appendices |
| 1027 | that follow.</p> |
| 1028 | |
| 1029 | <h2 id="loops">Appendix A: Game Loops</h2> |
| 1030 | |
| 1031 | <p>A very popular way to implement a game loop looks like this:</p> |
| 1032 | |
| 1033 | <pre> |
| 1034 | while (playing) { |
| 1035 | advance state by one frame |
| 1036 | render the new frame |
| 1037 | sleep until it’s time to do the next frame |
| 1038 | } |
| 1039 | </pre> |
| 1040 | |
| 1041 | <p>There are a few problems with this, the most fundamental being the idea that the |
| 1042 | game can define what a "frame" is. Different displays will refresh at different |
| 1043 | rates, and that rate may vary over time. If you generate frames faster than the |
| 1044 | display can show them, you will have to drop one occasionally. If you generate |
| 1045 | them too slowly, SurfaceFlinger will periodically fail to find a new buffer to |
| 1046 | acquire and will re-show the previous frame. Both of these situations can |
| 1047 | cause visible glitches.</p> |
| 1048 | |
| 1049 | <p>What you need to do is match the display's frame rate, and advance game state |
| 1050 | according to how much time has elapsed since the previous frame. There are two |
| 1051 | ways to go about this: (1) stuff the BufferQueue full and rely on the "swap |
| 1052 | buffers" back-pressure; (2) use Choreographer (API 16+).</p> |
| 1053 | |
| 1054 | <h3 id="stuffing">Queue Stuffing</h3> |
| 1055 | |
| 1056 | <p>This is very easy to implement: just swap buffers as fast as you can. In early |
| 1057 | versions of Android this could actually result in a penalty where |
| 1058 | <code>SurfaceView#lockCanvas()</code> would put you to sleep for 100ms. Now |
| 1059 | it's paced by the BufferQueue, and the BufferQueue is emptied as quickly as |
| 1060 | SurfaceFlinger is able.</p> |
| 1061 | |
| 1062 | <p>One example of this approach can be seen in <a |
| 1063 | href="https://code.google.com/p/android-breakout/">Android Breakout</a>. It |
| 1064 | uses GLSurfaceView, which runs in a loop that calls the application's |
| 1065 | onDrawFrame() callback and then swaps the buffer. If the BufferQueue is full, |
| 1066 | the <code>eglSwapBuffers()</code> call will wait until a buffer is available. |
| 1067 | Buffers become available when SurfaceFlinger releases them, which it does after |
| 1068 | acquiring a new one for display. Because this happens on VSYNC, your draw loop |
| 1069 | timing will match the refresh rate. Mostly.</p> |
| 1070 | |
| 1071 | <p>There are a couple of problems with this approach. First, the app is tied to |
| 1072 | SurfaceFlinger activity, which is going to take different amounts of time |
| 1073 | depending on how much work there is to do and whether it's fighting for CPU time |
| 1074 | with other processes. Since your game state advances according to the time |
| 1075 | between buffer swaps, your animation won't update at a consistent rate. When |
| 1076 | running at 60fps with the inconsistencies averaged out over time, though, you |
| 1077 | probably won't notice the bumps.</p> |
| 1078 | |
| 1079 | <p>Second, the first couple of buffer swaps are going to happen very quickly |
| 1080 | because the BufferQueue isn't full yet. The computed time between frames will |
| 1081 | be near zero, so the game will generate a few frames in which nothing happens. |
| 1082 | In a game like Breakout, which updates the screen on every refresh, the queue is |
| 1083 | always full except when a game is first starting (or un-paused), so the effect |
| 1084 | isn't noticeable. A game that pauses animation occasionally and then returns to |
| 1085 | as-fast-as-possible mode might see odd hiccups.</p> |
| 1086 | |
| 1087 | <h3 id="choreographer">Choreographer</h3> |
| 1088 | |
| 1089 | <p>Choreographer allows you to set a callback that fires on the next VSYNC. The |
| 1090 | actual VSYNC time is passed in as an argument. So even if your app doesn't wake |
| 1091 | up right away, you still have an accurate picture of when the display refresh |
| 1092 | period began. Using this value, rather than the current time, yields a |
| 1093 | consistent time source for your game state update logic.</p> |
| 1094 | |
| 1095 | <p>Unfortunately, the fact that you get a callback after every VSYNC does not |
| 1096 | guarantee that your callback will be executed in a timely fashion or that you |
| 1097 | will be able to act upon it sufficiently swiftly. Your app will need to detect |
| 1098 | situations where it's falling behind and drop frames manually.</p> |
| 1099 | |
| 1100 | <p>The "Record GL app" activity in Grafika provides an example of this. On some |
| 1101 | devices (e.g. Nexus 4 and Nexus 5), the activity will start dropping frames if |
| 1102 | you just sit and watch. The GL rendering is trivial, but occasionally the View |
| 1103 | elements get redrawn, and the measure/layout pass can take a very long time if |
| 1104 | the device has dropped into a reduced-power mode. (According to systrace, it |
| 1105 | takes 28ms instead of 6ms after the clocks slow on Android 4.4. If you drag |
| 1106 | your finger around the screen, it thinks you're interacting with the activity, |
| 1107 | so the clock speeds stay high and you'll never drop a frame.)</p> |
| 1108 | |
| 1109 | <p>The simple fix was to drop a frame in the Choreographer callback if the current |
| 1110 | time is more than N milliseconds after the VSYNC time. Ideally the value of N |
| 1111 | is determined based on previously observed VSYNC intervals. For example, if the |
| 1112 | refresh period is 16.7ms (60fps), you might drop a frame if you're running more |
| 1113 | than 15ms late.</p> |
| 1114 | |
| 1115 | <p>If you watch "Record GL app" run, you will see the dropped-frame counter |
| 1116 | increase, and even see a flash of red in the border when frames drop. Unless |
| 1117 | your eyes are very good, though, you won't see the animation stutter. At 60fps, |
| 1118 | the app can drop the occasional frame without anyone noticing so long as the |
| 1119 | animation continues to advance at a constant rate. How much you can get away |
| 1120 | with depends to some extent on what you're drawing, the characteristics of the |
| 1121 | display, and how good the person using the app is at detecting jank.</p> |
| 1122 | |
| 1123 | <h3 id="thread">Thread Management</h3> |
| 1124 | |
| 1125 | <p>Generally speaking, if you're rendering onto a SurfaceView, GLSurfaceView, or |
| 1126 | TextureView, you want to do that rendering in a dedicated thread. Never do any |
| 1127 | "heavy lifting" or anything that takes an indeterminate amount of time on the |
| 1128 | UI thread.</p> |
| 1129 | |
| 1130 | <p>Breakout and "Record GL app" use dedicated renderer threads, and they also |
| 1131 | update animation state on that thread. This is a reasonable approach so long as |
| 1132 | game state can be updated quickly.</p> |
| 1133 | |
| 1134 | <p>Other games separate the game logic and rendering completely. If you had a |
| 1135 | simple game that did nothing but move a block every 100ms, you could have a |
| 1136 | dedicated thread that just did this:</p> |
| 1137 | |
| 1138 | <pre> |
| 1139 | run() { |
| 1140 | Thread.sleep(100); |
| 1141 | synchronized (mLock) { |
| 1142 | moveBlock(); |
| 1143 | } |
| 1144 | } |
| 1145 | </pre> |
| 1146 | |
| 1147 | <p>(You may want to base the sleep time off of a fixed clock to prevent drift -- |
| 1148 | sleep() isn't perfectly consistent, and moveBlock() takes a nonzero amount of |
| 1149 | time -- but you get the idea.)</p> |
| 1150 | |
| 1151 | <p>When the draw code wakes up, it just grabs the lock, gets the current position |
| 1152 | of the block, releases the lock, and draws. Instead of doing fractional |
| 1153 | movement based on inter-frame delta times, you just have one thread that moves |
| 1154 | things along and another thread that draws things wherever they happen to be |
| 1155 | when the drawing starts.</p> |
| 1156 | |
| 1157 | <p>For a scene with any complexity you'd want to create a list of upcoming events |
| 1158 | sorted by wake time, and sleep until the next event is due, but it's the same |
| 1159 | idea.</p> |
| 1160 | |
| 1161 | <h2 id="activity">Appendix B: SurfaceView and the Activity Lifecycle</h2> |
| 1162 | |
| 1163 | <p>When using a SurfaceView, it's considered good practice to render the Surface |
| 1164 | from a thread other than the main UI thread. This raises some questions about |
| 1165 | the interaction between that thread and the Activity lifecycle.</p> |
| 1166 | |
| 1167 | <p>First, a little background. For an Activity with a SurfaceView, there are two |
| 1168 | separate but interdependent state machines:</p> |
| 1169 | |
| 1170 | <ol> |
| 1171 | <li>Application onCreate / onResume / onPause</li> |
| 1172 | <li>Surface created / changed / destroyed</li> |
| 1173 | </ol> |
| 1174 | |
| 1175 | <p>When the Activity starts, you get callbacks in this order:</p> |
| 1176 | |
| 1177 | <ul> |
| 1178 | <li>onCreate</li> |
| 1179 | <li>onResume</li> |
| 1180 | <li>surfaceCreated</li> |
| 1181 | <li>surfaceChanged</li> |
| 1182 | </ul> |
| 1183 | |
| 1184 | <p>If you hit "back" you get:</p> |
| 1185 | |
| 1186 | <ul> |
| 1187 | <li>onPause</li> |
| 1188 | <li>surfaceDestroyed (called just before the Surface goes away)</li> |
| 1189 | </ul> |
| 1190 | |
| 1191 | <p>If you rotate the screen, the Activity is torn down and recreated, so you |
| 1192 | get the full cycle. If it matters, you can tell that it's a "quick" restart by |
| 1193 | checking <code>isFinishing()</code>. (It might be possible to start / stop an |
| 1194 | Activity so quickly that surfaceCreated() might actually happen after onPause().)</p> |
| 1195 | |
| 1196 | <p>If you tap the power button to blank the screen, you only get |
| 1197 | <code>onPause()</code> -- no <code>surfaceDestroyed()</code>. The Surface |
| 1198 | remains alive, and rendering can continue. You can even keep getting |
| 1199 | Choreographer events if you continue to request them. If you have a lock |
| 1200 | screen that forces a different orientation, your Activity may be restarted when |
| 1201 | the device is unblanked; but if not, you can come out of screen-blank with the |
| 1202 | same Surface you had before.</p> |
| 1203 | |
| 1204 | <p>This raises a fundamental question when using a separate renderer thread with |
| 1205 | SurfaceView: Should the lifespan of the thread be tied to that of the Surface or |
| 1206 | the Activity? The answer depends on what you want to have happen when the |
| 1207 | screen goes blank. There are two basic approaches: (1) start/stop the thread on |
| 1208 | Activity start/stop; (2) start/stop the thread on Surface create/destroy.</p> |
| 1209 | |
| 1210 | <p>#1 interacts well with the app lifecycle. We start the renderer thread in |
| 1211 | <code>onResume()</code> and stop it in <code>onPause()</code>. It gets a bit |
| 1212 | awkward when creating and configuring the thread because sometimes the Surface |
| 1213 | will already exist and sometimes it won't (e.g. it's still alive after toggling |
| 1214 | the screen with the power button). We have to wait for the surface to be |
| 1215 | created before we do some initialization in the thread, but we can't simply do |
| 1216 | it in the <code>surfaceCreated()</code> callback because that won't fire again |
| 1217 | if the Surface didn't get recreated. So we need to query or cache the Surface |
| 1218 | state, and forward it to the renderer thread. Note we have to be a little |
| 1219 | careful here passing objects between threads -- it is best to pass the Surface or |
| 1220 | SurfaceHolder through a Handler message, rather than just stuffing it into the |
| 1221 | thread, to avoid issues on multi-core systems (cf. the <a |
| 1222 | href="http://developer.android.com/training/articles/smp.html">Android SMP |
| 1223 | Primer</a>).</p> |
| 1224 | |
| 1225 | <p>#2 has a certain appeal because the Surface and the renderer are logically |
| 1226 | intertwined. We start the thread after the Surface has been created, which |
| 1227 | avoids some inter-thread communication concerns. Surface created / changed |
| 1228 | messages are simply forwarded. We need to make sure rendering stops when the |
| 1229 | screen goes blank, and resumes when it un-blanks; this could be a simple matter |
| 1230 | of telling Choreographer to stop invoking the frame draw callback. Our |
| 1231 | <code>onResume()</code> will need to resume the callbacks if and only if the |
| 1232 | renderer thread is running. It may not be so trivial though -- if we animate |
| 1233 | based on elapsed time between frames, we could have a very large gap when the |
| 1234 | next event arrives; so an explicit pause/resume message may be desirable.</p> |
| 1235 | |
| 1236 | <p>The above is primarily concerned with how the renderer thread is configured and |
| 1237 | whether it's executing. A related concern is extracting state from the thread |
| 1238 | when the Activity is killed (in <code>onPause()</code> or <code>onSaveInstanceState()</code>). |
| 1239 | Approach #1 will work best for that, because once the renderer thread has been |
| 1240 | joined its state can be accessed without synchronization primitives.</p> |
| 1241 | |
| 1242 | <p>You can see an example of approach #2 in Grafika's "Hardware scaler exerciser."</p> |
| 1243 | |
| 1244 | <h2 id="tracking">Appendix C: Tracking BufferQueue with systrace</h2> |
| 1245 | |
| 1246 | <p>If you really want to understand how graphics buffers move around, you need to |
| 1247 | use systrace. The system-level graphics code is well instrumented, as is much |
| 1248 | of the relevant app framework code. Enable the "gfx" and "view" tags, and |
| 1249 | generally "sched" as well.</p> |
| 1250 | |
| 1251 | <p>A full description of how to use systrace effectively would fill a rather long |
| 1252 | document. One noteworthy item is the presence of BufferQueues in the trace. If |
| 1253 | you've used systrace before, you've probably seen them, but maybe weren't sure |
| 1254 | what they were. As an example, if you grab a trace while Grafika's "Play video |
| 1255 | (SurfaceView)" is running, you will see a row labeled: "SurfaceView" This row |
| 1256 | tells you how many buffers were queued up at any given time.</p> |
| 1257 | |
| 1258 | <p>You'll notice the value increments while the app is active -- triggering |
| 1259 | the rendering of frames by the MediaCodec decoder -- and decrements while |
| 1260 | SurfaceFlinger is doing work, consuming buffers. If you're showing video at |
| 1261 | 30fps, the queue's value will vary from 0 to 1, because the ~60fps display can |
| 1262 | easily keep up with the source. (You'll also notice that SurfaceFlinger is only |
| 1263 | waking up when there's work to be done, not 60 times per second. The system tries |
| 1264 | very hard to avoid work and will disable VSYNC entirely if nothing is updating |
| 1265 | the screen.)</p> |
| 1266 | |
| 1267 | <p>If you switch to "Play video (TextureView)" and grab a new trace, you'll see a |
| 1268 | row with a much longer name |
| 1269 | ("com.android.grafika/com.android.grafika.PlayMovieActivity"). This is the |
| 1270 | main UI layer, which is of course just another BufferQueue. Because TextureView |
| 1271 | renders into the UI layer, rather than a separate layer, you'll see all of the |
| 1272 | video-driven updates here.</p> |
| 1273 | |
| 1274 | <p>For more information about systrace, see the <a |
| 1275 | href="http://developer.android.com/tools/help/systrace.html">Android |
| 1276 | documentation</a> for the tool.</p> |