blob: 77593f2023e7eff0b51ca584aefe0e9a9db9b1bb [file] [log] [blame]
Clay Murphye3ae3962014-09-02 17:30:57 -07001page.title=Graphics architecture
Clay Murphyccf30372014-04-07 16:13:19 -07002@jd:body
3
4<!--
5 Copyright 2014 The Android Open Source Project
6
7 Licensed under the Apache License, Version 2.0 (the "License");
8 you may not use this file except in compliance with the License.
9 You may obtain a copy of the License at
10
11 http://www.apache.org/licenses/LICENSE-2.0
12
13 Unless required by applicable law or agreed to in writing, software
14 distributed under the License is distributed on an "AS IS" BASIS,
15 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
16 See the License for the specific language governing permissions and
17 limitations under the License.
18-->
19<div id="qv-wrapper">
20 <div id="qv">
21 <h2>In this document</h2>
22 <ol id="auto-toc">
23 </ol>
24 </div>
25</div>
26
27
28<p><em>What every developer should know about Surface, SurfaceHolder, EGLSurface,
29SurfaceView, GLSurfaceView, SurfaceTexture, TextureView, and SurfaceFlinger</em>
30</p>
31<p>This document describes the essential elements of Android's "system-level"
32 graphics architecture, and how it is used by the application framework and
33 multimedia system. The focus is on how buffers of graphical data move through
34 the system. If you've ever wondered why SurfaceView and TextureView behave the
35 way they do, or how Surface and EGLSurface interact, you've come to the right
36place.</p>
37
38<p>Some familiarity with Android devices and application development is assumed.
39You don't need detailed knowledge of the app framework, and very few API calls
40will be mentioned, but the material herein doesn't overlap much with other
41public documentation. The goal here is to provide a sense for the significant
42events involved in rendering a frame for output, so that you can make informed
43choices when designing an application. To achieve this, we work from the bottom
44up, describing how the UI classes work rather than how they can be used.</p>
45
46<p>Early sections contain background material used in later sections, so it's a
47good idea to read straight through rather than skipping to a section that sounds
48interesting. We start with an explanation of Android's graphics buffers,
49describe the composition and display mechanism, and then proceed to the
50higher-level mechanisms that supply the compositor with data.</p>
51
52<p>This document is chiefly concerned with the system as it exists in Android 4.4
53("KitKat"). Earlier versions of the system worked differently, and future
54versions will likely be different as well. Version-specific features are called
55out in a few places.</p>
56
57<p>At various points I will refer to source code from the AOSP sources or from
Clay Murphyfa783d82015-05-01 14:26:20 -070058Grafika. Grafika is a Google open source project for testing; it can be found at
Clay Murphyccf30372014-04-07 16:13:19 -070059<a
60href="https://github.com/google/grafika">https://github.com/google/grafika</a>.
61It's more "quick hack" than solid example code, but it will suffice.</p>
62<h2 id="BufferQueue">BufferQueue and gralloc</h2>
63
64<p>To understand how Android's graphics system works, we have to start behind the
65scenes. At the heart of everything graphical in Android is a class called
66BufferQueue. Its role is simple enough: connect something that generates
67buffers of graphical data (the "producer") to something that accepts the data
68for display or further processing (the "consumer"). The producer and consumer
69can live in different processes. Nearly everything that moves buffers of
70graphical data through the system relies on BufferQueue.</p>
71
72<p>The basic usage is straightforward. The producer requests a free buffer
73(<code>dequeueBuffer()</code>), specifying a set of characteristics including width,
74height, pixel format, and usage flags. The producer populates the buffer and
75returns it to the queue (<code>queueBuffer()</code>). Some time later, the consumer
76acquires the buffer (<code>acquireBuffer()</code>) and makes use of the buffer contents.
77When the consumer is done, it returns the buffer to the queue
78(<code>releaseBuffer()</code>).</p>
79
80<p>Most recent Android devices support the "sync framework". This allows the
81system to do some nifty thing when combined with hardware components that can
82manipulate graphics data asynchronously. For example, a producer can submit a
83series of OpenGL ES drawing commands and then enqueue the output buffer before
84rendering completes. The buffer is accompanied by a fence that signals when the
85contents are ready. A second fence accompanies the buffer when it is returned
86to the free list, so that the consumer can release the buffer while the contents
87are still in use. This approach improves latency and throughput as the buffers
88move through the system.</p>
89
90<p>Some characteristics of the queue, such as the maximum number of buffers it can
91hold, are determined jointly by the producer and the consumer.</p>
92
93<p>The BufferQueue is responsible for allocating buffers as it needs them. Buffers
94are retained unless the characteristics change; for example, if the producer
95starts requesting buffers with a different size, the old buffers will be freed
96and new buffers will be allocated on demand.</p>
97
98<p>The data structure is currently always created and "owned" by the consumer. In
99Android 4.3 only the producer side was "binderized", i.e. the producer could be
100in a remote process but the consumer had to live in the process where the queue
101was created. This evolved a bit in 4.4, moving toward a more general
102implementation.</p>
103
104<p>Buffer contents are never copied by BufferQueue. Moving that much data around
105would be very inefficient. Instead, buffers are always passed by handle.</p>
106
107<h3 id="gralloc_HAL">gralloc HAL</h3>
108
109<p>The actual buffer allocations are performed through a memory allocator called
110"gralloc", which is implemented through a vendor-specific HAL interface (see
111<a
112href="https://android.googlesource.com/platform/hardware/libhardware/+/kitkat-release/include/hardware/gralloc.h">hardware/libhardware/include/hardware/gralloc.h</a>).
113The <code>alloc()</code> function takes the arguments you'd expect -- width,
114height, pixel format -- as well as a set of usage flags. Those flags merit
115closer attention.</p>
116
117<p>The gralloc allocator is not just another way to allocate memory on the native
118heap. In some situations, the allocated memory may not be cache-coherent, or
119could be totally inaccessible from user space. The nature of the allocation is
120determined by the usage flags, which include attributes like:</p>
121
122<ul>
123<li>how often the memory will be accessed from software (CPU)</li>
124<li>how often the memory will be accessed from hardware (GPU)</li>
125<li>whether the memory will be used as an OpenGL ES ("GLES") texture</li>
126<li>whether the memory will be used by a video encoder</li>
127</ul>
128
129<p>For example, if your format specifies RGBA 8888 pixels, and you indicate
130the buffer will be accessed from software -- meaning your application will touch
131pixels directly -- then the allocator needs to create a buffer with 4 bytes per
132pixel in R-G-B-A order. If instead you say the buffer will only be
133accessed from hardware and as a GLES texture, the allocator can do anything the
134GLES driver wants -- BGRA ordering, non-linear "swizzled" layouts, alternative
135color formats, etc. Allowing the hardware to use its preferred format can
136improve performance.</p>
137
138<p>Some values cannot be combined on certain platforms. For example, the "video
139encoder" flag may require YUV pixels, so adding "software access" and specifying
140RGBA 8888 would fail.</p>
141
142<p>The handle returned by the gralloc allocator can be passed between processes
143through Binder.</p>
144
145<h2 id="SurfaceFlinger">SurfaceFlinger and Hardware Composer</h2>
146
147<p>Having buffers of graphical data is wonderful, but life is even better when you
148get to see them on your device's screen. That's where SurfaceFlinger and the
149Hardware Composer HAL come in.</p>
150
151<p>SurfaceFlinger's role is to accept buffers of data from multiple sources,
152composite them, and send them to the display. Once upon a time this was done
153with software blitting to a hardware framebuffer (e.g.
154<code>/dev/graphics/fb0</code>), but those days are long gone.</p>
155
156<p>When an app comes to the foreground, the WindowManager service asks
157SurfaceFlinger for a drawing surface. SurfaceFlinger creates a "layer" - the
158primary component of which is a BufferQueue - for which SurfaceFlinger acts as
159the consumer. A Binder object for the producer side is passed through the
160WindowManager to the app, which can then start sending frames directly to
Bert McMeen3bb4b8f2015-05-06 17:21:27 -0700161SurfaceFlinger.</p>
162
163<p class="note"><strong>Note:</strong> The WindowManager uses the term "window" instead of
Clay Murphyccf30372014-04-07 16:13:19 -0700164"layer" for this and uses "layer" to mean something else. We're going to use the
165SurfaceFlinger terminology. It can be argued that SurfaceFlinger should really
Bert McMeen3bb4b8f2015-05-06 17:21:27 -0700166be called LayerFlinger.</p>
Clay Murphyccf30372014-04-07 16:13:19 -0700167
168<p>For most apps, there will be three layers on screen at any time: the "status
169bar" at the top of the screen, the "navigation bar" at the bottom or side, and
170the application's UI. Some apps will have more or less, e.g. the default home app has a
171separate layer for the wallpaper, while a full-screen game might hide the status
172bar. Each layer can be updated independently. The status and navigation bars
173are rendered by a system process, while the app layers are rendered by the app,
174with no coordination between the two.</p>
175
176<p>Device displays refresh at a certain rate, typically 60 frames per second on
177phones and tablets. If the display contents are updated mid-refresh, "tearing"
178will be visible; so it's important to update the contents only between cycles.
179The system receives a signal from the display when it's safe to update the
180contents. For historical reasons we'll call this the VSYNC signal.</p>
181
182<p>The refresh rate may vary over time, e.g. some mobile devices will range from 58
183to 62fps depending on current conditions. For an HDMI-attached television, this
184could theoretically dip to 24 or 48Hz to match a video. Because we can update
185the screen only once per refresh cycle, submitting buffers for display at
186200fps would be a waste of effort as most of the frames would never be seen.
187Instead of taking action whenever an app submits a buffer, SurfaceFlinger wakes
188up when the display is ready for something new.</p>
189
190<p>When the VSYNC signal arrives, SurfaceFlinger walks through its list of layers
191looking for new buffers. If it finds a new one, it acquires it; if not, it
192continues to use the previously-acquired buffer. SurfaceFlinger always wants to
193have something to display, so it will hang on to one buffer. If no buffers have
194ever been submitted on a layer, the layer is ignored.</p>
195
196<p>Once SurfaceFlinger has collected all of the buffers for visible layers, it
197asks the Hardware Composer how composition should be performed.</p>
198
199<h3 id="hwcomposer">Hardware Composer</h3>
200
201<p>The Hardware Composer HAL ("HWC") was first introduced in Android 3.0
202("Honeycomb") and has evolved steadily over the years. Its primary purpose is
203to determine the most efficient way to composite buffers with the available
204hardware. As a HAL, its implementation is device-specific and usually
205implemented by the display hardware OEM.</p>
206
207<p>The value of this approach is easy to recognize when you consider "overlay
208planes." The purpose of overlay planes is to composite multiple buffers
209together, but in the display hardware rather than the GPU. For example, suppose
210you have a typical Android phone in portrait orientation, with the status bar on
211top and navigation bar at the bottom, and app content everywhere else. The contents
212for each layer are in separate buffers. You could handle composition by
213rendering the app content into a scratch buffer, then rendering the status bar
214over it, then rendering the navigation bar on top of that, and finally passing the
215scratch buffer to the display hardware. Or, you could pass all three buffers to
216the display hardware, and tell it to read data from different buffers for
217different parts of the screen. The latter approach can be significantly more
218efficient.</p>
219
220<p>As you might expect, the capabilities of different display processors vary
221significantly. The number of overlays, whether layers can be rotated or
222blended, and restrictions on positioning and overlap can be difficult to express
223through an API. So, the HWC works like this:</p>
224
225<ol>
226<li>SurfaceFlinger provides the HWC with a full list of layers, and asks, "how do
227you want to handle this?"</li>
228<li>The HWC responds by marking each layer as "overlay" or "GLES composition."</li>
229<li>SurfaceFlinger takes care of any GLES composition, passing the output buffer
230to HWC, and lets HWC handle the rest.</li>
231</ol>
232
233<p>Since the decision-making code can be custom tailored by the hardware vendor,
234it's possible to get the best performance out of every device.</p>
235
236<p>Overlay planes may be less efficient than GL composition when nothing on the
237screen is changing. This is particularly true when the overlay contents have
238transparent pixels, and overlapping layers are being blended together. In such
239cases, the HWC can choose to request GLES composition for some or all layers
240and retain the composited buffer. If SurfaceFlinger comes back again asking to
241composite the same set of buffers, the HWC can just continue to show the
242previously-composited scratch buffer. This can improve the battery life of an
243idle device.</p>
244
245<p>Devices shipping with Android 4.4 ("KitKat") typically support four overlay
246planes. Attempting to composite more layers than there are overlays will cause
247the system to use GLES composition for some of them; so the number of layers
248used by an application can have a measurable impact on power consumption and
249performance.</p>
250
251<p>You can see exactly what SurfaceFlinger is up to with the command <code>adb shell
252dumpsys SurfaceFlinger</code>. The output is verbose. The part most relevant to our
253current discussion is the HWC summary that appears near the bottom of the
254output:</p>
255
256<pre>
257 type | source crop | frame name
258------------+-----------------------------------+--------------------------------
259 HWC | [ 0.0, 0.0, 320.0, 240.0] | [ 48, 411, 1032, 1149] SurfaceView
260 HWC | [ 0.0, 75.0, 1080.0, 1776.0] | [ 0, 75, 1080, 1776] com.android.grafika/com.android.grafika.PlayMovieSurfaceActivity
261 HWC | [ 0.0, 0.0, 1080.0, 75.0] | [ 0, 0, 1080, 75] StatusBar
262 HWC | [ 0.0, 0.0, 1080.0, 144.0] | [ 0, 1776, 1080, 1920] NavigationBar
263 FB TARGET | [ 0.0, 0.0, 1080.0, 1920.0] | [ 0, 0, 1080, 1920] HWC_FRAMEBUFFER_TARGET
264</pre>
265
266<p>This tells you what layers are on screen, whether they're being handled with
267overlays ("HWC") or OpenGL ES composition ("GLES"), and gives you a bunch of
268other facts you probably won't care about ("handle" and "hints" and "flags" and
269other stuff that we've trimmed out of the snippet above). The "source crop" and
270"frame" values will be examined more closely later on.</p>
271
272<p>The FB_TARGET layer is where GLES composition output goes. Since all layers
273shown above are using overlays, FB_TARGET isn’t being used for this frame. The
274layer's name is indicative of its original role: On a device with
275<code>/dev/graphics/fb0</code> and no overlays, all composition would be done
276with GLES, and the output would be written to the framebuffer. On recent devices there
Bert McMeen3bb4b8f2015-05-06 17:21:27 -0700277generally is no simple framebuffer, so the FB_TARGET layer is a scratch buffer.</p>
278
279<p class="note"><strong>Note:</strong> This is why screen grabbers written for old versions of Android no
280longer work: They're trying to read from the Framebuffer, but there is no such
281thing.</p>
Clay Murphyccf30372014-04-07 16:13:19 -0700282
283<p>The overlay planes have another important role: they're the only way to display
284DRM content. DRM-protected buffers cannot be accessed by SurfaceFlinger or the
285GLES driver, which means that your video will disappear if HWC switches to GLES
286composition.</p>
287
288<h3 id="triple-buffering">The Need for Triple-Buffering</h3>
289
290<p>To avoid tearing on the display, the system needs to be double-buffered: the
291front buffer is displayed while the back buffer is being prepared. At VSYNC, if
292the back buffer is ready, you quickly switch them. This works reasonably well
293in a system where you're drawing directly into the framebuffer, but there's a
294hitch in the flow when a composition step is added. Because of the way
295SurfaceFlinger is triggered, our double-buffered pipeline will have a bubble.</p>
296
297<p>Suppose frame N is being displayed, and frame N+1 has been acquired by
298SurfaceFlinger for display on the next VSYNC. (Assume frame N is composited
299with an overlay, so we can't alter the buffer contents until the display is done
300with it.) When VSYNC arrives, HWC flips the buffers. While the app is starting
301to render frame N+2 into the buffer that used to hold frame N, SurfaceFlinger is
302scanning the layer list, looking for updates. SurfaceFlinger won't find any new
303buffers, so it prepares to show frame N+1 again after the next VSYNC. A little
304while later, the app finishes rendering frame N+2 and queues it for
305SurfaceFlinger, but it's too late. This has effectively cut our maximum frame
306rate in half.</p>
307
308<p>We can fix this with triple-buffering. Just before VSYNC, frame N is being
309displayed, frame N+1 has been composited (or scheduled for an overlay) and is
310ready to be displayed, and frame N+2 is queued up and ready to be acquired by
311SurfaceFlinger. When the screen flips, the buffers rotate through the stages
312with no bubble. The app has just less than a full VSYNC period (16.7ms at 60fps) to
313do its rendering and queue the buffer. And SurfaceFlinger / HWC has a full VSYNC
314period to figure out the composition before the next flip. The downside is
315that it takes at least two VSYNC periods for anything that the app does to
316appear on the screen. As the latency increases, the device feels less
317responsive to touch input.</p>
318
319<img src="images/surfaceflinger_bufferqueue.png" alt="SurfaceFlinger with BufferQueue" />
320
321<p class="img-caption">
322 <strong>Figure 1.</strong> SurfaceFlinger + BufferQueue
323</p>
324
325<p>The diagram above depicts the flow of SurfaceFlinger and BufferQueue. During
326frame:</p>
327
328<ol>
329<li>red buffer fills up, then slides into BufferQueue</li>
330<li>after red buffer leaves app, blue buffer slides in, replacing it</li>
331<li>green buffer and systemUI* shadow-slide into HWC (showing that SurfaceFlinger
332still has the buffers, but now HWC has prepared them for display via overlay on
333the next VSYNC).</li>
334</ol>
335
336<p>The blue buffer is referenced by both the display and the BufferQueue. The
337app is not allowed to render to it until the associated sync fence signals.</p>
338
339<p>On VSYNC, all of these happen at once:</p>
340
341<ul>
342<li>red buffer leaps into SurfaceFlinger, replacing green buffer</li>
343<li>green buffer leaps into Display, replacing blue buffer, and a dotted-line
344green twin appears in the BufferQueue</li>
345<li>the blue buffer’s fence is signaled, and the blue buffer in App empties**</li>
346<li>display rect changes from &lt;blue + SystemUI&gt; to &lt;green +
347SystemUI&gt;</li>
348</ul>
349
350<p><strong>*</strong> - The System UI process is providing the status and nav
351bars, which for our purposes here aren’t changing, so SurfaceFlinger keeps using
352the previously-acquired buffer. In practice there would be two separate
353buffers, one for the status bar at the top, one for the navigation bar at the
354bottom, and they would be sized to fit their contents. Each would arrive on its
355own BufferQueue.</p>
356
357<p><strong>**</strong> - The buffer doesn’t actually “empty”; if you submit it
358without drawing on it you’ll get that same blue again. The emptying is the
359result of clearing the buffer contents, which the app should do before it starts
360drawing.</p>
361
362<p>We can reduce the latency by noting layer composition should not require a
363full VSYNC period. If composition is performed by overlays, it takes essentially
364zero CPU and GPU time. But we can't count on that, so we need to allow a little
365time. If the app starts rendering halfway between VSYNC signals, and
366SurfaceFlinger defers the HWC setup until a few milliseconds before the signal
367is due to arrive, we can cut the latency from 2 frames to perhaps 1.5. In
368theory you could render and composite in a single period, allowing a return to
369double-buffering; but getting it down that far is difficult on current devices.
370Minor fluctuations in rendering and composition time, and switching from
371overlays to GLES composition, can cause us to miss a swap deadline and repeat
372the previous frame.</p>
373
374<p>SurfaceFlinger's buffer handling demonstrates the fence-based buffer
375management mentioned earlier. If we're animating at full speed, we need to
376have an acquired buffer for the display ("front") and an acquired buffer for
377the next flip ("back"). If we're showing the buffer on an overlay, the
378contents are being accessed directly by the display and must not be touched.
379But if you look at an active layer's BufferQueue state in the <code>dumpsys
380SurfaceFlinger</code> output, you'll see one acquired buffer, one queued buffer, and
381one free buffer. That's because, when SurfaceFlinger acquires the new "back"
382buffer, it releases the current "front" buffer to the queue. The "front"
383buffer is still in use by the display, so anything that dequeues it must wait
384for the fence to signal before drawing on it. So long as everybody follows
385the fencing rules, all of the queue-management IPC requests can happen in
386parallel with the display.</p>
387
388<h3 id="virtual-displays">Virtual Displays</h3>
389
390<p>SurfaceFlinger supports a "primary" display, i.e. what's built into your phone
391or tablet, and an "external" display, such as a television connected through
392HDMI. It also supports a number of "virtual" displays, which make composited
393output available within the system. Virtual displays can be used to record the
394screen or send it over a network.</p>
395
396<p>Virtual displays may share the same set of layers as the main display
397(the "layer stack") or have its own set. There is no VSYNC for a virtual
398display, so the VSYNC for the primary display is used to trigger composition for
399all displays.</p>
400
401<p>In the past, virtual displays were always composited with GLES. The Hardware
402Composer managed composition for only the primary display. In Android 4.4, the
403Hardware Composer gained the ability to participate in virtual display
404composition.</p>
405
406<p>As you might expect, the frames generated for a virtual display are written to a
407BufferQueue.</p>
408
409<h3 id="screenrecord">Case study: screenrecord</h3>
410
411<p>Now that we've established some background on BufferQueue and SurfaceFlinger,
412it's useful to examine a practical use case.</p>
413
414<p>The <a href="https://android.googlesource.com/platform/frameworks/av/+/kitkat-release/cmds/screenrecord/">screenrecord
415command</a>,
416introduced in Android 4.4, allows you to record everything that appears on the
417screen as an .mp4 file on disk. To implement this, we have to receive composited
418frames from SurfaceFlinger, write them to the video encoder, and then write the
419encoded video data to a file. The video codecs are managed by a separate
420process - called "mediaserver" - so we have to move large graphics buffers around
421the system. To make it more challenging, we're trying to record 60fps video at
422full resolution. The key to making this work efficiently is BufferQueue.</p>
423
424<p>The MediaCodec class allows an app to provide data as raw bytes in buffers, or
425through a Surface. We'll discuss Surface in more detail later, but for now just
426think of it as a wrapper around the producer end of a BufferQueue. When
427screenrecord requests access to a video encoder, mediaserver creates a
428BufferQueue and connects itself to the consumer side, and then passes the
429producer side back to screenrecord as a Surface.</p>
430
431<p>The screenrecord command then asks SurfaceFlinger to create a virtual display
432that mirrors the main display (i.e. it has all of the same layers), and directs
433it to send output to the Surface that came from mediaserver. Note that, in this
434case, SurfaceFlinger is the producer of buffers rather than the consumer.</p>
435
436<p>Once the configuration is complete, screenrecord can just sit and wait for
437encoded data to appear. As apps draw, their buffers travel to SurfaceFlinger,
438which composites them into a single buffer that gets sent directly to the video
439encoder in mediaserver. The full frames are never even seen by the screenrecord
440process. Internally, mediaserver has its own way of moving buffers around that
441also passes data by handle, minimizing overhead.</p>
442
443<h3 id="simulate-secondary">Case study: Simulate Secondary Displays</h3>
444
445<p>The WindowManager can ask SurfaceFlinger to create a visible layer for which
446SurfaceFlinger will act as the BufferQueue consumer. It's also possible to ask
447SurfaceFlinger to create a virtual display, for which SurfaceFlinger will act as
448the BufferQueue producer. What happens if you connect them, configuring a
449virtual display that renders to a visible layer?</p>
450
451<p>You create a closed loop, where the composited screen appears in a window. Of
452course, that window is now part of the composited output, so on the next refresh
453the composited image inside the window will show the window contents as well.
454It's turtles all the way down. You can see this in action by enabling
455"<a href="http://developer.android.com/tools/index.html">Developer options</a>" in
456settings, selecting "Simulate secondary displays", and enabling a window. For
457bonus points, use screenrecord to capture the act of enabling the display, then
458play it back frame-by-frame.</p>
459
460<h2 id="surface">Surface and SurfaceHolder</h2>
461
462<p>The <a
463href="http://developer.android.com/reference/android/view/Surface.html">Surface</a>
464class has been part of the public API since 1.0. Its description simply says,
465"Handle onto a raw buffer that is being managed by the screen compositor." The
466statement was accurate when initially written but falls well short of the mark
467on a modern system.</p>
468
469<p>The Surface represents the producer side of a buffer queue that is often (but
470not always!) consumed by SurfaceFlinger. When you render onto a Surface, the
471result ends up in a buffer that gets shipped to the consumer. A Surface is not
472simply a raw chunk of memory you can scribble on.</p>
473
474<p>The BufferQueue for a display Surface is typically configured for
475triple-buffering; but buffers are allocated on demand. So if the producer
476generates buffers slowly enough -- maybe it's animating at 30fps on a 60fps
477display -- there might only be two allocated buffers in the queue. This helps
478minimize memory consumption. You can see a summary of the buffers associated
479with every layer in the <code>dumpsys SurfaceFlinger</code> output.</p>
480
481<h3 id="canvas">Canvas Rendering</h3>
482
483<p>Once upon a time, all rendering was done in software, and you can still do this
484today. The low-level implementation is provided by the Skia graphics library.
485If you want to draw a rectangle, you make a library call, and it sets bytes in a
486buffer appropriately. To ensure that a buffer isn't updated by two clients at
487once, or written to while being displayed, you have to lock the buffer to access
488it. <code>lockCanvas()</code> locks the buffer and returns a Canvas to use for drawing,
489and <code>unlockCanvasAndPost()</code> unlocks the buffer and sends it to the compositor.</p>
490
491<p>As time went on, and devices with general-purpose 3D engines appeared, Android
492reoriented itself around OpenGL ES. However, it was important to keep the old
493API working, for apps as well as app framework code, so an effort was made to
494hardware-accelerate the Canvas API. As you can see from the charts on the
495<a href="http://developer.android.com/guide/topics/graphics/hardware-accel.html">Hardware
496Acceleration</a>
497page, this was a bit of a bumpy ride. Note in particular that while the Canvas
498provided to a View's <code>onDraw()</code> method may be hardware-accelerated, the Canvas
499obtained when an app locks a Surface directly with <code>lockCanvas()</code> never is.</p>
500
501<p>When you lock a Surface for Canvas access, the "CPU renderer" connects to the
502producer side of the BufferQueue and does not disconnect until the Surface is
503destroyed. Most other producers (like GLES) can be disconnected and reconnected
504to a Surface, but the Canvas-based "CPU renderer" cannot. This means you can't
505draw on a surface with GLES or send it frames from a video decoder if you've
506ever locked it for a Canvas.</p>
507
508<p>The first time the producer requests a buffer from a BufferQueue, it is
509allocated and initialized to zeroes. Initialization is necessary to avoid
510inadvertently sharing data between processes. When you re-use a buffer,
511however, the previous contents will still be present. If you repeatedly call
512<code>lockCanvas()</code> and <code>unlockCanvasAndPost()</code> without
513drawing anything, you'll cycle between previously-rendered frames.</p>
514
515<p>The Surface lock/unlock code keeps a reference to the previously-rendered
516buffer. If you specify a dirty region when locking the Surface, it will copy
517the non-dirty pixels from the previous buffer. There's a fair chance the buffer
518will be handled by SurfaceFlinger or HWC; but since we need to only read from
519it, there's no need to wait for exclusive access.</p>
520
521<p>The main non-Canvas way for an application to draw directly on a Surface is
522through OpenGL ES. That's described in the <a href="#eglsurface">EGLSurface and
523OpenGL ES</a> section.</p>
524
525<h3 id="surfaceholder">SurfaceHolder</h3>
526
527<p>Some things that work with Surfaces want a SurfaceHolder, notably SurfaceView.
528The original idea was that Surface represented the raw compositor-managed
529buffer, while SurfaceHolder was managed by the app and kept track of
530higher-level information like the dimensions and format. The Java-language
531definition mirrors the underlying native implementation. It's arguably no
532longer useful to split it this way, but it has long been part of the public API.</p>
533
534<p>Generally speaking, anything having to do with a View will involve a
535SurfaceHolder. Some other APIs, such as MediaCodec, will operate on the Surface
536itself. You can easily get the Surface from the SurfaceHolder, so hang on to
537the latter when you have it.</p>
538
539<p>APIs to get and set Surface parameters, such as the size and format, are
540implemented through SurfaceHolder.</p>
541
542<h2 id="eglsurface">EGLSurface and OpenGL ES</h2>
543
544<p>OpenGL ES defines an API for rendering graphics. It does not define a windowing
545system. To allow GLES to work on a variety of platforms, it is designed to be
546combined with a library that knows how to create and access windows through the
547operating system. The library used for Android is called EGL. If you want to
548draw textured polygons, you use GLES calls; if you want to put your rendering on
549the screen, you use EGL calls.</p>
550
551<p>Before you can do anything with GLES, you need to create a GL context. In EGL,
552this means creating an EGLContext and an EGLSurface. GLES operations apply to
553the current context, which is accessed through thread-local storage rather than
554passed around as an argument. This means you have to be careful about which
555thread your rendering code executes on, and which context is current on that
556thread.</p>
557
558<p>The EGLSurface can be an off-screen buffer allocated by EGL (called a "pbuffer")
559or a window allocated by the operating system. EGL window surfaces are created
560with the <code>eglCreateWindowSurface()</code> call. It takes a "window object" as an
561argument, which on Android can be a SurfaceView, a SurfaceTexture, a
562SurfaceHolder, or a Surface -- all of which have a BufferQueue underneath. When
563you make this call, EGL creates a new EGLSurface object, and connects it to the
564producer interface of the window object's BufferQueue. From that point onward,
565rendering to that EGLSurface results in a buffer being dequeued, rendered into,
566and queued for use by the consumer. (The term "window" is indicative of the
567expected use, but bear in mind the output might not be destined to appear
568on the display.)</p>
569
570<p>EGL does not provide lock/unlock calls. Instead, you issue drawing commands and
571then call <code>eglSwapBuffers()</code> to submit the current frame. The
572method name comes from the traditional swap of front and back buffers, but the actual
573implementation may be very different.</p>
574
575<p>Only one EGLSurface can be associated with a Surface at a time -- you can have
576only one producer connected to a BufferQueue -- but if you destroy the
577EGLSurface it will disconnect from the BufferQueue and allow something else to
578connect.</p>
579
580<p>A given thread can switch between multiple EGLSurfaces by changing what's
581"current." An EGLSurface must be current on only one thread at a time.</p>
582
583<p>The most common mistake when thinking about EGLSurface is assuming that it is
584just another aspect of Surface (like SurfaceHolder). It's a related but
585independent concept. You can draw on an EGLSurface that isn't backed by a
586Surface, and you can use a Surface without EGL. EGLSurface just gives GLES a
587place to draw.</p>
588
589<h3 id="anativewindow">ANativeWindow</h3>
590
591<p>The public Surface class is implemented in the Java programming language. The
592equivalent in C/C++ is the ANativeWindow class, semi-exposed by the <a
593href="https://developer.android.com/tools/sdk/ndk/index.html">Android NDK</a>. You
594can get the ANativeWindow from a Surface with the <code>ANativeWindow_fromSurface()</code>
595call. Just like its Java-language cousin, you can lock it, render in software,
596and unlock-and-post.</p>
597
598<p>To create an EGL window surface from native code, you pass an instance of
599EGLNativeWindowType to <code>eglCreateWindowSurface()</code>. EGLNativeWindowType is just
600a synonym for ANativeWindow, so you can freely cast one to the other.</p>
601
602<p>The fact that the basic "native window" type just wraps the producer side of a
603BufferQueue should not come as a surprise.</p>
604
605<h2 id="surfaceview">SurfaceView and GLSurfaceView</h2>
606
607<p>Now that we've explored the lower-level components, it's time to see how they
608fit into the higher-level components that apps are built from.</p>
609
610<p>The Android app framework UI is based on a hierarchy of objects that start with
611View. Most of the details don't matter for this discussion, but it's helpful to
612understand that UI elements go through a complicated measurement and layout
613process that fits them into a rectangular area. All visible View objects are
614rendered to a SurfaceFlinger-created Surface that was set up by the
615WindowManager when the app was brought to the foreground. The layout and
616rendering is performed on the app's UI thread.</p>
617
618<p>Regardless of how many Layouts and Views you have, everything gets rendered into
619a single buffer. This is true whether or not the Views are hardware-accelerated.</p>
620
621<p>A SurfaceView takes the same sorts of parameters as other views, so you can give
622it a position and size, and fit other elements around it. When it comes time to
623render, however, the contents are completely transparent. The View part of a
624SurfaceView is just a see-through placeholder.</p>
625
626<p>When the SurfaceView's View component is about to become visible, the framework
627asks the WindowManager to ask SurfaceFlinger to create a new Surface. (This
628doesn't happen synchronously, which is why you should provide a callback that
629notifies you when the Surface creation finishes.) By default, the new Surface
630is placed behind the app UI Surface, but the default "Z-ordering" can be
631overridden to put the Surface on top.</p>
632
633<p>Whatever you render onto this Surface will be composited by SurfaceFlinger, not
634by the app. This is the real power of SurfaceView: the Surface you get can be
635rendered by a separate thread or a separate process, isolated from any rendering
636performed by the app UI, and the buffers go directly to SurfaceFlinger. You
637can't totally ignore the UI thread -- you still have to coordinate with the
638Activity lifecycle, and you may need to adjust something if the size or position
639of the View changes -- but you have a whole Surface all to yourself, and
640blending with the app UI and other layers is handled by the Hardware Composer.</p>
641
642<p>It's worth taking a moment to note that this new Surface is the producer side of
643a BufferQueue whose consumer is a SurfaceFlinger layer. You can update the
644Surface with any mechanism that can feed a BufferQueue. You can: use the
645Surface-supplied Canvas functions, attach an EGLSurface and draw on it
646with GLES, and configure a MediaCodec video decoder to write to it.</p>
647
648<h3 id="composition">Composition and the Hardware Scaler</h3>
649
650<p>Now that we have a bit more context, it's useful to go back and look at a couple
651of fields from <code>dumpsys SurfaceFlinger</code> that we skipped over earlier
652on. Back in the <a href="#hwcomposer">Hardware Composer</a> discussion, we
653looked at some output like this:</p>
654
655<pre>
656 type | source crop | frame name
657------------+-----------------------------------+--------------------------------
658 HWC | [ 0.0, 0.0, 320.0, 240.0] | [ 48, 411, 1032, 1149] SurfaceView
659 HWC | [ 0.0, 75.0, 1080.0, 1776.0] | [ 0, 75, 1080, 1776] com.android.grafika/com.android.grafika.PlayMovieSurfaceActivity
660 HWC | [ 0.0, 0.0, 1080.0, 75.0] | [ 0, 0, 1080, 75] StatusBar
661 HWC | [ 0.0, 0.0, 1080.0, 144.0] | [ 0, 1776, 1080, 1920] NavigationBar
662 FB TARGET | [ 0.0, 0.0, 1080.0, 1920.0] | [ 0, 0, 1080, 1920] HWC_FRAMEBUFFER_TARGET
663</pre>
664
665<p>This was taken while playing a movie in Grafika's "Play video (SurfaceView)"
666activity, on a Nexus 5 in portrait orientation. Note that the list is ordered
667from back to front: the SurfaceView's Surface is in the back, the app UI layer
668sits on top of that, followed by the status and navigation bars that are above
669everything else. The video is QVGA (320x240).</p>
670
671<p>The "source crop" indicates the portion of the Surface's buffer that
672SurfaceFlinger is going to display. The app UI was given a Surface equal to the
673full size of the display (1080x1920), but there's no point rendering and
674compositing pixels that will be obscured by the status and navigation bars, so
675the source is cropped to a rectangle that starts 75 pixels from the top, and
676ends 144 pixels from the bottom. The status and navigation bars have smaller
677Surfaces, and the source crop describes a rectangle that begins at the the top
678left (0,0) and spans their content.</p>
679
680<p>The "frame" is the rectangle where the pixels end up on the display. For the
681app UI layer, the frame matches the source crop, because we're copying (or
682overlaying) a portion of a display-sized layer to the same location in another
683display-sized layer. For the status and navigation bars, the size of the frame
684rectangle is the same, but the position is adjusted so that the navigation bar
685appears at the bottom of the screen.</p>
686
687<p>Now consider the layer labeled "SurfaceView", which holds our video content.
688The source crop matches the video size, which SurfaceFlinger knows because the
689MediaCodec decoder (the buffer producer) is dequeuing buffers that size. The
690frame rectangle has a completely different size -- 984x738.</p>
691
692<p>SurfaceFlinger handles size differences by scaling the buffer contents to fill
693the frame rectangle, upscaling or downscaling as needed. This particular size
694was chosen because it has the same aspect ratio as the video (4:3), and is as
695wide as possible given the constraints of the View layout (which includes some
696padding at the edges of the screen for aesthetic reasons).</p>
697
698<p>If you started playing a different video on the same Surface, the underlying
699BufferQueue would reallocate buffers to the new size automatically, and
700SurfaceFlinger would adjust the source crop. If the aspect ratio of the new
701video is different, the app would need to force a re-layout of the View to match
702it, which causes the WindowManager to tell SurfaceFlinger to update the frame
703rectangle.</p>
704
705<p>If you're rendering on the Surface through some other means, perhaps GLES, you
706can set the Surface size using the <code>SurfaceHolder#setFixedSize()</code>
707call. You could, for example, configure a game to always render at 1280x720,
708which would significantly reduce the number of pixels that must be touched to
709fill the screen on a 2560x1440 tablet or 4K television. The display processor
710handles the scaling. If you don't want to letter- or pillar-box your game, you
711could adjust the game's aspect ratio by setting the size so that the narrow
712dimension is 720 pixels, but the long dimension is set to maintain the aspect
713ratio of the physical display (e.g. 1152x720 to match a 2560x1600 display).
714You can see an example of this approach in Grafika's "Hardware scaler
715exerciser" activity.</p>
716
717<h3 id="glsurfaceview">GLSurfaceView</h3>
718
719<p>The GLSurfaceView class provides some helper classes that help manage EGL
720contexts, inter-thread communication, and interaction with the Activity
721lifecycle. That's it. You do not need to use a GLSurfaceView to use GLES.</p>
722
723<p>For example, GLSurfaceView creates a thread for rendering and configures an EGL
724context there. The state is cleaned up automatically when the activity pauses.
725Most apps won't need to know anything about EGL to use GLES with GLSurfaceView.</p>
726
727<p>In most cases, GLSurfaceView is very helpful and can make working with GLES
728easier. In some situations, it can get in the way. Use it if it helps, don't
729if it doesn't.</p>
730
731<h2 id="surfacetexture">SurfaceTexture</h2>
732
733<p>The SurfaceTexture class is a relative newcomer, added in Android 3.0
734("Honeycomb"). Just as SurfaceView is the combination of a Surface and a View,
735SurfaceTexture is the combination of a Surface and a GLES texture. Sort of.</p>
736
737<p>When you create a SurfaceTexture, you are creating a BufferQueue for which your
738app is the consumer. When a new buffer is queued by the producer, your app is
739notified via callback (<code>onFrameAvailable()</code>). Your app calls
740<code>updateTexImage()</code>, which releases the previously-held buffer,
741acquires the new buffer from the queue, and makes some EGL calls to make the
742buffer available to GLES as an "external" texture.</p>
743
744<p>External textures (<code>GL_TEXTURE_EXTERNAL_OES</code>) are not quite the
745same as textures created by GLES (<code>GL_TEXTURE_2D</code>). You have to
746configure your renderer a bit differently, and there are things you can't do
747with them. But the key point is this: You can render textured polygons directly
748from the data received by your BufferQueue.</p>
749
750<p>You may be wondering how we can guarantee the format of the data in the
751buffer is something GLES can recognize -- gralloc supports a wide variety
752of formats. When SurfaceTexture created the BufferQueue, it set the consumer's
753usage flags to <code>GRALLOC_USAGE_HW_TEXTURE</code>, ensuring that any buffer
754created by gralloc would be usable by GLES.</p>
755
756<p>Because SurfaceTexture interacts with an EGL context, you have to be careful to
757call its methods from the correct thread. This is spelled out in the class
758documentation.</p>
759
760<p>If you look deeper into the class documentation, you will see a couple of odd
761calls. One retrieves a timestamp, the other a transformation matrix, the value
762of each having been set by the previous call to <code>updateTexImage()</code>.
763It turns out that BufferQueue passes more than just a buffer handle to the consumer.
764Each buffer is accompanied by a timestamp and transformation parameters.</p>
765
766<p>The transformation is provided for efficiency. In some cases, the source data
767might be in the "wrong" orientation for the consumer; but instead of rotating
768the data before sending it, we can send the data in its current orientation with
769a transform that corrects it. The transformation matrix can be merged with
770other transformations at the point the data is used, minimizing overhead.</p>
771
772<p>The timestamp is useful for certain buffer sources. For example, suppose you
773connect the producer interface to the output of the camera (with
774<code>setPreviewTexture()</code>). If you want to create a video, you need to
775set the presentation time stamp for each frame; but you want to base that on the time
776when the frame was captured, not the time when the buffer was received by your
777app. The timestamp provided with the buffer is set by the camera code,
778resulting in a more consistent series of timestamps.</p>
779
780<h3 id="surfacet">SurfaceTexture and Surface</h3>
781
782<p>If you look closely at the API you'll see the only way for an application
783to create a plain Surface is through a constructor that takes a SurfaceTexture
784as the sole argument. (Prior to API 11, there was no public constructor for
785Surface at all.) This might seem a bit backward if you view SurfaceTexture as a
786combination of a Surface and a texture.</p>
787
788<p>Under the hood, SurfaceTexture is called GLConsumer, which more accurately
789reflects its role as the owner and consumer of a BufferQueue. When you create a
790Surface from a SurfaceTexture, what you're doing is creating an object that
791represents the producer side of the SurfaceTexture's BufferQueue.</p>
792
793<h3 id="continuous-capture">Case Study: Grafika's "Continuous Capture" Activity</h3>
794
795<p>The camera can provide a stream of frames suitable for recording as a movie. If
796you want to display it on screen, you create a SurfaceView, pass the Surface to
797<code>setPreviewDisplay()</code>, and let the producer (camera) and consumer
798(SurfaceFlinger) do all the work. If you want to record the video, you create a
799Surface with MediaCodec's <code>createInputSurface()</code>, pass that to the
800camera, and again you sit back and relax. If you want to show the video and
801record it at the same time, you have to get more involved.</p>
802
803<p>The "Continuous capture" activity displays video from the camera as it's being
804recorded. In this case, encoded video is written to a circular buffer in memory
805that can be saved to disk at any time. It's straightforward to implement so
806long as you keep track of where everything is.</p>
807
808<p>There are three BufferQueues involved. The app uses a SurfaceTexture to receive
809frames from Camera, converting them to an external GLES texture. The app
810declares a SurfaceView, which we use to display the frames, and we configure a
811MediaCodec encoder with an input Surface to create the video. So one
812BufferQueue is created by the app, one by SurfaceFlinger, and one by
813mediaserver.</p>
814
815<img src="images/continuous_capture_activity.png" alt="Grafika continuous
816capture activity" />
817
818<p class="img-caption">
819 <strong>Figure 2.</strong>Grafika's continuous capture activity
820</p>
821
822<p>In the diagram above, the arrows show the propagation of the data from the
823camera. BufferQueues are in color (purple producer, cyan consumer). Note
824“Camera” actually lives in the mediaserver process.</p>
825
826<p>Encoded H.264 video goes to a circular buffer in RAM in the app process, and is
827written to an MP4 file on disk using the MediaMuxer class when the “capture”
828button is hit.</p>
829
830<p>All three of the BufferQueues are handled with a single EGL context in the
831app, and the GLES operations are performed on the UI thread. Doing the
832SurfaceView rendering on the UI thread is generally discouraged, but since we're
833doing simple operations that are handled asynchronously by the GLES driver we
834should be fine. (If the video encoder locks up and we block trying to dequeue a
835buffer, the app will become unresponsive. But at that point, we're probably
836failing anyway.) The handling of the encoded data -- managing the circular
837buffer and writing it to disk -- is performed on a separate thread.</p>
838
839<p>The bulk of the configuration happens in the SurfaceView's <code>surfaceCreated()</code>
840callback. The EGLContext is created, and EGLSurfaces are created for the
841display and for the video encoder. When a new frame arrives, we tell
842SurfaceTexture to acquire it and make it available as a GLES texture, then
843render it with GLES commands on each EGLSurface (forwarding the transform and
844timestamp from SurfaceTexture). The encoder thread pulls the encoded output
845from MediaCodec and stashes it in memory.</p>
846
847<h2 id="texture">TextureView</h2>
848
849<p>The TextureView class was
850<a href="http://android-developers.blogspot.com/2011/11/android-40-graphics-and-animations.html">introduced</a>
851in Android 4.0 ("Ice Cream Sandwich"). It's the most complex of the View
852objects discussed here, combining a View with a SurfaceTexture.</p>
853
854<p>Recall that the SurfaceTexture is a "GL consumer", consuming buffers of graphics
855data and making them available as textures. TextureView wraps a SurfaceTexture,
856taking over the responsibility of responding to the callbacks and acquiring new
857buffers. The arrival of new buffers causes TextureView to issue a View
858invalidate request. When asked to draw, the TextureView uses the contents of
859the most recently received buffer as its data source, rendering wherever and
860however the View state indicates it should.</p>
861
862<p>You can render on a TextureView with GLES just as you would SurfaceView. Just
863pass the SurfaceTexture to the EGL window creation call. However, doing so
864exposes a potential problem.</p>
865
866<p>In most of what we've looked at, the BufferQueues have passed buffers between
867different processes. When rendering to a TextureView with GLES, both producer
868and consumer are in the same process, and they might even be handled on a single
869thread. Suppose we submit several buffers in quick succession from the UI
870thread. The EGL buffer swap call will need to dequeue a buffer from the
871BufferQueue, and it will stall until one is available. There won't be any
872available until the consumer acquires one for rendering, but that also happens
873on the UI thread… so we're stuck.</p>
874
875<p>The solution is to have BufferQueue ensure there is always a buffer
876available to be dequeued, so the buffer swap never stalls. One way to guarantee
877this is to have BufferQueue discard the contents of the previously-queued buffer
878when a new buffer is queued, and to place restrictions on minimum buffer counts
879and maximum acquired buffer counts. (If your queue has three buffers, and all
880three buffers are acquired by the consumer, then there's nothing to dequeue and
881the buffer swap call must hang or fail. So we need to prevent the consumer from
882acquiring more than two buffers at once.) Dropping buffers is usually
883undesirable, so it's only enabled in specific situations, such as when the
884producer and consumer are in the same process.</p>
885
886<h3 id="surface-or-texture">SurfaceView or TextureView?</h3>
887SurfaceView and TextureView fill similar roles, but have very different
888implementations. To decide which is best requires an understanding of the
889trade-offs.</p>
890
891<p>Because TextureView is a proper citizen of the View hierarchy, it behaves like
892any other View, and can overlap or be overlapped by other elements. You can
893perform arbitrary transformations and retrieve the contents as a bitmap with
894simple API calls.</p>
895
896<p>The main strike against TextureView is the performance of the composition step.
897With SurfaceView, the content is written to a separate layer that SurfaceFlinger
898composites, ideally with an overlay. With TextureView, the View composition is
899always performed with GLES, and updates to its contents may cause other View
900elements to redraw as well (e.g. if they're positioned on top of the
901TextureView). After the View rendering completes, the app UI layer must then be
902composited with other layers by SurfaceFlinger, so you're effectively
903compositing every visible pixel twice. For a full-screen video player, or any
904other application that is effectively just UI elements layered on top of video,
905SurfaceView offers much better performance.</p>
906
907<p>As noted earlier, DRM-protected video can be presented only on an overlay plane.
908 Video players that support protected content must be implemented with
909SurfaceView.</p>
910
911<h3 id="grafika">Case Study: Grafika's Play Video (TextureView)</h3>
912
913<p>Grafika includes a pair of video players, one implemented with TextureView, the
914other with SurfaceView. The video decoding portion, which just sends frames
915from MediaCodec to a Surface, is the same for both. The most interesting
916differences between the implementations are the steps required to present the
917correct aspect ratio.</p>
918
919<p>While SurfaceView requires a custom implementation of FrameLayout, resizing
920SurfaceTexture is a simple matter of configuring a transformation matrix with
921<code>TextureView#setTransform()</code>. For the former, you're sending new
922window position and size values to SurfaceFlinger through WindowManager; for
923the latter, you're just rendering it differently.</p>
924
925<p>Otherwise, both implementations follow the same pattern. Once the Surface has
926been created, playback is enabled. When "play" is hit, a video decoding thread
927is started, with the Surface as the output target. After that, the app code
928doesn't have to do anything -- composition and display will either be handled by
929SurfaceFlinger (for the SurfaceView) or by TextureView.</p>
930
931<h3 id="decode">Case Study: Grafika's Double Decode</h3>
932
933<p>This activity demonstrates manipulation of the SurfaceTexture inside a
934TextureView.</p>
935
936<p>The basic structure of this activity is a pair of TextureViews that show two
937different videos playing side-by-side. To simulate the needs of a
938videoconferencing app, we want to keep the MediaCodec decoders alive when the
939activity is paused and resumed for an orientation change. The trick is that you
940can't change the Surface that a MediaCodec decoder uses without fully
941reconfiguring it, which is a fairly expensive operation; so we want to keep the
942Surface alive. The Surface is just a handle to the producer interface in the
943SurfaceTexture's BufferQueue, and the SurfaceTexture is managed by the
944TextureView;, so we also need to keep the SurfaceTexture alive. So how do we deal
945with the TextureView getting torn down?</p>
946
947<p>It just so happens TextureView provides a <code>setSurfaceTexture()</code> call
948that does exactly what we want. We obtain references to the SurfaceTextures
949from the TextureViews and save them in a static field. When the activity is
950shut down, we return "false" from the <code>onSurfaceTextureDestroyed()</code>
951callback to prevent destruction of the SurfaceTexture. When the activity is
952restarted, we stuff the old SurfaceTexture into the new TextureView. The
953TextureView class takes care of creating and destroying the EGL contexts.</p>
954
955<p>Each video decoder is driven from a separate thread. At first glance it might
956seem like we need EGL contexts local to each thread; but remember the buffers
957with decoded output are actually being sent from mediaserver to our
958BufferQueue consumers (the SurfaceTextures). The TextureViews take care of the
959rendering for us, and they execute on the UI thread.</p>
960
961<p>Implementing this activity with SurfaceView would be a bit harder. We can't
962just create a pair of SurfaceViews and direct the output to them, because the
963Surfaces would be destroyed during an orientation change. Besides, that would
964add two layers, and limitations on the number of available overlays strongly
965motivate us to keep the number of layers to a minimum. Instead, we'd want to
966create a pair of SurfaceTextures to receive the output from the video decoders,
967and then perform the rendering in the app, using GLES to render two textured
968quads onto the SurfaceView's Surface.</p>
969
970<h2 id="notes">Conclusion</h2>
971
972<p>We hope this page has provided useful insights into the way Android handles
973graphics at the system level.</p>
974
975<p>Some information and advice on related topics can be found in the appendices
976that follow.</p>
977
978<h2 id="loops">Appendix A: Game Loops</h2>
979
980<p>A very popular way to implement a game loop looks like this:</p>
981
982<pre>
983while (playing) {
984 advance state by one frame
985 render the new frame
986 sleep until it’s time to do the next frame
987}
988</pre>
989
990<p>There are a few problems with this, the most fundamental being the idea that the
991game can define what a "frame" is. Different displays will refresh at different
992rates, and that rate may vary over time. If you generate frames faster than the
993display can show them, you will have to drop one occasionally. If you generate
994them too slowly, SurfaceFlinger will periodically fail to find a new buffer to
995acquire and will re-show the previous frame. Both of these situations can
996cause visible glitches.</p>
997
998<p>What you need to do is match the display's frame rate, and advance game state
999according to how much time has elapsed since the previous frame. There are two
1000ways to go about this: (1) stuff the BufferQueue full and rely on the "swap
1001buffers" back-pressure; (2) use Choreographer (API 16+).</p>
1002
1003<h3 id="stuffing">Queue Stuffing</h3>
1004
1005<p>This is very easy to implement: just swap buffers as fast as you can. In early
1006versions of Android this could actually result in a penalty where
1007<code>SurfaceView#lockCanvas()</code> would put you to sleep for 100ms. Now
1008it's paced by the BufferQueue, and the BufferQueue is emptied as quickly as
1009SurfaceFlinger is able.</p>
1010
1011<p>One example of this approach can be seen in <a
1012href="https://code.google.com/p/android-breakout/">Android Breakout</a>. It
1013uses GLSurfaceView, which runs in a loop that calls the application's
1014onDrawFrame() callback and then swaps the buffer. If the BufferQueue is full,
1015the <code>eglSwapBuffers()</code> call will wait until a buffer is available.
1016Buffers become available when SurfaceFlinger releases them, which it does after
1017acquiring a new one for display. Because this happens on VSYNC, your draw loop
1018timing will match the refresh rate. Mostly.</p>
1019
1020<p>There are a couple of problems with this approach. First, the app is tied to
1021SurfaceFlinger activity, which is going to take different amounts of time
1022depending on how much work there is to do and whether it's fighting for CPU time
1023with other processes. Since your game state advances according to the time
1024between buffer swaps, your animation won't update at a consistent rate. When
1025running at 60fps with the inconsistencies averaged out over time, though, you
1026probably won't notice the bumps.</p>
1027
1028<p>Second, the first couple of buffer swaps are going to happen very quickly
1029because the BufferQueue isn't full yet. The computed time between frames will
1030be near zero, so the game will generate a few frames in which nothing happens.
1031In a game like Breakout, which updates the screen on every refresh, the queue is
1032always full except when a game is first starting (or un-paused), so the effect
1033isn't noticeable. A game that pauses animation occasionally and then returns to
1034as-fast-as-possible mode might see odd hiccups.</p>
1035
1036<h3 id="choreographer">Choreographer</h3>
1037
1038<p>Choreographer allows you to set a callback that fires on the next VSYNC. The
1039actual VSYNC time is passed in as an argument. So even if your app doesn't wake
1040up right away, you still have an accurate picture of when the display refresh
1041period began. Using this value, rather than the current time, yields a
1042consistent time source for your game state update logic.</p>
1043
1044<p>Unfortunately, the fact that you get a callback after every VSYNC does not
1045guarantee that your callback will be executed in a timely fashion or that you
1046will be able to act upon it sufficiently swiftly. Your app will need to detect
1047situations where it's falling behind and drop frames manually.</p>
1048
1049<p>The "Record GL app" activity in Grafika provides an example of this. On some
1050devices (e.g. Nexus 4 and Nexus 5), the activity will start dropping frames if
1051you just sit and watch. The GL rendering is trivial, but occasionally the View
1052elements get redrawn, and the measure/layout pass can take a very long time if
1053the device has dropped into a reduced-power mode. (According to systrace, it
1054takes 28ms instead of 6ms after the clocks slow on Android 4.4. If you drag
1055your finger around the screen, it thinks you're interacting with the activity,
1056so the clock speeds stay high and you'll never drop a frame.)</p>
1057
1058<p>The simple fix was to drop a frame in the Choreographer callback if the current
1059time is more than N milliseconds after the VSYNC time. Ideally the value of N
1060is determined based on previously observed VSYNC intervals. For example, if the
1061refresh period is 16.7ms (60fps), you might drop a frame if you're running more
1062than 15ms late.</p>
1063
1064<p>If you watch "Record GL app" run, you will see the dropped-frame counter
1065increase, and even see a flash of red in the border when frames drop. Unless
1066your eyes are very good, though, you won't see the animation stutter. At 60fps,
1067the app can drop the occasional frame without anyone noticing so long as the
1068animation continues to advance at a constant rate. How much you can get away
1069with depends to some extent on what you're drawing, the characteristics of the
1070display, and how good the person using the app is at detecting jank.</p>
1071
1072<h3 id="thread">Thread Management</h3>
1073
1074<p>Generally speaking, if you're rendering onto a SurfaceView, GLSurfaceView, or
1075TextureView, you want to do that rendering in a dedicated thread. Never do any
1076"heavy lifting" or anything that takes an indeterminate amount of time on the
1077UI thread.</p>
1078
1079<p>Breakout and "Record GL app" use dedicated renderer threads, and they also
1080update animation state on that thread. This is a reasonable approach so long as
1081game state can be updated quickly.</p>
1082
1083<p>Other games separate the game logic and rendering completely. If you had a
1084simple game that did nothing but move a block every 100ms, you could have a
1085dedicated thread that just did this:</p>
1086
1087<pre>
1088 run() {
1089 Thread.sleep(100);
1090 synchronized (mLock) {
1091 moveBlock();
1092 }
1093 }
1094</pre>
1095
1096<p>(You may want to base the sleep time off of a fixed clock to prevent drift --
1097sleep() isn't perfectly consistent, and moveBlock() takes a nonzero amount of
1098time -- but you get the idea.)</p>
1099
1100<p>When the draw code wakes up, it just grabs the lock, gets the current position
1101of the block, releases the lock, and draws. Instead of doing fractional
1102movement based on inter-frame delta times, you just have one thread that moves
1103things along and another thread that draws things wherever they happen to be
1104when the drawing starts.</p>
1105
1106<p>For a scene with any complexity you'd want to create a list of upcoming events
1107sorted by wake time, and sleep until the next event is due, but it's the same
1108idea.</p>
1109
1110<h2 id="activity">Appendix B: SurfaceView and the Activity Lifecycle</h2>
1111
1112<p>When using a SurfaceView, it's considered good practice to render the Surface
1113from a thread other than the main UI thread. This raises some questions about
1114the interaction between that thread and the Activity lifecycle.</p>
1115
1116<p>First, a little background. For an Activity with a SurfaceView, there are two
1117separate but interdependent state machines:</p>
1118
1119<ol>
1120<li>Application onCreate / onResume / onPause</li>
1121<li>Surface created / changed / destroyed</li>
1122</ol>
1123
1124<p>When the Activity starts, you get callbacks in this order:</p>
1125
1126<ul>
1127<li>onCreate</li>
1128<li>onResume</li>
1129<li>surfaceCreated</li>
1130<li>surfaceChanged</li>
1131</ul>
1132
1133<p>If you hit "back" you get:</p>
1134
1135<ul>
1136<li>onPause</li>
1137<li>surfaceDestroyed (called just before the Surface goes away)</li>
1138</ul>
1139
1140<p>If you rotate the screen, the Activity is torn down and recreated, so you
1141get the full cycle. If it matters, you can tell that it's a "quick" restart by
1142checking <code>isFinishing()</code>. (It might be possible to start / stop an
1143Activity so quickly that surfaceCreated() might actually happen after onPause().)</p>
1144
1145<p>If you tap the power button to blank the screen, you only get
1146<code>onPause()</code> -- no <code>surfaceDestroyed()</code>. The Surface
1147remains alive, and rendering can continue. You can even keep getting
1148Choreographer events if you continue to request them. If you have a lock
1149screen that forces a different orientation, your Activity may be restarted when
1150the device is unblanked; but if not, you can come out of screen-blank with the
1151same Surface you had before.</p>
1152
1153<p>This raises a fundamental question when using a separate renderer thread with
1154SurfaceView: Should the lifespan of the thread be tied to that of the Surface or
1155the Activity? The answer depends on what you want to have happen when the
1156screen goes blank. There are two basic approaches: (1) start/stop the thread on
1157Activity start/stop; (2) start/stop the thread on Surface create/destroy.</p>
1158
1159<p>#1 interacts well with the app lifecycle. We start the renderer thread in
1160<code>onResume()</code> and stop it in <code>onPause()</code>. It gets a bit
1161awkward when creating and configuring the thread because sometimes the Surface
1162will already exist and sometimes it won't (e.g. it's still alive after toggling
1163the screen with the power button). We have to wait for the surface to be
1164created before we do some initialization in the thread, but we can't simply do
1165it in the <code>surfaceCreated()</code> callback because that won't fire again
1166if the Surface didn't get recreated. So we need to query or cache the Surface
1167state, and forward it to the renderer thread. Note we have to be a little
1168careful here passing objects between threads -- it is best to pass the Surface or
1169SurfaceHolder through a Handler message, rather than just stuffing it into the
1170thread, to avoid issues on multi-core systems (cf. the <a
1171href="http://developer.android.com/training/articles/smp.html">Android SMP
1172Primer</a>).</p>
1173
1174<p>#2 has a certain appeal because the Surface and the renderer are logically
1175intertwined. We start the thread after the Surface has been created, which
1176avoids some inter-thread communication concerns. Surface created / changed
1177messages are simply forwarded. We need to make sure rendering stops when the
1178screen goes blank, and resumes when it un-blanks; this could be a simple matter
1179of telling Choreographer to stop invoking the frame draw callback. Our
1180<code>onResume()</code> will need to resume the callbacks if and only if the
1181renderer thread is running. It may not be so trivial though -- if we animate
1182based on elapsed time between frames, we could have a very large gap when the
1183next event arrives; so an explicit pause/resume message may be desirable.</p>
1184
1185<p>The above is primarily concerned with how the renderer thread is configured and
1186whether it's executing. A related concern is extracting state from the thread
1187when the Activity is killed (in <code>onPause()</code> or <code>onSaveInstanceState()</code>).
1188Approach #1 will work best for that, because once the renderer thread has been
1189joined its state can be accessed without synchronization primitives.</p>
1190
1191<p>You can see an example of approach #2 in Grafika's "Hardware scaler exerciser."</p>
1192
1193<h2 id="tracking">Appendix C: Tracking BufferQueue with systrace</h2>
1194
1195<p>If you really want to understand how graphics buffers move around, you need to
1196use systrace. The system-level graphics code is well instrumented, as is much
1197of the relevant app framework code. Enable the "gfx" and "view" tags, and
1198generally "sched" as well.</p>
1199
1200<p>A full description of how to use systrace effectively would fill a rather long
1201document. One noteworthy item is the presence of BufferQueues in the trace. If
1202you've used systrace before, you've probably seen them, but maybe weren't sure
1203what they were. As an example, if you grab a trace while Grafika's "Play video
1204(SurfaceView)" is running, you will see a row labeled: "SurfaceView" This row
1205tells you how many buffers were queued up at any given time.</p>
1206
1207<p>You'll notice the value increments while the app is active -- triggering
1208the rendering of frames by the MediaCodec decoder -- and decrements while
1209SurfaceFlinger is doing work, consuming buffers. If you're showing video at
121030fps, the queue's value will vary from 0 to 1, because the ~60fps display can
1211easily keep up with the source. (You'll also notice that SurfaceFlinger is only
1212waking up when there's work to be done, not 60 times per second. The system tries
1213very hard to avoid work and will disable VSYNC entirely if nothing is updating
1214the screen.)</p>
1215
1216<p>If you switch to "Play video (TextureView)" and grab a new trace, you'll see a
1217row with a much longer name
1218("com.android.grafika/com.android.grafika.PlayMovieActivity"). This is the
1219main UI layer, which is of course just another BufferQueue. Because TextureView
1220renders into the UI layer, rather than a separate layer, you'll see all of the
1221video-driven updates here.</p>
1222
1223<p>For more information about systrace, see the <a
1224href="http://developer.android.com/tools/help/systrace.html">Android
1225documentation</a> for the tool.</p>