Robert Ly | 35f2fda | 2013-01-29 16:27:05 -0800 | [diff] [blame] | 1 | page.title=Audio Latency |
| 2 | @jd:body |
| 3 | |
| 4 | <!-- |
| 5 | Copyright 2010 The Android Open Source Project |
| 6 | |
| 7 | Licensed under the Apache License, Version 2.0 (the "License"); |
| 8 | you may not use this file except in compliance with the License. |
| 9 | You may obtain a copy of the License at |
| 10 | |
| 11 | http://www.apache.org/licenses/LICENSE-2.0 |
| 12 | |
| 13 | Unless required by applicable law or agreed to in writing, software |
| 14 | distributed under the License is distributed on an "AS IS" BASIS, |
| 15 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| 16 | See the License for the specific language governing permissions and |
| 17 | limitations under the License. |
| 18 | --> |
| 19 | <div id="qv-wrapper"> |
| 20 | <div id="qv"> |
| 21 | <h2>In this document</h2> |
| 22 | <ol id="auto-toc"> |
| 23 | </ol> |
| 24 | </div> |
| 25 | </div> |
| 26 | |
| 27 | <p>Audio latency is the time delay as an audio signal passes through a system. |
| 28 | For a complete description of audio latency for the purposes of Android |
Clay Murphy | c28f237 | 2013-09-25 16:13:40 -0700 | [diff] [blame^] | 29 | compatibility, see <em>Section 5.5 Audio Latency</em> |
Robert Ly | 35f2fda | 2013-01-29 16:27:05 -0800 | [diff] [blame] | 30 | in the <a href="http://source.android.com/compatibility/index.html">Android CDD</a>. |
Clay Murphy | c28f237 | 2013-09-25 16:13:40 -0700 | [diff] [blame^] | 31 | See <a href="latency_design.html">Design For Reduced Latency</a> for an |
| 32 | understanding of Android's audio latency-reduction efforts. |
Robert Ly | 35f2fda | 2013-01-29 16:27:05 -0800 | [diff] [blame] | 33 | </p> |
| 34 | |
| 35 | <h2 id="contributors">Contributors to Latency</h2> |
| 36 | |
| 37 | <p> |
| 38 | This section focuses on the contributors to output latency, |
| 39 | but a similar discussion applies to input latency. |
| 40 | </p> |
| 41 | <p> |
Clay Murphy | c28f237 | 2013-09-25 16:13:40 -0700 | [diff] [blame^] | 42 | Assuming the analog circuitry does not contribute significantly, then the major |
| 43 | surface-level contributors to audio latency are the following: |
Robert Ly | 35f2fda | 2013-01-29 16:27:05 -0800 | [diff] [blame] | 44 | </p> |
| 45 | |
| 46 | <ul> |
| 47 | <li>Application</li> |
| 48 | <li>Total number of buffers in pipeline</li> |
| 49 | <li>Size of each buffer, in frames</li> |
| 50 | <li>Additional latency after the app processor, such as from a DSP</li> |
| 51 | </ul> |
| 52 | |
| 53 | <p> |
| 54 | As accurate as the above list of contributors may be, it is also misleading. |
| 55 | The reason is that buffer count and buffer size are more of an |
| 56 | <em>effect</em> than a <em>cause</em>. What usually happens is that |
| 57 | a given buffer scheme is implemented and tested, but during testing, an audio |
Clay Murphy | c28f237 | 2013-09-25 16:13:40 -0700 | [diff] [blame^] | 58 | underrun is heard as a "click" or "pop." To compensate, the |
Robert Ly | 35f2fda | 2013-01-29 16:27:05 -0800 | [diff] [blame] | 59 | system designer then increases buffer sizes or buffer counts. |
| 60 | This has the desired result of eliminating the underruns, but it also |
| 61 | has the undesired side effect of increasing latency. |
| 62 | </p> |
| 63 | |
| 64 | <p> |
Clay Murphy | c28f237 | 2013-09-25 16:13:40 -0700 | [diff] [blame^] | 65 | A better approach is to understand the causes of the |
Robert Ly | 35f2fda | 2013-01-29 16:27:05 -0800 | [diff] [blame] | 66 | underruns and then correct those. This eliminates the |
| 67 | audible artifacts and may even permit even smaller or fewer buffers |
| 68 | and thus reduce latency. |
| 69 | </p> |
| 70 | |
| 71 | <p> |
| 72 | In our experience, the most common causes of underruns include: |
| 73 | </p> |
| 74 | <ul> |
| 75 | <li>Linux CFS (Completely Fair Scheduler)</li> |
| 76 | <li>high-priority threads with SCHED_FIFO scheduling</li> |
| 77 | <li>long scheduling latency</li> |
| 78 | <li>long-running interrupt handlers</li> |
| 79 | <li>long interrupt disable time</li> |
| 80 | </ul> |
| 81 | |
| 82 | <h3>Linux CFS and SCHED_FIFO scheduling</h3> |
| 83 | <p> |
| 84 | The Linux CFS is designed to be fair to competing workloads sharing a common CPU |
| 85 | resource. This fairness is represented by a per-thread <em>nice</em> parameter. |
| 86 | The nice value ranges from -19 (least nice, or most CPU time allocated) |
| 87 | to 20 (nicest, or least CPU time allocated). In general, all threads with a given |
| 88 | nice value receive approximately equal CPU time and threads with a |
| 89 | numerically lower nice value should expect to |
| 90 | receive more CPU time. However, CFS is "fair" only over relatively long |
| 91 | periods of observation. Over short-term observation windows, |
| 92 | CFS may allocate the CPU resource in unexpected ways. For example, it |
| 93 | may take the CPU away from a thread with numerically low niceness |
| 94 | onto a thread with a numerically high niceness. In the case of audio, |
| 95 | this can result in an underrun. |
| 96 | </p> |
| 97 | |
| 98 | <p> |
| 99 | The obvious solution is to avoid CFS for high-performance audio |
Clay Murphy | c28f237 | 2013-09-25 16:13:40 -0700 | [diff] [blame^] | 100 | threads. Beginning with Android 4.1, such threads now use the |
Robert Ly | 35f2fda | 2013-01-29 16:27:05 -0800 | [diff] [blame] | 101 | <code>SCHED_FIFO</code> scheduling policy rather than the <code>SCHED_NORMAL</code> (also called |
| 102 | <code>SCHED_OTHER</code>) scheduling policy implemented by CFS. |
| 103 | </p> |
| 104 | |
| 105 | <p> |
| 106 | Though the high-performance audio threads now use <code>SCHED_FIFO</code>, they |
| 107 | are still susceptible to other higher priority <code>SCHED_FIFO</code> threads. |
| 108 | These are typically kernel worker threads, but there may also be a few |
| 109 | non-audio user threads with policy <code>SCHED_FIFO</code>. The available <code>SCHED_FIFO</code> |
| 110 | priorities range from 1 to 99. The audio threads run at priority |
| 111 | 2 or 3. This leaves priority 1 available for lower priority threads, |
Clay Murphy | c28f237 | 2013-09-25 16:13:40 -0700 | [diff] [blame^] | 112 | and priorities 4 to 99 for higher priority threads. We recommend |
Robert Ly | 35f2fda | 2013-01-29 16:27:05 -0800 | [diff] [blame] | 113 | you use priority 1 whenever possible, and reserve priorities 4 to 99 for |
| 114 | those threads that are guaranteed to complete within a bounded amount |
Clay Murphy | c28f237 | 2013-09-25 16:13:40 -0700 | [diff] [blame^] | 115 | of time and are known to not interfere with scheduling of audio threads. |
Robert Ly | 35f2fda | 2013-01-29 16:27:05 -0800 | [diff] [blame] | 116 | </p> |
| 117 | |
| 118 | <h3>Scheduling latency</h3> |
| 119 | <p> |
| 120 | Scheduling latency is the time between when a thread becomes |
| 121 | ready to run, and when the resulting context switch completes so that the |
Clay Murphy | c28f237 | 2013-09-25 16:13:40 -0700 | [diff] [blame^] | 122 | thread actually runs on a CPU. The shorter the latency the better, and |
Robert Ly | 35f2fda | 2013-01-29 16:27:05 -0800 | [diff] [blame] | 123 | anything over two milliseconds causes problems for audio. Long scheduling |
| 124 | latency is most likely to occur during mode transitions, such as |
| 125 | bringing up or shutting down a CPU, switching between a security kernel |
| 126 | and the normal kernel, switching from full power to low-power mode, |
| 127 | or adjusting the CPU clock frequency and voltage. |
| 128 | </p> |
| 129 | |
| 130 | <h3>Interrupts</h3> |
| 131 | <p> |
| 132 | In many designs, CPU 0 services all external interrupts. So a |
| 133 | long-running interrupt handler may delay other interrupts, in particular |
Clay Murphy | c28f237 | 2013-09-25 16:13:40 -0700 | [diff] [blame^] | 134 | audio direct memory access (DMA) completion interrupts. Design interrupt handlers |
Robert Ly | 35f2fda | 2013-01-29 16:27:05 -0800 | [diff] [blame] | 135 | to finish quickly and defer any lengthy work to a thread (preferably |
| 136 | a CFS thread or <code>SCHED_FIFO</code> thread of priority 1). |
| 137 | </p> |
| 138 | |
| 139 | <p> |
| 140 | Equivalently, disabling interrupts on CPU 0 for a long period |
| 141 | has the same result of delaying the servicing of audio interrupts. |
| 142 | Long interrupt disable times typically happen while waiting for a kernel |
| 143 | <i>spin lock</i>. Review these spin locks to ensure that |
| 144 | they are bounded. |
| 145 | </p> |
| 146 | |
| 147 | |
| 148 | |
| 149 | <h2 id="measuringOutput">Measuring Output Latency</h2> |
| 150 | |
| 151 | <p> |
| 152 | There are several techniques available to measure output latency, |
Clay Murphy | c28f237 | 2013-09-25 16:13:40 -0700 | [diff] [blame^] | 153 | with varying degrees of accuracy and ease of running, described below. Also |
| 154 | see the <a href="testing_circuit.html">Testing circuit</a> for an example test environment. |
Robert Ly | 35f2fda | 2013-01-29 16:27:05 -0800 | [diff] [blame] | 155 | </p> |
| 156 | |
| 157 | <h3>LED and oscilloscope test</h3> |
| 158 | <p> |
| 159 | This test measures latency in relation to the device's LED indicator. |
| 160 | If your production device does not have an LED, you can install the |
| 161 | LED on a prototype form factor device. For even better accuracy |
| 162 | on prototype devices with exposed circuity, connect one |
| 163 | oscilloscope probe to the LED directly to bypass the light |
| 164 | sensor latency. |
| 165 | </p> |
| 166 | |
| 167 | <p> |
| 168 | If you cannot install an LED on either your production or prototype device, |
| 169 | try the following workarounds: |
| 170 | </p> |
| 171 | |
| 172 | <ul> |
Clay Murphy | c28f237 | 2013-09-25 16:13:40 -0700 | [diff] [blame^] | 173 | <li>Use a General Purpose Input/Output (GPIO) pin for the same purpose.</li> |
| 174 | <li>Use JTAG or another debugging port.</li> |
Robert Ly | 35f2fda | 2013-01-29 16:27:05 -0800 | [diff] [blame] | 175 | <li>Use the screen backlight. This might be risky as the |
| 176 | backlight may have a non-neglible latency, and can contribute to |
| 177 | an inaccurate latency reading. |
| 178 | </li> |
| 179 | </ul> |
| 180 | |
| 181 | <p>To conduct this test:</p> |
| 182 | |
| 183 | <ol> |
| 184 | <li>Run an app that periodically pulses the LED at |
| 185 | the same time it outputs audio. |
| 186 | |
| 187 | <p class="note"><b>Note:</b> To get useful results, it is crucial to use the correct |
| 188 | APIs in the test app so that you're exercising the fast audio output path. |
Clay Murphy | c28f237 | 2013-09-25 16:13:40 -0700 | [diff] [blame^] | 189 | See <a href="latency_design.html">Design For Reduced Latency</a> for |
| 190 | background. |
Robert Ly | 35f2fda | 2013-01-29 16:27:05 -0800 | [diff] [blame] | 191 | </p> |
| 192 | </li> |
| 193 | <li>Place a light sensor next to the LED.</li> |
| 194 | <li>Connect the probes of a dual-channel oscilloscope to both the wired headphone |
| 195 | jack (line output) and light sensor.</li> |
| 196 | <li>Use the oscilloscope to measure |
| 197 | the time difference between observing the line output signal versus the light |
| 198 | sensor signal.</li> |
| 199 | </ol> |
| 200 | |
| 201 | <p>The difference in time is the approximate audio output latency, |
| 202 | assuming that the LED latency and light sensor latency are both zero. |
| 203 | Typically, the LED and light sensor each have a relatively low latency |
Clay Murphy | c28f237 | 2013-09-25 16:13:40 -0700 | [diff] [blame^] | 204 | on the order of one millisecond or less, which is sufficiently low enough |
Robert Ly | 35f2fda | 2013-01-29 16:27:05 -0800 | [diff] [blame] | 205 | to ignore.</p> |
| 206 | |
| 207 | <h3>Larsen test</h3> |
| 208 | <p> |
| 209 | One of the easiest latency tests is an audio feedback |
| 210 | (Larsen effect) test. This provides a crude measure of combined output |
| 211 | and input latency by timing an impulse response loop. This test is not very useful |
Clay Murphy | c28f237 | 2013-09-25 16:13:40 -0700 | [diff] [blame^] | 212 | by itself because of the nature of the test, but it can be useful for calibrating |
| 213 | other tests</p> |
Robert Ly | 35f2fda | 2013-01-29 16:27:05 -0800 | [diff] [blame] | 214 | |
| 215 | <p>To conduct this test:</p> |
| 216 | <ol> |
| 217 | <li>Run an app that captures audio from the microphone and immediately plays the |
| 218 | captured data back over the speaker.</li> |
| 219 | <li>Create a sound externally, |
| 220 | such as tapping a pencil by the microphone. This noise generates a feedback loop.</li> |
| 221 | <li>Measure the time between feedback pulses to get the sum of the output latency, input latency, and application overhead.</li> |
| 222 | </ol> |
| 223 | |
| 224 | <p>This method does not break down the |
| 225 | component times, which is important when the output latency |
Clay Murphy | c28f237 | 2013-09-25 16:13:40 -0700 | [diff] [blame^] | 226 | and input latency are independent. So this method is not recommended for measuring output latency, but might be useful |
Robert Ly | 35f2fda | 2013-01-29 16:27:05 -0800 | [diff] [blame] | 227 | to help measure output latency.</p> |
| 228 | |
| 229 | <h2 id="measuringInput">Measuring Input Latency</h2> |
| 230 | |
| 231 | <p> |
| 232 | Input latency is more difficult to measure than output latency. The following |
| 233 | tests might help. |
| 234 | </p> |
| 235 | |
| 236 | <p> |
| 237 | One approach is to first determine the output latency |
| 238 | using the LED and oscilloscope method and then use |
| 239 | the audio feedback (Larsen) test to determine the sum of output |
| 240 | latency and input latency. The difference between these two |
| 241 | measurements is the input latency. |
| 242 | </p> |
| 243 | |
| 244 | <p> |
| 245 | Another technique is to use a GPIO pin on a prototype device. |
| 246 | Externally, pulse a GPIO input at the same time that you present |
| 247 | an audio signal to the device. Run an app that compares the |
| 248 | difference in arrival times of the GPIO signal and audio data. |
| 249 | </p> |
| 250 | |
| 251 | <h2 id="reducing">Reducing Latency</h2> |
| 252 | |
| 253 | <p>To achieve low audio latency, pay special attention throughout the |
| 254 | system to scheduling, interrupt handling, power management, and device |
| 255 | driver design. Your goal is to prevent any part of the platform from |
| 256 | blocking a <code>SCHED_FIFO</code> audio thread for more than a couple |
| 257 | of milliseconds. By adopting such a systematic approach, you can reduce |
| 258 | audio latency and get the side benefit of more predictable performance |
| 259 | overall. |
| 260 | </p> |
| 261 | |
| 262 | |
| 263 | <p> |
| 264 | Audio underruns, when they do occur, are often detectable only under certain |
| 265 | conditions or only at the transitions. Try stressing the system by launching |
| 266 | new apps and scrolling quickly through various displays. But be aware |
| 267 | that some test conditions are so stressful as to be beyond the design |
| 268 | goals. For example, taking a bugreport puts such enormous load on the |
| 269 | system that it may be acceptable to have an underrun in that case. |
| 270 | </p> |
| 271 | |
| 272 | <p> |
| 273 | When testing for underruns: |
| 274 | </p> |
| 275 | <ul> |
| 276 | <li>Configure any DSP after the app processor so that it adds |
Clay Murphy | c28f237 | 2013-09-25 16:13:40 -0700 | [diff] [blame^] | 277 | minimal latency.</li> |
Robert Ly | 35f2fda | 2013-01-29 16:27:05 -0800 | [diff] [blame] | 278 | <li>Run tests under different conditions |
| 279 | such as having the screen on or off, USB plugged in or unplugged, |
| 280 | WiFi on or off, Bluetooth on or off, and telephony and data radios |
| 281 | on or off.</li> |
| 282 | <li>Select relatively quiet music that you're very familiar with, and which is easy |
| 283 | to hear underruns in.</li> |
| 284 | <li>Use wired headphones for extra sensitivity.</li> |
Clay Murphy | c28f237 | 2013-09-25 16:13:40 -0700 | [diff] [blame^] | 285 | <li>Give yourself breaks so that you don't experience "ear fatigue."</li> |
Robert Ly | 35f2fda | 2013-01-29 16:27:05 -0800 | [diff] [blame] | 286 | </ul> |
| 287 | |
| 288 | <p> |
| 289 | Once you find the underlying causes of underruns, reduce |
| 290 | the buffer counts and sizes to take advantage of this. |
| 291 | The eager approach of reducing buffer counts and sizes <i>before</i> |
| 292 | analyzing underruns and fixing the causes of underruns only |
| 293 | results in frustration. |
| 294 | </p> |
| 295 | |
| 296 | <h3 id="tools">Tools</h3> |
| 297 | <p> |
| 298 | <code>systrace</code> is an excellent general-purpose tool |
| 299 | for diagnosing system-level performance glitches. |
| 300 | </p> |
| 301 | |
| 302 | <p> |
| 303 | The output of <code>dumpsys media.audio_flinger</code> also contains a |
Clay Murphy | c28f237 | 2013-09-25 16:13:40 -0700 | [diff] [blame^] | 304 | useful section called "simple moving statistics." This has a summary |
Robert Ly | 35f2fda | 2013-01-29 16:27:05 -0800 | [diff] [blame] | 305 | of the variability of elapsed times for each audio mix and I/O cycle. |
| 306 | Ideally, all the time measurements should be about equal to the mean or |
| 307 | nominal cycle time. If you see a very low minimum or high maximum, this is an |
Clay Murphy | c28f237 | 2013-09-25 16:13:40 -0700 | [diff] [blame^] | 308 | indication of a problem, likely a high scheduling latency or interrupt |
Robert Ly | 35f2fda | 2013-01-29 16:27:05 -0800 | [diff] [blame] | 309 | disable time. The <i>tail</i> part of the output is especially helpful, |
| 310 | as it highlights the variability beyond +/- 3 standard deviations. |
Clay Murphy | c28f237 | 2013-09-25 16:13:40 -0700 | [diff] [blame^] | 311 | </p> |