Blame - src/devices/audio/latency_contrib.jd - platform/docs/source.android.com

blob: 2969ba20ac84cf8cf3b7481e2b81f40d67311308 [file] [log] [blame]

Glenn Kasten	3251785	2015-03-30 11:57:01 -0700	[diff] [blame]	1	page.title=Contributors to Audio Latency
				2	@jd:body
				3
				4	<!--
				5	Copyright 2013 The Android Open Source Project
				6
				7	Licensed under the Apache License, Version 2.0 (the "License");
				8	you may not use this file except in compliance with the License.
				9	You may obtain a copy of the License at
				10
				11	http://www.apache.org/licenses/LICENSE-2.0
				12
				13	Unless required by applicable law or agreed to in writing, software
				14	distributed under the License is distributed on an "AS IS" BASIS,
				15	WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
				16	See the License for the specific language governing permissions and
				17	limitations under the License.
				18	-->
				19	<div id="qv-wrapper">
				20	<div id="qv">
				21	<h2>In this document</h2>
				22	<ol id="auto-toc">
				23	</ol>
				24	</div>
				25	</div>
				26
				27	<p>
				28	This page focuses on the contributors to output latency,
				29	but a similar discussion applies to input latency.
				30	</p>
				31	<p>
				32	Assuming the analog circuitry does not contribute significantly, then the major
				33	surface-level contributors to audio latency are the following:
				34	</p>
				35
				36	<ul>
				37	<li>Application</li>
				38	<li>Total number of buffers in pipeline</li>
				39	<li>Size of each buffer, in frames</li>
				40	<li>Additional latency after the app processor, such as from a DSP</li>
				41	</ul>
				42
				43	<p>
				44	As accurate as the above list of contributors may be, it is also misleading.
				45	The reason is that buffer count and buffer size are more of an
				46	<em>effect</em> than a <em>cause</em>. What usually happens is that
				47	a given buffer scheme is implemented and tested, but during testing, an audio
				48	underrun or overrun is heard as a "click" or "pop." To compensate, the
				49	system designer then increases buffer sizes or buffer counts.
				50	This has the desired result of eliminating the underruns or overruns, but it also
				51	has the undesired side effect of increasing latency.
Glenn Kasten	5b7c17f	2015-05-21 13:04:10 -0700	[diff] [blame]	52	For more information about buffer sizes, see the video
				53	<a href="https://youtu.be/PnDK17zP9BI">Audio latency: buffer sizes</a>.
				54
Glenn Kasten	3251785	2015-03-30 11:57:01 -0700	[diff] [blame]	55	</p>
				56
				57	<p>
				58	A better approach is to understand the causes of the
				59	underruns and overruns, and then correct those. This eliminates the
				60	audible artifacts and may permit even smaller or fewer buffers
				61	and thus reduce latency.
				62	</p>
				63
				64	<p>
				65	In our experience, the most common causes of underruns and overruns include:
				66	</p>
				67	<ul>
				68	<li>Linux CFS (Completely Fair Scheduler)</li>
				69	<li>high-priority threads with SCHED_FIFO scheduling</li>
Glenn Kasten	5b7c17f	2015-05-21 13:04:10 -0700	[diff] [blame]	70	<li>priority inversion</li>
Glenn Kasten	3251785	2015-03-30 11:57:01 -0700	[diff] [blame]	71	<li>long scheduling latency</li>
				72	<li>long-running interrupt handlers</li>
				73	<li>long interrupt disable time</li>
				74	<li>power management</li>
				75	<li>security kernels</li>
				76	</ul>
				77
				78	<h3 id="linuxCfs">Linux CFS and SCHED_FIFO scheduling</h3>
				79	<p>
				80	The Linux CFS is designed to be fair to competing workloads sharing a common CPU
				81	resource. This fairness is represented by a per-thread <em>nice</em> parameter.
				82	The nice value ranges from -19 (least nice, or most CPU time allocated)
				83	to 20 (nicest, or least CPU time allocated). In general, all threads with a given
				84	nice value receive approximately equal CPU time and threads with a
				85	numerically lower nice value should expect to
				86	receive more CPU time. However, CFS is "fair" only over relatively long
				87	periods of observation. Over short-term observation windows,
				88	CFS may allocate the CPU resource in unexpected ways. For example, it
				89	may take the CPU away from a thread with numerically low niceness
				90	onto a thread with a numerically high niceness. In the case of audio,
				91	this can result in an underrun or overrun.
				92	</p>
				93
				94	<p>
				95	The obvious solution is to avoid CFS for high-performance audio
				96	threads. Beginning with Android 4.1, such threads now use the
				97	<code>SCHED_FIFO</code> scheduling policy rather than the <code>SCHED_NORMAL</code> (also called
				98	<code>SCHED_OTHER</code>) scheduling policy implemented by CFS.
				99	</p>
				100
				101	<h3 id="schedFifo">SCHED_FIFO priorities</h3>
				102	<p>
				103	Though the high-performance audio threads now use <code>SCHED_FIFO</code>, they
				104	are still susceptible to other higher priority <code>SCHED_FIFO</code> threads.
				105	These are typically kernel worker threads, but there may also be a few
				106	non-audio user threads with policy <code>SCHED_FIFO</code>. The available <code>SCHED_FIFO</code>
				107	priorities range from 1 to 99. The audio threads run at priority
				108	2 or 3. This leaves priority 1 available for lower priority threads,
				109	and priorities 4 to 99 for higher priority threads. We recommend
				110	you use priority 1 whenever possible, and reserve priorities 4 to 99 for
				111	those threads that are guaranteed to complete within a bounded amount
				112	of time, execute with a period shorter than the period of audio threads,
				113	and are known to not interfere with scheduling of audio threads.
				114	</p>
				115
				116	<h3 id="rms">Rate-monotonic scheduling</h3>
				117	<p>
				118	For more information on the theory of assignment of fixed priorities,
				119	see the Wikipedia article
				120	<a href="http://en.wikipedia.org/wiki/Rate-monotonic_scheduling">Rate-monotonic scheduling</a> (RMS).
				121	A key point is that fixed priorities should be allocated strictly based on period,
				122	with higher priorities assigned to threads of shorter periods, not based on perceived "importance."
				123	Non-periodic threads may be modeled as periodic threads, using the maximum frequency of execution
				124	and maximum computation per execution. If a non-periodic thread cannot be modeled as
				125	a periodic thread (for example it could execute with unbounded frequency or unbounded computation
				126	per execution), then it should not be assigned a fixed priority as that would be incompatible
				127	with the scheduling of true periodic threads.
				128	</p>
				129
Glenn Kasten	5b7c17f	2015-05-21 13:04:10 -0700	[diff] [blame]	130	<h3 id="priorityInversion">Priority inversion</h3>
				131	<p>
				132	<a href="http://en.wikipedia.org/wiki/Priority_inversion">Priority inversion</a>
				133	is a classic failure mode of real-time systems,
				134	where a higher-priority task is blocked for an unbounded time waiting
				135	for a lower-priority task to release a resource such as (shared
				136	state protected by) a
				137	<a href="http://en.wikipedia.org/wiki/Mutual_exclusion">mutex</a>.
				138	See the article "<a href="avoiding_pi.html">Avoiding priority inversion</a>" for techniques to
				139	mitigate it.
				140	</p>
				141
Glenn Kasten	3251785	2015-03-30 11:57:01 -0700	[diff] [blame]	142	<h3 id="schedLatency">Scheduling latency</h3>
				143	<p>
				144	Scheduling latency is the time between when a thread becomes
				145	ready to run and when the resulting context switch completes so that the
				146	thread actually runs on a CPU. The shorter the latency the better, and
				147	anything over two milliseconds causes problems for audio. Long scheduling
				148	latency is most likely to occur during mode transitions, such as
				149	bringing up or shutting down a CPU, switching between a security kernel
				150	and the normal kernel, switching from full power to low-power mode,
				151	or adjusting the CPU clock frequency and voltage.
				152	</p>
				153
				154	<h3 id="interrupts">Interrupts</h3>
				155	<p>
				156	In many designs, CPU 0 services all external interrupts. So a
				157	long-running interrupt handler may delay other interrupts, in particular
				158	audio direct memory access (DMA) completion interrupts. Design interrupt handlers
				159	to finish quickly and defer lengthy work to a thread (preferably
				160	a CFS thread or <code>SCHED_FIFO</code> thread of priority 1).
				161	</p>
				162
				163	<p>
				164	Equivalently, disabling interrupts on CPU 0 for a long period
				165	has the same result of delaying the servicing of audio interrupts.
				166	Long interrupt disable times typically happen while waiting for a kernel
				167	<i>spin lock</i>. Review these spin locks to ensure they are bounded.
				168	</p>
				169
				170	<h3 id="power">Power, performance, and thermal management</h3>
				171	<p>
				172	<a href="http://en.wikipedia.org/wiki/Power_management">Power management</a>
				173	is a broad term that encompasses efforts to monitor
				174	and reduce power consumption while optimizing performance.
				175	<a href="http://en.wikipedia.org/wiki/Thermal_management_of_electronic_devices_and_systems">Thermal management</a>
				176	and <a href="http://en.wikipedia.org/wiki/Computer_cooling">computer cooling</a>
				177	are similar but seek to measure and control heat to avoid damage due to excess heat.
				178	In the Linux kernel, the CPU
				179	<a href="http://en.wikipedia.org/wiki/Governor_%28device%29">governor</a>
				180	is responsible for low-level policy, while user mode configures high-level policy.
				181	Techniques used include:
				182	</p>
				183
				184	<ul>
				185	<li>dynamic voltage scaling</li>
				186	<li>dynamic frequency scaling</li>
				187	<li>dynamic core enabling</li>
				188	<li>cluster switching</li>
				189	<li>power gating</li>
				190	<li>hotplug (hotswap)</li>
				191	<li>various sleep modes (halt, stop, idle, suspend, etc.)</li>
				192	<li>process migration</li>
				193	<li><a href="http://en.wikipedia.org/wiki/Processor_affinity">processor affinity</a></li>
				194	</ul>
				195
				196	<p>
				197	Some management operations can result in "work stoppages" or
				198	times during which there is no useful work performed by the application processor.
				199	These work stoppages can interfere with audio, so such management should be designed
				200	for an acceptable worst-case work stoppage while audio is active.
				201	Of course, when thermal runaway is imminent, avoiding permanent damage
				202	is more important than audio!
				203	</p>
				204
				205	<h3 id="security">Security kernels</h3>
				206	<p>
				207	A <a href="http://en.wikipedia.org/wiki/Security_kernel">security kernel</a> for
				208	<a href="http://en.wikipedia.org/wiki/Digital_rights_management">Digital rights management</a>
				209	(DRM) may run on the same application processor core(s) as those used
				210	for the main operating system kernel and application code. Any time
				211	during which a security kernel operation is active on a core is effectively a
				212	stoppage of ordinary work that would normally run on that core.
				213	In particular, this may include audio work. By its nature, the internal
				214	behavior of a security kernel is inscrutable from higher-level layers, and thus
				215	any performance anomalies caused by a security kernel are especially
				216	pernicious. For example, security kernel operations do not typically appear in
				217	context switch traces. We call this "dark time" — time that elapses
				218	yet cannot be observed. Security kernels should be designed for an
				219	acceptable worst-case work stoppage while audio is active.
				220	</p>