blob: 701e3822944cf97bd98054e43ea246066eb2ad90 [file] [log] [blame]
Glenn Kasten949ae0b2014-07-31 06:57:49 -07001page.title=USB Digital Audio
2@jd:body
3
4<div id="qv-wrapper">
5 <div id="qv">
6 <h2>In this document</h2>
7 <ol id="auto-toc">
8 </ol>
9 </div>
10</div>
11
12<p>
13This article reviews Android support for USB digital audio and related
14USB-based protocols.
15</p>
16
17<h3 id="audience">Audience</h3>
18
19<p>
20The target audience of this article is Android device OEMs, SoC vendors,
21USB audio peripheral suppliers, advanced audio application developers,
22and others seeking detailed understanding of USB digital audio internals on Android.
23</p>
24
25<p>
26End users should see the <a href="https://support.google.com/android/">Help Center</a> instead.
27Though this article is not oriented towards end users,
28certain audiophile consumers may find portions of interest.
29</p>
30
31<h2 id="overview">Overview of USB</h2>
32
33<p>
34Universal Serial Bus (USB) is informally described in the Wikipedia article
35<a href="http://en.wikipedia.org/wiki/USB">USB</a>,
36and is formally defined by the standards published by the
37<a href="http://www.usb.org/">USB Implementers Forum, Inc</a>.
38For convenience, we summarize the key USB concepts here,
39but the standards are the authoritative reference.
40</p>
41
42<h3 id="terminology">Basic concepts and terminology</h3>
43
44<p>
45USB is a <a href="http://en.wikipedia.org/wiki/Bus_(computing)">bus</a>
46with a single initiator of data transfer operations, called the <i>host</i>.
47The host communicates with
48<a href="http://en.wikipedia.org/wiki/Peripheral">peripherals</a> via the bus.
49</p>
50
51<p>
52<b>Note:</b> the terms <i>device</i> or <i>accessory</i> are common synonyms for
53<i>peripheral</i>. We avoid those terms here, as they could be confused with
54Android <a href="http://en.wikipedia.org/wiki/Mobile_device">device</a>
55or the Android-specific concept called
56<a href="http://developer.android.com/guide/topics/connectivity/usb/accessory.html">accessory mode</a>.
57</p>
58
59<p>
60A critical host role is <i>enumeration</i>:
61the process of detecting which peripherals are connected to the bus,
62and querying their properties expressed via <i>descriptors</i>.
63</p>
64
65<p>
66A peripheral may be one physical object
67but actually implement multiple logical <i>functions</i>.
68For example, a webcam peripheral could have both a camera function and a
69microphone audio function.
70</p>
71
72<p>
73Each peripheral function has an <i>interface</i> that
74defines the protocol to communicate with that function.
75</p>
76
77<p>
78The host communicates with a peripheral over a
79<a href="http://en.wikipedia.org/wiki/Stream_(computing)">pipe</a>
80to an <a href="http://en.wikipedia.org/wiki/Communication_endpoint">endpoint</a>,
81a data source or sink
82provided by one of the peripheral's functions.
83</p>
84
85<p>
86There are two kinds of pipes: <i>message</i> and <i>stream</i>.
87A message pipe is used for bi-directional control and status.
88A stream pipe is used for uni-directional data transfer.
89</p>
90
91<p>
92The host initiates all data transfers,
93hence the terms <i>input</i> and <i>output</i> are expressed relative to the host.
94An input operation transfers data from the peripheral to the host,
95while an output operation transfers data from the host to the peripheral.
96</p>
97
98<p>
99There are three major data transfer modes:
100<i>interrupt</i>, <i>bulk</i>, and <i>isochronous</i>.
101Isochronous mode will be discussed further in the context of audio.
102</p>
103
104<p>
105The peripheral may have <i>terminals</i> that connect to the outside world,
106beyond the peripheral itself. In this way, the peripheral serves
107to translate between USB protocol and "real world" signals.
108The terminals are logical objects of the function.
109</p>
110
111<h2 id="androidModes">Android USB modes</h2>
112
113<h3 id="developmentMode">Development mode</h3>
114
115<p>
116<i>Development mode</i> has been present since the initial release of Android.
117The Android device appears as a USB peripheral
118to a host PC running a desktop operating system such as Linux,
119Mac OS X, or Windows. The only visible peripheral function is either
120<a href="http://en.wikipedia.org/wiki/Android_software_development#Fastboot">Android fastboot</a>
121or
122<a href="http://developer.android.com/tools/help/adb.html">Android Debug Bridge (adb)</a>.
123The fastboot and adb protocols are layered over USB bulk data transfer mode.
124</p>
125
126<h3 id="hostMode">Host mode</h3>
127
128<p>
129<i>Host mode</i> is introduced in Android 3.1 (API level 12).
130</p>
131
132<p>
133As the Android device must act as host, and most Android devices include
134a micro-USB connector that does not directly permit host operation,
135an on-the-go (<a href="http://en.wikipedia.org/wiki/USB_On-The-Go">OTG</a>) adapter
136such as this is usually required:
137</p>
138
139<img src="audio/images/otg.jpg" style="image-orientation: 90deg;" height="50%" width="50%" alt="OTG">
140
141<p>
142An Android device might not provide sufficient power to operate a
143particular peripheral, depending on how much power the peripheral needs,
144and how much the Android device is capable of supplying. Even if
145adequate power is available, the Android device battery charge may
146be significantly shortened. For these situations, use a powered
147<a href="http://en.wikipedia.org/wiki/USB_hub">hub</a> such as this:
148</p>
149
150<img src="audio/images/hub.jpg" alt="Powered hub">
151
152<h3 id="accessoryMode">Accessory mode</h3>
153
154<p>
155<i>Accessory mode</i> was introduced in Android 3.1 (API level 12) and back-ported to Android 2.3.4.
156In this mode, the Android device operates as a USB peripheral,
157under the control of another device such as a dock that serves as host.
158The difference between development mode and accessory mode
159is that additional USB functions are visible to the host, beyond adb.
160The Android device begins in development mode and then
161transitions to accessory mode via a re-negotiation process.
162</p>
163
164<p>
165Accessory mode was extended with additional features in Android 4.1,
166in particular audio described below.
167</p>
168
169<h2 id="audioClass">USB audio</h2>
170
171<h3 id="class">USB classes</h3>
172
173<p>
174Each peripheral function has an associated <i>device class</i> document
175that specifies the standard protocol for that function.
176This enables <i>class compliant</i> hosts and peripheral functions
177to inter-operate, without detailed knowledge of each other's workings.
178Class compliance is critical if the host and peripheral are provided by
179different entities.
180</p>
181
182<p>
183The term <i>driverless</i> is a common synonym for <i>class compliant</i>,
184indicating that it is possible to use the standard features of such a
185peripheral without requiring an operating-system specific
186<a href="http://en.wikipedia.org/wiki/Device_driver">driver</a> to be installed.
187One can assume that a peripheral advertised as "no driver needed"
188for major desktop operating systems
189will be class compliant, though there may be exceptions.
190</p>
191
192<h3 id="audioClass">USB audio class</h3>
193
194<p>
195Here we concern ourselves only with peripherals that implement
196audio functions, and thus adhere to the audio device class. There are two
197editions of the USB audio class specification: class 1 (UAC1) and 2 (UAC2).
198</p>
199
200<h3 id="otherClasses">Comparison with other classes</h3>
201
202<p>
203USB includes many other device classes, some of which may be confused
204with the audio class. The
205<a href="http://en.wikipedia.org/wiki/USB_mass_storage_device_class">mass storage class</a>
206(MSC) is used for
207sector-oriented access to media, while
208<a href="http://en.wikipedia.org/wiki/Media_Transfer_Protocol">Media Transfer Protocol</a>
209(MTP) is for full file access to media.
210Both MSC and MTP may be used for transferring audio files,
211but only USB audio class is suitable for real-time streaming.
212</p>
213
214<h3 id="audioTerminals">Audio terminals</h3>
215
216<p>
217The terminals of an audio peripheral are typically analog.
218The analog signal presented at the peripheral's input terminal is converted to digital by an
219<a href="http://en.wikipedia.org/wiki/Analog-to-digital_converter">analog-to-digital converter</a>
220(ADC),
221and is carried over USB protocol to be consumed by
222the host. The ADC is a data <i>source</i>
223for the host. Similarly, the host sends a
224digital audio signal over USB protocol to the peripheral, where a
225<a href="http://en.wikipedia.org/wiki/Digital-to-analog_converter">digital-to-analog converter</a>
226(DAC)
227converts and presents to an analog output terminal.
228The DAC is a <i>sink</i> for the host.
229</p>
230
231<h3 id="channels">Channels</h3>
232
233<p>
234A peripheral with audio function can include a source terminal, sink terminal, or both.
235Each direction may have one channel (<i>mono</i>), two channels
236(<i>stereo</i>), or more.
237Peripherals with more than two channels are called <i>multichannel</i>.
238It is common to interpret a stereo stream as consisting of
239<i>left</i> and <i>right</i> channels, and by extension to interpret a multichannel stream as having
240spatial locations corresponding to each channel. However, it is also quite appropriate
241(especially for USB audio more so than
242<a href="http://en.wikipedia.org/wiki/HDMI">HDMI</a>)
243to not assign any particular
244standard spatial meaning to each channel. In this case, it is up to the
245application and user to define how each channel is used.
246For example, a four-channel USB input stream might have the first three
247channels attached to various microphones within a room, and the final
248channel receiving input from an AM radio.
249</p>
250
251<h3 id="isochronous">Isochronous transfer mode</h3>
252
253<p>
254USB audio uses isochronous transfer mode for its real-time characteristics,
255at the expense of error recovery.
256In isochronous mode, bandwidth is guaranteed, and data transmission
257errors are detected using a cyclic redundancy check (CRC). But there is
258no packet acknowledgement or re-transmission in the event of error.
259</p>
260
261<p>
262Isochronous transmissions occur each Start Of Frame (SOF) period.
263The SOF period is one millisecond for full-speed, and 125 microseconds for
264high-speed. Each full-speed frame carries up to 1023 bytes of payload,
265and a high-speed frame carries up to 1024 bytes. Putting these together,
266we calculate the maximum transfer rate as 1,023,000 or 8,192,000 bytes
267per second. This sets a theoretical upper limit on the combined audio
268sample rate, channel count, and bit depth. The practical limit is lower.
269</p>
270
271<p>
272Within isochronous mode, there are three sub-modes:
273</p>
274
275<ul>
276<li>Adaptive</li>
277<li>Asynchronous</li>
278<li>Synchronous</li>
279</ul>
280
281<p>
282In adaptive sub-mode, the peripheral sink or source adapts to a potentially varying sample rate
283of the host.
284</p>
285
286<p>
287In asynchronous (also called implicit feedback) sub-mode,
288the sink or source determines the sample rate, and the host accomodates.
289The primary theoretical advantage of asynchronous sub-mode is that the source
290or sink USB clock is physically and electrically closer to (and indeed may
291be the same as, or derived from) the clock that drives the DAC or ADC.
292This proximity means that asynchronous sub-mode should be less susceptible
293to clock jitter. In addition, the clock used by the DAC or ADC may be
294designed for higher accuracy and lower drift than the host clock.
295</p>
296
297<p>
298In synchronous sub-mode, a fixed number of bytes is transferred each SOF period.
299The audio sample rate is effectively derived from the USB clock.
300Synchronous sub-mode is not commonly used with audio because both
301host and peripheral are at the mercy of the USB clock.
302</p>
303
304<p>
305The table below summarizes the isochronous sub-modes:
306</p>
307
308<table>
309<tr>
310 <th>Sub-mode</th>
311 <th>Byte count<br \>per packet</th>
312 <th>Sample rate<br \>determined by</th>
313 <th>Used for audio</th>
314</tr>
315<tr>
316 <td>adaptive</td>
317 <td>variable</td>
318 <td>host</td>
319 <td>yes</td>
320</tr>
321<tr>
322 <td>asynchronous</td>
323 <td>variable</td>
324 <td>peripheral</td>
325 <td>yes</td>
326</tr>
327<tr>
328 <td>synchronous</td>
329 <td>fixed</td>
330 <td>USB clock</td>
331 <td>no</td>
332</tr>
333</table>
334
335<p>
336In practice, the sub-mode does of course matter, but other factors
337should also be considered.
338</p>
339
340<h2 id="androidSupport">Android support for USB audio class</h2>
341
342<h3 id="developmentAudio">Development mode</h3>
343
344<p>
345USB audio is not supported in development mode.
346</p>
347
348<h3 id="hostAudio">Host mode</h3>
349
350<p>
351Android 5.0 (API level 21) and above supports a subset of USB audio class 1 (UAC1) features:
352</p>
353
354<ul>
355<li>The Android device must act as host</li>
356<li>The audio format must be PCM (interface type I)</li>
357<li>The bit depth must be 16-bits, 24-bits, or 32-bits where
35824 bits of useful audio data are left-justified within the most significant
359bits of the 32-bit word</li>
360<li>The sample rate must be either 48, 44.1, 32, 24, 22.05, 16, 12, 11.025, or 8 kHz</li>
361<li>The channel count must be 1 (mono) or 2 (stereo)</li>
362</ul>
363
364<p>
365Perusal of the Android framework source code may show additional code
366beyond the minimum needed to support these features. But this code
367has not been validated, so more advanced features are not yet claimed.
368</p>
369
370<h3 id="accessoryAudio">Accessory mode</h3>
371
372<p>
373Android 4.1 (API level 16) added limited support for audio playback to the host.
374While in accessory mode, Android automatically routes its audio output to USB.
375That is, the Android device serves as a data source to the host, for example a dock.
376</p>
377
378<p>
379Accessory mode audio has these features:
380</p>
381
382<ul>
383<li>
384The Android device must be controlled by a knowledgeable host that
385can first transition the Android device from development mode to accessory mode,
386and then the host must transfer audio data from the appropriate endpoint.
387Thus the Android device does not appear "driverless" to the host.
388</li>
389<li>The direction must be <i>input</i>, expressed relative to the host</li>
390<li>The audio format must be 16-bit PCM</li>
391<li>The sample rate must be 44.1 kHz</li>
392<li>The channel count must be 2 (stereo)</li>
393</ul>
394
395<p>
396Accessory mode audio has not been widely adopted,
397and is not currently recommended for new designs.
398</p>
399
400<h2 id="applications">Applications of USB digital audio</h2>
401
402<p>
403As the name indicates, the USB digital audio signal is represented
404by a <a href="http://en.wikipedia.org/wiki/Digital_data">digital</a> data stream
405rather than the <a href="http://en.wikipedia.org/wiki/Analog_signal">analog</a>
406signal used by the common TRS mini
407<a href=" http://en.wikipedia.org/wiki/Phone_connector_(audio)">headset connector</a>.
408Eventually any digital signal must be converted to analog before it can be heard.
409There are tradeoffs in choosing where to place that conversion.
410</p>
411
412<h3 id="comparison">A tale of two DACs</h3>
413
414<p>
415In the example diagram below, we compare two designs. First we have a
416mobile device with Application Processor (AP), on-board DAC, amplifier,
417and analog TRS connector attached to headphones. We also consider a
418mobile device with USB connected to external USB DAC and amplifier,
419also with headphones.
420</p>
421
422<img src="audio/images/dac.png" alt="DAC comparison">
423
424<p>
425Which design is better? The answer depends on your needs.
426Each has advantages and disadvantages.
427<b>Note:</b> this is an artificial comparison, since
428a real Android device would probably have both options available.
429</p>
430
431<p>
432The first design A is simpler, less expensive, uses less power,
433and will be a more reliable design assuming otherwise equally reliable components.
434However, there are usually audio quality tradeoffs vs. other requirements.
435For example, if this is a mass-market device, it may be designed to fit
436the needs of the general consumer, not for the audiophile.
437</p>
438
439<p>
440In the second design, the external audio peripheral C can be designed for
441higher audio quality and greater power output without impacting the cost of
442the basic mass market Android device B. Yes, it is a more expensive design,
443but the cost is absorbed only by those who want it.
444</p>
445
446<p>
447Mobile devices are notorious for having high-density
448circuit boards, which can result in more opportunities for
449<a href="http://en.wikipedia.org/wiki/Crosstalk_(electronics)">crosstalk</a>
450that degrades adjacent analog signals. Digital communication is less susceptible to
451<a href="http://en.wikipedia.org/wiki/Noise_(electronics)">noise</a>,
452so moving the DAC from the Android device A to an external circuit board
453C allows the final analog stages to be physically and electrically
454isolated from the dense and noisy circuit board, resulting in higher fidelity audio.
455</p>
456
457<p>
458On the other hand,
459the second design is more complex, and with added complexity come more
460opportunities for things to fail. There is also additional latency
461from the USB controllers.
462</p>
463
464<h3 id="applications">Applications</h3>
465
466<p>
467Typical USB host mode audio applications include:
468</p>
469
470<ul>
471<li>music listening</li>
472<li>telephony</li>
473<li>instant messaging and voice chat</li>
474<li>recording</li>
475</ul>
476
477<p>
478For all of these applications, Android detects a compatible USB digital
479audio peripheral, and automatically routes audio playback and capture
480appropriately, based on the audio policy rules.
481Stereo content is played on the first two channels of the peripheral.
482</p>
483
484<p>
485There are no APIs specific to USB digital audio.
486For advanced usage, the automatic routing may interfere with applications
487that are USB-aware. For such applications, disable automatic routing
488via the corresponding control in the Media section of
489<a href="http://developer.android.com/tools/index.html">Settings / Developer Options</a>.
490</p>
491
492<h2 id="compatibility">Implementing USB audio</h2>
493
494<h3 id="recommendationsPeripheral">Recommendations for audio peripheral vendors</h3>
495
496<p>
497In order to inter-operate with Android devices, audio peripheral vendors should:
498</p>
499
500<ul>
501<li>design for audio class compliance;
502currently Android targets class 1, but it is wise to plan for class 2</li>
503<li>avoid <a href="http://en.wiktionary.org/wiki/quirk">quirks</a>
504<li>test for inter-operability with reference and popular Android devices</li>
505<li>clearly document supported features, audio class compliance, power requirements, etc.
506so that consumers can make informed decisions</li>
507</ul>
508
509<h3 id="recommendationsAndroid">Recommendations for Android device OEMs and SoC vendors</h3>
510
511<p>
512In order to support USB digital audio, device OEMs and SoC vendors should:
513</p>
514
515<ul>
516<li>enable all kernel features needed: USB host mode, USB audio, isochronous transfer mode</li>
517<li>keep up-to-date with recent kernel releases and patches;
518despite the noble goal of class compliance, there are extant audio peripherals
519with <a href="http://en.wiktionary.org/wiki/quirk">quirks</a>,
520and recent kernels have workarounds for such quirks
521</li>
522<li>enable USB audio policy as described below</li>
523<li>test for inter-operability with common USB audio peripherals</li>
524</ul>
525
526<h3 id="enable">How to enable USB audio policy</h3>
527
528<p>
529To enable USB audio, add an entry to the
530audio policy configuration file. This is typically
531located here:
532<pre>device/oem/codename/audio_policy.conf</pre>
533The pathname component "oem" should be replaced by the name
534of the OEM who manufactures the Android device,
535and "codename" should be replaced by the device code name.
536</p>
537
538<p>
539An example entry is shown here:
540</p>
541
542<pre>
543audio_hw_modules {
544 ...
545 usb {
546 outputs {
547 usb_accessory {
548 sampling_rates 44100
549 channel_masks AUDIO_CHANNEL_OUT_STEREO
550 formats AUDIO_FORMAT_PCM_16_BIT
551 devices AUDIO_DEVICE_OUT_USB_ACCESSORY
552 }
553 usb_device {
554 sampling_rates dynamic
555 channel_masks dynamic
556 formats dynamic
557 devices AUDIO_DEVICE_OUT_USB_DEVICE
558 }
559 }
560 inputs {
561 usb_device {
562 sampling_rates dynamic
563 channel_masks AUDIO_CHANNEL_IN_STEREO
564 formats AUDIO_FORMAT_PCM_16_BIT
565 devices AUDIO_DEVICE_IN_USB_DEVICE
566 }
567 }
568 }
569 ...
570}
571</pre>
572
573<h3 id="sourceCode">Source code</h3>
574
575<p>
576The audio Hardware Abstraction Layer (HAL)
577implementation for USB audio is located here:
578<pre>hardware/libhardware/modules/usbaudio/</pre>
579The USB audio HAL relies heavily on
580<i>tinyalsa</i>, described at <a href="audio_terminology.html">Audio Terminology</a>.
581Though USB audio relies on isochronous transfers,
582this is abstracted away by the ALSA implementation.
583So the USB audio HAL and tinyalsa do not need to concern
584themselves with this part of USB protocol.
585</p>