blob: fb6c944662b5c30130b05fe0704812611a9d4100 [file] [log] [blame]
Guido van Rossum470be141995-03-17 16:07:09 +00001\section{Built-in Module \sectcode{audioop}}
Guido van Rossume47da0a1997-07-17 16:34:52 +00002\label{module-audioop}
Guido van Rossum5fdeeea1994-01-02 01:22:07 +00003\bimodindex{audioop}
4
Guido van Rossum6bb1adc1995-03-13 10:03:32 +00005The \code{audioop} module contains some useful operations on sound fragments.
6It operates on sound fragments consisting of signed integer samples
Guido van Rossum5fdeeea1994-01-02 01:22:07 +000078, 16 or 32 bits wide, stored in Python strings. This is the same
8format as used by the \code{al} and \code{sunaudiodev} modules. All
9scalar items are integers, unless specified otherwise.
10
11A few of the more complicated operations only take 16-bit samples,
12otherwise the sample size (in bytes) is always a parameter of the operation.
13
14The module defines the following variables and functions:
15
16\renewcommand{\indexsubitem}{(in module audioop)}
17\begin{excdesc}{error}
18This exception is raised on all errors, such as unknown number of bytes
19per sample, etc.
20\end{excdesc}
21
22\begin{funcdesc}{add}{fragment1\, fragment2\, width}
Guido van Rossum470be141995-03-17 16:07:09 +000023Return a fragment which is the addition of the two samples passed as
24parameters. \var{width} is the sample width in bytes, either
25\code{1}, \code{2} or \code{4}. Both fragments should have the same
26length.
Guido van Rossum5fdeeea1994-01-02 01:22:07 +000027\end{funcdesc}
28
29\begin{funcdesc}{adpcm2lin}{adpcmfragment\, width\, state}
Guido van Rossum470be141995-03-17 16:07:09 +000030Decode an Intel/DVI ADPCM coded fragment to a linear fragment. See
31the description of \code{lin2adpcm} for details on ADPCM coding.
32Return a tuple \code{(\var{sample}, \var{newstate})} where the sample
33has the width specified in \var{width}.
Guido van Rossum5fdeeea1994-01-02 01:22:07 +000034\end{funcdesc}
35
36\begin{funcdesc}{adpcm32lin}{adpcmfragment\, width\, state}
Guido van Rossum470be141995-03-17 16:07:09 +000037Decode an alternative 3-bit ADPCM code. See \code{lin2adpcm3} for
38details.
Guido van Rossum5fdeeea1994-01-02 01:22:07 +000039\end{funcdesc}
40
41\begin{funcdesc}{avg}{fragment\, width}
Guido van Rossum470be141995-03-17 16:07:09 +000042Return the average over all samples in the fragment.
Guido van Rossum5fdeeea1994-01-02 01:22:07 +000043\end{funcdesc}
44
45\begin{funcdesc}{avgpp}{fragment\, width}
Guido van Rossum470be141995-03-17 16:07:09 +000046Return the average peak-peak value over all samples in the fragment.
47No filtering is done, so the usefulness of this routine is
48questionable.
Guido van Rossum5fdeeea1994-01-02 01:22:07 +000049\end{funcdesc}
50
51\begin{funcdesc}{bias}{fragment\, width\, bias}
Guido van Rossum470be141995-03-17 16:07:09 +000052Return a fragment that is the original fragment with a bias added to
53each sample.
Guido van Rossum5fdeeea1994-01-02 01:22:07 +000054\end{funcdesc}
55
56\begin{funcdesc}{cross}{fragment\, width}
Guido van Rossum470be141995-03-17 16:07:09 +000057Return the number of zero crossings in the fragment passed as an
58argument.
Guido van Rossum5fdeeea1994-01-02 01:22:07 +000059\end{funcdesc}
60
61\begin{funcdesc}{findfactor}{fragment\, reference}
Guido van Rossum470be141995-03-17 16:07:09 +000062Return a factor \var{F} such that
63\code{rms(add(fragment, mul(reference, -F)))} is minimal, i.e.,
64return the factor with which you should multiply \var{reference} to
65make it match as well as possible to \var{fragment}. The fragments
66should both contain 2-byte samples.
Guido van Rossum5fdeeea1994-01-02 01:22:07 +000067
68The time taken by this routine is proportional to \code{len(fragment)}.
69\end{funcdesc}
70
71\begin{funcdesc}{findfit}{fragment\, reference}
Guido van Rossum470be141995-03-17 16:07:09 +000072This routine (which only accepts 2-byte sample fragments)
73
74Try to match \var{reference} as well as possible to a portion of
75\var{fragment} (which should be the longer fragment). This is
76(conceptually) done by taking slices out of \var{fragment}, using
Guido van Rossum5fdeeea1994-01-02 01:22:07 +000077\code{findfactor} to compute the best match, and minimizing the
Guido van Rossum470be141995-03-17 16:07:09 +000078result. The fragments should both contain 2-byte samples. Return a
79tuple \code{(\var{offset}, \var{factor})} where \var{offset} is the
Guido van Rossum5fdeeea1994-01-02 01:22:07 +000080(integer) offset into \var{fragment} where the optimal match started
Guido van Rossum470be141995-03-17 16:07:09 +000081and \var{factor} is the (floating-point) factor as per
82\code{findfactor}.
Guido van Rossum5fdeeea1994-01-02 01:22:07 +000083\end{funcdesc}
84
85\begin{funcdesc}{findmax}{fragment\, length}
Guido van Rossum470be141995-03-17 16:07:09 +000086Search \var{fragment} for a slice of length \var{length} samples (not
87bytes!)\ with maximum energy, i.e., return \var{i} for which
88\code{rms(fragment[i*2:(i+length)*2])} is maximal. The fragments
89should both contain 2-byte samples.
Guido van Rossum5fdeeea1994-01-02 01:22:07 +000090
91The routine takes time proportional to \code{len(fragment)}.
92\end{funcdesc}
93
94\begin{funcdesc}{getsample}{fragment\, width\, index}
Guido van Rossum470be141995-03-17 16:07:09 +000095Return the value of sample \var{index} from the fragment.
Guido van Rossum5fdeeea1994-01-02 01:22:07 +000096\end{funcdesc}
97
98\begin{funcdesc}{lin2lin}{fragment\, width\, newwidth}
Guido van Rossum470be141995-03-17 16:07:09 +000099Convert samples between 1-, 2- and 4-byte formats.
Guido van Rossum5fdeeea1994-01-02 01:22:07 +0000100\end{funcdesc}
101
102\begin{funcdesc}{lin2adpcm}{fragment\, width\, state}
Guido van Rossum470be141995-03-17 16:07:09 +0000103Convert samples to 4 bit Intel/DVI ADPCM encoding. ADPCM coding is an
104adaptive coding scheme, whereby each 4 bit number is the difference
105between one sample and the next, divided by a (varying) step. The
106Intel/DVI ADPCM algorithm has been selected for use by the IMA, so it
107may well become a standard.
Guido van Rossum5fdeeea1994-01-02 01:22:07 +0000108
Guido van Rossum470be141995-03-17 16:07:09 +0000109\code{State} is a tuple containing the state of the coder. The coder
Guido van Rossum5fdeeea1994-01-02 01:22:07 +0000110returns a tuple \code{(\var{adpcmfrag}, \var{newstate})}, and the
111\var{newstate} should be passed to the next call of lin2adpcm. In the
Guido van Rossum470be141995-03-17 16:07:09 +0000112initial call \code{None} can be passed as the state. \var{adpcmfrag}
113is the ADPCM coded fragment packed 2 4-bit values per byte.
Guido van Rossum5fdeeea1994-01-02 01:22:07 +0000114\end{funcdesc}
115
116\begin{funcdesc}{lin2adpcm3}{fragment\, width\, state}
117This is an alternative ADPCM coder that uses only 3 bits per sample.
118It is not compatible with the Intel/DVI ADPCM coder and its output is
Guido van Rossum470be141995-03-17 16:07:09 +0000119not packed (due to laziness on the side of the author). Its use is
Guido van Rossum5fdeeea1994-01-02 01:22:07 +0000120discouraged.
121\end{funcdesc}
122
123\begin{funcdesc}{lin2ulaw}{fragment\, width}
Guido van Rossum470be141995-03-17 16:07:09 +0000124Convert samples in the audio fragment to U-LAW encoding and return
125this as a Python string. U-LAW is an audio encoding format whereby
126you get a dynamic range of about 14 bits using only 8 bit samples. It
127is used by the Sun audio hardware, among others.
Guido van Rossum5fdeeea1994-01-02 01:22:07 +0000128\end{funcdesc}
129
130\begin{funcdesc}{minmax}{fragment\, width}
Guido van Rossum470be141995-03-17 16:07:09 +0000131Return a tuple consisting of the minimum and maximum values of all
132samples in the sound fragment.
Guido van Rossum5fdeeea1994-01-02 01:22:07 +0000133\end{funcdesc}
134
135\begin{funcdesc}{max}{fragment\, width}
Guido van Rossum470be141995-03-17 16:07:09 +0000136Return the maximum of the {\em absolute value} of all samples in a
137fragment.
Guido van Rossum5fdeeea1994-01-02 01:22:07 +0000138\end{funcdesc}
139
140\begin{funcdesc}{maxpp}{fragment\, width}
Guido van Rossum470be141995-03-17 16:07:09 +0000141Return the maximum peak-peak value in the sound fragment.
Guido van Rossum5fdeeea1994-01-02 01:22:07 +0000142\end{funcdesc}
143
144\begin{funcdesc}{mul}{fragment\, width\, factor}
Guido van Rossum6bb1adc1995-03-13 10:03:32 +0000145Return a fragment that has all samples in the original framgent
Guido van Rossum470be141995-03-17 16:07:09 +0000146multiplied by the floating-point value \var{factor}. Overflow is
Guido van Rossum5fdeeea1994-01-02 01:22:07 +0000147silently ignored.
148\end{funcdesc}
149
Guido van Rossum6fb6f101997-02-14 15:59:49 +0000150\begin{funcdesc}{ratecv}{fragment\, width\, nchannels\, inrate\, outrate\, state\optional{\, weightA\, weightB}}
151Convert the frame rate of the input fragment.
152
153\code{State} is a tuple containing the state of the converter. The
154converter returns a tupl \code{(\var{newfragment}, \var{newstate})},
155and \var{newstate} should be passed to the next call of ratecv.
156
Guido van Rossum3ff73171997-03-03 16:02:32 +0000157The \code{weightA} and \code{weightB} arguments are parameters for a
Guido van Rossum6fb6f101997-02-14 15:59:49 +0000158simple digital filter and default to 1 and 0 respectively.
159\end{funcdesc}
160
Guido van Rossum5fdeeea1994-01-02 01:22:07 +0000161\begin{funcdesc}{reverse}{fragment\, width}
Guido van Rossum470be141995-03-17 16:07:09 +0000162Reverse the samples in a fragment and returns the modified fragment.
Guido van Rossum5fdeeea1994-01-02 01:22:07 +0000163\end{funcdesc}
164
Guido van Rossum470be141995-03-17 16:07:09 +0000165\begin{funcdesc}{rms}{fragment\, width}
166Return the root-mean-square of the fragment, i.e.
Guido van Rossum5fdeeea1994-01-02 01:22:07 +0000167\iftexi
168the square root of the quotient of the sum of all squared sample value,
169divided by the sumber of samples.
170\else
171% in eqn: sqrt { sum S sub i sup 2 over n }
172\begin{displaymath}
173\catcode`_=8
174\sqrt{\frac{\sum{{S_{i}}^{2}}}{n}}
175\end{displaymath}
176\fi
177This is a measure of the power in an audio signal.
178\end{funcdesc}
179
Guido van Rossum6bb1adc1995-03-13 10:03:32 +0000180\begin{funcdesc}{tomono}{fragment\, width\, lfactor\, rfactor}
Guido van Rossum470be141995-03-17 16:07:09 +0000181Convert a stereo fragment to a mono fragment. The left channel is
182multiplied by \var{lfactor} and the right channel by \var{rfactor}
183before adding the two channels to give a mono signal.
Guido van Rossum6bb1adc1995-03-13 10:03:32 +0000184\end{funcdesc}
185
186\begin{funcdesc}{tostereo}{fragment\, width\, lfactor\, rfactor}
Guido van Rossum470be141995-03-17 16:07:09 +0000187Generate a stereo fragment from a mono fragment. Each pair of samples
188in the stereo fragment are computed from the mono sample, whereby left
189channel samples are multiplied by \var{lfactor} and right channel
190samples by \var{rfactor}.
Guido van Rossum6bb1adc1995-03-13 10:03:32 +0000191\end{funcdesc}
192
Guido van Rossum5fdeeea1994-01-02 01:22:07 +0000193\begin{funcdesc}{ulaw2lin}{fragment\, width}
Guido van Rossum470be141995-03-17 16:07:09 +0000194Convert sound fragments in ULAW encoding to linearly encoded sound
195fragments. ULAW encoding always uses 8 bits samples, so \var{width}
196refers only to the sample width of the output fragment here.
Guido van Rossum5fdeeea1994-01-02 01:22:07 +0000197\end{funcdesc}
198
199Note that operations such as \code{mul} or \code{max} make no
Guido van Rossum6bb1adc1995-03-13 10:03:32 +0000200distinction between mono and stereo fragments, i.e.\ all samples are
Guido van Rossum5fdeeea1994-01-02 01:22:07 +0000201treated equal. If this is a problem the stereo fragment should be split
202into two mono fragments first and recombined later. Here is an example
203of how to do that:
204\bcode\begin{verbatim}
205def mul_stereo(sample, width, lfactor, rfactor):
206 lsample = audioop.tomono(sample, width, 1, 0)
207 rsample = audioop.tomono(sample, width, 0, 1)
208 lsample = audioop.mul(sample, width, lfactor)
209 rsample = audioop.mul(sample, width, rfactor)
210 lsample = audioop.tostereo(lsample, width, 1, 0)
211 rsample = audioop.tostereo(rsample, width, 0, 1)
212 return audioop.add(lsample, rsample, width)
213\end{verbatim}\ecode
Guido van Rossume47da0a1997-07-17 16:34:52 +0000214%
Guido van Rossum5fdeeea1994-01-02 01:22:07 +0000215If you use the ADPCM coder to build network packets and you want your
Guido van Rossum6bb1adc1995-03-13 10:03:32 +0000216protocol to be stateless (i.e.\ to be able to tolerate packet loss)
Guido van Rossum470be141995-03-17 16:07:09 +0000217you should not only transmit the data but also the state. Note that
Guido van Rossum5fdeeea1994-01-02 01:22:07 +0000218you should send the \var{initial} state (the one you passed to
Guido van Rossum6bb1adc1995-03-13 10:03:32 +0000219\code{lin2adpcm}) along to the decoder, not the final state (as returned by
Guido van Rossum470be141995-03-17 16:07:09 +0000220the coder). If you want to use \code{struct} to store the state in
Guido van Rossum5fdeeea1994-01-02 01:22:07 +0000221binary you can code the first element (the predicted value) in 16 bits
222and the second (the delta index) in 8.
223
224The ADPCM coders have never been tried against other ADPCM coders,
Guido van Rossum470be141995-03-17 16:07:09 +0000225only against themselves. It could well be that I misinterpreted the
Guido van Rossum5fdeeea1994-01-02 01:22:07 +0000226standards in which case they will not be interoperable with the
227respective standards.
228
229The \code{find...} routines might look a bit funny at first sight.
Guido van Rossum470be141995-03-17 16:07:09 +0000230They are primarily meant to do echo cancellation. A reasonably
Guido van Rossum5fdeeea1994-01-02 01:22:07 +0000231fast way to do this is to pick the most energetic piece of the output
232sample, locate that in the input sample and subtract the whole output
233sample from the input sample:
234\bcode\begin{verbatim}
235def echocancel(outputdata, inputdata):
236 pos = audioop.findmax(outputdata, 800) # one tenth second
237 out_test = outputdata[pos*2:]
238 in_test = inputdata[pos*2:]
239 ipos, factor = audioop.findfit(in_test, out_test)
240 # Optional (for better cancellation):
241 # factor = audioop.findfactor(in_test[ipos*2:ipos*2+len(out_test)],
242 # out_test)
243 prefill = '\0'*(pos+ipos)*2
244 postfill = '\0'*(len(inputdata)-len(prefill)-len(outputdata))
245 outputdata = prefill + audioop.mul(outputdata,2,-factor) + postfill
246 return audioop.add(inputdata, outputdata, 2)
247\end{verbatim}\ecode