| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 1 | :mod:`audioop` --- Manipulate raw audio data | 
 | 2 | ============================================ | 
 | 3 |  | 
 | 4 | .. module:: audioop | 
 | 5 |    :synopsis: Manipulate raw audio data. | 
 | 6 |  | 
 | 7 |  | 
 | 8 | The :mod:`audioop` module contains some useful operations on sound fragments. | 
| Serhiy Storchaka | eaea5e9 | 2013-10-19 21:10:46 +0300 | [diff] [blame] | 9 | It operates on sound fragments consisting of signed integer samples 8, 16, 24 | 
| Serhiy Storchaka | 711e91b | 2013-11-10 21:44:36 +0200 | [diff] [blame] | 10 | or 32 bits wide, stored in :term:`bytes-like object`\ s.  All scalar items are | 
 | 11 | integers, unless specified otherwise. | 
| Serhiy Storchaka | eaea5e9 | 2013-10-19 21:10:46 +0300 | [diff] [blame] | 12 |  | 
 | 13 | .. versionchanged:: 3.4 | 
 | 14 |    Support for 24-bit samples was added. | 
| R David Murray | 8591563 | 2014-03-07 21:35:31 -0500 | [diff] [blame] | 15 |    All functions now accept any :term:`bytes-like object`. | 
 | 16 |    String input now results in an immediate error. | 
| Serhiy Storchaka | 711e91b | 2013-11-10 21:44:36 +0200 | [diff] [blame] | 17 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 18 | .. index:: | 
 | 19 |    single: Intel/DVI ADPCM | 
 | 20 |    single: ADPCM, Intel/DVI | 
 | 21 |    single: a-LAW | 
 | 22 |    single: u-LAW | 
 | 23 |  | 
 | 24 | This module provides support for a-LAW, u-LAW and Intel/DVI ADPCM encodings. | 
 | 25 |  | 
| Christian Heimes | 5b5e81c | 2007-12-31 16:14:33 +0000 | [diff] [blame] | 26 | .. This para is mostly here to provide an excuse for the index entries... | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 27 |  | 
 | 28 | A few of the more complicated operations only take 16-bit samples, otherwise the | 
 | 29 | sample size (in bytes) is always a parameter of the operation. | 
 | 30 |  | 
 | 31 | The module defines the following variables and functions: | 
 | 32 |  | 
 | 33 |  | 
 | 34 | .. exception:: error | 
 | 35 |  | 
 | 36 |    This exception is raised on all errors, such as unknown number of bytes per | 
 | 37 |    sample, etc. | 
 | 38 |  | 
 | 39 |  | 
 | 40 | .. function:: add(fragment1, fragment2, width) | 
 | 41 |  | 
 | 42 |    Return a fragment which is the addition of the two samples passed as parameters. | 
| Serhiy Storchaka | eaea5e9 | 2013-10-19 21:10:46 +0300 | [diff] [blame] | 43 |    *width* is the sample width in bytes, either ``1``, ``2``, ``3`` or ``4``.  Both | 
| Serhiy Storchaka | 01ad622 | 2013-02-09 11:10:53 +0200 | [diff] [blame] | 44 |    fragments should have the same length.  Samples are truncated in case of overflow. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 45 |  | 
 | 46 |  | 
 | 47 | .. function:: adpcm2lin(adpcmfragment, width, state) | 
 | 48 |  | 
 | 49 |    Decode an Intel/DVI ADPCM coded fragment to a linear fragment.  See the | 
 | 50 |    description of :func:`lin2adpcm` for details on ADPCM coding. Return a tuple | 
 | 51 |    ``(sample, newstate)`` where the sample has the width specified in *width*. | 
 | 52 |  | 
 | 53 |  | 
 | 54 | .. function:: alaw2lin(fragment, width) | 
 | 55 |  | 
 | 56 |    Convert sound fragments in a-LAW encoding to linearly encoded sound fragments. | 
 | 57 |    a-LAW encoding always uses 8 bits samples, so *width* refers only to the sample | 
 | 58 |    width of the output fragment here. | 
 | 59 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 60 |  | 
 | 61 | .. function:: avg(fragment, width) | 
 | 62 |  | 
 | 63 |    Return the average over all samples in the fragment. | 
 | 64 |  | 
 | 65 |  | 
 | 66 | .. function:: avgpp(fragment, width) | 
 | 67 |  | 
 | 68 |    Return the average peak-peak value over all samples in the fragment. No | 
 | 69 |    filtering is done, so the usefulness of this routine is questionable. | 
 | 70 |  | 
 | 71 |  | 
 | 72 | .. function:: bias(fragment, width, bias) | 
 | 73 |  | 
 | 74 |    Return a fragment that is the original fragment with a bias added to each | 
| Serhiy Storchaka | 01ad622 | 2013-02-09 11:10:53 +0200 | [diff] [blame] | 75 |    sample.  Samples wrap around in case of overflow. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 76 |  | 
 | 77 |  | 
| Serhiy Storchaka | 3062c9a | 2013-11-23 22:26:01 +0200 | [diff] [blame] | 78 | .. function:: byteswap(fragment, width) | 
 | 79 |  | 
 | 80 |    "Byteswap" all samples in a fragment and returns the modified fragment. | 
 | 81 |    Converts big-endian samples to little-endian and vice versa. | 
 | 82 |  | 
| R David Murray | 2177be2 | 2014-03-09 20:42:49 -0400 | [diff] [blame] | 83 |    .. versionadded:: 3.4 | 
| Serhiy Storchaka | 3062c9a | 2013-11-23 22:26:01 +0200 | [diff] [blame] | 84 |  | 
 | 85 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 86 | .. function:: cross(fragment, width) | 
 | 87 |  | 
 | 88 |    Return the number of zero crossings in the fragment passed as an argument. | 
 | 89 |  | 
 | 90 |  | 
 | 91 | .. function:: findfactor(fragment, reference) | 
 | 92 |  | 
 | 93 |    Return a factor *F* such that ``rms(add(fragment, mul(reference, -F)))`` is | 
 | 94 |    minimal, i.e., return the factor with which you should multiply *reference* to | 
 | 95 |    make it match as well as possible to *fragment*.  The fragments should both | 
 | 96 |    contain 2-byte samples. | 
 | 97 |  | 
 | 98 |    The time taken by this routine is proportional to ``len(fragment)``. | 
 | 99 |  | 
 | 100 |  | 
 | 101 | .. function:: findfit(fragment, reference) | 
 | 102 |  | 
 | 103 |    Try to match *reference* as well as possible to a portion of *fragment* (which | 
 | 104 |    should be the longer fragment).  This is (conceptually) done by taking slices | 
 | 105 |    out of *fragment*, using :func:`findfactor` to compute the best match, and | 
 | 106 |    minimizing the result.  The fragments should both contain 2-byte samples. | 
 | 107 |    Return a tuple ``(offset, factor)`` where *offset* is the (integer) offset into | 
 | 108 |    *fragment* where the optimal match started and *factor* is the (floating-point) | 
 | 109 |    factor as per :func:`findfactor`. | 
 | 110 |  | 
 | 111 |  | 
 | 112 | .. function:: findmax(fragment, length) | 
 | 113 |  | 
 | 114 |    Search *fragment* for a slice of length *length* samples (not bytes!) with | 
 | 115 |    maximum energy, i.e., return *i* for which ``rms(fragment[i*2:(i+length)*2])`` | 
 | 116 |    is maximal.  The fragments should both contain 2-byte samples. | 
 | 117 |  | 
 | 118 |    The routine takes time proportional to ``len(fragment)``. | 
 | 119 |  | 
 | 120 |  | 
 | 121 | .. function:: getsample(fragment, width, index) | 
 | 122 |  | 
 | 123 |    Return the value of sample *index* from the fragment. | 
 | 124 |  | 
 | 125 |  | 
 | 126 | .. function:: lin2adpcm(fragment, width, state) | 
 | 127 |  | 
 | 128 |    Convert samples to 4 bit Intel/DVI ADPCM encoding.  ADPCM coding is an adaptive | 
 | 129 |    coding scheme, whereby each 4 bit number is the difference between one sample | 
 | 130 |    and the next, divided by a (varying) step.  The Intel/DVI ADPCM algorithm has | 
 | 131 |    been selected for use by the IMA, so it may well become a standard. | 
 | 132 |  | 
 | 133 |    *state* is a tuple containing the state of the coder.  The coder returns a tuple | 
 | 134 |    ``(adpcmfrag, newstate)``, and the *newstate* should be passed to the next call | 
 | 135 |    of :func:`lin2adpcm`.  In the initial call, ``None`` can be passed as the state. | 
 | 136 |    *adpcmfrag* is the ADPCM coded fragment packed 2 4-bit values per byte. | 
 | 137 |  | 
 | 138 |  | 
 | 139 | .. function:: lin2alaw(fragment, width) | 
 | 140 |  | 
 | 141 |    Convert samples in the audio fragment to a-LAW encoding and return this as a | 
| Serhiy Storchaka | c8bd74d | 2012-12-27 20:43:36 +0200 | [diff] [blame] | 142 |    bytes object.  a-LAW is an audio encoding format whereby you get a dynamic | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 143 |    range of about 13 bits using only 8 bit samples.  It is used by the Sun audio | 
 | 144 |    hardware, among others. | 
 | 145 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 146 |  | 
 | 147 | .. function:: lin2lin(fragment, width, newwidth) | 
 | 148 |  | 
| Serhiy Storchaka | eaea5e9 | 2013-10-19 21:10:46 +0300 | [diff] [blame] | 149 |    Convert samples between 1-, 2-, 3- and 4-byte formats. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 150 |  | 
| Christian Heimes | cc47b05 | 2008-03-25 14:56:36 +0000 | [diff] [blame] | 151 |    .. note:: | 
 | 152 |  | 
| Serhiy Storchaka | eaea5e9 | 2013-10-19 21:10:46 +0300 | [diff] [blame] | 153 |       In some audio formats, such as .WAV files, 16, 24 and 32 bit samples are | 
| Christian Heimes | cc47b05 | 2008-03-25 14:56:36 +0000 | [diff] [blame] | 154 |       signed, but 8 bit samples are unsigned.  So when converting to 8 bit wide | 
 | 155 |       samples for these formats, you need to also add 128 to the result:: | 
 | 156 |  | 
 | 157 |          new_frames = audioop.lin2lin(frames, old_width, 1) | 
 | 158 |          new_frames = audioop.bias(new_frames, 1, 128) | 
 | 159 |  | 
| Serhiy Storchaka | eaea5e9 | 2013-10-19 21:10:46 +0300 | [diff] [blame] | 160 |       The same, in reverse, has to be applied when converting from 8 to 16, 24 | 
 | 161 |       or 32 bit width samples. | 
| Christian Heimes | cc47b05 | 2008-03-25 14:56:36 +0000 | [diff] [blame] | 162 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 163 |  | 
 | 164 | .. function:: lin2ulaw(fragment, width) | 
 | 165 |  | 
 | 166 |    Convert samples in the audio fragment to u-LAW encoding and return this as a | 
| Serhiy Storchaka | c8bd74d | 2012-12-27 20:43:36 +0200 | [diff] [blame] | 167 |    bytes object.  u-LAW is an audio encoding format whereby you get a dynamic | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 168 |    range of about 14 bits using only 8 bit samples.  It is used by the Sun audio | 
 | 169 |    hardware, among others. | 
 | 170 |  | 
 | 171 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 172 | .. function:: max(fragment, width) | 
 | 173 |  | 
 | 174 |    Return the maximum of the *absolute value* of all samples in a fragment. | 
 | 175 |  | 
 | 176 |  | 
 | 177 | .. function:: maxpp(fragment, width) | 
 | 178 |  | 
 | 179 |    Return the maximum peak-peak value in the sound fragment. | 
 | 180 |  | 
 | 181 |  | 
| Ezio Melotti | e0035a2 | 2012-12-14 20:18:46 +0200 | [diff] [blame] | 182 | .. function:: minmax(fragment, width) | 
 | 183 |  | 
 | 184 |    Return a tuple consisting of the minimum and maximum values of all samples in | 
 | 185 |    the sound fragment. | 
 | 186 |  | 
 | 187 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 188 | .. function:: mul(fragment, width, factor) | 
 | 189 |  | 
 | 190 |    Return a fragment that has all samples in the original fragment multiplied by | 
| Serhiy Storchaka | 01ad622 | 2013-02-09 11:10:53 +0200 | [diff] [blame] | 191 |    the floating-point value *factor*.  Samples are truncated in case of overflow. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 192 |  | 
 | 193 |  | 
 | 194 | .. function:: ratecv(fragment, width, nchannels, inrate, outrate, state[, weightA[, weightB]]) | 
 | 195 |  | 
 | 196 |    Convert the frame rate of the input fragment. | 
 | 197 |  | 
 | 198 |    *state* is a tuple containing the state of the converter.  The converter returns | 
 | 199 |    a tuple ``(newfragment, newstate)``, and *newstate* should be passed to the next | 
 | 200 |    call of :func:`ratecv`.  The initial call should pass ``None`` as the state. | 
 | 201 |  | 
 | 202 |    The *weightA* and *weightB* arguments are parameters for a simple digital filter | 
 | 203 |    and default to ``1`` and ``0`` respectively. | 
 | 204 |  | 
 | 205 |  | 
 | 206 | .. function:: reverse(fragment, width) | 
 | 207 |  | 
 | 208 |    Reverse the samples in a fragment and returns the modified fragment. | 
 | 209 |  | 
 | 210 |  | 
 | 211 | .. function:: rms(fragment, width) | 
 | 212 |  | 
 | 213 |    Return the root-mean-square of the fragment, i.e. ``sqrt(sum(S_i^2)/n)``. | 
 | 214 |  | 
 | 215 |    This is a measure of the power in an audio signal. | 
 | 216 |  | 
 | 217 |  | 
 | 218 | .. function:: tomono(fragment, width, lfactor, rfactor) | 
 | 219 |  | 
 | 220 |    Convert a stereo fragment to a mono fragment.  The left channel is multiplied by | 
 | 221 |    *lfactor* and the right channel by *rfactor* before adding the two channels to | 
 | 222 |    give a mono signal. | 
 | 223 |  | 
 | 224 |  | 
 | 225 | .. function:: tostereo(fragment, width, lfactor, rfactor) | 
 | 226 |  | 
 | 227 |    Generate a stereo fragment from a mono fragment.  Each pair of samples in the | 
 | 228 |    stereo fragment are computed from the mono sample, whereby left channel samples | 
 | 229 |    are multiplied by *lfactor* and right channel samples by *rfactor*. | 
 | 230 |  | 
 | 231 |  | 
 | 232 | .. function:: ulaw2lin(fragment, width) | 
 | 233 |  | 
 | 234 |    Convert sound fragments in u-LAW encoding to linearly encoded sound fragments. | 
 | 235 |    u-LAW encoding always uses 8 bits samples, so *width* refers only to the sample | 
 | 236 |    width of the output fragment here. | 
 | 237 |  | 
| Georg Brandl | 502d9a5 | 2009-07-26 15:02:41 +0000 | [diff] [blame] | 238 | Note that operations such as :func:`.mul` or :func:`.max` make no distinction | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 239 | between mono and stereo fragments, i.e. all samples are treated equal.  If this | 
 | 240 | is a problem the stereo fragment should be split into two mono fragments first | 
 | 241 | and recombined later.  Here is an example of how to do that:: | 
 | 242 |  | 
 | 243 |    def mul_stereo(sample, width, lfactor, rfactor): | 
 | 244 |        lsample = audioop.tomono(sample, width, 1, 0) | 
 | 245 |        rsample = audioop.tomono(sample, width, 0, 1) | 
| Georg Brandl | f3d0087 | 2010-10-17 10:07:29 +0000 | [diff] [blame] | 246 |        lsample = audioop.mul(lsample, width, lfactor) | 
 | 247 |        rsample = audioop.mul(rsample, width, rfactor) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 248 |        lsample = audioop.tostereo(lsample, width, 1, 0) | 
 | 249 |        rsample = audioop.tostereo(rsample, width, 0, 1) | 
 | 250 |        return audioop.add(lsample, rsample, width) | 
 | 251 |  | 
 | 252 | If you use the ADPCM coder to build network packets and you want your protocol | 
 | 253 | to be stateless (i.e. to be able to tolerate packet loss) you should not only | 
 | 254 | transmit the data but also the state.  Note that you should send the *initial* | 
 | 255 | state (the one you passed to :func:`lin2adpcm`) along to the decoder, not the | 
 | 256 | final state (as returned by the coder).  If you want to use | 
| Serhiy Storchaka | bfdcd43 | 2013-10-13 23:09:14 +0300 | [diff] [blame] | 257 | :class:`struct.Struct` to store the state in binary you can code the first | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 258 | element (the predicted value) in 16 bits and the second (the delta index) in 8. | 
 | 259 |  | 
 | 260 | The ADPCM coders have never been tried against other ADPCM coders, only against | 
 | 261 | themselves.  It could well be that I misinterpreted the standards in which case | 
 | 262 | they will not be interoperable with the respective standards. | 
 | 263 |  | 
 | 264 | The :func:`find\*` routines might look a bit funny at first sight. They are | 
 | 265 | primarily meant to do echo cancellation.  A reasonably fast way to do this is to | 
 | 266 | pick the most energetic piece of the output sample, locate that in the input | 
 | 267 | sample and subtract the whole output sample from the input sample:: | 
 | 268 |  | 
 | 269 |    def echocancel(outputdata, inputdata): | 
 | 270 |        pos = audioop.findmax(outputdata, 800)    # one tenth second | 
 | 271 |        out_test = outputdata[pos*2:] | 
 | 272 |        in_test = inputdata[pos*2:] | 
 | 273 |        ipos, factor = audioop.findfit(in_test, out_test) | 
 | 274 |        # Optional (for better cancellation): | 
| Georg Brandl | 48310cd | 2009-01-03 21:18:54 +0000 | [diff] [blame] | 275 |        # factor = audioop.findfactor(in_test[ipos*2:ipos*2+len(out_test)], | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 276 |        #              out_test) | 
 | 277 |        prefill = '\0'*(pos+ipos)*2 | 
 | 278 |        postfill = '\0'*(len(inputdata)-len(prefill)-len(outputdata)) | 
 | 279 |        outputdata = prefill + audioop.mul(outputdata,2,-factor) + postfill | 
 | 280 |        return audioop.add(inputdata, outputdata, 2) | 
 | 281 |  |