| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 1 |  | 
 | 2 | :mod:`audioop` --- Manipulate raw audio data | 
 | 3 | ============================================ | 
 | 4 |  | 
 | 5 | .. module:: audioop | 
 | 6 |    :synopsis: Manipulate raw audio data. | 
 | 7 |  | 
 | 8 |  | 
 | 9 | The :mod:`audioop` module contains some useful operations on sound fragments. | 
 | 10 | It operates on sound fragments consisting of signed integer samples 8, 16 or 32 | 
 | 11 | bits wide, stored in Python strings.  All scalar items are integers, unless | 
 | 12 | specified otherwise. | 
 | 13 |  | 
 | 14 | .. index:: | 
 | 15 |    single: Intel/DVI ADPCM | 
 | 16 |    single: ADPCM, Intel/DVI | 
 | 17 |    single: a-LAW | 
 | 18 |    single: u-LAW | 
 | 19 |  | 
 | 20 | This module provides support for a-LAW, u-LAW and Intel/DVI ADPCM encodings. | 
 | 21 |  | 
| Christian Heimes | 5b5e81c | 2007-12-31 16:14:33 +0000 | [diff] [blame] | 22 | .. This para is mostly here to provide an excuse for the index entries... | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 23 |  | 
 | 24 | A few of the more complicated operations only take 16-bit samples, otherwise the | 
 | 25 | sample size (in bytes) is always a parameter of the operation. | 
 | 26 |  | 
 | 27 | The module defines the following variables and functions: | 
 | 28 |  | 
 | 29 |  | 
 | 30 | .. exception:: error | 
 | 31 |  | 
 | 32 |    This exception is raised on all errors, such as unknown number of bytes per | 
 | 33 |    sample, etc. | 
 | 34 |  | 
 | 35 |  | 
 | 36 | .. function:: add(fragment1, fragment2, width) | 
 | 37 |  | 
 | 38 |    Return a fragment which is the addition of the two samples passed as parameters. | 
 | 39 |    *width* is the sample width in bytes, either ``1``, ``2`` or ``4``.  Both | 
 | 40 |    fragments should have the same length. | 
 | 41 |  | 
 | 42 |  | 
 | 43 | .. function:: adpcm2lin(adpcmfragment, width, state) | 
 | 44 |  | 
 | 45 |    Decode an Intel/DVI ADPCM coded fragment to a linear fragment.  See the | 
 | 46 |    description of :func:`lin2adpcm` for details on ADPCM coding. Return a tuple | 
 | 47 |    ``(sample, newstate)`` where the sample has the width specified in *width*. | 
 | 48 |  | 
 | 49 |  | 
 | 50 | .. function:: alaw2lin(fragment, width) | 
 | 51 |  | 
 | 52 |    Convert sound fragments in a-LAW encoding to linearly encoded sound fragments. | 
 | 53 |    a-LAW encoding always uses 8 bits samples, so *width* refers only to the sample | 
 | 54 |    width of the output fragment here. | 
 | 55 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 56 |  | 
 | 57 | .. function:: avg(fragment, width) | 
 | 58 |  | 
 | 59 |    Return the average over all samples in the fragment. | 
 | 60 |  | 
 | 61 |  | 
 | 62 | .. function:: avgpp(fragment, width) | 
 | 63 |  | 
 | 64 |    Return the average peak-peak value over all samples in the fragment. No | 
 | 65 |    filtering is done, so the usefulness of this routine is questionable. | 
 | 66 |  | 
 | 67 |  | 
 | 68 | .. function:: bias(fragment, width, bias) | 
 | 69 |  | 
 | 70 |    Return a fragment that is the original fragment with a bias added to each | 
 | 71 |    sample. | 
 | 72 |  | 
 | 73 |  | 
 | 74 | .. function:: cross(fragment, width) | 
 | 75 |  | 
 | 76 |    Return the number of zero crossings in the fragment passed as an argument. | 
 | 77 |  | 
 | 78 |  | 
 | 79 | .. function:: findfactor(fragment, reference) | 
 | 80 |  | 
 | 81 |    Return a factor *F* such that ``rms(add(fragment, mul(reference, -F)))`` is | 
 | 82 |    minimal, i.e., return the factor with which you should multiply *reference* to | 
 | 83 |    make it match as well as possible to *fragment*.  The fragments should both | 
 | 84 |    contain 2-byte samples. | 
 | 85 |  | 
 | 86 |    The time taken by this routine is proportional to ``len(fragment)``. | 
 | 87 |  | 
 | 88 |  | 
 | 89 | .. function:: findfit(fragment, reference) | 
 | 90 |  | 
 | 91 |    Try to match *reference* as well as possible to a portion of *fragment* (which | 
 | 92 |    should be the longer fragment).  This is (conceptually) done by taking slices | 
 | 93 |    out of *fragment*, using :func:`findfactor` to compute the best match, and | 
 | 94 |    minimizing the result.  The fragments should both contain 2-byte samples. | 
 | 95 |    Return a tuple ``(offset, factor)`` where *offset* is the (integer) offset into | 
 | 96 |    *fragment* where the optimal match started and *factor* is the (floating-point) | 
 | 97 |    factor as per :func:`findfactor`. | 
 | 98 |  | 
 | 99 |  | 
 | 100 | .. function:: findmax(fragment, length) | 
 | 101 |  | 
 | 102 |    Search *fragment* for a slice of length *length* samples (not bytes!) with | 
 | 103 |    maximum energy, i.e., return *i* for which ``rms(fragment[i*2:(i+length)*2])`` | 
 | 104 |    is maximal.  The fragments should both contain 2-byte samples. | 
 | 105 |  | 
 | 106 |    The routine takes time proportional to ``len(fragment)``. | 
 | 107 |  | 
 | 108 |  | 
 | 109 | .. function:: getsample(fragment, width, index) | 
 | 110 |  | 
 | 111 |    Return the value of sample *index* from the fragment. | 
 | 112 |  | 
 | 113 |  | 
 | 114 | .. function:: lin2adpcm(fragment, width, state) | 
 | 115 |  | 
 | 116 |    Convert samples to 4 bit Intel/DVI ADPCM encoding.  ADPCM coding is an adaptive | 
 | 117 |    coding scheme, whereby each 4 bit number is the difference between one sample | 
 | 118 |    and the next, divided by a (varying) step.  The Intel/DVI ADPCM algorithm has | 
 | 119 |    been selected for use by the IMA, so it may well become a standard. | 
 | 120 |  | 
 | 121 |    *state* is a tuple containing the state of the coder.  The coder returns a tuple | 
 | 122 |    ``(adpcmfrag, newstate)``, and the *newstate* should be passed to the next call | 
 | 123 |    of :func:`lin2adpcm`.  In the initial call, ``None`` can be passed as the state. | 
 | 124 |    *adpcmfrag* is the ADPCM coded fragment packed 2 4-bit values per byte. | 
 | 125 |  | 
 | 126 |  | 
 | 127 | .. function:: lin2alaw(fragment, width) | 
 | 128 |  | 
 | 129 |    Convert samples in the audio fragment to a-LAW encoding and return this as a | 
 | 130 |    Python string.  a-LAW is an audio encoding format whereby you get a dynamic | 
 | 131 |    range of about 13 bits using only 8 bit samples.  It is used by the Sun audio | 
 | 132 |    hardware, among others. | 
 | 133 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 134 |  | 
 | 135 | .. function:: lin2lin(fragment, width, newwidth) | 
 | 136 |  | 
 | 137 |    Convert samples between 1-, 2- and 4-byte formats. | 
 | 138 |  | 
| Christian Heimes | cc47b05 | 2008-03-25 14:56:36 +0000 | [diff] [blame] | 139 |    .. note:: | 
 | 140 |  | 
 | 141 |       In some audio formats, such as .WAV files, 16 and 32 bit samples are | 
 | 142 |       signed, but 8 bit samples are unsigned.  So when converting to 8 bit wide | 
 | 143 |       samples for these formats, you need to also add 128 to the result:: | 
 | 144 |  | 
 | 145 |          new_frames = audioop.lin2lin(frames, old_width, 1) | 
 | 146 |          new_frames = audioop.bias(new_frames, 1, 128) | 
 | 147 |  | 
 | 148 |       The same, in reverse, has to be applied when converting from 8 to 16 or 32 | 
 | 149 |       bit width samples. | 
 | 150 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 151 |  | 
 | 152 | .. function:: lin2ulaw(fragment, width) | 
 | 153 |  | 
 | 154 |    Convert samples in the audio fragment to u-LAW encoding and return this as a | 
 | 155 |    Python string.  u-LAW is an audio encoding format whereby you get a dynamic | 
 | 156 |    range of about 14 bits using only 8 bit samples.  It is used by the Sun audio | 
 | 157 |    hardware, among others. | 
 | 158 |  | 
 | 159 |  | 
 | 160 | .. function:: minmax(fragment, width) | 
 | 161 |  | 
 | 162 |    Return a tuple consisting of the minimum and maximum values of all samples in | 
 | 163 |    the sound fragment. | 
 | 164 |  | 
 | 165 |  | 
 | 166 | .. function:: max(fragment, width) | 
 | 167 |  | 
 | 168 |    Return the maximum of the *absolute value* of all samples in a fragment. | 
 | 169 |  | 
 | 170 |  | 
 | 171 | .. function:: maxpp(fragment, width) | 
 | 172 |  | 
 | 173 |    Return the maximum peak-peak value in the sound fragment. | 
 | 174 |  | 
 | 175 |  | 
 | 176 | .. function:: mul(fragment, width, factor) | 
 | 177 |  | 
 | 178 |    Return a fragment that has all samples in the original fragment multiplied by | 
 | 179 |    the floating-point value *factor*.  Overflow is silently ignored. | 
 | 180 |  | 
 | 181 |  | 
 | 182 | .. function:: ratecv(fragment, width, nchannels, inrate, outrate, state[, weightA[, weightB]]) | 
 | 183 |  | 
 | 184 |    Convert the frame rate of the input fragment. | 
 | 185 |  | 
 | 186 |    *state* is a tuple containing the state of the converter.  The converter returns | 
 | 187 |    a tuple ``(newfragment, newstate)``, and *newstate* should be passed to the next | 
 | 188 |    call of :func:`ratecv`.  The initial call should pass ``None`` as the state. | 
 | 189 |  | 
 | 190 |    The *weightA* and *weightB* arguments are parameters for a simple digital filter | 
 | 191 |    and default to ``1`` and ``0`` respectively. | 
 | 192 |  | 
 | 193 |  | 
 | 194 | .. function:: reverse(fragment, width) | 
 | 195 |  | 
 | 196 |    Reverse the samples in a fragment and returns the modified fragment. | 
 | 197 |  | 
 | 198 |  | 
 | 199 | .. function:: rms(fragment, width) | 
 | 200 |  | 
 | 201 |    Return the root-mean-square of the fragment, i.e. ``sqrt(sum(S_i^2)/n)``. | 
 | 202 |  | 
 | 203 |    This is a measure of the power in an audio signal. | 
 | 204 |  | 
 | 205 |  | 
 | 206 | .. function:: tomono(fragment, width, lfactor, rfactor) | 
 | 207 |  | 
 | 208 |    Convert a stereo fragment to a mono fragment.  The left channel is multiplied by | 
 | 209 |    *lfactor* and the right channel by *rfactor* before adding the two channels to | 
 | 210 |    give a mono signal. | 
 | 211 |  | 
 | 212 |  | 
 | 213 | .. function:: tostereo(fragment, width, lfactor, rfactor) | 
 | 214 |  | 
 | 215 |    Generate a stereo fragment from a mono fragment.  Each pair of samples in the | 
 | 216 |    stereo fragment are computed from the mono sample, whereby left channel samples | 
 | 217 |    are multiplied by *lfactor* and right channel samples by *rfactor*. | 
 | 218 |  | 
 | 219 |  | 
 | 220 | .. function:: ulaw2lin(fragment, width) | 
 | 221 |  | 
 | 222 |    Convert sound fragments in u-LAW encoding to linearly encoded sound fragments. | 
 | 223 |    u-LAW encoding always uses 8 bits samples, so *width* refers only to the sample | 
 | 224 |    width of the output fragment here. | 
 | 225 |  | 
 | 226 | Note that operations such as :func:`mul` or :func:`max` make no distinction | 
 | 227 | between mono and stereo fragments, i.e. all samples are treated equal.  If this | 
 | 228 | is a problem the stereo fragment should be split into two mono fragments first | 
 | 229 | and recombined later.  Here is an example of how to do that:: | 
 | 230 |  | 
 | 231 |    def mul_stereo(sample, width, lfactor, rfactor): | 
 | 232 |        lsample = audioop.tomono(sample, width, 1, 0) | 
 | 233 |        rsample = audioop.tomono(sample, width, 0, 1) | 
 | 234 |        lsample = audioop.mul(sample, width, lfactor) | 
 | 235 |        rsample = audioop.mul(sample, width, rfactor) | 
 | 236 |        lsample = audioop.tostereo(lsample, width, 1, 0) | 
 | 237 |        rsample = audioop.tostereo(rsample, width, 0, 1) | 
 | 238 |        return audioop.add(lsample, rsample, width) | 
 | 239 |  | 
 | 240 | If you use the ADPCM coder to build network packets and you want your protocol | 
 | 241 | to be stateless (i.e. to be able to tolerate packet loss) you should not only | 
 | 242 | transmit the data but also the state.  Note that you should send the *initial* | 
 | 243 | state (the one you passed to :func:`lin2adpcm`) along to the decoder, not the | 
 | 244 | final state (as returned by the coder).  If you want to use | 
 | 245 | :func:`struct.struct` to store the state in binary you can code the first | 
 | 246 | element (the predicted value) in 16 bits and the second (the delta index) in 8. | 
 | 247 |  | 
 | 248 | The ADPCM coders have never been tried against other ADPCM coders, only against | 
 | 249 | themselves.  It could well be that I misinterpreted the standards in which case | 
 | 250 | they will not be interoperable with the respective standards. | 
 | 251 |  | 
 | 252 | The :func:`find\*` routines might look a bit funny at first sight. They are | 
 | 253 | primarily meant to do echo cancellation.  A reasonably fast way to do this is to | 
 | 254 | pick the most energetic piece of the output sample, locate that in the input | 
 | 255 | sample and subtract the whole output sample from the input sample:: | 
 | 256 |  | 
 | 257 |    def echocancel(outputdata, inputdata): | 
 | 258 |        pos = audioop.findmax(outputdata, 800)    # one tenth second | 
 | 259 |        out_test = outputdata[pos*2:] | 
 | 260 |        in_test = inputdata[pos*2:] | 
 | 261 |        ipos, factor = audioop.findfit(in_test, out_test) | 
 | 262 |        # Optional (for better cancellation): | 
 | 263 |        # factor = audioop.findfactor(in_test[ipos*2:ipos*2+len(out_test)],  | 
 | 264 |        #              out_test) | 
 | 265 |        prefill = '\0'*(pos+ipos)*2 | 
 | 266 |        postfill = '\0'*(len(inputdata)-len(prefill)-len(outputdata)) | 
 | 267 |        outputdata = prefill + audioop.mul(outputdata,2,-factor) + postfill | 
 | 268 |        return audioop.add(inputdata, outputdata, 2) | 
 | 269 |  |