blob: 73a2292af2a29f99eb5e86346cd0889d24a4dbbc [file] [log] [blame]
Jean-Marc Valin01e17732008-10-10 20:14:44 -04001CELT is a very low delay audio codec designed for high-quality communications.
Gregory Maxwell54547f12009-02-16 18:56:44 -05002
3Traditional full-bandwidth codecs such as Vorbis and AAC can offer high
4quality but they require codec delays of hundreds of milliseconds, which
5makes them unsuitable for real-time interactive applications like tele-
6conferencing. Speech targeted codecs, such as Speex or G.722, have lower
720-40ms delays but their speech focus and limited sampling rates
8restricts their quality, especially for music.
9
10Additionally, the other mandatory components of a full network audio system
11audio interfaces, routers, jitter buffers each add their own delay. For lower
12speed networks the time it takes to serialize a packet onto the network cable
13takes considerable time, and over the long distances the speed of light
14imposes a significant delay.
15
16In teleconferencing it is important to keep delay low so that the participants
17can communicate fluidly without talking on top of each other and so that their
18own voices don't return after a round trip as an annoying echo.
19
20For network music performance— research has show that the total one way delay
21must be kept under 25ms to avoid degrading the musicians performance.
22
23Since many of the sources of delay in a complete system are outside of the
24user's control (such as the speed of light) it is often only possible to
25reduce the total delay by reducing the codec delay.
26
27Low delay has traditionally been considered a challenging area in audio codec
28design, because as a codec is forced to work on the smaller chunks of audio
29required for low delay it has access to less redundancy and less perceptual
30information which it can use to reduce the size of the transmitted audio.
31
32CELT is designed to bridge the gap between "music" and "speech" codecs,
33permitting new very high quality teleconferencing applications, and to go
34further, permitting latencies much lower than speech codecs normally provide
35to enable applications such as remote musical collaboration even over long
36distances.
37
38In keeping with the Xiph.Org mission CELT is also designed to accomplish
39this without copyright or patent encumbrance. Only by keeping the formats
40that drive our Internet communication free and unencumbered can we maximize
41innovation, collaboration, and interoperability. Fortunately, CELT is ahead
42of the adoption curve in its target application space, so there should be
43no reason for someone who needs what CELT provides to go with a proprietary
44codec.
45
46CELT has been tested on x86, x86_64, ARM, and the TI C55x DSPs, and should
47be portable to any platform with a working C compiler and on the order of
48100 MIPS of processing power.
49
50The code is still in early stage, so it may be broken from time to time, and
Jean-Marc Valin01e17732008-10-10 20:14:44 -040051the bit-stream is not frozen yet, so it is different from one version to
52another. Oh, and don't complain if it sets your house on fire.
Jean-Marc Valin06ee7f92007-12-09 00:55:49 +110053
Gregory Maxwell54547f12009-02-16 18:56:44 -050054Complaints and accolades can be directed to the CELT mailing list:
55http://lists.xiph.org/mailman/listinfo/celt-dev/
56
Jean-Marc Valin06ee7f92007-12-09 00:55:49 +110057To compile:
58% ./configure
59% make
60
Gregory Maxwell54547f12009-02-16 18:56:44 -050061For platforms without fast floating point support (such as ARM) use the
62--enable-fixed argument to configure to build a fixed-point version of CELT.
63
64There are Ogg-based encode/decode tools in tools/. These are quite similar to
65the speexenc/speexdec tools. Use the --help option for details.
66
67There is also a basic tool for testing the encoder and decoder called
68"testcelt" located in libcelt/:
69
Jean-Marc Valin5c0d4862008-07-24 08:49:34 -040070% testcelt <rate> <channels> <frame size> <bytes per packet> input.sw output.sw
Jean-Marc Valin06ee7f92007-12-09 00:55:49 +110071
Gregory Maxwell54547f12009-02-16 18:56:44 -050072where input.sw is a 16-bit (machine endian) audio file sampled at 32000 Hz to
7396000 Hz. The output file is already decompressed.
Jean-Marc Valin3e65d1e2008-02-13 13:07:11 +110074
Gregory Maxwell54547f12009-02-16 18:56:44 -050075For example, for a 44.1 kHz mono stream at ~64kbit/sec and with 256 sample
76frames:
77
78% testcelt 44100 1 256 46 intput.sw output.sw
79
80Since 44100/256*46*8 = 63393.74 bits/sec.
81
82All even frame sizes from 64 to 512 are currently supported, although
83power-of-two sizes are recommended and most CELT development is done
84using a size of 256. The delay imposed by CELT is 1.25x - 1.5x the
85frame duration depending on the frame size and some details of CELT's
86internal operation. For 256 sample frames the delay is 1.5x or 384
87samples, so the total codec delay in the above example is 8.70ms
88(1000/(44100/384)).