| .de PP |
| .LP |
| .. |
| .de pT |
| .IP \fB\\$1\fP |
| .. |
| .TL |
| CMIF video file format |
| .AU |
| Jack Jansen |
| (Version of 27-Feb-92) |
| .SH |
| Introduction |
| .PP |
| The CMIF video format was invented to allow various applications |
| to exchange video data. The format consists of |
| a header containing global information (like data format) |
| followed by a sequence of frames, each consisting of a header |
| followed by the actual frame data. |
| All information except pixel data is |
| encoded in ASCII. Pixel data is \fIalways\fP encoded in Silicon Graphics |
| order, which means that the first pixel in the frame is the lower left |
| pixel on the screen. |
| .PP |
| All ASCII data except the first line of the file |
| is in python format. This means that |
| outer parentheses can be ommitted, and parentheses around a tuple with |
| one element can also be omitted. So, the lines |
| .IP |
| .ft C |
| .nf |
| ('grey',(4)) |
| ('grey',4) |
| 'grey',4 |
| .LP |
| have the same meaning. |
| To ease parsing in C programs, however, it is advised that there are |
| no parenteses around single items, and that there are parentheses around |
| lists. So, the second format above is preferred. |
| .PP |
| The current version is version 3, but this document will also explain |
| shortly what the previous formats looked like. |
| .SH |
| Header. |
| .PP |
| The header consists of three lines. The first line identifies the file |
| as a CMIF video file, and gives the version number. |
| It looks as follows: |
| .IP |
| .ft C |
| CMIF video 3.0 |
| .LP |
| All programs expect the layout to be exactly like this, so no |
| extra spaces, etc. should be added. |
| .PP |
| The second line specifies the data format. Its format is a python |
| tuple with two members. The first member is a string giving the format |
| type and the second is a tuple containing type-specific information. |
| The following formats are currently understood: |
| .pT rgb |
| The video data is 24 bit RGB packed into 32 bit words. |
| R is the least significant byte, then G and then B. The top byte is |
| unused. |
| .IP |
| There is no type-specific information, so the complete data format |
| line is |
| .IP |
| .ft C |
| ('rgb',()) |
| .pT grey |
| The video data is greyscale, at most 8 bits. Data is packed into |
| 8 bit bytes (in the low-order bits). The extra information is the |
| number of significant bits, so an example data format line is |
| .IP |
| .ft C |
| ('grey',(6)) |
| .pT yiq |
| The video data is in YIQ format. This is a format that has one luminance |
| component, Y, and two chrominance components, I and Q. The luminance and |
| chrominance components are encoded in \fItwo\fP pixel arrays: first an |
| array of 8-bit luminance values followed by a array of 16 bit chrominance |
| values. See the section on chrominance coding for details. |
| .IP |
| The type specific part contains the number of bits for Y, I and Q, |
| the chrominance packfactor and the colormap offset. So, a sample format |
| information line of |
| .IP |
| .ft C |
| ('yiq',(5,3,3,2,1024)) |
| .IP |
| means that the pictures have 5 bit Y values (in the luminance array), |
| 3 bits of I and Q each (in the chrominance array), chrominance data |
| is packed for 2x2 pixels, and the first colormap index used is 1024. |
| .pT hls |
| The video data is in HLS format. L is the luminance component, H and S |
| are the chrominance components. The data format and type specific information |
| are the same as for the yiq format. |
| .pT hsv |
| The video data is in HSV format. V is the luminance component, H and S |
| are the chrominance components. Again, data format and type specific |
| information are the same as for the yiq format. |
| .pT rgb8 |
| The video data is in 8 bit dithered rgb format. This is the format |
| used internally by the Indigo. bit 0-2 are green, bit 3-4 are blue and |
| bit 5-7 are red. Because rgb8 is treated more-or-less like yiq format |
| internally the type-specific information is the same, with zeroes for |
| the (unused) chrominance sizes: |
| .IP |
| .ft C |
| ('rgb8',(8,0,0,0,0)) |
| .PP |
| The third header line contains width and height of the video image, |
| in pixels, and the pack factor of the picture. For compatability, RGB |
| images must have a pack factor of 0 (zero), and non-RGB images must |
| have a pack factor of at least 1. |
| The packfactor is the amount of compression done on the original video |
| signal to obtain pictures. In other words, if only one out of three pixels |
| and lines is stored (so every 9 original pixels have one pixel in the |
| data) the packfactor is three. Width and height are the size of the |
| \fIoriginal\fP picture. |
| Viewers are expected to enlarge the picture so it is shown in the |
| original size. RGB videos cannot be packed. |
| So, a size line like |
| .IP |
| .ft C |
| 200,200,2 |
| .LP |
| means that this was a 200x200 picture that is stored as 100x100 pixels. |
| .SH |
| Frame header |
| .PP |
| Each frame is preceded by a single header line. This line contains timing information |
| and optional size information. The time information is mandatory, and |
| contains the time this frame should be displayed, in milliseconds since |
| the start of the film. Frames should be stored in chronological order. |
| .PP |
| An optional second number is interpreted as the size of the luminance |
| data in bytes. Currently this number, if present, should always be the |
| same as \fCwidth*height/(packfactor*packfactor)\fP (times 4 for RGB |
| data), but this might change if we come up with variable-length encoding |
| for frame data. |
| .PP |
| An optional third number is the size of the chrominance data |
| in bytes. If present, the number should be equal to |
| .ft C |
| luminance_size2*/(chrompack*chrompack). |
| .SH |
| Frame data |
| .PP |
| For RGB films, the frame data is an array of 32 bit pixels containing |
| RGB data in the lower 24 bits. For greyscale films, the frame data |
| is an array of 8 bit pixels. For split luminance/chrominance films the |
| data consists of two parts: first an array of 8 bit luminance values |
| followed by an array of 16 bit chrominance values. |
| .PP |
| For all data formats, the data is stored left-to-right, bottom-to-top. |
| .SH |
| Chrominance coding |
| .PP |
| Since the human eye is apparently more sensitive to luminance changes |
| than to chrominance changes we support a coding where we split the luminance |
| and chrominance components of the video image. The main point of this |
| is that it allows us to transmit chrominance data in a coarser granularity |
| than luminance data, for instance one chrominance pixel for every |
| 2x2 luminance pixels. According to the theory this should result in an |
| acceptable picture while reducing the data by a fair amount. |
| .PP |
| The coding of split chrominance/luminance data is a bit tricky, to |
| make maximum use of the graphics hardware on the Personal Iris. Therefore, |
| there are the following constraints on the number of bits used: |
| .IP - |
| No more than 8 luminance bits, |
| .IP - |
| No more than 11 bits total, |
| .IP - |
| The luminance bits are in the low-end of the data word, and are stored |
| as 8 bit bytes, |
| .IP - |
| The two sets of chrominance bits are stored in 16 bit words, correctly |
| aligned, |
| .IP - |
| The color map offset is added to the chrominance data. The offset should |
| be at most 4096-256-2**(total number of bits). To reduce interference with |
| other applications the offset should be at least 1024. |
| .LP |
| So, as an example, an HLS video with 5 bits L, 4 bits H, 2 bits S and an |
| offset of 1024 will look as follows in-core and in-file: |
| .IP |
| .nf |
| .ft C |
| 31 15 11 10 9 8 5 4 0 |
| +-----------------------------------+ |
| incore + 0+ 1+ S + H + L + |
| +-----------------------------------+ |
| +----------+ |
| L-array + 0 + L + |
| +----------+ |
| +-----------------------+ |
| C-array + 0+ 1+ S + H + 0 + |
| +-----------------------+ |