blob: 88dd8fa5d6ade396c693d8a99fcde62e47125b87 [file] [log] [blame]
Raph Leviendcecdd82012-03-23 11:21:16 -07001This is a README for the font compression reference code. Its very rough in
2this snapshot, but will be cleaned up some for public release.
3
4= How to run the compression test tool =
5
6This document documents how to run the compression reference code. At this
7writing, the code, while it is intended to produce a bytestream that can be
8reconstructed into a working font, the reference decompression code is not
9done, and the exact format of that bytestream is subject to change.
10
11== Building the tool ==
12
13On a standard Unix-style environment, it should be as simple as running ant”.
14A couple of paths to compression subprocesses are hardcoded in
15CompressionRunner.java, namely usr/bin/lzma and “/bin/bzip2”. These are the
16default locations in Ubuntu, but if theyre elsewhere on your system, youll
17need to change that.
18
19The tool depends on sfntly for much of the font work. The lib/ directory
20contains a snapshot jar. If you want to use the latest sfntly sources, then cd
21to the java subdirectory, run ant”, then copy these files dist/lib/sfntly.jar
22dist/tools/conversion/eot/eotconverter.jar and
23dist.tools/conversion/woff/woffconverter.jar to $(thisproject)/lib:
24
25dist/lib/sfntly.jar dist/tools/conversion/eot/eotconverter.jar
26dist.tools/conversion/woff/woffconverter.jar
27
28Theres also a dependency on guava (see references below).
29
30The dependencies are subject to their own licenses.
31
32== Setting up the test ==
33
34A run of the tool evaluates a base configuration plus one or more test
35configurations, for each font. It measures the file size of the test as a ratio
36over the base file size, then graphs the value of that ratio sorted across all
37files given on the command line.
38
39The test parameters are set by command line options (an improvement from the
40last snapshot). The base is set by the -b command line option, and the
41additional tests are specified by repeated -x command line options (see below).
42
43Each test is specified by a string description. It is a colon-separated list of
44stages. The final stage is entropy compression and can be one of gzip”,
45lzma”, bzip2”, woff”, eot (with actual wire-format MTX compression), or
46uncomp (for raw, uncompressed TTFs). Also, the new wire-format draft
47WOFF2 spec is available as "woff2", and takes an entropy coding as an
48optional argument, as in "woff2/gzip" or "woff2/lzma".
49
50Other stages may optionally include subparameters (following a slash, and
51comma-separated). The stages are:
52
53glyf: performs glyf-table preprocessing based on MTX. There are subparameters:
541. cbbox (composite bounding box). When specified, the bounding box for
55composite glyphs is included, otherwise stripped 2. sbbox (simple bounding
56box). When specified, the bounding box for simple glyphs is included 3. code:
57the bytecode is separated out into a separate stream 4. triplet: triplet coding
58(as in MTX) is used 5. push: push sequences are separated; if unset, pushes are
59kept inline in the bytecode 6. reslice: components of the glyf table are
60separated into individual streams, taking the MTX idea of separating the
61bytecodes further.
62
63hmtx: strips lsbs from the hmtx table. Based on the idea that lsbs can be
64reconstructed from bbox.
65
66hdmx: performs the delta coding on hdmx, essentially the same as MTX.
67
68cmap: compresses cmap table: wire format representation is inverse of cmap
69table plus exceptions (one glyph encoded by multiple character codes).
70
71kern: compresses kern table (not robust, intended just for rough testing).
72
73strip: the subparameters are a list of tables to be stripped entirely
74(comma-separated).
75
76The string roughly corresponding to MTX is:
77
78glyf/cbbox,code,triplet,push,hop:hdmx:gzip
79
80Meaning: glyph encoding is used, with simple glyph bboxes stripped (but
81composite glyph bboxes included), triplet coding, push sequences, and hop
82codes. The hdmx table is compressed. And finally, gzip is used as the entropy
83coder.
84
85This differs from MTX in a number of small ways: LZCOMP is not exactly the same
86as gzip. MTX uses three separate compression streams (the base font including
87triplet-coded glyph data), the bytecodes, and the push sequences, while this
88test uses a single stream. MTX also compresses the CVT table (an upper bound on
89the impact of this can be estimated by testing strip/cvt)
90
91Lastly, as a point of methodology, the code by default strips the dsig table,
92which would be invalidated by any non-bit-identical change to the font data. If
93it is desired to keep this table, add the keepdsig stage.
94
95The string representing the currently most aggressive optimization level is:
96
97glyf/triplet,code,push,reslice:hdmx:hmtx:cmap:kern:lzma
98
99In addition to the MTX one above, it strips the bboxes from composite glyphs,
100reslices the glyf table, compresses the htmx, cmap, and kern tables, and uses
101lzma as the entropy coding.
102
103The string corresponding to the current WOFF Ultra Condensed draft spec
104document is:
105
106glyf/cbbox,triplet,code,reslice:woff2/lzma
107
108The current C++ codebase can roundtrip compressed files as long as no per-table
109entropy coding is specified, as below (this will be fixed soon).
110
111glyf/cbbox,triplet,code,reslice:woff2
112
113
114== Running the tool ==
115
116java -jar build/jar/compression.jar *.ttf > chart.html
117
118The tool takes a list of OpenType fonts on the commandline, and generates an
119HTML chart, which it simply outputs to stdout. This chart uses the Google Chart
120API for plotting.
121
122Options:
123
124-b <desc>
125
126Sets the baseline experiment description.
127
128[ -x <desc> ]...
129
130Sets an experiment description. Can be used multiple times.
131
132-o
133
134Outputs the actual compressed file, substituting ".wof2" for ".ttf" in
135the input file name. Only useful when a single -x parameter is specified.
136
137= Decompressing the fonts =
138
139See the cpp/ directory (including cpp/README) for the C++ implementation of
140decompression. This code is based on OTS, and successfully roundtrips the
141basic compression as described in the draft spec.
142
143= References =
144
145sfntly: http://code.google.com/p/sfntly/ Guava:
146http://code.google.com/p/guava-libraries/ MTX:
147http://www.w3.org/Submission/MTX/
148
149Also please refer to documents (currently Google Docs):
150
151WOFF Ultra Condensed file format: proposals and discussion of wire format
152issues
153
154WIFF Ultra Condensed: more discussion of results and compression techniques.
155This tool was used to prepare the data in that document.