Blame - README - fp2-dev/platform/external/chromium_org/third_party/brotli/src

blob: 88dd8fa5d6ade396c693d8a99fcde62e47125b87 [file] [log] [blame]

Raph Levien	dcecdd8	2012-03-23 11:21:16 -0700	[diff] [blame]	1	This is a README for the font compression reference code. It’s very rough in
				2	this snapshot, but will be cleaned up some for public release.
				3
				4	= How to run the compression test tool =
				5
				6	This document documents how to run the compression reference code. At this
				7	writing, the code, while it is intended to produce a bytestream that can be
				8	reconstructed into a working font, the reference decompression code is not
				9	done, and the exact format of that bytestream is subject to change.
				10
				11	== Building the tool ==
				12
				13	On a standard Unix-style environment, it should be as simple as running “ant”.
				14	A couple of paths to compression subprocesses are hardcoded in
				15	CompressionRunner.java, namely “usr/bin/lzma” and “/bin/bzip2”. These are the
				16	default locations in Ubuntu, but if they’re elsewhere on your system, you’ll
				17	need to change that.
				18
				19	The tool depends on sfntly for much of the font work. The lib/ directory
				20	contains a snapshot jar. If you want to use the latest sfntly sources, then cd
				21	to the java subdirectory, run “ant”, then copy these files dist/lib/sfntly.jar
				22	dist/tools/conversion/eot/eotconverter.jar and
				23	dist.tools/conversion/woff/woffconverter.jar to $(thisproject)/lib:
				24
				25	dist/lib/sfntly.jar dist/tools/conversion/eot/eotconverter.jar
				26	dist.tools/conversion/woff/woffconverter.jar
				27
				28	There’s also a dependency on guava (see references below).
				29
				30	The dependencies are subject to their own licenses.
				31
				32	== Setting up the test ==
				33
				34	A run of the tool evaluates a “base” configuration plus one or more test
				35	configurations, for each font. It measures the file size of the test as a ratio
				36	over the base file size, then graphs the value of that ratio sorted across all
				37	files given on the command line.
				38
				39	The test parameters are set by command line options (an improvement from the
				40	last snapshot). The base is set by the -b command line option, and the
				41	additional tests are specified by repeated -x command line options (see below).
				42
				43	Each test is specified by a string description. It is a colon-separated list of
				44	stages. The final stage is entropy compression and can be one of “gzip”,
				45	“lzma”, “bzip2”, “woff”, “eot” (with actual wire-format MTX compression), or
				46	“uncomp” (for raw, uncompressed TTF’s). Also, the new wire-format draft
				47	WOFF2 spec is available as "woff2", and takes an entropy coding as an
				48	optional argument, as in "woff2/gzip" or "woff2/lzma".
				49
				50	Other stages may optionally include subparameters (following a slash, and
				51	comma-separated). The stages are:
				52
				53	glyf: performs glyf-table preprocessing based on MTX. There are subparameters:
				54	1. cbbox (composite bounding box). When specified, the bounding box for
				55	composite glyphs is included, otherwise stripped 2. sbbox (simple bounding
				56	box). When specified, the bounding box for simple glyphs is included 3. code:
				57	the bytecode is separated out into a separate stream 4. triplet: triplet coding
				58	(as in MTX) is used 5. push: push sequences are separated; if unset, pushes are
				59	kept inline in the bytecode 6. reslice: components of the glyf table are
				60	separated into individual streams, taking the MTX idea of separating the
				61	bytecodes further.
				62
				63	hmtx: strips lsb’s from the hmtx table. Based on the idea that lsb’s can be
				64	reconstructed from bbox.
				65
				66	hdmx: performs the delta coding on hdmx, essentially the same as MTX.
				67
				68	cmap: compresses cmap table: wire format representation is inverse of cmap
				69	table plus exceptions (one glyph encoded by multiple character codes).
				70
				71	kern: compresses kern table (not robust, intended just for rough testing).
				72
				73	strip: the subparameters are a list of tables to be stripped entirely
				74	(comma-separated).
				75
				76	The string roughly corresponding to MTX is:
				77
				78	glyf/cbbox,code,triplet,push,hop:hdmx:gzip
				79
				80	Meaning: glyph encoding is used, with simple glyph bboxes stripped (but
				81	composite glyph bboxes included), triplet coding, push sequences, and hop
				82	codes. The hdmx table is compressed. And finally, gzip is used as the entropy
				83	coder.
				84
				85	This differs from MTX in a number of small ways: LZCOMP is not exactly the same
				86	as gzip. MTX uses three separate compression streams (the base font including
				87	triplet-coded glyph data), the bytecodes, and the push sequences, while this
				88	test uses a single stream. MTX also compresses the CVT table (an upper bound on
				89	the impact of this can be estimated by testing strip/cvt)
				90
				91	Lastly, as a point of methodology, the code by default strips the “dsig” table,
				92	which would be invalidated by any non-bit-identical change to the font data. If
				93	it is desired to keep this table, add the “keepdsig” stage.
				94
				95	The string representing the currently most aggressive optimization level is:
				96
				97	glyf/triplet,code,push,reslice:hdmx:hmtx:cmap:kern:lzma
				98
				99	In addition to the MTX one above, it strips the bboxes from composite glyphs,
				100	reslices the glyf table, compresses the htmx, cmap, and kern tables, and uses
				101	lzma as the entropy coding.
				102
				103	The string corresponding to the current WOFF Ultra Condensed draft spec
				104	document is:
				105
				106	glyf/cbbox,triplet,code,reslice:woff2/lzma
				107
				108	The current C++ codebase can roundtrip compressed files as long as no per-table
				109	entropy coding is specified, as below (this will be fixed soon).
				110
				111	glyf/cbbox,triplet,code,reslice:woff2
				112
				113
				114	== Running the tool ==
				115
				116	java -jar build/jar/compression.jar *.ttf > chart.html
				117
				118	The tool takes a list of OpenType fonts on the commandline, and generates an
				119	HTML chart, which it simply outputs to stdout. This chart uses the Google Chart
				120	API for plotting.
				121
				122	Options:
				123
				124	-b <desc>
				125
				126	Sets the baseline experiment description.
				127
				128	[ -x <desc> ]...
				129
				130	Sets an experiment description. Can be used multiple times.
				131
				132	-o
				133
				134	Outputs the actual compressed file, substituting ".wof2" for ".ttf" in
				135	the input file name. Only useful when a single -x parameter is specified.
				136
				137	= Decompressing the fonts =
				138
				139	See the cpp/ directory (including cpp/README) for the C++ implementation of
				140	decompression. This code is based on OTS, and successfully roundtrips the
				141	basic compression as described in the draft spec.
				142
				143	= References =
				144
				145	sfntly: http://code.google.com/p/sfntly/ Guava:
				146	http://code.google.com/p/guava-libraries/ MTX:
				147	http://www.w3.org/Submission/MTX/
				148
				149	Also please refer to documents (currently Google Docs):
				150
				151	WOFF Ultra Condensed file format: proposals and discussion of wire format
				152	issues
				153
				154	WIFF Ultra Condensed: more discussion of results and compression techniques.
				155	This tool was used to prepare the data in that document.