blob: 7714fb91b25fe9ac3391c38aa2cebfe75db37408 [file] [log] [blame]
Ted Kremenek797a2472009-04-08 05:07:30 +00001<html>
2 <head>
3 <title>Pretokenized Headers (PTH)</title>
4 <link type="text/css" rel="stylesheet" href="../menu.css" />
5 <link type="text/css" rel="stylesheet" href="../content.css" />
6 <style type="text/css">
7 td {
8 vertical-align: top;
9 }
10 </style>
11</head>
12<body>
13
14<!--#include virtual="../menu.html.incl"-->
15
16<div id="content">
17<h1>Pretokenized Headers</h1>
18
19<p> <a href="http://en.wikipedia.org/wiki/Precompiled_header">Precompiled
20headers</a> is a general approach employed by many compilers to reduce
21compilation time. The underlying motivation of the approach is that within a
22codebase frequently the same (and often large) header files are included by
23multiple source files. Consequently, compile times can often be greatly improved
24by caching some of the (redundant) work done by a compiler to process headers.
25Precompiled header files, which represent one of possibly many ways to implement
26this optimization, are literally files that represent an on-disk cache that
27contains the vital information necessary to reduce some (or all) of the work
28needed to process a corresponding header file. While details of precompiled
29headers vary between compilers, precompiled headers have been shown to be a
30highly effective at speeding up program compilation on systems with very large
31system headers (e.g., Mac OS X).</p>
32
33<p>Clang supports an implementation of precompiled headers known as
34<em>pre-tokenized headers</em> (PTH). Clang's pre-tokenized headers support most
35of same interfaces as GCC's pre-compiled headers (as well as others) but are
36completely different in their implementation. This pages first describes the
37interface for using PTH and then briefly elaborates on their design and
38implementation.</p>
39
40
41<h2>Using Pretokenized Headers (High-level Interface)</h2>
42
43<p>The high-level interface to generate a PTH file is the same as GCC's:</p>
44
45<pre>
46 $ gcc -x c-header test.h -o test.h.gch
47 $ clang -x c-header test.h -o test.h.pth
48</pre>
49
50<p>A PTH file can then be used as a prefix header when a <tt>-include</tt>
51option is passed to <tt>clang</tt>:</p>
52
53<pre>
54 $ clang -include test.h test.c -o test
55</pre>
56
57<p>The <tt>clang</tt> driver will first check if a PTH file for <tt>test.h</tt>
58is available; if so, the contents of <tt>test.h</tt> (and the files it includes)
59will be processed from the PTH file. Otherwise, <tt>clang</tt> falls back to
60directly processing the content of <tt>test.h</tt>. This mirrors the behavior of
61GCC.</p>
62
63<p><b>NOTE:</b> <tt>clang</tt> does <em>not</em> automatically used PTH files
64for headers that are directly included within a source file. For example:</p>
65
66<pre>
67 $ clang -x c-header test.h -o test.h.pth
68 $ cat test.c
69 #include "test.h"
70 $ clang test.c -o test
71</pre>
72
73<p>In this example, <tt>clang</tt> will not automatically use the PTH file for
74<tt>test.h</tt> since <tt>test.h</tt> was included directly in the source file
75and not specified on the command line using <tt>-include</tt>.</p>
76
77<h2>Using Pretokenized Headers (Low-level Interface)</h2>
78
79<p>The low-level Clang driver, <tt>clang-cc</tt>, supports three command line
80options for generating and using PTH files.<p>
81
82<p>To generate PTH files using <tt>clang-cc</tt>, use the option <tt>-emit-pth</tt>:
83
84<pre>
85 $ clang-cc test.h -emit-pth -o test.h.pth
86</pre>
87
88<p>This option is transparently used by <tt>clang</tt> when generating PTH
89files. Similarly, PTH files can be used as prefix headers using the <tt>-include-pth</tt> option:</p>
90
91<pre>
92 $ clang-cc -include-pth test.h.pth test.c -o test.s
93</pre>
94
95<p>Alternatively, Clang's PTH files can be used as a raw &quot;token-cache&quot;
96(or &quot;content&quot; cache) of the source included by the original header
97file. This means that the contents of the PTH file are searched as substitutes
98for <em>any</em> source files that are used by <tt>clang-cc</tt> to process a
99source file. This is done by specifying the <tt>-token-cache</tt> option:</p>
100
101<pre>
102 $ cat test.h
103 #include<stdio.h>
104 $ clang-cc -emit-pth test.h -o test.h.pth
105 $ cat test.c
106 #include "test.h"
107 $ clang-cc test.c -o test -token-cache test.h.pth
108</pre>
109
110<p>In this example the contents of <tt>stdio.h</tt> (and the files it includes)
111will be retrieved from <tt>test.h.pth</tt>, as the PTH file is being used in
112this case as a raw cache of the contents of <tt>test.h</tt>. This is a low-level
113interface used to both implement the high-level PTH interface as well as to
114provide alternative means to use PTH-style caching.</p>
115
116<h2>PTH Design and Implementation</h2>
117
118<p>Unlike GCC's precompiled headers, which cache the full ASTs and preprocessor
119state of a header file, Clang's pretokenized header files mainly cache the raw
120lexer <em>tokens</em> that are needed to segment the stream of characters in a
121source file into keywords, identifiers, and operators. Consequently, PTH serves
122to mainly directly speed up the lexing and preprocessing of a source file, while
123parsing and type-checking must be completely redone every time a PTH file is
124used.</p>
125
126<h3>Basic Design Tradeoffs</h3>
127
128<p>In the long term there are plans to provide an alternate PCH implementation
129for Clang that also caches the work for parsing and type checking the contents
130of header files. The current implementation of PCH in Clang as pretokenized
131header files was motivated by the following factors:<p>
132
133<ul>
134<li><p><em>Language independence</em>: PTH files are (roughly) language
135independent. They work with any language that Clang's lexer can handle,
136including C, Objective-C, and (in the early stages) C++. This means development
137on language features at the parsing level or above (which is basically almost
138all interesting pieces) does not require PTH to be modified.</p></li>
139
140<li><em>Simple design</em>: Relatively speaking, PTH has a simple design and
141implementation, making it easy to test. Further, because the machinery for PTH
142resides at the lower-levels of the Clang library stack it is fairly
143straightforward to profile and optimize.</li>
144</ul>
145
146<p>Further, compared to GCC's PCH implementation (which is the dominate
147precompiled header file implementation that Clang can be directly compared
148against) the PTH design in Clang yields several attractive features:</p>
149
150<ul>
151
152<li><p><em>Architecture independence</em>: In contrast to GCC's PCH files (and
153those of several other compilers), Clang's PTH files are architecture
154independent, requiring only a single PTH file when building an program for
155multiple architectures.</p>
156
157<p>For example, on Mac OS X one may wish to
158compile a &quot;universal binary&quot; that runs on PowerPC, 32-bit Intel
159(i386), and 64-bit Intel architectures. In contrast, GCC requires a PCH file for
160each architecture, as the definitions of types in the AST are
161architecture-specific. Since a Clang PTH file essentially represents a lexical
162cache of header files, a single PTH file can be safely used when compiling for
163multiple architectures. This can also reduce compile times because only a single
164PTH file needs to be generated during a build instead of several.</p></li>
165
166<li><p><em>Reduced memory pressure</em>: Similar to GCC,
167Clang reads PTH files via the use of memory mapping (i.e., <tt>mmap</tt>).
168Clang, however, memory maps PTH files as read-only, meaning that multiple
169invocations of <tt>clang-cc</tt> can share the same pages in memory from a
170memory-mapped PTH file. In comparison, GCC also memory maps its PCH files but
171also modifies those pages in memory, incurring the copy-on-write costs. The
172read-only nature of PTH can greatly reduce memory pressure for builds involving
173multiple cores, thus improving overall scalability.</p></li>
174
175</ul>
176
177<p>Despite these strengths, PTH's simple design suffers some algorithmic
178handicaps compared to other PCH strategies such as those used by GCC. While PTH
179can greatly speed up the processing time of a header file, the amount of work
180required to process a header file is still roughly linear in the size of the
181header file. In contrast, the amount of work done by GCC to process a
182precompiled header is (theoretically) constant (the ASTs for the header are
183literally memory mapped into the compiler). This means that only the pieces of
184the header file that are referenced by the source file including the header are
185the only ones the compiler needs to process during actual compilation. While
186GCC's particular implementation of PCH mitigates some of these algorithmic
187strengths via the use of copy-on-write pages, the approach itself can
188fundamentally dominate at an algorithmic level, especially when one considers
189header files of arbitrary size.</p>
190
191<p>Consequently, as alluded earlier, there are plans to potentially implement an
192alternative PCH implementation for Clang based on the lazy deserialization of
193ASTs. This approach would theoretically have the same constant-time algorithmic
194advantages just mentioned but would also retain some of the strengths of PTH
195such as reduced memory pressure (ideal for multi-core builds).</p>
196
197<h3>Internal PTH Optimizations</h3>
198
199<p>While the main optimization employed by PTH is to reduce lexing time of
200header files by caching pre-lexed tokens, PTH also employs several other
201optimizations to speed up the processing of header files:</p>
202
203<ul>
204
205<li><p><em><tt>stat</tt> caching</em>: PTH files cache information obtained via
206calls to <tt>stat</tt> that <tt>clang-cc</tt> uses to resolve which files are
207included by <tt>#include</tt> directives. This greatly reduces the overhead
208involved in context-switching to the kernel to resolve included files.</p></li>
209
210<li><p><em>Fasting skipping of <tt>#ifdef</tt>...<tt>#endif</tt> chains</em>:
211PTH files record the basic structure of nested preprocessor blocks. When the
212condition of the preprocessor block is false, all of its tokens are immediately
213skipped instead of requiring them to be handled by Clang's
214preprocessor.</p></li>
215
216</ul>
217
218</div>
219</body>
220</html>