blob: 2d2e1e3e1ba40c7716270726763db8ebee152eac [file] [log] [blame]
<html>
<head>
<title>Pretokenized Headers (PTH)</title>
<link type="text/css" rel="stylesheet" href="../menu.css" />
<link type="text/css" rel="stylesheet" href="../content.css" />
<style type="text/css">
td {
vertical-align: top;
}
</style>
</head>
<body>
<!--#include virtual="../menu.html.incl"-->
<div id="content">
<h1>Pretokenized Headers</h1>
<p> <a href="http://en.wikipedia.org/wiki/Precompiled_header">Precompiled
headers</a> are a general approach employed by many compilers to reduce
compilation time. The underlying motivation of the approach is that it is
common for the same (and often large) header files to be included by
multiple source files. Consequently, compile times can often be greatly improved
by caching some of the (redundant) work done by a compiler to process headers.
Precompiled header files, which represent one of many ways to implement
this optimization, are literally files that represent an on-disk cache that
contains the vital information necessary to reduce some (or all) of the work
needed to process a corresponding header file. While details of precompiled
headers vary between compilers, precompiled headers have been shown to be a
highly effective at speeding up program compilation on systems with very large
system headers (e.g., Mac OS/X).</p>
<p>Clang supports an implementation of precompiled headers known as
<em>pre-tokenized headers</em> (PTH). Clang's pre-tokenized headers support most
of same interfaces as GCC's pre-compiled headers (as well as others) but are
completely different in their implementation. This first describes the interface
for using PTH and then briefly elaborates on its design and implementation.</p>
<h2>Using Pretokenized Headers with <tt>clang</tt></h2>
<p>The high-level <tt>clang</tt> driver supports an interface to use PTH files
that is similar to GCC's interface for precompiled headers.</p>
<h3>Generating a PTH File</h3>
<p>To generate a PTH file using <tt>clang</tt>, one invokes <tt>clang</tt> using
the <b><tt>-x <i>&lt;language&gt;</i>-header</tt></b> option. This mirrors the
interface in GCC for generating PCH files:</p>
<pre>
$ gcc -x c-header test.h -o test.h.gch
$ clang -x c-header test.h -o test.h.pth
</pre>
<h3>Using a PTH File</h3>
<p>A PTH file can then be used as a prefix header when a
<b><tt>-include</tt></b> option is passed to <tt>clang</tt>:</p>
<pre>
$ clang -include test.h test.c -o test
</pre>
<p>The <tt>clang</tt> driver will first check if a PTH file for <tt>test.h</tt>
is available; if so, the contents of <tt>test.h</tt> (and the files it includes)
will be processed from the PTH file. Otherwise, <tt>clang</tt> falls back to
directly processing the content of <tt>test.h</tt>. This mirrors the behavior of
GCC.</p>
<p><b>NOTE:</b> <tt>clang</tt> does <em>not</em> automatically used PTH files
for headers that are directly included within a source file. For example:</p>
<pre>
$ clang -x c-header test.h -o test.h.pth
$ cat test.c
#include "test.h"
$ clang test.c -o test
</pre>
<p>In this example, <tt>clang</tt> will not automatically use the PTH file for
<tt>test.h</tt> since <tt>test.h</tt> was included directly in the source file
and not specified on the command line using <tt>-include</tt>.</p>
<h2>Using Pretokenized Headers with <tt>clang-cc</tt> (Low-level Interface)</h2>
<p>The low-level Clang compiler tool, <tt>clang-cc</tt>, supports three command
line options for generating and using PTH files.<p>
<p>To generate PTH files using <tt>clang-cc</tt>, use the option
<b><tt>-emit-pth</tt></b>:
<pre> $ clang-cc test.h -emit-pth -o test.h.pth </pre>
<p>This option is transparently used by <tt>clang</tt> when generating PTH
files. Similarly, PTH files can be used as prefix headers using the
<b><tt>-include-pth</tt></b> option:</p>
<pre>
$ clang-cc -include-pth test.h.pth test.c -o test.s
</pre>
<p>Alternatively, Clang's PTH files can be used as a raw &quot;token-cache&quot;
(or &quot;content&quot; cache) of the source included by the original header
file. This means that the contents of the PTH file are searched as substitutes
for <em>any</em> source files that are used by <tt>clang-cc</tt> to process a
source file. This is done by specifying the <b><tt>-token-cache</tt></b>
option:</p>
<pre>
$ cat test.h
#include &lt;stdio.h&gt;
$ clang-cc -emit-pth test.h -o test.h.pth
$ cat test.c
#include "test.h"
$ clang-cc test.c -o test -token-cache test.h.pth
</pre>
<p>In this example the contents of <tt>stdio.h</tt> (and the files it includes)
will be retrieved from <tt>test.h.pth</tt>, as the PTH file is being used in
this case as a raw cache of the contents of <tt>test.h</tt>. This is a low-level
interface used to both implement the high-level PTH interface as well as to
provide alternative means to use PTH-style caching.</p>
<h2>PTH Design and Implementation</h2>
<p>Unlike GCC's precompiled headers, which cache the full ASTs and preprocessor
state of a header file, Clang's pretokenized header files mainly cache the raw
lexer <em>tokens</em> that are needed to segment the stream of characters in a
source file into keywords, identifiers, and operators. Consequently, PTH serves
to mainly directly speed up the lexing and preprocessing of a source file, while
parsing and type-checking must be completely redone every time a PTH file is
used.</p>
<h3>Basic Design Tradeoffs</h3>
<p>In the long term there are plans to provide an alternate PCH implementation
for Clang that also caches the work for parsing and type checking the contents
of header files. The current implementation of PCH in Clang as pretokenized
header files was motivated by the following factors:<p>
<ul>
<li><p><b>Language independence</b>: PTH files work with any language that
Clang's lexer can handle, including C, Objective-C, and (in the early stages)
C++. This means development on language features at the parsing level or above
(which is basically almost all interesting pieces) does not require PTH to be
modified.</p></li>
<li><b>Simple design</b>: Relatively speaking, PTH has a simple design and
implementation, making it easy to test. Further, because the machinery for PTH
resides at the lower-levels of the Clang library stack it is fairly
straightforward to profile and optimize.</li>
</ul>
<p>Further, compared to GCC's PCH implementation (which is the dominate
precompiled header file implementation that Clang can be directly compared
against) the PTH design in Clang yields several attractive features:</p>
<ul>
<li><p><b>Architecture independence</b>: In contrast to GCC's PCH files (and
those of several other compilers), Clang's PTH files are architecture
independent, requiring only a single PTH file when building an program for
multiple architectures.</p>
<p>For example, on Mac OS X one may wish to
compile a &quot;universal binary&quot; that runs on PowerPC, 32-bit Intel
(i386), and 64-bit Intel architectures. In contrast, GCC requires a PCH file for
each architecture, as the definitions of types in the AST are
architecture-specific. Since a Clang PTH file essentially represents a lexical
cache of header files, a single PTH file can be safely used when compiling for
multiple architectures. This can also reduce compile times because only a single
PTH file needs to be generated during a build instead of several.</p></li>
<li><p><b>Reduced memory pressure</b>: Similar to GCC,
Clang reads PTH files via the use of memory mapping (i.e., <tt>mmap</tt>).
Clang, however, memory maps PTH files as read-only, meaning that multiple
invocations of <tt>clang-cc</tt> can share the same pages in memory from a
memory-mapped PTH file. In comparison, GCC also memory maps its PCH files but
also modifies those pages in memory, incurring the copy-on-write costs. The
read-only nature of PTH can greatly reduce memory pressure for builds involving
multiple cores, thus improving overall scalability.</p></li>
<li><p><b>Fast generation</b>: PTH files can be generated in a small fraction
of the time needed to generate GCC's PCH files. Since PTH/PCH generation is a
serial operation that typically blocks progress during a build, faster
generation time leads to improved processor utilization with parallel builds on
multicore machines.</p></li>
</ul>
<p>Despite these strengths, PTH's simple design suffers some algorithmic
handicaps compared to other PCH strategies such as those used by GCC. While PTH
can greatly speed up the processing time of a header file, the amount of work
required to process a header file is still roughly linear in the size of the
header file. In contrast, the amount of work done by GCC to process a
precompiled header is (theoretically) constant (the ASTs for the header are
literally memory mapped into the compiler). This means that only the pieces of
the header file that are referenced by the source file including the header are
the only ones the compiler needs to process during actual compilation. While
GCC's particular implementation of PCH mitigates some of these algorithmic
strengths via the use of copy-on-write pages, the approach itself can
fundamentally dominate at an algorithmic level, especially when one considers
header files of arbitrary size.</p>
<p>There are plans to potentially implement an complementary PCH implementation
for Clang based on the lazy deserialization of ASTs. This approach would
theoretically have the same constant-time algorithmic advantages just mentioned
but would also retain some of the strengths of PTH such as reduced memory
pressure (ideal for multi-core builds).</p>
<h3>Internal PTH Optimizations</h3>
<p>While the main optimization employed by PTH is to reduce lexing time of
header files by caching pre-lexed tokens, PTH also employs several other
optimizations to speed up the processing of header files:</p>
<ul>
<li><p><em><tt>stat</tt> caching</em>: PTH files cache information obtained via
calls to <tt>stat</tt> that <tt>clang-cc</tt> uses to resolve which files are
included by <tt>#include</tt> directives. This greatly reduces the overhead
involved in context-switching to the kernel to resolve included files.</p></li>
<li><p><em>Fasting skipping of <tt>#ifdef</tt>...<tt>#endif</tt> chains</em>:
PTH files record the basic structure of nested preprocessor blocks. When the
condition of the preprocessor block is false, all of its tokens are immediately
skipped instead of requiring them to be handled by Clang's
preprocessor.</p></li>
</ul>
</div>
</body>
</html>