| <html> |
| <head> |
| <title>Pretokenized Headers (PTH)</title> |
| <link type="text/css" rel="stylesheet" href="../menu.css" /> |
| <link type="text/css" rel="stylesheet" href="../content.css" /> |
| <style type="text/css"> |
| td { |
| vertical-align: top; |
| } |
| </style> |
| </head> |
| <body> |
| |
| <!--#include virtual="../menu.html.incl"--> |
| |
| <div id="content"> |
| <h1>Pretokenized Headers</h1> |
| |
| <p> <a href="http://en.wikipedia.org/wiki/Precompiled_header">Precompiled |
| headers</a> are a general approach employed by many compilers to reduce |
| compilation time. The underlying motivation of the approach is that it is |
| common for the same (and often large) header files to be included by |
| multiple source files. Consequently, compile times can often be greatly improved |
| by caching some of the (redundant) work done by a compiler to process headers. |
| Precompiled header files, which represent one of many ways to implement |
| this optimization, are literally files that represent an on-disk cache that |
| contains the vital information necessary to reduce some (or all) of the work |
| needed to process a corresponding header file. While details of precompiled |
| headers vary between compilers, precompiled headers have been shown to be a |
| highly effective at speeding up program compilation on systems with very large |
| system headers (e.g., Mac OS/X).</p> |
| |
| <p>Clang supports an implementation of precompiled headers known as |
| <em>pre-tokenized headers</em> (PTH). Clang's pre-tokenized headers support most |
| of same interfaces as GCC's pre-compiled headers (as well as others) but are |
| completely different in their implementation. This first describes the interface |
| for using PTH and then briefly elaborates on its design and implementation.</p> |
| |
| |
| <h2>Using Pretokenized Headers with <tt>clang</tt></h2> |
| |
| <p>The high-level <tt>clang</tt> driver supports an interface to use PTH files |
| that is similar to GCC's interface for precompiled headers.</p> |
| |
| <h3>Generating a PTH File</h3> |
| |
| <p>To generate a PTH file using <tt>clang</tt>, one invokes <tt>clang</tt> using |
| the <b><tt>-x <i><language></i>-header</tt></b> option. This mirrors the |
| interface in GCC for generating PCH files:</p> |
| |
| <pre> |
| $ gcc -x c-header test.h -o test.h.gch |
| $ clang -x c-header test.h -o test.h.pth |
| </pre> |
| |
| <h3>Using a PTH File</h3> |
| |
| <p>A PTH file can then be used as a prefix header when a |
| <b><tt>-include</tt></b> option is passed to <tt>clang</tt>:</p> |
| |
| <pre> |
| $ clang -include test.h test.c -o test |
| </pre> |
| |
| <p>The <tt>clang</tt> driver will first check if a PTH file for <tt>test.h</tt> |
| is available; if so, the contents of <tt>test.h</tt> (and the files it includes) |
| will be processed from the PTH file. Otherwise, <tt>clang</tt> falls back to |
| directly processing the content of <tt>test.h</tt>. This mirrors the behavior of |
| GCC.</p> |
| |
| <p><b>NOTE:</b> <tt>clang</tt> does <em>not</em> automatically used PTH files |
| for headers that are directly included within a source file. For example:</p> |
| |
| <pre> |
| $ clang -x c-header test.h -o test.h.pth |
| $ cat test.c |
| #include "test.h" |
| $ clang test.c -o test |
| </pre> |
| |
| <p>In this example, <tt>clang</tt> will not automatically use the PTH file for |
| <tt>test.h</tt> since <tt>test.h</tt> was included directly in the source file |
| and not specified on the command line using <tt>-include</tt>.</p> |
| |
| <h2>Using Pretokenized Headers with <tt>clang-cc</tt> (Low-level Interface)</h2> |
| |
| <p>The low-level Clang compiler tool, <tt>clang-cc</tt>, supports three command |
| line options for generating and using PTH files.<p> |
| |
| <p>To generate PTH files using <tt>clang-cc</tt>, use the option |
| <b><tt>-emit-pth</tt></b>: |
| |
| <pre> $ clang-cc test.h -emit-pth -o test.h.pth </pre> |
| |
| <p>This option is transparently used by <tt>clang</tt> when generating PTH |
| files. Similarly, PTH files can be used as prefix headers using the |
| <b><tt>-include-pth</tt></b> option:</p> |
| |
| <pre> |
| $ clang-cc -include-pth test.h.pth test.c -o test.s |
| </pre> |
| |
| <p>Alternatively, Clang's PTH files can be used as a raw "token-cache" |
| (or "content" cache) of the source included by the original header |
| file. This means that the contents of the PTH file are searched as substitutes |
| for <em>any</em> source files that are used by <tt>clang-cc</tt> to process a |
| source file. This is done by specifying the <b><tt>-token-cache</tt></b> |
| option:</p> |
| |
| <pre> |
| $ cat test.h |
| #include <stdio.h> |
| $ clang-cc -emit-pth test.h -o test.h.pth |
| $ cat test.c |
| #include "test.h" |
| $ clang-cc test.c -o test -token-cache test.h.pth |
| </pre> |
| |
| <p>In this example the contents of <tt>stdio.h</tt> (and the files it includes) |
| will be retrieved from <tt>test.h.pth</tt>, as the PTH file is being used in |
| this case as a raw cache of the contents of <tt>test.h</tt>. This is a low-level |
| interface used to both implement the high-level PTH interface as well as to |
| provide alternative means to use PTH-style caching.</p> |
| |
| <h2>PTH Design and Implementation</h2> |
| |
| <p>Unlike GCC's precompiled headers, which cache the full ASTs and preprocessor |
| state of a header file, Clang's pretokenized header files mainly cache the raw |
| lexer <em>tokens</em> that are needed to segment the stream of characters in a |
| source file into keywords, identifiers, and operators. Consequently, PTH serves |
| to mainly directly speed up the lexing and preprocessing of a source file, while |
| parsing and type-checking must be completely redone every time a PTH file is |
| used.</p> |
| |
| <h3>Basic Design Tradeoffs</h3> |
| |
| <p>In the long term there are plans to provide an alternate PCH implementation |
| for Clang that also caches the work for parsing and type checking the contents |
| of header files. The current implementation of PCH in Clang as pretokenized |
| header files was motivated by the following factors:<p> |
| |
| <ul> |
| |
| <li><p><b>Language independence</b>: PTH files work with any language that |
| Clang's lexer can handle, including C, Objective-C, and (in the early stages) |
| C++. This means development on language features at the parsing level or above |
| (which is basically almost all interesting pieces) does not require PTH to be |
| modified.</p></li> |
| |
| <li><b>Simple design</b>: Relatively speaking, PTH has a simple design and |
| implementation, making it easy to test. Further, because the machinery for PTH |
| resides at the lower-levels of the Clang library stack it is fairly |
| straightforward to profile and optimize.</li> |
| </ul> |
| |
| <p>Further, compared to GCC's PCH implementation (which is the dominate |
| precompiled header file implementation that Clang can be directly compared |
| against) the PTH design in Clang yields several attractive features:</p> |
| |
| <ul> |
| |
| <li><p><b>Architecture independence</b>: In contrast to GCC's PCH files (and |
| those of several other compilers), Clang's PTH files are architecture |
| independent, requiring only a single PTH file when building an program for |
| multiple architectures.</p> |
| |
| <p>For example, on Mac OS X one may wish to |
| compile a "universal binary" that runs on PowerPC, 32-bit Intel |
| (i386), and 64-bit Intel architectures. In contrast, GCC requires a PCH file for |
| each architecture, as the definitions of types in the AST are |
| architecture-specific. Since a Clang PTH file essentially represents a lexical |
| cache of header files, a single PTH file can be safely used when compiling for |
| multiple architectures. This can also reduce compile times because only a single |
| PTH file needs to be generated during a build instead of several.</p></li> |
| |
| <li><p><b>Reduced memory pressure</b>: Similar to GCC, |
| Clang reads PTH files via the use of memory mapping (i.e., <tt>mmap</tt>). |
| Clang, however, memory maps PTH files as read-only, meaning that multiple |
| invocations of <tt>clang-cc</tt> can share the same pages in memory from a |
| memory-mapped PTH file. In comparison, GCC also memory maps its PCH files but |
| also modifies those pages in memory, incurring the copy-on-write costs. The |
| read-only nature of PTH can greatly reduce memory pressure for builds involving |
| multiple cores, thus improving overall scalability.</p></li> |
| |
| <li><p><b>Fast generation</b>: PTH files can be generated in a small fraction |
| of the time needed to generate GCC's PCH files. Since PTH/PCH generation is a |
| serial operation that typically blocks progress during a build, faster |
| generation time leads to improved processor utilization with parallel builds on |
| multicore machines.</p></li> |
| |
| </ul> |
| |
| <p>Despite these strengths, PTH's simple design suffers some algorithmic |
| handicaps compared to other PCH strategies such as those used by GCC. While PTH |
| can greatly speed up the processing time of a header file, the amount of work |
| required to process a header file is still roughly linear in the size of the |
| header file. In contrast, the amount of work done by GCC to process a |
| precompiled header is (theoretically) constant (the ASTs for the header are |
| literally memory mapped into the compiler). This means that only the pieces of |
| the header file that are referenced by the source file including the header are |
| the only ones the compiler needs to process during actual compilation. While |
| GCC's particular implementation of PCH mitigates some of these algorithmic |
| strengths via the use of copy-on-write pages, the approach itself can |
| fundamentally dominate at an algorithmic level, especially when one considers |
| header files of arbitrary size.</p> |
| |
| <p>There are plans to potentially implement an complementary PCH implementation |
| for Clang based on the lazy deserialization of ASTs. This approach would |
| theoretically have the same constant-time algorithmic advantages just mentioned |
| but would also retain some of the strengths of PTH such as reduced memory |
| pressure (ideal for multi-core builds).</p> |
| |
| <h3>Internal PTH Optimizations</h3> |
| |
| <p>While the main optimization employed by PTH is to reduce lexing time of |
| header files by caching pre-lexed tokens, PTH also employs several other |
| optimizations to speed up the processing of header files:</p> |
| |
| <ul> |
| |
| <li><p><em><tt>stat</tt> caching</em>: PTH files cache information obtained via |
| calls to <tt>stat</tt> that <tt>clang-cc</tt> uses to resolve which files are |
| included by <tt>#include</tt> directives. This greatly reduces the overhead |
| involved in context-switching to the kernel to resolve included files.</p></li> |
| |
| <li><p><em>Fasting skipping of <tt>#ifdef</tt>...<tt>#endif</tt> chains</em>: |
| PTH files record the basic structure of nested preprocessor blocks. When the |
| condition of the preprocessor block is false, all of its tokens are immediately |
| skipped instead of requiring them to be handled by Clang's |
| preprocessor.</p></li> |
| |
| </ul> |
| |
| </div> |
| </body> |
| </html> |