Sean Silva | 93ca021 | 2012-12-13 01:10:46 +0000 | [diff] [blame] | 1 | ========================== |
| 2 | Pretokenized Headers (PTH) |
| 3 | ========================== |
| 4 | |
| 5 | This document first describes the low-level interface for using PTH and |
| 6 | then briefly elaborates on its design and implementation. If you are |
Dmitri Gribenko | 97555a1 | 2012-12-15 21:10:51 +0000 | [diff] [blame] | 7 | interested in the end-user view, please see the :ref:`User's Manual |
| 8 | <usersmanual-precompiled-headers>`. |
Sean Silva | 93ca021 | 2012-12-13 01:10:46 +0000 | [diff] [blame] | 9 | |
| 10 | Using Pretokenized Headers with ``clang`` (Low-level Interface) |
| 11 | =============================================================== |
| 12 | |
| 13 | The Clang compiler frontend, ``clang -cc1``, supports three command line |
| 14 | options for generating and using PTH files. |
| 15 | |
Dmitri Gribenko | 97555a1 | 2012-12-15 21:10:51 +0000 | [diff] [blame] | 16 | To generate PTH files using ``clang -cc1``, use the option ``-emit-pth``: |
Sean Silva | 93ca021 | 2012-12-13 01:10:46 +0000 | [diff] [blame] | 17 | |
Dmitri Gribenko | 97555a1 | 2012-12-15 21:10:51 +0000 | [diff] [blame] | 18 | .. code-block:: console |
Sean Silva | 93ca021 | 2012-12-13 01:10:46 +0000 | [diff] [blame] | 19 | |
Dmitri Gribenko | 97555a1 | 2012-12-15 21:10:51 +0000 | [diff] [blame] | 20 | $ clang -cc1 test.h -emit-pth -o test.h.pth |
Sean Silva | 93ca021 | 2012-12-13 01:10:46 +0000 | [diff] [blame] | 21 | |
| 22 | This option is transparently used by ``clang`` when generating PTH |
| 23 | files. Similarly, PTH files can be used as prefix headers using the |
| 24 | ``-include-pth`` option: |
| 25 | |
Dmitri Gribenko | 97555a1 | 2012-12-15 21:10:51 +0000 | [diff] [blame] | 26 | .. code-block:: console |
Sean Silva | 93ca021 | 2012-12-13 01:10:46 +0000 | [diff] [blame] | 27 | |
Dmitri Gribenko | 97555a1 | 2012-12-15 21:10:51 +0000 | [diff] [blame] | 28 | $ clang -cc1 -include-pth test.h.pth test.c -o test.s |
Sean Silva | 93ca021 | 2012-12-13 01:10:46 +0000 | [diff] [blame] | 29 | |
| 30 | Alternatively, Clang's PTH files can be used as a raw "token-cache" (or |
| 31 | "content" cache) of the source included by the original header file. |
| 32 | This means that the contents of the PTH file are searched as substitutes |
| 33 | for *any* source files that are used by ``clang -cc1`` to process a |
| 34 | source file. This is done by specifying the ``-token-cache`` option: |
| 35 | |
Dmitri Gribenko | 97555a1 | 2012-12-15 21:10:51 +0000 | [diff] [blame] | 36 | .. code-block:: console |
Sean Silva | 93ca021 | 2012-12-13 01:10:46 +0000 | [diff] [blame] | 37 | |
Dmitri Gribenko | 97555a1 | 2012-12-15 21:10:51 +0000 | [diff] [blame] | 38 | $ cat test.h |
| 39 | #include <stdio.h> |
| 40 | $ clang -cc1 -emit-pth test.h -o test.h.pth |
| 41 | $ cat test.c |
| 42 | #include "test.h" |
| 43 | $ clang -cc1 test.c -o test -token-cache test.h.pth |
Sean Silva | 93ca021 | 2012-12-13 01:10:46 +0000 | [diff] [blame] | 44 | |
| 45 | In this example the contents of ``stdio.h`` (and the files it includes) |
| 46 | will be retrieved from ``test.h.pth``, as the PTH file is being used in |
| 47 | this case as a raw cache of the contents of ``test.h``. This is a |
| 48 | low-level interface used to both implement the high-level PTH interface |
| 49 | as well as to provide alternative means to use PTH-style caching. |
| 50 | |
| 51 | PTH Design and Implementation |
| 52 | ============================= |
| 53 | |
| 54 | Unlike GCC's precompiled headers, which cache the full ASTs and |
| 55 | preprocessor state of a header file, Clang's pretokenized header files |
| 56 | mainly cache the raw lexer *tokens* that are needed to segment the |
| 57 | stream of characters in a source file into keywords, identifiers, and |
| 58 | operators. Consequently, PTH serves to mainly directly speed up the |
| 59 | lexing and preprocessing of a source file, while parsing and |
| 60 | type-checking must be completely redone every time a PTH file is used. |
| 61 | |
| 62 | Basic Design Tradeoffs |
Dmitri Gribenko | 029e70c | 2012-12-23 18:39:54 +0000 | [diff] [blame] | 63 | ---------------------- |
Sean Silva | 93ca021 | 2012-12-13 01:10:46 +0000 | [diff] [blame] | 64 | |
| 65 | In the long term there are plans to provide an alternate PCH |
| 66 | implementation for Clang that also caches the work for parsing and type |
| 67 | checking the contents of header files. The current implementation of PCH |
| 68 | in Clang as pretokenized header files was motivated by the following |
| 69 | factors: |
| 70 | |
| 71 | **Language independence** |
| 72 | PTH files work with any language that |
| 73 | Clang's lexer can handle, including C, Objective-C, and (in the early |
| 74 | stages) C++. This means development on language features at the |
| 75 | parsing level or above (which is basically almost all interesting |
| 76 | pieces) does not require PTH to be modified. |
| 77 | |
| 78 | **Simple design** |
| 79 | Relatively speaking, PTH has a simple design and |
| 80 | implementation, making it easy to test. Further, because the |
| 81 | machinery for PTH resides at the lower-levels of the Clang library |
| 82 | stack it is fairly straightforward to profile and optimize. |
| 83 | |
| 84 | Further, compared to GCC's PCH implementation (which is the dominate |
| 85 | precompiled header file implementation that Clang can be directly |
| 86 | compared against) the PTH design in Clang yields several attractive |
| 87 | features: |
| 88 | |
| 89 | **Architecture independence** |
| 90 | In contrast to GCC's PCH files (and |
| 91 | those of several other compilers), Clang's PTH files are architecture |
Dmitri Gribenko | 029e70c | 2012-12-23 18:39:54 +0000 | [diff] [blame] | 92 | independent, requiring only a single PTH file when building a |
Sean Silva | 93ca021 | 2012-12-13 01:10:46 +0000 | [diff] [blame] | 93 | program for multiple architectures. |
| 94 | |
| 95 | For example, on Mac OS X one may wish to compile a "universal binary" |
| 96 | that runs on PowerPC, 32-bit Intel (i386), and 64-bit Intel |
| 97 | architectures. In contrast, GCC requires a PCH file for each |
| 98 | architecture, as the definitions of types in the AST are |
| 99 | architecture-specific. Since a Clang PTH file essentially represents |
| 100 | a lexical cache of header files, a single PTH file can be safely used |
| 101 | when compiling for multiple architectures. This can also reduce |
| 102 | compile times because only a single PTH file needs to be generated |
| 103 | during a build instead of several. |
| 104 | |
| 105 | **Reduced memory pressure** |
| 106 | Similar to GCC, Clang reads PTH files |
| 107 | via the use of memory mapping (i.e., ``mmap``). Clang, however, |
| 108 | memory maps PTH files as read-only, meaning that multiple invocations |
| 109 | of ``clang -cc1`` can share the same pages in memory from a |
| 110 | memory-mapped PTH file. In comparison, GCC also memory maps its PCH |
| 111 | files but also modifies those pages in memory, incurring the |
| 112 | copy-on-write costs. The read-only nature of PTH can greatly reduce |
| 113 | memory pressure for builds involving multiple cores, thus improving |
| 114 | overall scalability. |
| 115 | |
| 116 | **Fast generation** |
| 117 | PTH files can be generated in a small fraction |
| 118 | of the time needed to generate GCC's PCH files. Since PTH/PCH |
| 119 | generation is a serial operation that typically blocks progress |
| 120 | during a build, faster generation time leads to improved processor |
| 121 | utilization with parallel builds on multicore machines. |
| 122 | |
| 123 | Despite these strengths, PTH's simple design suffers some algorithmic |
| 124 | handicaps compared to other PCH strategies such as those used by GCC. |
| 125 | While PTH can greatly speed up the processing time of a header file, the |
| 126 | amount of work required to process a header file is still roughly linear |
| 127 | in the size of the header file. In contrast, the amount of work done by |
| 128 | GCC to process a precompiled header is (theoretically) constant (the |
| 129 | ASTs for the header are literally memory mapped into the compiler). This |
| 130 | means that only the pieces of the header file that are referenced by the |
| 131 | source file including the header are the only ones the compiler needs to |
| 132 | process during actual compilation. While GCC's particular implementation |
| 133 | of PCH mitigates some of these algorithmic strengths via the use of |
| 134 | copy-on-write pages, the approach itself can fundamentally dominate at |
| 135 | an algorithmic level, especially when one considers header files of |
| 136 | arbitrary size. |
| 137 | |
| 138 | There are plans to potentially implement an complementary PCH |
| 139 | implementation for Clang based on the lazy deserialization of ASTs. This |
| 140 | approach would theoretically have the same constant-time algorithmic |
| 141 | advantages just mentioned but would also retain some of the strengths of |
| 142 | PTH such as reduced memory pressure (ideal for multi-core builds). |
| 143 | |
| 144 | Internal PTH Optimizations |
Dmitri Gribenko | 029e70c | 2012-12-23 18:39:54 +0000 | [diff] [blame] | 145 | -------------------------- |
Sean Silva | 93ca021 | 2012-12-13 01:10:46 +0000 | [diff] [blame] | 146 | |
| 147 | While the main optimization employed by PTH is to reduce lexing time of |
| 148 | header files by caching pre-lexed tokens, PTH also employs several other |
| 149 | optimizations to speed up the processing of header files: |
| 150 | |
| 151 | - ``stat`` caching: PTH files cache information obtained via calls to |
| 152 | ``stat`` that ``clang -cc1`` uses to resolve which files are included |
| 153 | by ``#include`` directives. This greatly reduces the overhead |
| 154 | involved in context-switching to the kernel to resolve included |
| 155 | files. |
| 156 | |
Dmitri Gribenko | 029e70c | 2012-12-23 18:39:54 +0000 | [diff] [blame] | 157 | - Fast skipping of ``#ifdef`` ... ``#endif`` chains: PTH files |
Sean Silva | 93ca021 | 2012-12-13 01:10:46 +0000 | [diff] [blame] | 158 | record the basic structure of nested preprocessor blocks. When the |
| 159 | condition of the preprocessor block is false, all of its tokens are |
| 160 | immediately skipped instead of requiring them to be handled by |
| 161 | Clang's preprocessor. |
| 162 | |
| 163 | |