Douglas Gregor | 32110df | 2009-05-20 00:16:32 +0000 | [diff] [blame] | 1 | <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> |
| 2 | <html> <head> |
| 3 | <title>Precompiled Headers (PCH)</title> |
| 4 | </head> |
| 5 | |
| 6 | <body> |
| 7 | |
| 8 | <!--#include virtual="../menu.html.incl"--> |
| 9 | |
| 10 | <div id="content"> |
| 11 | |
| 12 | <h1>Precompiled Headers</h1> |
| 13 | |
| 14 | <p>This document describes the design and implementation of Clang's |
| 15 | precompiled headers (PCH). If you are interested in the end-user |
| 16 | view, please see the <a |
| 17 | href="UsersManual.html#precompiledheaders">User's Manual</a>.</p> |
| 18 | |
| 19 | <h2>Using precompiled headers with <tt>clang-cc</tt></h2> |
| 20 | |
| 21 | <p>The low-level Clang compiler, <tt>clang-cc</tt>, supports two command |
| 22 | line options for generating and using PCH files.<p> |
| 23 | |
| 24 | <p>To generate PCH files using <tt>clang-cc</tt>, use the option |
| 25 | <b><tt>-emit-pch</tt></b>: |
| 26 | |
| 27 | <pre> $ clang-cc test.h -emit-pch -o test.h.pch </pre> |
| 28 | |
| 29 | <p>This option is transparently used by <tt>clang</tt> when generating |
| 30 | PCH files. The resulting PCH file contains the serialized form of the |
| 31 | compiler's internal representation after it has completed parsing and |
| 32 | semantic analysis. The PCH file can then be used as a prefix header |
| 33 | with the <b><tt>-include-pch</tt></b> option:</p> |
| 34 | |
| 35 | <pre> |
| 36 | $ clang-cc -include-pch test.h.pch test.c -o test.s |
| 37 | </pre> |
| 38 | |
| 39 | <h2>PCH Design Philosophy</h2> |
| 40 | |
| 41 | <p>Precompiled headers are meant to improve overall compile times for |
| 42 | projects, so the design of precompiled headers is entirely driven by |
| 43 | performance concerns. The use case for precompiled headers is |
| 44 | relatively simple: when there is a common set of headers that is |
| 45 | included in nearly every source file in the project, we |
| 46 | <i>precompile</i> that bundle of headers into a single precompiled |
| 47 | header (PCH file). Then, when compiling the source files in the |
| 48 | project, we load the PCH file first (as a prefix header), which acts |
| 49 | as a stand-in for that bundle of headers.</p> |
| 50 | |
| 51 | <p>A precompiled header implementation improves performance when:</p> |
| 52 | <ul> |
| 53 | <li>Loading the PCH file is significantly faster than re-parsing the |
| 54 | bundle of headers stored within the PCH file. Thus, a precompiled |
| 55 | header design attempts to minimize the cost of reading the PCH |
| 56 | file. Ideally, this cost should not vary with the size of the |
| 57 | precompiled header file.</li> |
| 58 | |
| 59 | <li>The cost of generating the PCH file initially is not so large |
| 60 | that it counters the per-source-file performance improvement due to |
| 61 | eliminating the need to parse the bundled headers in the first |
| 62 | place. This is particularly important on multi-core systems, because |
| 63 | PCH file generation serializes the build when all compilations |
| 64 | require the PCH file to be up-to-date.</li> |
| 65 | </ul> |
Douglas Gregor | 2cc390e | 2009-06-02 22:08:07 +0000 | [diff] [blame] | 66 | |
| 67 | <p>Clang's precompiled headers are designed with a compact on-disk |
| 68 | representation, which minimizes both PCH creation time and the time |
| 69 | required to initially load the PCH file. The PCH file itself contains |
| 70 | a serialized representation of Clang's abstract syntax trees and |
| 71 | supporting data structures, stored using the same compressed bitstream |
| 72 | as <a href="http://llvm.org/docs/BitCodeFormat.html">LLVM's bitcode |
| 73 | file format</a>.</p> |
| 74 | |
| 75 | <p>Clang's precompiled headers are loaded "lazily" from disk. When a |
| 76 | PCH file is initially loaded, Clang reads only a small amount of data |
| 77 | from the PCH file to establish where certain important data structures |
| 78 | are stored. The amount of data read in this initial load is |
| 79 | independent of the size of the PCH file, such that a larger PCH file |
| 80 | does not lead to longer PCH load times. The actual header data in the |
| 81 | PCH file--macros, functions, variables, types, etc.--is loaded only |
| 82 | when it is referenced from the user's code, at which point only that |
| 83 | entity (and those entities it depends on) are deserialized from the |
| 84 | PCH file. With this approach, the cost of using a precompiled header |
| 85 | for a translation unit is proportional to the amount of code actually |
| 86 | used from the header, rather than being proportional to the size of |
| 87 | the header itself.</p> </body> |
| 88 | |
| 89 | <h2>Precompiled Header Contents</h2> |
| 90 | |
| 91 | <img src="PCHLayout.png" align="right" alt="Precompiled header layout"> |
| 92 | |
| 93 | <p>Clang's precompiled headers are organized into several different |
| 94 | blocks, each of which contains the serialized representation of a part |
| 95 | of Clang's internal representation. Each of the blocks corresponds to |
| 96 | either a block or a record within <a |
| 97 | href="http://llvm.org/docs/BitCodeFormat.html">LLVM's bitstream |
| 98 | format</a>. The contents of each of these logical blocks are described |
| 99 | below.</p> |
| 100 | |
| 101 | <h3 name="metadata">Metadata Block</h3> |
| 102 | |
| 103 | <p>The metadata block contains several records that provide |
| 104 | information about how the precompiled header was built. This metadata |
| 105 | is primarily used to validate the use of a precompiled header. For |
| 106 | example, a precompiled header built for x86 (32-bit) cannot be used |
| 107 | when compiling for x86-64 (64-bit). The metadata block contains |
| 108 | information about:</p> |
| 109 | |
| 110 | <dl> |
| 111 | <dt>Language options</dt> |
| 112 | <dd>Describes the particular language dialect used to compile the |
| 113 | PCH file, including major options (e.g., Objective-C support) and more |
| 114 | minor options (e.g., support for "//" comments). The contents of this |
| 115 | record correspond to the <code>LangOptions</code> class.</dd> |
Douglas Gregor | 32110df | 2009-05-20 00:16:32 +0000 | [diff] [blame] | 116 | |
Douglas Gregor | 2cc390e | 2009-06-02 22:08:07 +0000 | [diff] [blame] | 117 | <dt>Target architecture</dt> |
| 118 | <dd>The target triple that describes the architecture, platform, and |
| 119 | ABI for which the PCH file was generated, e.g., |
| 120 | <code>i386-apple-darwin9</code>.</dd> |
| 121 | |
| 122 | <dt>PCH version</dt> |
| 123 | <dd>The major and minor version numbers of the precompiled header |
| 124 | format. Changes in the minor version number should not affect backward |
| 125 | compatibility, while changes in the major version number imply that a |
| 126 | newer compiler cannot read an older precompiled header (and |
| 127 | vice-versa).</dd> |
| 128 | |
| 129 | <dt>Original file name</dt> |
| 130 | <dd>The full path of the header that was used to generate the |
Douglas Gregor | 5accbb9 | 2009-06-03 16:06:22 +0000 | [diff] [blame] | 131 | precompiled header.</dd> |
Douglas Gregor | 2cc390e | 2009-06-02 22:08:07 +0000 | [diff] [blame] | 132 | |
| 133 | <dt>Predefines buffer</dt> |
| 134 | <dd>Although not explicitly stored as part of the metadata, the |
| 135 | predefines buffer is used in the validation of the precompiled header. |
| 136 | The predefines buffer itself contains code generated by the compiler |
| 137 | to initialize the preprocessor state according to the current target, |
| 138 | platform, and command-line options. For example, the predefines buffer |
| 139 | will contain "<code>#define __STDC__ 1</code>" when we are compiling C |
| 140 | without Microsoft extensions. The predefines buffer itself is stored |
| 141 | within the <a href="#sourcemgr">source manager block</a>, but its |
Douglas Gregor | 5accbb9 | 2009-06-03 16:06:22 +0000 | [diff] [blame] | 142 | contents are verified along with the rest of the metadata.</dd> |
| 143 | |
| 144 | </dl> |
Douglas Gregor | 2cc390e | 2009-06-02 22:08:07 +0000 | [diff] [blame] | 145 | |
| 146 | <h3 name="sourcemgr">Source Manager Block</h3> |
| 147 | |
| 148 | <p>The source manager block contains the serialized representation of |
| 149 | Clang's <a |
| 150 | href="InternalsManual.html#SourceLocation">SourceManager</a> class, |
| 151 | which handles the mapping from source locations (as represented in |
| 152 | Clang's abstract syntax tree) into actual column/line positions within |
| 153 | a source file or macro instantiation. The precompiled header's |
| 154 | representation of the source manager also includes information about |
| 155 | all of the headers that were (transitively) included when building the |
| 156 | precompiled header.</p> |
| 157 | |
| 158 | <p>The bulk of the source manager block is dedicated to information |
| 159 | about the various files, buffers, and macro instantiations into which |
| 160 | a source location can refer. Each of these is referenced by a numeric |
| 161 | "file ID", which is a unique number (allocated starting at 1) stored |
| 162 | in the source location. Clang serializes the information for each kind |
| 163 | of file ID, along with an index that maps file IDs to the position |
| 164 | within the PCH file where the information about that file ID is |
| 165 | stored. The data associated with a file ID is loaded only when |
| 166 | required by the front end, e.g., to emit a diagnostic that includes a |
| 167 | macro instantiation history inside the header itself.</p> |
| 168 | |
| 169 | <p>The source manager block also contains information about all of the |
| 170 | headers that were included when building the precompiled header. This |
| 171 | includes information about the controlling macro for the header (e.g., |
| 172 | when the preprocessor identified that the contents of the header |
| 173 | dependent on a macro like <code>LLVM_CLANG_SOURCEMANAGER_H</code>) |
| 174 | along with a cached version of the results of the <code>stat()</code> |
| 175 | system calls performed when building the precompiled header. The |
| 176 | latter is particularly useful in reducing system time when searching |
| 177 | for include files.</p> |
| 178 | |
| 179 | <h3 name="preprocessor">Preprocessor Block</h3> |
| 180 | |
| 181 | <p>The preprocessor block contains the serialized representation of |
| 182 | the preprocessor. Specifically, it contains all of the macros that |
| 183 | have been defined by the end of the header used to build the |
| 184 | precompiled header, along with the token sequences that comprise each |
| 185 | macro. The macro definitions are only read from the PCH file when the |
| 186 | name of the macro first occurs in the program. This lazy loading of |
| 187 | macro definitions is trigged by lookups into the <a |
| 188 | href="#idtable">identifier table</a>.</p> |
| 189 | |
| 190 | <h3 name="types">Types Block</h3> |
| 191 | |
| 192 | <p>The types block contains the serialized representation of all of |
| 193 | the types referenced in the translation unit. Each Clang type node |
| 194 | (<code>PointerType</code>, <code>FunctionProtoType</code>, etc.) has a |
| 195 | corresponding record type in the PCH file. When types are deserialized |
| 196 | from the precompiled header, the data within the record is used to |
| 197 | reconstruct the appropriate type node using the AST context.</p> |
| 198 | |
| 199 | <p>Each type has a unique type ID, which is an integer that uniquely |
| 200 | identifies that type. Type ID 0 represents the NULL type, type IDs |
| 201 | less than <code>NUM_PREDEF_TYPE_IDS</code> represent predefined types |
| 202 | (<code>void</code>, <code>float</code>, etc.), while other |
| 203 | "user-defined" type IDs are assigned consecutively from |
| 204 | <code>NUM_PREDEF_TYPE_IDS</code> upward as the types are encountered. |
| 205 | The PCH file has an associated mapping from the user-defined types |
| 206 | block to the location within the types block where the serialized |
| 207 | representation of that type resides, enabling lazy deserialization of |
| 208 | types. When a type is referenced from within the PCH file, that |
| 209 | reference is encoded using the type ID shifted left by 3 bits. The |
| 210 | lower three bits are used to represent the <code>const</code>, |
| 211 | <code>volatile</code>, and <code>restrict</code> qualifiers, as in |
| 212 | Clang's <a |
| 213 | href="http://clang.llvm.org/docs/InternalsManual.html#Type">QualType</a> |
| 214 | class.</p> |
| 215 | |
| 216 | <h3 name="decls">Declarations Block</h3> |
| 217 | |
| 218 | <p>The declarations block contains the serialized representation of |
| 219 | all of the declarations referenced in the translation unit. Each Clang |
| 220 | declaration node (<code>VarDecl</code>, <code>FunctionDecl</code>, |
| 221 | etc.) has a corresponding record type in the PCH file. When |
| 222 | declarations are deserialized from the precompiled header, the data |
| 223 | within the record is used to build and populate a new instance of the |
| 224 | corresponding <code>Decl</code> node. As with types, each declaration |
| 225 | node has a numeric ID that is used to refer to that declaration within |
| 226 | the PCH file. In addition, a lookup table provides a mapping from that |
| 227 | numeric ID to the offset within the precompiled header where that |
| 228 | declaration is described.</p> |
| 229 | |
| 230 | <p>Declarations in Clang's abstract syntax trees are stored |
| 231 | hierarchically. At the top of the hierarchy is the translation unit |
| 232 | (<code>TranslationUnitDecl</code>), which contains all of the |
| 233 | declarations in the translation unit. These declarations---such as |
| 234 | functions or struct types---may also contain other declarations inside |
| 235 | them, and so on. Within Clang, each declaration is stored within a <a |
| 236 | href="http://clang.llvm.org/docs/InternalsManual.html#DeclContext">declaration |
| 237 | context</a>, as represented by the <code>DeclContext</code> class. |
| 238 | Declaration contexts provide the mechanism to perform name lookup |
| 239 | within a given declaration (e.g., find the member named <code>x</code> |
| 240 | in a structure) and iterate over the declarations stored within a |
| 241 | context (e.g., iterate over all of the fields of a structure for |
| 242 | structure layout).</p> |
| 243 | |
| 244 | <p>In Clang's precompiled header format, deserializing a declaration |
| 245 | that is a <code>DeclContext</code> is a separate operation from |
| 246 | deserializing all of the declarations stored within that declaration |
| 247 | context. Therefore, Clang will deserialize the translation unit |
| 248 | declaration without deserializing the declarations within that |
| 249 | translation unit. When required, the declarations stored within a |
| 250 | declaration context will be serialized. There are two representations |
| 251 | of the declarations within a declaration context, which correspond to |
| 252 | the name-lookup and iteration behavior described above:</p> |
| 253 | |
| 254 | <ul> |
| 255 | <li>When the front end performs name lookup to find a name |
| 256 | <code>x</code> within a given declaration context (for example, |
| 257 | during semantic analysis of the expression <code>p->x</code>, |
| 258 | where <code>p</code>'s type is defined in the precompiled header), |
| 259 | Clang deserializes a hash table mapping from the names within that |
| 260 | declaration context to the declaration IDs that represent each |
| 261 | visible declaration with that name. The entire hash table is |
| 262 | deserialized at this point (into the <code>llvm::DenseMap</code> |
| 263 | stored within each <code>DeclContext</code> object), but the actual |
| 264 | declarations are not yet deserialized. In a second step, those |
| 265 | declarations with the name <code>x</code> will be deserialized and |
| 266 | will be used as the result of name lookup.</li> |
| 267 | |
| 268 | <li>When the front end performs iteration over all of the |
| 269 | declarations within a declaration context, all of those declarations |
| 270 | are immediately de-serialized. For large declaration contexts (e.g., |
| 271 | the translation unit), this operation is expensive; however, large |
| 272 | declaration contexts are not traversed in normal compilation, since |
| 273 | such a traversal is unnecessary. However, it is common for the code |
| 274 | generator and semantic analysis to traverse declaration contexts for |
| 275 | structs, classes, unions, and enumerations, although those contexts |
| 276 | contain relatively few declarations in the common case.</li> |
| 277 | </ul> |
| 278 | |
Douglas Gregor | 5accbb9 | 2009-06-03 16:06:22 +0000 | [diff] [blame] | 279 | <h3 name"stmt">Statements and Expressions</h3> |
| 280 | |
| 281 | <p>Statements and expressions are stored in the precompiled header in |
| 282 | both the <a href="#types">types</a> and the <a |
| 283 | href="#decls">declarations</a> blocks, because every statement or |
| 284 | expression will be associated with either a type or declaration. The |
| 285 | actual statement and expression records are stored immediately |
| 286 | following the declaration or type that owns the statement or |
| 287 | expression. For example, the statement representing the body of a |
| 288 | function will be stored directly following the declaration of the |
| 289 | function.</p> |
| 290 | |
| 291 | <p>As with types and declarations, each statement and expression kind |
| 292 | in Clang's abstract syntax tree (<code>ForStmt</code>, |
| 293 | <code>CallExpr</code>, etc.) has a corresponding record type in the |
| 294 | precompiled header, which contains the serialized representation of |
| 295 | that statement or expression. </p> |
| 296 | |
Douglas Gregor | 2cc390e | 2009-06-02 22:08:07 +0000 | [diff] [blame] | 297 | <h3 name="idtable">Identifier Table Block</h3> |
| 298 | |
| 299 | <p>The identifier table block contains an on-disk hash table that maps |
| 300 | each identifier mentioned within the precompiled header to the |
| 301 | serialized representation of the identifier's information (e.g, the |
| 302 | <code>IdentifierInfo</code> structure). The serialized representation |
| 303 | contains:</p> |
| 304 | |
| 305 | <ul> |
| 306 | <li>The actual identifier string.</li> |
| 307 | <li>Flags that describe whether this identifier is the name of a |
| 308 | built-in, a poisoned identifier, an extension token, or a |
| 309 | macro.</li> |
| 310 | <li>If the identifier names a macro, the offset of the macro |
| 311 | definition within the <a href="#preprocessor">preprocessor |
| 312 | block</a>.</li> |
| 313 | <li>If the identifier names one or more declarations visible from |
| 314 | translation unit scope, the <a href="#decls">declaration IDs</a> of these |
| 315 | declarations.</li> |
| 316 | </ul> |
| 317 | |
| 318 | <p>When a precompiled header is loaded, the precompiled header |
| 319 | mechanism introduces itself into the identifier table as an external |
| 320 | lookup source. Thus, when the user program refers to an identifier |
| 321 | that has not yet been seen, Clang will perform a lookup into the |
Douglas Gregor | 5accbb9 | 2009-06-03 16:06:22 +0000 | [diff] [blame] | 322 | identifier table. If an identifier is found, its contents---macro definitions, flags, top-level declarations, etc.---will be deserialized, at which point the corresponding <code>IdentifierInfo</code> structure will have the same contents it would have after parsing the headers in the precompiled header.</p> |
Douglas Gregor | 2cc390e | 2009-06-02 22:08:07 +0000 | [diff] [blame] | 323 | |
Douglas Gregor | 5accbb9 | 2009-06-03 16:06:22 +0000 | [diff] [blame] | 324 | <p>Within the PCH file, the identifiers used to name declarations are represented with an integral value. A separate table provides a mapping from this integral value (the identifier ID) to the location within the on-disk |
Douglas Gregor | 2cc390e | 2009-06-02 22:08:07 +0000 | [diff] [blame] | 325 | hash table where that identifier is stored. This mapping is used when |
| 326 | deserializing the name of a declaration, the identifier of a token, or |
| 327 | any other construct in the PCH file that refers to a name.</p> |
| 328 | |
Douglas Gregor | 5accbb9 | 2009-06-03 16:06:22 +0000 | [diff] [blame] | 329 | <h3 name="method-pool">Method Pool Block</h3> |
| 330 | |
| 331 | <p>The method pool block is represented as an on-disk hash table that |
| 332 | serves two purposes: it provides a mapping from the names of |
| 333 | Objective-C selectors to the set of Objective-C instance and class |
| 334 | methods that have that particular selector (which is required for |
| 335 | semantic analysis in Objective-C) and also stores all of the selectors |
| 336 | used by entities within the precompiled header. The design of the |
| 337 | method pool is similar to that of the <a href="#idtable">identifier |
| 338 | table</a>: the first time a particular selector is formed during the |
| 339 | compilation of the program, Clang will search in the on-disk hash |
| 340 | table of selectors; if found, Clang will read the Objective-C methods |
| 341 | associated with that selector into the appropriate front-end data |
| 342 | structure (<code>Sema::InstanceMethodPool</code> and |
| 343 | <code>Sema::FactoryMethodPool</code> for instance and class methods, |
| 344 | respectively).</p> |
| 345 | |
| 346 | <p>As with identifiers, selectors are represented by numeric values |
| 347 | within the PCH file. A separate index maps these numeric selector |
| 348 | values to the offset of the selector within the on-disk hash table, |
| 349 | and will be used when de-serializing an Objective-C method declaration |
| 350 | (or other Objective-C construct) that refers to the selector.</p> |
| 351 | |
Douglas Gregor | 32110df | 2009-05-20 00:16:32 +0000 | [diff] [blame] | 352 | </div> |
| 353 | |
Douglas Gregor | 32110df | 2009-05-20 00:16:32 +0000 | [diff] [blame] | 354 | </html> |