blob: d46ae5ceec1ffc02482c550119db300d89349a70 [file] [log] [blame]
Douglas Gregor29dde392009-06-03 21:57:43 +00001<html>
2<head>
3 <title>Precompiled Headers (PCH)</title>
4 <link type="text/css" rel="stylesheet" href="../menu.css" />
5 <link type="text/css" rel="stylesheet" href="../content.css" />
6 <style type="text/css">
7 td {
8 vertical-align: top;
9 }
10 </style>
Douglas Gregor32110df2009-05-20 00:16:32 +000011</head>
12
13<body>
14
15<!--#include virtual="../menu.html.incl"-->
16
17<div id="content">
18
19<h1>Precompiled Headers</h1>
20
21 <p>This document describes the design and implementation of Clang's
22 precompiled headers (PCH). If you are interested in the end-user
23 view, please see the <a
24 href="UsersManual.html#precompiledheaders">User's Manual</a>.</p>
25
Douglas Gregor923cb232009-06-03 18:35:59 +000026 <p><b>Table of Contents</b></p>
27 <ul>
28 <li><a href="#usage">Using Precompiled Headers with
Daniel Dunbar69cfd862009-12-11 23:17:03 +000029 <tt>clang</tt></a></li>
Douglas Gregor923cb232009-06-03 18:35:59 +000030 <li><a href="#philosophy">Design Philosophy</a></li>
31 <li><a href="#contents">Precompiled Header Contents</a>
32 <ul>
33 <li><a href="#metadata">Metadata Block</a></li>
34 <li><a href="#sourcemgr">Source Manager Block</a></li>
35 <li><a href="#preprocessor">Preprocessor Block</a></li>
36 <li><a href="#types">Types Block</a></li>
37 <li><a href="#decls">Declarations Block</a></li>
38 <li><a href="#stmt">Statements and Expressions</a></li>
39 <li><a href="#idtable">Identifier Table Block</a></li>
40 <li><a href="#method-pool">Method Pool Block</a></li>
41 </ul>
42 </li>
Douglas Gregor4c0397f2009-06-03 21:55:35 +000043 <li><a href="#tendrils">Precompiled Header Integration
44 Points</a></li>
Douglas Gregor0084ead2009-06-03 21:41:31 +000045</ul>
Douglas Gregor923cb232009-06-03 18:35:59 +000046
Daniel Dunbar69cfd862009-12-11 23:17:03 +000047<h2 id="usage">Using Precompiled Headers with <tt>clang</tt></h2>
Douglas Gregor32110df2009-05-20 00:16:32 +000048
Daniel Dunbar69cfd862009-12-11 23:17:03 +000049<p>The Clang compiler frontend, <tt>clang -cc1</tt>, supports two command line
50options for generating and using PCH files.<p>
Douglas Gregor32110df2009-05-20 00:16:32 +000051
Daniel Dunbar69cfd862009-12-11 23:17:03 +000052<p>To generate PCH files using <tt>clang -cc1</tt>, use the option
Douglas Gregor32110df2009-05-20 00:16:32 +000053<b><tt>-emit-pch</tt></b>:
54
Daniel Dunbar69cfd862009-12-11 23:17:03 +000055<pre> $ clang -cc1 test.h -emit-pch -o test.h.pch </pre>
Douglas Gregor32110df2009-05-20 00:16:32 +000056
57<p>This option is transparently used by <tt>clang</tt> when generating
58PCH files. The resulting PCH file contains the serialized form of the
59compiler's internal representation after it has completed parsing and
60semantic analysis. The PCH file can then be used as a prefix header
61with the <b><tt>-include-pch</tt></b> option:</p>
62
63<pre>
Daniel Dunbar69cfd862009-12-11 23:17:03 +000064 $ clang -cc1 -include-pch test.h.pch test.c -o test.s
Douglas Gregor32110df2009-05-20 00:16:32 +000065</pre>
66
Douglas Gregor923cb232009-06-03 18:35:59 +000067<h2 id="philosophy">Design Philosophy</h2>
Douglas Gregor32110df2009-05-20 00:16:32 +000068
69<p>Precompiled headers are meant to improve overall compile times for
70 projects, so the design of precompiled headers is entirely driven by
71 performance concerns. The use case for precompiled headers is
72 relatively simple: when there is a common set of headers that is
73 included in nearly every source file in the project, we
74 <i>precompile</i> that bundle of headers into a single precompiled
75 header (PCH file). Then, when compiling the source files in the
76 project, we load the PCH file first (as a prefix header), which acts
77 as a stand-in for that bundle of headers.</p>
78
79<p>A precompiled header implementation improves performance when:</p>
80<ul>
81 <li>Loading the PCH file is significantly faster than re-parsing the
82 bundle of headers stored within the PCH file. Thus, a precompiled
83 header design attempts to minimize the cost of reading the PCH
84 file. Ideally, this cost should not vary with the size of the
85 precompiled header file.</li>
86
87 <li>The cost of generating the PCH file initially is not so large
88 that it counters the per-source-file performance improvement due to
89 eliminating the need to parse the bundled headers in the first
90 place. This is particularly important on multi-core systems, because
91 PCH file generation serializes the build when all compilations
92 require the PCH file to be up-to-date.</li>
93</ul>
Douglas Gregor2cc390e2009-06-02 22:08:07 +000094
95<p>Clang's precompiled headers are designed with a compact on-disk
96representation, which minimizes both PCH creation time and the time
97required to initially load the PCH file. The PCH file itself contains
98a serialized representation of Clang's abstract syntax trees and
99supporting data structures, stored using the same compressed bitstream
100as <a href="http://llvm.org/docs/BitCodeFormat.html">LLVM's bitcode
101file format</a>.</p>
102
103<p>Clang's precompiled headers are loaded "lazily" from disk. When a
104PCH file is initially loaded, Clang reads only a small amount of data
105from the PCH file to establish where certain important data structures
106are stored. The amount of data read in this initial load is
107independent of the size of the PCH file, such that a larger PCH file
108does not lead to longer PCH load times. The actual header data in the
109PCH file--macros, functions, variables, types, etc.--is loaded only
110when it is referenced from the user's code, at which point only that
111entity (and those entities it depends on) are deserialized from the
112PCH file. With this approach, the cost of using a precompiled header
113for a translation unit is proportional to the amount of code actually
114used from the header, rather than being proportional to the size of
Douglas Gregor4c0397f2009-06-03 21:55:35 +0000115the header itself.</p>
116
117<p>When given the <code>-print-stats</code> option, Clang produces
118statistics describing how much of the precompiled header was actually
119loaded from disk. For a simple "Hello, World!" program that includes
120the Apple <code>Cocoa.h</code> header (which is built as a precompiled
121header), this option illustrates how little of the actual precompiled
122header is required:</p>
123
124<pre>
125*** PCH Statistics:
126 933 stat cache hits
127 4 stat cache misses
128 895/39981 source location entries read (2.238563%)
129 19/15315 types read (0.124061%)
130 20/82685 declarations read (0.024188%)
131 154/58070 identifiers read (0.265197%)
132 0/7260 selectors read (0.000000%)
133 0/30842 statements read (0.000000%)
134 4/8400 macros read (0.047619%)
135 1/4995 lexical declcontexts read (0.020020%)
136 0/4413 visible declcontexts read (0.000000%)
137 0/7230 method pool entries read (0.000000%)
138 0 method pool misses
139</pre>
140
141<p>For this small program, only a tiny fraction of the source
142locations, types, declarations, identifiers, and macros were actually
143deserialized from the precompiled header. These statistics can be
144useful to determine whether the precompiled header implementation can
145be improved by making more of the implementation lazy.</p>
Douglas Gregor2cc390e2009-06-02 22:08:07 +0000146
Sebastian Redla93e3b52010-07-08 22:01:51 +0000147<p>Precompiled headers can be chained. When you create a PCH while
148including an existing PCH, Clang can create the new PCH by referencing
149the original file and only writing the new data to the new file. For
150example, you could create a PCH out of all the headers that are very
151commonly used throughout your project, and then create a PCH for every
152single source file in the project that includes the code that is
153specific to that file, so that recompiling the file itself is very fast,
154without duplicating the data from the common headers for every file.</p>
155
Douglas Gregor923cb232009-06-03 18:35:59 +0000156<h2 id="contents">Precompiled Header Contents</h2>
Douglas Gregor2cc390e2009-06-02 22:08:07 +0000157
158<img src="PCHLayout.png" align="right" alt="Precompiled header layout">
159
160<p>Clang's precompiled headers are organized into several different
161blocks, each of which contains the serialized representation of a part
162of Clang's internal representation. Each of the blocks corresponds to
163either a block or a record within <a
164 href="http://llvm.org/docs/BitCodeFormat.html">LLVM's bitstream
165format</a>. The contents of each of these logical blocks are described
166below.</p>
167
Douglas Gregor4c0397f2009-06-03 21:55:35 +0000168<p>For a given precompiled header, the <a
169href="http://llvm.org/cmds/llvm-bcanalyzer.html"><code>llvm-bcanalyzer</code></a>
170utility can be used to examine the actual structure of the bitstream
171for the precompiled header. This information can be used both to help
172understand the structure of the precompiled header and to isolate
173areas where precompiled headers can still be optimized, e.g., through
174the introduction of abbreviations.</p>
175
Douglas Gregor923cb232009-06-03 18:35:59 +0000176<h3 id="metadata">Metadata Block</h3>
Douglas Gregor2cc390e2009-06-02 22:08:07 +0000177
178<p>The metadata block contains several records that provide
179information about how the precompiled header was built. This metadata
180is primarily used to validate the use of a precompiled header. For
Douglas Gregorfe3f2232009-06-03 18:26:16 +0000181example, a precompiled header built for a 32-bit x86 target cannot be used
182when compiling for a 64-bit x86 target. The metadata block contains
Douglas Gregor2cc390e2009-06-02 22:08:07 +0000183information about:</p>
184
185<dl>
186 <dt>Language options</dt>
187 <dd>Describes the particular language dialect used to compile the
188PCH file, including major options (e.g., Objective-C support) and more
189minor options (e.g., support for "//" comments). The contents of this
190record correspond to the <code>LangOptions</code> class.</dd>
Douglas Gregor32110df2009-05-20 00:16:32 +0000191
Douglas Gregor2cc390e2009-06-02 22:08:07 +0000192 <dt>Target architecture</dt>
193 <dd>The target triple that describes the architecture, platform, and
194ABI for which the PCH file was generated, e.g.,
195<code>i386-apple-darwin9</code>.</dd>
196
197 <dt>PCH version</dt>
198 <dd>The major and minor version numbers of the precompiled header
199format. Changes in the minor version number should not affect backward
200compatibility, while changes in the major version number imply that a
201newer compiler cannot read an older precompiled header (and
202vice-versa).</dd>
203
204 <dt>Original file name</dt>
205 <dd>The full path of the header that was used to generate the
Douglas Gregor5accbb92009-06-03 16:06:22 +0000206precompiled header.</dd>
Douglas Gregor2cc390e2009-06-02 22:08:07 +0000207
208 <dt>Predefines buffer</dt>
209 <dd>Although not explicitly stored as part of the metadata, the
210predefines buffer is used in the validation of the precompiled header.
211The predefines buffer itself contains code generated by the compiler
212to initialize the preprocessor state according to the current target,
213platform, and command-line options. For example, the predefines buffer
214will contain "<code>#define __STDC__ 1</code>" when we are compiling C
215without Microsoft extensions. The predefines buffer itself is stored
216within the <a href="#sourcemgr">source manager block</a>, but its
Douglas Gregor5accbb92009-06-03 16:06:22 +0000217contents are verified along with the rest of the metadata.</dd>
218
219</dl>
Douglas Gregor2cc390e2009-06-02 22:08:07 +0000220
Sebastian Redla93e3b52010-07-08 22:01:51 +0000221<p>A chained PCH file (that is, one that references another PCH) has
222a slightly different metadata block, which contains the following
223information:</p>
224
225<dl>
226 <dt>Referenced file</dt>
227 <dd>The name of the referenced PCH file. It is looked up like a file
228specified using -include-pch.</dd>
229
230 <dt>PCH version</dt>
231 <dd>This is the same as in normal PCH files.</dd>
232
233 <dt>Original file name</dt>
234 <dd>The full path of the header that was used to generate this
235precompiled header.</dd>
236
237</dl>
238
239<p>The language options, target architecture and predefines buffer data
240is taken from the end of the chain, since they have to match anyway.</p>
241
Douglas Gregor923cb232009-06-03 18:35:59 +0000242<h3 id="sourcemgr">Source Manager Block</h3>
Douglas Gregor2cc390e2009-06-02 22:08:07 +0000243
244<p>The source manager block contains the serialized representation of
245Clang's <a
246 href="InternalsManual.html#SourceLocation">SourceManager</a> class,
247which handles the mapping from source locations (as represented in
248Clang's abstract syntax tree) into actual column/line positions within
249a source file or macro instantiation. The precompiled header's
250representation of the source manager also includes information about
251all of the headers that were (transitively) included when building the
252precompiled header.</p>
253
254<p>The bulk of the source manager block is dedicated to information
255about the various files, buffers, and macro instantiations into which
256a source location can refer. Each of these is referenced by a numeric
257"file ID", which is a unique number (allocated starting at 1) stored
258in the source location. Clang serializes the information for each kind
259of file ID, along with an index that maps file IDs to the position
260within the PCH file where the information about that file ID is
261stored. The data associated with a file ID is loaded only when
262required by the front end, e.g., to emit a diagnostic that includes a
263macro instantiation history inside the header itself.</p>
264
265<p>The source manager block also contains information about all of the
266headers that were included when building the precompiled header. This
267includes information about the controlling macro for the header (e.g.,
268when the preprocessor identified that the contents of the header
269dependent on a macro like <code>LLVM_CLANG_SOURCEMANAGER_H</code>)
270along with a cached version of the results of the <code>stat()</code>
271system calls performed when building the precompiled header. The
272latter is particularly useful in reducing system time when searching
273for include files.</p>
274
Douglas Gregor923cb232009-06-03 18:35:59 +0000275<h3 id="preprocessor">Preprocessor Block</h3>
Douglas Gregor2cc390e2009-06-02 22:08:07 +0000276
277<p>The preprocessor block contains the serialized representation of
278the preprocessor. Specifically, it contains all of the macros that
279have been defined by the end of the header used to build the
280precompiled header, along with the token sequences that comprise each
281macro. The macro definitions are only read from the PCH file when the
282name of the macro first occurs in the program. This lazy loading of
Chris Lattner57eccbe2009-06-13 18:11:10 +0000283macro definitions is triggered by lookups into the <a
Douglas Gregor2cc390e2009-06-02 22:08:07 +0000284 href="#idtable">identifier table</a>.</p>
285
Douglas Gregor923cb232009-06-03 18:35:59 +0000286<h3 id="types">Types Block</h3>
Douglas Gregor2cc390e2009-06-02 22:08:07 +0000287
288<p>The types block contains the serialized representation of all of
289the types referenced in the translation unit. Each Clang type node
290(<code>PointerType</code>, <code>FunctionProtoType</code>, etc.) has a
291corresponding record type in the PCH file. When types are deserialized
292from the precompiled header, the data within the record is used to
293reconstruct the appropriate type node using the AST context.</p>
294
295<p>Each type has a unique type ID, which is an integer that uniquely
296identifies that type. Type ID 0 represents the NULL type, type IDs
297less than <code>NUM_PREDEF_TYPE_IDS</code> represent predefined types
298(<code>void</code>, <code>float</code>, etc.), while other
299"user-defined" type IDs are assigned consecutively from
300<code>NUM_PREDEF_TYPE_IDS</code> upward as the types are encountered.
301The PCH file has an associated mapping from the user-defined types
302block to the location within the types block where the serialized
303representation of that type resides, enabling lazy deserialization of
304types. When a type is referenced from within the PCH file, that
305reference is encoded using the type ID shifted left by 3 bits. The
306lower three bits are used to represent the <code>const</code>,
307<code>volatile</code>, and <code>restrict</code> qualifiers, as in
308Clang's <a
309 href="http://clang.llvm.org/docs/InternalsManual.html#Type">QualType</a>
310class.</p>
311
Douglas Gregor923cb232009-06-03 18:35:59 +0000312<h3 id="decls">Declarations Block</h3>
Douglas Gregor2cc390e2009-06-02 22:08:07 +0000313
314<p>The declarations block contains the serialized representation of
315all of the declarations referenced in the translation unit. Each Clang
316declaration node (<code>VarDecl</code>, <code>FunctionDecl</code>,
317etc.) has a corresponding record type in the PCH file. When
318declarations are deserialized from the precompiled header, the data
319within the record is used to build and populate a new instance of the
320corresponding <code>Decl</code> node. As with types, each declaration
321node has a numeric ID that is used to refer to that declaration within
322the PCH file. In addition, a lookup table provides a mapping from that
323numeric ID to the offset within the precompiled header where that
324declaration is described.</p>
325
326<p>Declarations in Clang's abstract syntax trees are stored
327hierarchically. At the top of the hierarchy is the translation unit
328(<code>TranslationUnitDecl</code>), which contains all of the
Chris Lattner57eccbe2009-06-13 18:11:10 +0000329declarations in the translation unit. These declarations (such as
330functions or struct types) may also contain other declarations inside
Douglas Gregor2cc390e2009-06-02 22:08:07 +0000331them, and so on. Within Clang, each declaration is stored within a <a
332href="http://clang.llvm.org/docs/InternalsManual.html#DeclContext">declaration
333context</a>, as represented by the <code>DeclContext</code> class.
334Declaration contexts provide the mechanism to perform name lookup
335within a given declaration (e.g., find the member named <code>x</code>
336in a structure) and iterate over the declarations stored within a
337context (e.g., iterate over all of the fields of a structure for
338structure layout).</p>
339
340<p>In Clang's precompiled header format, deserializing a declaration
341that is a <code>DeclContext</code> is a separate operation from
342deserializing all of the declarations stored within that declaration
343context. Therefore, Clang will deserialize the translation unit
344declaration without deserializing the declarations within that
345translation unit. When required, the declarations stored within a
Chris Lattner57eccbe2009-06-13 18:11:10 +0000346declaration context will be deserialized. There are two representations
Douglas Gregor2cc390e2009-06-02 22:08:07 +0000347of the declarations within a declaration context, which correspond to
348the name-lookup and iteration behavior described above:</p>
349
350<ul>
351 <li>When the front end performs name lookup to find a name
352 <code>x</code> within a given declaration context (for example,
353 during semantic analysis of the expression <code>p-&gt;x</code>,
354 where <code>p</code>'s type is defined in the precompiled header),
355 Clang deserializes a hash table mapping from the names within that
356 declaration context to the declaration IDs that represent each
357 visible declaration with that name. The entire hash table is
358 deserialized at this point (into the <code>llvm::DenseMap</code>
359 stored within each <code>DeclContext</code> object), but the actual
360 declarations are not yet deserialized. In a second step, those
361 declarations with the name <code>x</code> will be deserialized and
362 will be used as the result of name lookup.</li>
363
364 <li>When the front end performs iteration over all of the
365 declarations within a declaration context, all of those declarations
366 are immediately de-serialized. For large declaration contexts (e.g.,
367 the translation unit), this operation is expensive; however, large
368 declaration contexts are not traversed in normal compilation, since
369 such a traversal is unnecessary. However, it is common for the code
370 generator and semantic analysis to traverse declaration contexts for
371 structs, classes, unions, and enumerations, although those contexts
372 contain relatively few declarations in the common case.</li>
373</ul>
374
Douglas Gregor923cb232009-06-03 18:35:59 +0000375<h3 id="stmt">Statements and Expressions</h3>
Douglas Gregor5accbb92009-06-03 16:06:22 +0000376
377<p>Statements and expressions are stored in the precompiled header in
378both the <a href="#types">types</a> and the <a
379 href="#decls">declarations</a> blocks, because every statement or
380expression will be associated with either a type or declaration. The
381actual statement and expression records are stored immediately
382following the declaration or type that owns the statement or
383expression. For example, the statement representing the body of a
384function will be stored directly following the declaration of the
385function.</p>
386
387<p>As with types and declarations, each statement and expression kind
388in Clang's abstract syntax tree (<code>ForStmt</code>,
389<code>CallExpr</code>, etc.) has a corresponding record type in the
390precompiled header, which contains the serialized representation of
Douglas Gregorfe3f2232009-06-03 18:26:16 +0000391that statement or expression. Each substatement or subexpression
392within an expression is stored as a separate record (which keeps most
393records to a fixed size). Within the precompiled header, the
Argyrios Kyrtzidis86d3ca52010-09-13 17:48:02 +0000394subexpressions of an expression are stored, in reverse order, prior to the expression
Douglas Gregorfe3f2232009-06-03 18:26:16 +0000395that owns those expression, using a form of <a
396href="http://en.wikipedia.org/wiki/Reverse_Polish_notation">Reverse
397Polish Notation</a>. For example, an expression <code>3 - 4 + 5</code>
398would be represented as follows:</p>
399
400<table border="1">
Douglas Gregorfe3f2232009-06-03 18:26:16 +0000401 <tr><td><code>IntegerLiteral(5)</code></td></tr>
Argyrios Kyrtzidis86d3ca52010-09-13 17:48:02 +0000402 <tr><td><code>IntegerLiteral(4)</code></td></tr>
403 <tr><td><code>IntegerLiteral(3)</code></td></tr>
404 <tr><td><code>BinaryOperator(-)</code></td></tr>
Douglas Gregorfe3f2232009-06-03 18:26:16 +0000405 <tr><td><code>BinaryOperator(+)</code></td></tr>
406 <tr><td>STOP</td></tr>
407</table>
408
409<p>When reading this representation, Clang evaluates each expression
Argyrios Kyrtzidis86d3ca52010-09-13 17:48:02 +0000410record it encounters, builds the appropriate abstract syntax tree node,
Douglas Gregorfe3f2232009-06-03 18:26:16 +0000411and then pushes that expression on to a stack. When a record contains <i>N</i>
412subexpressions--<code>BinaryOperator</code> has two of them--those
413expressions are popped from the top of the stack. The special STOP
414code indicates that we have reached the end of a serialized expression
415or statement; other expression or statement records may follow, but
416they are part of a different expression.</p>
Douglas Gregor5accbb92009-06-03 16:06:22 +0000417
Douglas Gregor923cb232009-06-03 18:35:59 +0000418<h3 id="idtable">Identifier Table Block</h3>
Douglas Gregor2cc390e2009-06-02 22:08:07 +0000419
420<p>The identifier table block contains an on-disk hash table that maps
421each identifier mentioned within the precompiled header to the
422serialized representation of the identifier's information (e.g, the
423<code>IdentifierInfo</code> structure). The serialized representation
424contains:</p>
425
426<ul>
427 <li>The actual identifier string.</li>
428 <li>Flags that describe whether this identifier is the name of a
429 built-in, a poisoned identifier, an extension token, or a
430 macro.</li>
431 <li>If the identifier names a macro, the offset of the macro
432 definition within the <a href="#preprocessor">preprocessor
433 block</a>.</li>
434 <li>If the identifier names one or more declarations visible from
435 translation unit scope, the <a href="#decls">declaration IDs</a> of these
436 declarations.</li>
437</ul>
438
439<p>When a precompiled header is loaded, the precompiled header
440mechanism introduces itself into the identifier table as an external
441lookup source. Thus, when the user program refers to an identifier
442that has not yet been seen, Clang will perform a lookup into the
Chris Lattner57eccbe2009-06-13 18:11:10 +0000443identifier table. If an identifier is found, its contents (macro
444definitions, flags, top-level declarations, etc.) will be deserialized, at which point the corresponding <code>IdentifierInfo</code> structure will have the same contents it would have after parsing the headers in the precompiled header.</p>
Douglas Gregor2cc390e2009-06-02 22:08:07 +0000445
Douglas Gregor5accbb92009-06-03 16:06:22 +0000446<p>Within the PCH file, the identifiers used to name declarations are represented with an integral value. A separate table provides a mapping from this integral value (the identifier ID) to the location within the on-disk
Douglas Gregor2cc390e2009-06-02 22:08:07 +0000447hash table where that identifier is stored. This mapping is used when
448deserializing the name of a declaration, the identifier of a token, or
449any other construct in the PCH file that refers to a name.</p>
450
Douglas Gregor923cb232009-06-03 18:35:59 +0000451<h3 id="method-pool">Method Pool Block</h3>
Douglas Gregor5accbb92009-06-03 16:06:22 +0000452
453<p>The method pool block is represented as an on-disk hash table that
454serves two purposes: it provides a mapping from the names of
455Objective-C selectors to the set of Objective-C instance and class
456methods that have that particular selector (which is required for
457semantic analysis in Objective-C) and also stores all of the selectors
458used by entities within the precompiled header. The design of the
459method pool is similar to that of the <a href="#idtable">identifier
460table</a>: the first time a particular selector is formed during the
461compilation of the program, Clang will search in the on-disk hash
462table of selectors; if found, Clang will read the Objective-C methods
463associated with that selector into the appropriate front-end data
464structure (<code>Sema::InstanceMethodPool</code> and
465<code>Sema::FactoryMethodPool</code> for instance and class methods,
466respectively).</p>
467
468<p>As with identifiers, selectors are represented by numeric values
469within the PCH file. A separate index maps these numeric selector
470values to the offset of the selector within the on-disk hash table,
471and will be used when de-serializing an Objective-C method declaration
472(or other Objective-C construct) that refers to the selector.</p>
473
Douglas Gregor0084ead2009-06-03 21:41:31 +0000474<h2 id="tendrils">Precompiled Header Integration Points</h2>
475
476<p>The "lazy" deserialization behavior of precompiled headers requires
477their integration into several completely different submodules of
478Clang. For example, lazily deserializing the declarations during name
479lookup requires that the name-lookup routines be able to query the
480precompiled header to find entities within the PCH file.</p>
481
482<p>For each Clang data structure that requires direct interaction with
483the precompiled header logic, there is an abstract class that provides
484the interface between the two modules. The <code>PCHReader</code>
485class, which handles the loading of a precompiled header, inherits
486from all of these abstract classes to provide lazy deserialization of
487Clang's data structures. <code>PCHReader</code> implements the
488following abstract classes:</p>
489
490<dl>
491 <dt><code>StatSysCallCache</code></dt>
492 <dd>This abstract interface is associated with the
493 <code>FileManager</code> class, and is used whenever the file
494 manager is going to perform a <code>stat()</code> system call.</dd>
495
496 <dt><code>ExternalSLocEntrySource</code></dt>
497 <dd>This abstract interface is associated with the
498 <code>SourceManager</code> class, and is used whenever the
499 <a href="#sourcemgr">source manager</a> needs to load the details
500 of a file, buffer, or macro instantiation.</dd>
501
502 <dt><code>IdentifierInfoLookup</code></dt>
503 <dd>This abstract interface is associated with the
504 <code>IdentifierTable</code> class, and is used whenever the
505 program source refers to an identifier that has not yet been seen.
506 In this case, the precompiled header implementation searches for
507 this identifier within its <a href="#idtable">identifier table</a>
508 to load any top-level declarations or macros associated with that
509 identifier.</dd>
510
511 <dt><code>ExternalASTSource</code></dt>
512 <dd>This abstract interface is associated with the
513 <code>ASTContext</code> class, and is used whenever the abstract
514 syntax tree nodes need to loaded from the precompiled header. It
515 provides the ability to de-serialize declarations and types
516 identified by their numeric values, read the bodies of functions
517 when required, and read the declarations stored within a
518 declaration context (either for iteration or for name lookup).</dd>
519
520 <dt><code>ExternalSemaSource</code></dt>
521 <dd>This abstract interface is associated with the <code>Sema</code>
522 class, and is used whenever semantic analysis needs to read
523 information from the <a href="#methodpool">global method
524 pool</a>.</dd>
525</dl>
526
Douglas Gregor32110df2009-05-20 00:16:32 +0000527</div>
528
Douglas Gregor4c0397f2009-06-03 21:55:35 +0000529</body>
Douglas Gregor32110df2009-05-20 00:16:32 +0000530</html>