blob: 28ce1ce53efbbfe48e8db032e96da1fa4ef08baa [file] [log] [blame]
Benjamin Kramer665a8dc2012-01-15 15:26:07 +00001<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
2 "http://www.w3.org/TR/html4/strict.dtd">
Douglas Gregor29dde392009-06-03 21:57:43 +00003<html>
4<head>
5 <title>Precompiled Headers (PCH)</title>
Benjamin Kramer665a8dc2012-01-15 15:26:07 +00006 <link type="text/css" rel="stylesheet" href="../menu.css">
7 <link type="text/css" rel="stylesheet" href="../content.css">
Douglas Gregor29dde392009-06-03 21:57:43 +00008 <style type="text/css">
9 td {
10 vertical-align: top;
11 }
12 </style>
Douglas Gregor32110df2009-05-20 00:16:32 +000013</head>
14
15<body>
16
17<!--#include virtual="../menu.html.incl"-->
18
19<div id="content">
20
21<h1>Precompiled Headers</h1>
22
23 <p>This document describes the design and implementation of Clang's
24 precompiled headers (PCH). If you are interested in the end-user
25 view, please see the <a
26 href="UsersManual.html#precompiledheaders">User's Manual</a>.</p>
27
Douglas Gregor923cb232009-06-03 18:35:59 +000028 <p><b>Table of Contents</b></p>
29 <ul>
30 <li><a href="#usage">Using Precompiled Headers with
Daniel Dunbar69cfd862009-12-11 23:17:03 +000031 <tt>clang</tt></a></li>
Douglas Gregor923cb232009-06-03 18:35:59 +000032 <li><a href="#philosophy">Design Philosophy</a></li>
33 <li><a href="#contents">Precompiled Header Contents</a>
34 <ul>
35 <li><a href="#metadata">Metadata Block</a></li>
36 <li><a href="#sourcemgr">Source Manager Block</a></li>
37 <li><a href="#preprocessor">Preprocessor Block</a></li>
38 <li><a href="#types">Types Block</a></li>
39 <li><a href="#decls">Declarations Block</a></li>
40 <li><a href="#stmt">Statements and Expressions</a></li>
41 <li><a href="#idtable">Identifier Table Block</a></li>
42 <li><a href="#method-pool">Method Pool Block</a></li>
43 </ul>
44 </li>
Douglas Gregor4c0397f2009-06-03 21:55:35 +000045 <li><a href="#tendrils">Precompiled Header Integration
46 Points</a></li>
Douglas Gregor0084ead2009-06-03 21:41:31 +000047</ul>
Douglas Gregor923cb232009-06-03 18:35:59 +000048
Daniel Dunbar69cfd862009-12-11 23:17:03 +000049<h2 id="usage">Using Precompiled Headers with <tt>clang</tt></h2>
Douglas Gregor32110df2009-05-20 00:16:32 +000050
Daniel Dunbar69cfd862009-12-11 23:17:03 +000051<p>The Clang compiler frontend, <tt>clang -cc1</tt>, supports two command line
52options for generating and using PCH files.<p>
Douglas Gregor32110df2009-05-20 00:16:32 +000053
Daniel Dunbar69cfd862009-12-11 23:17:03 +000054<p>To generate PCH files using <tt>clang -cc1</tt>, use the option
Douglas Gregor32110df2009-05-20 00:16:32 +000055<b><tt>-emit-pch</tt></b>:
56
Daniel Dunbar69cfd862009-12-11 23:17:03 +000057<pre> $ clang -cc1 test.h -emit-pch -o test.h.pch </pre>
Douglas Gregor32110df2009-05-20 00:16:32 +000058
59<p>This option is transparently used by <tt>clang</tt> when generating
60PCH files. The resulting PCH file contains the serialized form of the
61compiler's internal representation after it has completed parsing and
62semantic analysis. The PCH file can then be used as a prefix header
63with the <b><tt>-include-pch</tt></b> option:</p>
64
65<pre>
Daniel Dunbar69cfd862009-12-11 23:17:03 +000066 $ clang -cc1 -include-pch test.h.pch test.c -o test.s
Douglas Gregor32110df2009-05-20 00:16:32 +000067</pre>
68
Douglas Gregor923cb232009-06-03 18:35:59 +000069<h2 id="philosophy">Design Philosophy</h2>
Douglas Gregor32110df2009-05-20 00:16:32 +000070
71<p>Precompiled headers are meant to improve overall compile times for
72 projects, so the design of precompiled headers is entirely driven by
73 performance concerns. The use case for precompiled headers is
74 relatively simple: when there is a common set of headers that is
75 included in nearly every source file in the project, we
76 <i>precompile</i> that bundle of headers into a single precompiled
77 header (PCH file). Then, when compiling the source files in the
78 project, we load the PCH file first (as a prefix header), which acts
79 as a stand-in for that bundle of headers.</p>
80
81<p>A precompiled header implementation improves performance when:</p>
82<ul>
83 <li>Loading the PCH file is significantly faster than re-parsing the
84 bundle of headers stored within the PCH file. Thus, a precompiled
85 header design attempts to minimize the cost of reading the PCH
86 file. Ideally, this cost should not vary with the size of the
87 precompiled header file.</li>
88
89 <li>The cost of generating the PCH file initially is not so large
90 that it counters the per-source-file performance improvement due to
91 eliminating the need to parse the bundled headers in the first
92 place. This is particularly important on multi-core systems, because
93 PCH file generation serializes the build when all compilations
94 require the PCH file to be up-to-date.</li>
95</ul>
Douglas Gregor2cc390e2009-06-02 22:08:07 +000096
97<p>Clang's precompiled headers are designed with a compact on-disk
98representation, which minimizes both PCH creation time and the time
99required to initially load the PCH file. The PCH file itself contains
100a serialized representation of Clang's abstract syntax trees and
101supporting data structures, stored using the same compressed bitstream
102as <a href="http://llvm.org/docs/BitCodeFormat.html">LLVM's bitcode
103file format</a>.</p>
104
105<p>Clang's precompiled headers are loaded "lazily" from disk. When a
106PCH file is initially loaded, Clang reads only a small amount of data
107from the PCH file to establish where certain important data structures
108are stored. The amount of data read in this initial load is
109independent of the size of the PCH file, such that a larger PCH file
110does not lead to longer PCH load times. The actual header data in the
111PCH file--macros, functions, variables, types, etc.--is loaded only
112when it is referenced from the user's code, at which point only that
113entity (and those entities it depends on) are deserialized from the
114PCH file. With this approach, the cost of using a precompiled header
115for a translation unit is proportional to the amount of code actually
116used from the header, rather than being proportional to the size of
Douglas Gregor4c0397f2009-06-03 21:55:35 +0000117the header itself.</p>
118
119<p>When given the <code>-print-stats</code> option, Clang produces
120statistics describing how much of the precompiled header was actually
121loaded from disk. For a simple "Hello, World!" program that includes
122the Apple <code>Cocoa.h</code> header (which is built as a precompiled
123header), this option illustrates how little of the actual precompiled
124header is required:</p>
125
126<pre>
127*** PCH Statistics:
128 933 stat cache hits
129 4 stat cache misses
130 895/39981 source location entries read (2.238563%)
131 19/15315 types read (0.124061%)
132 20/82685 declarations read (0.024188%)
133 154/58070 identifiers read (0.265197%)
134 0/7260 selectors read (0.000000%)
135 0/30842 statements read (0.000000%)
136 4/8400 macros read (0.047619%)
137 1/4995 lexical declcontexts read (0.020020%)
138 0/4413 visible declcontexts read (0.000000%)
139 0/7230 method pool entries read (0.000000%)
140 0 method pool misses
141</pre>
142
143<p>For this small program, only a tiny fraction of the source
144locations, types, declarations, identifiers, and macros were actually
145deserialized from the precompiled header. These statistics can be
146useful to determine whether the precompiled header implementation can
147be improved by making more of the implementation lazy.</p>
Douglas Gregor2cc390e2009-06-02 22:08:07 +0000148
Sebastian Redla93e3b52010-07-08 22:01:51 +0000149<p>Precompiled headers can be chained. When you create a PCH while
150including an existing PCH, Clang can create the new PCH by referencing
151the original file and only writing the new data to the new file. For
152example, you could create a PCH out of all the headers that are very
153commonly used throughout your project, and then create a PCH for every
154single source file in the project that includes the code that is
155specific to that file, so that recompiling the file itself is very fast,
156without duplicating the data from the common headers for every file.</p>
157
Douglas Gregor923cb232009-06-03 18:35:59 +0000158<h2 id="contents">Precompiled Header Contents</h2>
Douglas Gregor2cc390e2009-06-02 22:08:07 +0000159
Benjamin Kramer665a8dc2012-01-15 15:26:07 +0000160<img src="PCHLayout.png" style="float:right" alt="Precompiled header layout">
Douglas Gregor2cc390e2009-06-02 22:08:07 +0000161
162<p>Clang's precompiled headers are organized into several different
163blocks, each of which contains the serialized representation of a part
164of Clang's internal representation. Each of the blocks corresponds to
165either a block or a record within <a
166 href="http://llvm.org/docs/BitCodeFormat.html">LLVM's bitstream
167format</a>. The contents of each of these logical blocks are described
168below.</p>
169
Douglas Gregor4c0397f2009-06-03 21:55:35 +0000170<p>For a given precompiled header, the <a
171href="http://llvm.org/cmds/llvm-bcanalyzer.html"><code>llvm-bcanalyzer</code></a>
172utility can be used to examine the actual structure of the bitstream
173for the precompiled header. This information can be used both to help
174understand the structure of the precompiled header and to isolate
175areas where precompiled headers can still be optimized, e.g., through
176the introduction of abbreviations.</p>
177
Douglas Gregor923cb232009-06-03 18:35:59 +0000178<h3 id="metadata">Metadata Block</h3>
Douglas Gregor2cc390e2009-06-02 22:08:07 +0000179
180<p>The metadata block contains several records that provide
181information about how the precompiled header was built. This metadata
182is primarily used to validate the use of a precompiled header. For
Douglas Gregorfe3f2232009-06-03 18:26:16 +0000183example, a precompiled header built for a 32-bit x86 target cannot be used
184when compiling for a 64-bit x86 target. The metadata block contains
Douglas Gregor2cc390e2009-06-02 22:08:07 +0000185information about:</p>
186
187<dl>
188 <dt>Language options</dt>
189 <dd>Describes the particular language dialect used to compile the
190PCH file, including major options (e.g., Objective-C support) and more
191minor options (e.g., support for "//" comments). The contents of this
192record correspond to the <code>LangOptions</code> class.</dd>
Douglas Gregor32110df2009-05-20 00:16:32 +0000193
Douglas Gregor2cc390e2009-06-02 22:08:07 +0000194 <dt>Target architecture</dt>
195 <dd>The target triple that describes the architecture, platform, and
196ABI for which the PCH file was generated, e.g.,
197<code>i386-apple-darwin9</code>.</dd>
198
199 <dt>PCH version</dt>
200 <dd>The major and minor version numbers of the precompiled header
201format. Changes in the minor version number should not affect backward
202compatibility, while changes in the major version number imply that a
203newer compiler cannot read an older precompiled header (and
204vice-versa).</dd>
205
206 <dt>Original file name</dt>
207 <dd>The full path of the header that was used to generate the
Douglas Gregor5accbb92009-06-03 16:06:22 +0000208precompiled header.</dd>
Douglas Gregor2cc390e2009-06-02 22:08:07 +0000209
210 <dt>Predefines buffer</dt>
211 <dd>Although not explicitly stored as part of the metadata, the
212predefines buffer is used in the validation of the precompiled header.
213The predefines buffer itself contains code generated by the compiler
214to initialize the preprocessor state according to the current target,
215platform, and command-line options. For example, the predefines buffer
216will contain "<code>#define __STDC__ 1</code>" when we are compiling C
217without Microsoft extensions. The predefines buffer itself is stored
218within the <a href="#sourcemgr">source manager block</a>, but its
Douglas Gregor5accbb92009-06-03 16:06:22 +0000219contents are verified along with the rest of the metadata.</dd>
220
221</dl>
Douglas Gregor2cc390e2009-06-02 22:08:07 +0000222
Sebastian Redla93e3b52010-07-08 22:01:51 +0000223<p>A chained PCH file (that is, one that references another PCH) has
224a slightly different metadata block, which contains the following
225information:</p>
226
227<dl>
228 <dt>Referenced file</dt>
229 <dd>The name of the referenced PCH file. It is looked up like a file
230specified using -include-pch.</dd>
231
232 <dt>PCH version</dt>
233 <dd>This is the same as in normal PCH files.</dd>
234
235 <dt>Original file name</dt>
236 <dd>The full path of the header that was used to generate this
237precompiled header.</dd>
238
239</dl>
240
241<p>The language options, target architecture and predefines buffer data
242is taken from the end of the chain, since they have to match anyway.</p>
243
Douglas Gregor923cb232009-06-03 18:35:59 +0000244<h3 id="sourcemgr">Source Manager Block</h3>
Douglas Gregor2cc390e2009-06-02 22:08:07 +0000245
246<p>The source manager block contains the serialized representation of
247Clang's <a
248 href="InternalsManual.html#SourceLocation">SourceManager</a> class,
249which handles the mapping from source locations (as represented in
250Clang's abstract syntax tree) into actual column/line positions within
251a source file or macro instantiation. The precompiled header's
252representation of the source manager also includes information about
253all of the headers that were (transitively) included when building the
254precompiled header.</p>
255
256<p>The bulk of the source manager block is dedicated to information
257about the various files, buffers, and macro instantiations into which
258a source location can refer. Each of these is referenced by a numeric
259"file ID", which is a unique number (allocated starting at 1) stored
260in the source location. Clang serializes the information for each kind
261of file ID, along with an index that maps file IDs to the position
262within the PCH file where the information about that file ID is
263stored. The data associated with a file ID is loaded only when
264required by the front end, e.g., to emit a diagnostic that includes a
265macro instantiation history inside the header itself.</p>
266
267<p>The source manager block also contains information about all of the
268headers that were included when building the precompiled header. This
269includes information about the controlling macro for the header (e.g.,
270when the preprocessor identified that the contents of the header
271dependent on a macro like <code>LLVM_CLANG_SOURCEMANAGER_H</code>)
272along with a cached version of the results of the <code>stat()</code>
273system calls performed when building the precompiled header. The
274latter is particularly useful in reducing system time when searching
275for include files.</p>
276
Douglas Gregor923cb232009-06-03 18:35:59 +0000277<h3 id="preprocessor">Preprocessor Block</h3>
Douglas Gregor2cc390e2009-06-02 22:08:07 +0000278
279<p>The preprocessor block contains the serialized representation of
280the preprocessor. Specifically, it contains all of the macros that
281have been defined by the end of the header used to build the
282precompiled header, along with the token sequences that comprise each
283macro. The macro definitions are only read from the PCH file when the
284name of the macro first occurs in the program. This lazy loading of
Chris Lattner57eccbe2009-06-13 18:11:10 +0000285macro definitions is triggered by lookups into the <a
Douglas Gregor2cc390e2009-06-02 22:08:07 +0000286 href="#idtable">identifier table</a>.</p>
287
Douglas Gregor923cb232009-06-03 18:35:59 +0000288<h3 id="types">Types Block</h3>
Douglas Gregor2cc390e2009-06-02 22:08:07 +0000289
290<p>The types block contains the serialized representation of all of
291the types referenced in the translation unit. Each Clang type node
292(<code>PointerType</code>, <code>FunctionProtoType</code>, etc.) has a
293corresponding record type in the PCH file. When types are deserialized
294from the precompiled header, the data within the record is used to
295reconstruct the appropriate type node using the AST context.</p>
296
297<p>Each type has a unique type ID, which is an integer that uniquely
298identifies that type. Type ID 0 represents the NULL type, type IDs
299less than <code>NUM_PREDEF_TYPE_IDS</code> represent predefined types
300(<code>void</code>, <code>float</code>, etc.), while other
301"user-defined" type IDs are assigned consecutively from
302<code>NUM_PREDEF_TYPE_IDS</code> upward as the types are encountered.
303The PCH file has an associated mapping from the user-defined types
304block to the location within the types block where the serialized
305representation of that type resides, enabling lazy deserialization of
306types. When a type is referenced from within the PCH file, that
307reference is encoded using the type ID shifted left by 3 bits. The
308lower three bits are used to represent the <code>const</code>,
309<code>volatile</code>, and <code>restrict</code> qualifiers, as in
310Clang's <a
311 href="http://clang.llvm.org/docs/InternalsManual.html#Type">QualType</a>
312class.</p>
313
Douglas Gregor923cb232009-06-03 18:35:59 +0000314<h3 id="decls">Declarations Block</h3>
Douglas Gregor2cc390e2009-06-02 22:08:07 +0000315
316<p>The declarations block contains the serialized representation of
317all of the declarations referenced in the translation unit. Each Clang
318declaration node (<code>VarDecl</code>, <code>FunctionDecl</code>,
319etc.) has a corresponding record type in the PCH file. When
320declarations are deserialized from the precompiled header, the data
321within the record is used to build and populate a new instance of the
322corresponding <code>Decl</code> node. As with types, each declaration
323node has a numeric ID that is used to refer to that declaration within
324the PCH file. In addition, a lookup table provides a mapping from that
325numeric ID to the offset within the precompiled header where that
326declaration is described.</p>
327
328<p>Declarations in Clang's abstract syntax trees are stored
329hierarchically. At the top of the hierarchy is the translation unit
330(<code>TranslationUnitDecl</code>), which contains all of the
Chris Lattner57eccbe2009-06-13 18:11:10 +0000331declarations in the translation unit. These declarations (such as
332functions or struct types) may also contain other declarations inside
Douglas Gregor2cc390e2009-06-02 22:08:07 +0000333them, and so on. Within Clang, each declaration is stored within a <a
334href="http://clang.llvm.org/docs/InternalsManual.html#DeclContext">declaration
335context</a>, as represented by the <code>DeclContext</code> class.
336Declaration contexts provide the mechanism to perform name lookup
337within a given declaration (e.g., find the member named <code>x</code>
338in a structure) and iterate over the declarations stored within a
339context (e.g., iterate over all of the fields of a structure for
340structure layout).</p>
341
342<p>In Clang's precompiled header format, deserializing a declaration
343that is a <code>DeclContext</code> is a separate operation from
344deserializing all of the declarations stored within that declaration
345context. Therefore, Clang will deserialize the translation unit
346declaration without deserializing the declarations within that
347translation unit. When required, the declarations stored within a
Chris Lattner57eccbe2009-06-13 18:11:10 +0000348declaration context will be deserialized. There are two representations
Douglas Gregor2cc390e2009-06-02 22:08:07 +0000349of the declarations within a declaration context, which correspond to
350the name-lookup and iteration behavior described above:</p>
351
352<ul>
353 <li>When the front end performs name lookup to find a name
354 <code>x</code> within a given declaration context (for example,
355 during semantic analysis of the expression <code>p-&gt;x</code>,
356 where <code>p</code>'s type is defined in the precompiled header),
357 Clang deserializes a hash table mapping from the names within that
358 declaration context to the declaration IDs that represent each
359 visible declaration with that name. The entire hash table is
360 deserialized at this point (into the <code>llvm::DenseMap</code>
361 stored within each <code>DeclContext</code> object), but the actual
362 declarations are not yet deserialized. In a second step, those
363 declarations with the name <code>x</code> will be deserialized and
364 will be used as the result of name lookup.</li>
365
366 <li>When the front end performs iteration over all of the
367 declarations within a declaration context, all of those declarations
368 are immediately de-serialized. For large declaration contexts (e.g.,
369 the translation unit), this operation is expensive; however, large
370 declaration contexts are not traversed in normal compilation, since
371 such a traversal is unnecessary. However, it is common for the code
372 generator and semantic analysis to traverse declaration contexts for
373 structs, classes, unions, and enumerations, although those contexts
374 contain relatively few declarations in the common case.</li>
375</ul>
376
Douglas Gregor923cb232009-06-03 18:35:59 +0000377<h3 id="stmt">Statements and Expressions</h3>
Douglas Gregor5accbb92009-06-03 16:06:22 +0000378
379<p>Statements and expressions are stored in the precompiled header in
380both the <a href="#types">types</a> and the <a
381 href="#decls">declarations</a> blocks, because every statement or
382expression will be associated with either a type or declaration. The
383actual statement and expression records are stored immediately
384following the declaration or type that owns the statement or
385expression. For example, the statement representing the body of a
386function will be stored directly following the declaration of the
387function.</p>
388
389<p>As with types and declarations, each statement and expression kind
390in Clang's abstract syntax tree (<code>ForStmt</code>,
391<code>CallExpr</code>, etc.) has a corresponding record type in the
392precompiled header, which contains the serialized representation of
Douglas Gregorfe3f2232009-06-03 18:26:16 +0000393that statement or expression. Each substatement or subexpression
394within an expression is stored as a separate record (which keeps most
395records to a fixed size). Within the precompiled header, the
Argyrios Kyrtzidis86d3ca52010-09-13 17:48:02 +0000396subexpressions of an expression are stored, in reverse order, prior to the expression
Douglas Gregorfe3f2232009-06-03 18:26:16 +0000397that owns those expression, using a form of <a
398href="http://en.wikipedia.org/wiki/Reverse_Polish_notation">Reverse
399Polish Notation</a>. For example, an expression <code>3 - 4 + 5</code>
400would be represented as follows:</p>
401
402<table border="1">
Douglas Gregorfe3f2232009-06-03 18:26:16 +0000403 <tr><td><code>IntegerLiteral(5)</code></td></tr>
Argyrios Kyrtzidis86d3ca52010-09-13 17:48:02 +0000404 <tr><td><code>IntegerLiteral(4)</code></td></tr>
405 <tr><td><code>IntegerLiteral(3)</code></td></tr>
406 <tr><td><code>BinaryOperator(-)</code></td></tr>
Douglas Gregorfe3f2232009-06-03 18:26:16 +0000407 <tr><td><code>BinaryOperator(+)</code></td></tr>
408 <tr><td>STOP</td></tr>
409</table>
410
411<p>When reading this representation, Clang evaluates each expression
Argyrios Kyrtzidis86d3ca52010-09-13 17:48:02 +0000412record it encounters, builds the appropriate abstract syntax tree node,
Douglas Gregorfe3f2232009-06-03 18:26:16 +0000413and then pushes that expression on to a stack. When a record contains <i>N</i>
414subexpressions--<code>BinaryOperator</code> has two of them--those
415expressions are popped from the top of the stack. The special STOP
416code indicates that we have reached the end of a serialized expression
417or statement; other expression or statement records may follow, but
418they are part of a different expression.</p>
Douglas Gregor5accbb92009-06-03 16:06:22 +0000419
Douglas Gregor923cb232009-06-03 18:35:59 +0000420<h3 id="idtable">Identifier Table Block</h3>
Douglas Gregor2cc390e2009-06-02 22:08:07 +0000421
422<p>The identifier table block contains an on-disk hash table that maps
423each identifier mentioned within the precompiled header to the
424serialized representation of the identifier's information (e.g, the
425<code>IdentifierInfo</code> structure). The serialized representation
426contains:</p>
427
428<ul>
429 <li>The actual identifier string.</li>
430 <li>Flags that describe whether this identifier is the name of a
431 built-in, a poisoned identifier, an extension token, or a
432 macro.</li>
433 <li>If the identifier names a macro, the offset of the macro
434 definition within the <a href="#preprocessor">preprocessor
435 block</a>.</li>
436 <li>If the identifier names one or more declarations visible from
437 translation unit scope, the <a href="#decls">declaration IDs</a> of these
438 declarations.</li>
439</ul>
440
441<p>When a precompiled header is loaded, the precompiled header
442mechanism introduces itself into the identifier table as an external
443lookup source. Thus, when the user program refers to an identifier
444that has not yet been seen, Clang will perform a lookup into the
Chris Lattner57eccbe2009-06-13 18:11:10 +0000445identifier table. If an identifier is found, its contents (macro
446definitions, flags, top-level declarations, etc.) will be deserialized, at which point the corresponding <code>IdentifierInfo</code> structure will have the same contents it would have after parsing the headers in the precompiled header.</p>
Douglas Gregor2cc390e2009-06-02 22:08:07 +0000447
Douglas Gregor5accbb92009-06-03 16:06:22 +0000448<p>Within the PCH file, the identifiers used to name declarations are represented with an integral value. A separate table provides a mapping from this integral value (the identifier ID) to the location within the on-disk
Douglas Gregor2cc390e2009-06-02 22:08:07 +0000449hash table where that identifier is stored. This mapping is used when
450deserializing the name of a declaration, the identifier of a token, or
451any other construct in the PCH file that refers to a name.</p>
452
Douglas Gregor923cb232009-06-03 18:35:59 +0000453<h3 id="method-pool">Method Pool Block</h3>
Douglas Gregor5accbb92009-06-03 16:06:22 +0000454
455<p>The method pool block is represented as an on-disk hash table that
456serves two purposes: it provides a mapping from the names of
457Objective-C selectors to the set of Objective-C instance and class
458methods that have that particular selector (which is required for
459semantic analysis in Objective-C) and also stores all of the selectors
460used by entities within the precompiled header. The design of the
461method pool is similar to that of the <a href="#idtable">identifier
462table</a>: the first time a particular selector is formed during the
463compilation of the program, Clang will search in the on-disk hash
464table of selectors; if found, Clang will read the Objective-C methods
465associated with that selector into the appropriate front-end data
466structure (<code>Sema::InstanceMethodPool</code> and
467<code>Sema::FactoryMethodPool</code> for instance and class methods,
468respectively).</p>
469
470<p>As with identifiers, selectors are represented by numeric values
471within the PCH file. A separate index maps these numeric selector
472values to the offset of the selector within the on-disk hash table,
473and will be used when de-serializing an Objective-C method declaration
474(or other Objective-C construct) that refers to the selector.</p>
475
Douglas Gregor0084ead2009-06-03 21:41:31 +0000476<h2 id="tendrils">Precompiled Header Integration Points</h2>
477
478<p>The "lazy" deserialization behavior of precompiled headers requires
479their integration into several completely different submodules of
480Clang. For example, lazily deserializing the declarations during name
481lookup requires that the name-lookup routines be able to query the
482precompiled header to find entities within the PCH file.</p>
483
484<p>For each Clang data structure that requires direct interaction with
485the precompiled header logic, there is an abstract class that provides
486the interface between the two modules. The <code>PCHReader</code>
487class, which handles the loading of a precompiled header, inherits
488from all of these abstract classes to provide lazy deserialization of
489Clang's data structures. <code>PCHReader</code> implements the
490following abstract classes:</p>
491
492<dl>
493 <dt><code>StatSysCallCache</code></dt>
494 <dd>This abstract interface is associated with the
495 <code>FileManager</code> class, and is used whenever the file
496 manager is going to perform a <code>stat()</code> system call.</dd>
497
498 <dt><code>ExternalSLocEntrySource</code></dt>
499 <dd>This abstract interface is associated with the
500 <code>SourceManager</code> class, and is used whenever the
501 <a href="#sourcemgr">source manager</a> needs to load the details
502 of a file, buffer, or macro instantiation.</dd>
503
504 <dt><code>IdentifierInfoLookup</code></dt>
505 <dd>This abstract interface is associated with the
506 <code>IdentifierTable</code> class, and is used whenever the
507 program source refers to an identifier that has not yet been seen.
508 In this case, the precompiled header implementation searches for
509 this identifier within its <a href="#idtable">identifier table</a>
510 to load any top-level declarations or macros associated with that
511 identifier.</dd>
512
513 <dt><code>ExternalASTSource</code></dt>
514 <dd>This abstract interface is associated with the
515 <code>ASTContext</code> class, and is used whenever the abstract
516 syntax tree nodes need to loaded from the precompiled header. It
517 provides the ability to de-serialize declarations and types
518 identified by their numeric values, read the bodies of functions
519 when required, and read the declarations stored within a
520 declaration context (either for iteration or for name lookup).</dd>
521
522 <dt><code>ExternalSemaSource</code></dt>
523 <dd>This abstract interface is associated with the <code>Sema</code>
524 class, and is used whenever semantic analysis needs to read
525 information from the <a href="#methodpool">global method
526 pool</a>.</dd>
527</dl>
528
Douglas Gregor32110df2009-05-20 00:16:32 +0000529</div>
530
Douglas Gregor4c0397f2009-06-03 21:55:35 +0000531</body>
Douglas Gregor32110df2009-05-20 00:16:32 +0000532</html>