blob: f3b76f8c63c428c638a6f20176d093224127b186 [file] [log] [blame]
Chris Lattnerce90ba62007-12-10 05:20:47 +00001<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
2 "http://www.w3.org/TR/html4/strict.dtd">
Chris Lattner7a274392007-10-06 05:23:00 +00003<html>
4<head>
Chris Lattnerce90ba62007-12-10 05:20:47 +00005 <META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
Chris Lattner6908f302007-12-10 05:52:05 +00006 <title>Clang - Features and Goals</title>
Chris Lattnerce90ba62007-12-10 05:20:47 +00007 <link type="text/css" rel="stylesheet" href="menu.css" />
8 <link type="text/css" rel="stylesheet" href="content.css" />
9 <style type="text/css">
Chris Lattner7a274392007-10-06 05:23:00 +000010</style>
11</head>
12<body>
Chris Lattnerce90ba62007-12-10 05:20:47 +000013
Chris Lattner7a274392007-10-06 05:23:00 +000014<!--#include virtual="menu.html.incl"-->
Chris Lattnerce90ba62007-12-10 05:20:47 +000015
Chris Lattner7a274392007-10-06 05:23:00 +000016<div id="content">
Chris Lattner7a274392007-10-06 05:23:00 +000017
Chris Lattneread27db2007-12-10 08:12:49 +000018<!--*************************************************************************-->
Chris Lattner6908f302007-12-10 05:52:05 +000019<h1>Clang - Features and Goals</h1>
Chris Lattneread27db2007-12-10 08:12:49 +000020<!--*************************************************************************-->
21
Chris Lattner6908f302007-12-10 05:52:05 +000022<p>
23This page describes the <a href="index.html#goals">features and goals</a> of
24Clang in more detail and gives a more broad explanation about what we mean.
25These features are:
26</p>
Chris Lattner7a274392007-10-06 05:23:00 +000027
Chris Lattner1a380a02007-12-10 07:14:08 +000028<p>End-User Features:</p>
29
30<ul>
Chris Lattnerde9a4f52007-12-13 05:42:27 +000031<li><a href="#performance">Fast compiles and low memory use</a></li>
Chris Lattnercf086ea2007-12-10 08:19:29 +000032<li><a href="#expressivediags">Expressive diagnostics</a></li>
Chris Lattnerb5604af2007-12-10 07:23:52 +000033<li><a href="#gcccompat">GCC compatibility</a></li>
Chris Lattner1a380a02007-12-10 07:14:08 +000034</ul>
35
Chris Lattneread27db2007-12-10 08:12:49 +000036<p>Utility and Applications:</p>
37
38<ul>
39<li><a href="#libraryarch">Library based architecture</a></li>
40<li><a href="#diverseclients">Support diverse clients</a></li>
41<li><a href="#ideintegration">Integration with IDEs</a></li>
42<li><a href="#license">Use the LLVM 'BSD' License</a></li>
43</ul>
44
45<p>Internal Design and Implementation:</p>
46
Chris Lattner6908f302007-12-10 05:52:05 +000047<ul>
48<li><a href="#real">A real-world, production quality compiler</a></li>
Chris Lattnerb5604af2007-12-10 07:23:52 +000049<li><a href="#simplecode">A simple and hackable code base</a></li>
Chris Lattner6908f302007-12-10 05:52:05 +000050<li><a href="#unifiedparser">A single unified parser for C, Objective C, C++,
51 and Objective C++</a></li>
52<li><a href="#conformance">Conformance with C/C++/ObjC and their
53 variants</a></li>
54</ul>
Chris Lattner7a274392007-10-06 05:23:00 +000055
Chris Lattneread27db2007-12-10 08:12:49 +000056<!--*************************************************************************-->
Ted Kremenek3b61b152008-06-17 06:35:36 +000057<h2><a name="enduser">End-User Features</a></h2>
Chris Lattneread27db2007-12-10 08:12:49 +000058<!--*************************************************************************-->
Chris Lattner1a380a02007-12-10 07:14:08 +000059
60
61<!--=======================================================================-->
Ted Kremenek3b61b152008-06-17 06:35:36 +000062<h3><a name="performance">Fast compiles and Low Memory Use</a></h3>
Chris Lattner1a380a02007-12-10 07:14:08 +000063<!--=======================================================================-->
64
65<p>A major focus of our work on clang is to make it fast, light and scalable.
66The library-based architecture of clang makes it straight-forward to time and
67profile the cost of each layer of the stack, and the driver has a number of
68options for performance analysis.</p>
69
70<p>While there is still much that can be done, we find that the clang front-end
71is significantly quicker than gcc and uses less memory For example, when
72compiling "Carbon.h" on Mac OS/X, we see that clang is 2.5x faster than GCC:</p>
73
74<img class="img_slide" src="feature-compile1.png" width="400" height="300" />
75
76<p>Carbon.h is a monster: it transitively includes 558 files, 12.3M of code,
77declares 10000 functions, has 2000 struct definitions, 8000 fields, 20000 enum
78constants, etc (see slide 25+ of the <a href="clang_video-07-25-2007.html">clang
79talk</a> for more information). It is also #include'd into almost every C file
80in a GUI app on the Mac, so its compile time is very important.</p>
81
82<p>From the slide above, you can see that we can measure the time to preprocess
83the file independently from the time to parse it, and independently from the
84time to build the ASTs for the code. GCC doesn't provide a way to measure the
85parser without AST building (it only provides -fsyntax-only). In our
86measurements, we find that clang's preprocessor is consistently 40% faster than
87GCCs, and the parser + AST builder is ~4x faster than GCC's. If you have
88sources that do not depend as heavily on the preprocessor (or if you
89use Precompiled Headers) you may see a much bigger speedup from clang.
90</p>
91
92<p>Compile time performance is important, but when using clang as an API, often
93memory use is even moreso: the less memory the code takes the more code you can
94fit into memory at a time (useful for whole program analysis tools, for
95example).</p>
96
97<img class="img_slide" src="feature-memory1.png" width="400" height="300" />
98
99<p>Here we see a huge advantage of clang: its ASTs take <b>5x less memory</b>
100than GCC's syntax trees, despite the fact that clang's ASTs capture far more
101source-level information than GCC's trees do. This feat is accomplished through
102the use of carefully designed APIs and efficient representations.</p>
103
104<p>In addition to being efficient when pitted head-to-head against GCC in batch
105mode, clang is built with a <a href="#libraryarch">library based
106architecture</a> that makes it relatively easy to adapt it and build new tools
107with it. This means that it is often possible to apply out-of-the-box thinking
108and novel techniques to improve compilation in various ways.</p>
109
110<img class="img_slide" src="feature-compile2.png" width="400" height="300" />
111
112<p>This slide shows how the clang preprocessor can be used to make "distcc"
113parallelization <b>3x</b> more scalable than when using the GCC preprocessor.
114"distcc" quickly bottlenecks on the preprocessor running on the central driver
115machine, so a fast preprocessor is very useful. Comparing the first two bars
116of each group shows how a ~40% faster preprocessor can reduce preprocessing time
117of these large C++ apps by about 40% (shocking!).</p>
118
119<p>The third bar on the slide is the interesting part: it shows how trivial
120caching of file system accesses across invocations of the preprocessor allows
121clang to reduce time spent in the kernel by 10x, making distcc over 3x more
122scalable. This is obviously just one simple hack, doing more interesting things
123(like caching tokens across preprocessed files) would yield another substantial
124speedup.</p>
125
126<p>The clean framework-based design of clang means that many things are possible
127that would be very difficult in other systems, for example incremental
128compilation, multithreading, intelligent caching, etc. We are only starting
129to tap the full potential of the clang design.</p>
130
131
132<!--=======================================================================-->
Ted Kremenek3b61b152008-06-17 06:35:36 +0000133<h3><a name="expressivediags">Expressive Diagnostics</a></h3>
Chris Lattner1a380a02007-12-10 07:14:08 +0000134<!--=======================================================================-->
135
Chris Lattner13cc2352009-03-19 06:52:51 +0000136<p>In addition to being fast and functional, we aim to make Clang extremely user
137friendly. As far as a command-line compiler goes, this basically boils down to
138making the diagnostics (error and warning messages) generated by the compiler
139be as useful as possible. There are several ways that we do this. This section
140talks about the experience provided by the command line compiler, contrasting
141Clang output to GCC 4.2's output in several examples.
142<!--
143Other clients
144that embed Clang and extract equivalent information through internal APIs.-->
145</p>
Chris Lattner1a380a02007-12-10 07:14:08 +0000146
Chris Lattner13cc2352009-03-19 06:52:51 +0000147<h4>Column Numbers and Caret Diagnostics</h4>
Chris Lattner1a380a02007-12-10 07:14:08 +0000148
Chris Lattner13cc2352009-03-19 06:52:51 +0000149<p>First, all diagnostics produced by clang include full column number
150information, and use this to print "caret diagnostics". This is a feature
151provided by many commercial compilers, but is generally missing from open source
152compilers. This is nice because it makes it very easy to understand exactly
153what is wrong in a particular piece of code, an example is:</p>
Chris Lattner1a380a02007-12-10 07:14:08 +0000154
Chris Lattner13cc2352009-03-19 06:52:51 +0000155<pre>
156 $ <b>gcc-4.2 -fsyntax-only -Wformat format-strings.c</b>
157 format-strings.c:91: warning: too few arguments for format
158 $ <b>clang -fsyntax-only format-strings.c</b>
159 format-strings.c:91:13: warning: '.*' specified field precision is missing a matching 'int' argument
160 <font color="darkgreen"> printf("%.*d");</font>
161 <font color="blue"> ^</font>
162</pre>
Chris Lattner1a380a02007-12-10 07:14:08 +0000163
Chris Lattner13cc2352009-03-19 06:52:51 +0000164<p>The caret (the blue "^" character) exactly shows where the problem is, even
165inside of the string. This makes it really easy to jump to the problem and
166helps when multiple instances of the same character occur on a line. We'll
167revisit this more in following examples.</p>
Chris Lattner1a380a02007-12-10 07:14:08 +0000168
Chris Lattner13cc2352009-03-19 06:52:51 +0000169<h4>Range Highlighting for Related Text</h4>
170
171<p>Clang captures and accurately tracks range information for expressions,
172statements, and other constructs in your program and uses this to make
173diagnostics highlight related information. For example, here's a somewhat
174nonsensical example to illustrate this:</p>
175
176<pre>
177 $ <b>gcc-4.2 -fsyntax-only t.c</b>
178 t.c:7: error: invalid operands to binary + (have 'int' and 'struct A')
179 $ <b>clang -fsyntax-only t.c</b>
180 t.c:7:39: error: invalid operands to binary expression ('int' and 'struct A')
181 <font color="darkgreen"> return y + func(y ? ((SomeA.X + 40) + SomeA) / 42 + SomeA.X : SomeA.X);</font>
182 <font color="blue"> ~~~~~~~~~~~~~~ ^ ~~~~~</font>
183</pre>
184
185<p>Here you can see that you don't even need to see the original source code to
186understand what is wrong based on the Clang error: Because clang prints a
187caret, you know exactly <em>which</em> plus it is complaining about. The range
188information highlights the left and right side of the plus which makes it
189immediately obvious what the compiler is talking about, which is very useful for
190cases involving precedence issues and many other cases.</p>
191
192<h4>Precision in Wording</h4>
193
194<p>A detail is that we have tried really hard to make the diagnostics that come
195out of clang contain exactly the pertinent information about what is wrong and
196why. In the example above, we tell you what the inferred types are for
197the left and right hand sides, and we don't repeat what is obvious from the
198caret (that this is a "binary +"). Many other examples abound, here is a simple
199one:</p>
200
201<pre>
202 $ <b>gcc-4.2 -fsyntax-only t.c</b>
203 t.c:5: error: invalid type argument of 'unary *'
204 $ <b>clang -fsyntax-only t.c</b>
205 t.c:5:11: error: indirection requires pointer operand ('int' invalid)
206 <font color="darkgreen"> int y = *SomeA.X;</font>
207 <font color="blue"> ^~~~~~~~</font>
208</pre>
209
210<p>In this example, not only do we tell you that there is a problem with the *
211and point to it, we say exactly why and tell you what the type is (in case it is
212a complicated subexpression, such as a call to an overloaded function). This
213sort of attention to detail makes it much easier to understand and fix problems
214quickly.</p>
215
216<h4>No Pretty Printing of Expressions in Diagnostics</h4>
217
218<p>Since Clang has range highlighting, it never needs to pretty print your code
219back out to you. This is particularly bad in G++ (which often emits errors
220containing lowered vtable references), but even GCC can produce
221inscrutible error messages in some cases when it tries to do this. In this
222example P and Q have type "int*":</p>
223
224<pre>
225 $ <b>gcc-4.2 -fsyntax-only t.c</b>
226 #'exact_div_expr' not supported by pp_c_expression#'t.c:12: error: called object is not a function
227 $ <b>clang -fsyntax-only t.c</b>
228 t.c:12:8: error: called object type 'int' is not a function or function pointer
229 <font color="darkgreen"> (P-Q)();</font>
230 <font color="blue"> ~~~~~^</font>
231</pre>
232
233
234<h4>Typedef Preservation and Selective Unwrapping</h4>
235
236<p>Many programmers use high-level user defined types, typedefs, and other
237syntactic sugar to refer to types in their program. This is useful because they
238can abbreviate otherwise very long types and it is useful to preserve the
239typename in diagnostics. However, sometimes very simple typedefs can wrap
240trivial types and it is important to strip off the typedef to understand what
241is going on. Clang aims to handle both cases well.<p>
242
243<p>For example, here is an example that shows where it is important to preserve
244a typedef in C:</p>
245
246<pre>
247 $ <b>gcc-4.2 -fsyntax-only t.c</b>
248 t.c:15: error: invalid operands to binary / (have 'float __vector__' and 'const int *')
249 $ <b>clang -fsyntax-only t.c</b>
250 t.c:15:11: error: can't convert between vector values of different size ('__m128' and 'int const *')
251 <font color="darkgreen"> myvec[1]/P;</font>
252 <font color="blue"> ~~~~~~~~^~</font>
253</pre>
254
255<p>Here the type printed by GCC isn't even valid, but if the error were about a
256very long and complicated type (as often happens in C++) the error message would
257be ugly just because it was long and hard to read. Here's an example where it
258is useful for the compiler to expose underlying details of a typedef:</p>
259
260<pre>
261 $ <b>gcc-4.2 -fsyntax-only t.c</b>
262 t.c:13: error: request for member 'x' in something not a structure or union
263 $ <b>clang -fsyntax-only t.c</b>
264 t.c:13:9: error: member reference base type 'pid_t' (aka 'int') is not a structure or union
265 <font color="darkgreen"> myvar = myvar.x;</font>
266 <font color="blue"> ~~~~~ ^</font>
267</pre>
268
269<p>If the user was somehow confused about how the system "pid_t" typedef is
270defined, Clang helpfully displays it with "aka".</p>
271
272<h4>Automatic Macro Expansion</h4>
273
274<p>Many errors happen in macros that are sometimes deeply nested. With
275traditional compilers, you need to dig deep into the definition of the macro to
276understand how you got into trouble. Here's a simple example that shows how
277Clang helps you out:</p>
278
279<pre>
280 $ <b>gcc-4.2 -fsyntax-only t.c</b>
281 t.c: In function 'test':
282 t.c:80: error: invalid operands to binary &lt; (have 'struct mystruct' and 'float')
283 $ <b>clang -fsyntax-only t.c</b>
284 t.c:80:3: error: invalid operands to binary expression ('typeof(P)' (aka 'struct mystruct') and 'typeof(F)' (aka 'float'))
285 <font color="darkgreen"> X = MYMAX(P, F);</font>
286 <font color="blue"> ^~~~~~~~~~~</font>
287 t.c:76:94: note: instantiated from:
288 <font color="darkgreen">#define MYMAX(A,B) __extension__ ({ __typeof__(A) __a = (A); __typeof__(B) __b = (B); __a &lt; __b ? __b : __a; })</font>
289 <font color="blue"> ~~~ ^ ~~~</font>
290</pre>
291
292<p>This shows how clang automatically prints instantiation information and
293nested range information for diagnostics as they are instantiated through macros
294and also shows how some of the other pieces work in a bigger example. Here's
295another real world warning that occurs in the "window" Unix package (which
296implements the "wwopen" class of APIs):</p>
297
298<pre>
299 $ <b>clang -fsyntax-only t.c</b>
300 t.c:22:2: warning: type specifier missing, defaults to 'int'
301 <font color="darkgreen"> ILPAD();</font>
302 <font color="blue"> ^</font>
303 t.c:17:17: note: instantiated from:
304 <font color="darkgreen">#define ILPAD() PAD((NROW - tt.tt_row) * 10) /* 1 ms per char */</font>
305 <font color="blue"> ^</font>
306 t.c:14:2: note: instantiated from:
307 <font color="darkgreen"> register i; \</font>
308 <font color="blue"> ^</font>
309</pre>
310
311<p>In practice, we've found that this is actually more useful in multiply nested
312macros that in simple ones.</p>
313
Chris Lattner4b79c502009-03-19 07:06:44 +0000314<h4>Fix-it Hints</h4>
315
316<p>simple example + template&lt;&gt; example</p>
Chris Lattner13cc2352009-03-19 06:52:51 +0000317
318<h4>C++ Fun Examples</h4>
319
320<p>...</p>
Chris Lattner1a380a02007-12-10 07:14:08 +0000321
Chris Lattnerb5604af2007-12-10 07:23:52 +0000322<!--=======================================================================-->
Ted Kremenek3b61b152008-06-17 06:35:36 +0000323<h3><a name="gcccompat">GCC Compatibility</a></h3>
Chris Lattnerb5604af2007-12-10 07:23:52 +0000324<!--=======================================================================-->
325
326<p>GCC is currently the defacto-standard open source compiler today, and it
327routinely compiles a huge volume of code. GCC supports a huge number of
328extensions and features (many of which are undocumented) and a lot of
329code and header files depend on these features in order to build.</p>
330
331<p>While it would be nice to be able to ignore these extensions and focus on
332implementing the language standards to the letter, pragmatics force us to
333support the GCC extensions that see the most use. Many users just want their
334code to compile, they don't care to argue about whether it is pedantically C99
335or not.</p>
336
337<p>As mentioned above, all
338extensions are explicitly recognized as such and marked with extension
339diagnostics, which can be mapped to warnings, errors, or just ignored.
340</p>
341
Chris Lattner1a380a02007-12-10 07:14:08 +0000342
Chris Lattneread27db2007-12-10 08:12:49 +0000343<!--*************************************************************************-->
Ted Kremenek3b61b152008-06-17 06:35:36 +0000344<h2><a name="applications">Utility and Applications</a></h2>
Chris Lattneread27db2007-12-10 08:12:49 +0000345<!--*************************************************************************-->
346
Chris Lattner1a380a02007-12-10 07:14:08 +0000347<!--=======================================================================-->
Ted Kremenek3b61b152008-06-17 06:35:36 +0000348<h3><a name="libraryarch">Library Based Architecture</a></h3>
Chris Lattner1a380a02007-12-10 07:14:08 +0000349<!--=======================================================================-->
350
Chris Lattneread27db2007-12-10 08:12:49 +0000351<p>A major design concept for clang is its use of a library-based
352architecture. In this design, various parts of the front-end can be cleanly
353divided into separate libraries which can then be mixed up for different needs
354and uses. In addition, the library-based approach encourages good interfaces
355and makes it easier for new developers to get involved (because they only need
356to understand small pieces of the big picture).</p>
357
358<blockquote>
359"The world needs better compiler tools, tools which are built as libraries.
360This design point allows reuse of the tools in new and novel ways. However,
361building the tools as libraries isn't enough: they must have clean APIs, be as
362decoupled from each other as possible, and be easy to modify/extend. This
363requires clean layering, decent design, and keeping the libraries independent of
364any specific client."</blockquote>
365
366<p>
367Currently, clang is divided into the following libraries and tool:
368</p>
369
370<ul>
371<li><b>libsupport</b> - Basic support library, from LLVM.</li>
372<li><b>libsystem</b> - System abstraction library, from LLVM.</li>
373<li><b>libbasic</b> - Diagnostics, SourceLocations, SourceBuffer abstraction,
374 file system caching for input source files.</li>
375<li><b>libast</b> - Provides classes to represent the C AST, the C type system,
376 builtin functions, and various helpers for analyzing and manipulating the
377 AST (visitors, pretty printers, etc).</li>
378<li><b>liblex</b> - Lexing and preprocessing, identifier hash table, pragma
379 handling, tokens, and macro expansion.</li>
380<li><b>libparse</b> - Parsing. This library invokes coarse-grained 'Actions'
381 provided by the client (e.g. libsema builds ASTs) but knows nothing about
382 ASTs or other client-specific data structures.</li>
383<li><b>libsema</b> - Semantic Analysis. This provides a set of parser actions
384 to build a standardized AST for programs.</li>
385<li><b>libcodegen</b> - Lower the AST to LLVM IR for optimization &amp; code
386 generation.</li>
387<li><b>librewrite</b> - Editing of text buffers (important for code rewriting
388 transformation, like refactoring).</li>
389<li><b>libanalysis</b> - Static analysis support.</li>
390<li><b>clang</b> - A driver program, client of the libraries at various
391 levels.</li>
392</ul>
393
394<p>As an example of the power of this library based design.... If you wanted to
395build a preprocessor, you would take the Basic and Lexer libraries. If you want
396an indexer, you would take the previous two and add the Parser library and
397some actions for indexing. If you want a refactoring, static analysis, or
398source-to-source compiler tool, you would then add the AST building and
399semantic analyzer libraries.</p>
400
401<p>For more information about the low-level implementation details of the
402various clang libraries, please see the <a href="docs/InternalsManual.html">
403clang Internals Manual</a>.</p>
404
405<!--=======================================================================-->
Ted Kremenek3b61b152008-06-17 06:35:36 +0000406<h3><a name="diverseclients">Support Diverse Clients</a></h3>
Chris Lattneread27db2007-12-10 08:12:49 +0000407<!--=======================================================================-->
408
409<p>Clang is designed and built with many grand plans for how we can use it. The
410driving force is the fact that we use C and C++ daily, and have to suffer due to
411a lack of good tools available for it. We believe that the C and C++ tools
412ecosystem has been significantly limited by how difficult it is to parse and
413represent the source code for these languages, and we aim to rectify this
414problem in clang.</p>
415
416<p>The problem with this goal is that different clients have very different
417requirements. Consider code generation, for example: a simple front-end that
418parses for code generation must analyze the code for validity and emit code
419in some intermediate form to pass off to a optimizer or backend. Because
420validity analysis and code generation can largely be done on the fly, there is
421not hard requirement that the front-end actually build up a full AST for all
422the expressions and statements in the code. TCC and GCC are examples of
423compilers that either build no real AST (in the former case) or build a stripped
424down and simplified AST (in the later case) because they focus primarily on
425codegen.</p>
426
427<p>On the opposite side of the spectrum, some clients (like refactoring) want
428highly detailed information about the original source code and want a complete
429AST to describe it with. Refactoring wants to have information about macro
430expansions, the location of every paren expression '(((x)))' vs 'x', full
431position information, and much more. Further, refactoring wants to look
432<em>across the whole program</em> to ensure that it is making transformations
433that are safe. Making this efficient and getting this right requires a
434significant amount of engineering and algorithmic work that simply are
435unnecessary for a simple static compiler.</p>
436
437<p>The beauty of the clang approach is that it does not restrict how you use it.
438In particular, it is possible to use the clang preprocessor and parser to build
439an extremely quick and light-weight on-the-fly code generator (similar to TCC)
440that does not build an AST at all. As an intermediate step, clang supports
441using the current AST generation and semantic analysis code and having a code
442generation client free the AST for each function after code generation. Finally,
443clang provides support for building and retaining fully-fledged ASTs, and even
444supports writing them out to disk.</p>
445
446<p>Designing the libraries with clean and simple APIs allows these high-level
447policy decisions to be determined in the client, instead of forcing "one true
448way" in the implementation of any of these libraries. Getting this right is
449hard, and we don't always get it right the first time, but we fix any problems
450when we realize we made a mistake.</p>
451
452<!--=======================================================================-->
Ted Kremenek3b61b152008-06-17 06:35:36 +0000453<h3><a name="ideintegration">Integration with IDEs</h3>
Chris Lattneread27db2007-12-10 08:12:49 +0000454<!--=======================================================================-->
455
456<p>
457We believe that Integrated Development Environments (IDE's) are a great way
458to pull together various pieces of the development puzzle, and aim to make clang
459work well in such an environment. The chief advantage of an IDE is that they
460typically have visibility across your entire project and are long-lived
461processes, whereas stand-alone compiler tools are typically invoked on each
462individual file in the project, and thus have limited scope.</p>
463
464<p>There are many implications of this difference, but a significant one has to
465do with efficiency and caching: sharing an address space across different files
466in a project, means that you can use intelligent caching and other techniques to
467dramatically reduce analysis/compilation time.</p>
468
469<p>A further difference between IDEs and batch compiler is that they often
470impose very different requirements on the front-end: they depend on high
471performance in order to provide a "snappy" experience, and thus really want
472techniques like "incremental compilation", "fuzzy parsing", etc. Finally, IDEs
473often have very different requirements than code generation, often requiring
474information that a codegen-only frontend can throw away. Clang is
475specifically designed and built to capture this information.
476</p>
477
478
479<!--=======================================================================-->
Ted Kremenek3b61b152008-06-17 06:35:36 +0000480<h3><a name="license">Use the LLVM 'BSD' License</a></h3>
Chris Lattneread27db2007-12-10 08:12:49 +0000481<!--=======================================================================-->
482
483<p>We actively indend for clang (and a LLVM as a whole) to be used for
484commercial projects, and the BSD license is the simplest way to allow this. We
485feel that the license encourages contributors to pick up the source and work
486with it, and believe that those individuals and organizations will contribute
487back their work if they do not want to have to maintain a fork forever (which is
488time consuming and expensive when merges are involved). Further, nobody makes
489money on compilers these days, but many people need them to get bigger goals
490accomplished: it makes sense for everyone to work together.</p>
491
492<p>For more information about the LLVM/clang license, please see the <a
493href="http://llvm.org/docs/DeveloperPolicy.html#license">LLVM License
494Description</a> for more information.</p>
495
496
497
498<!--*************************************************************************-->
Ted Kremenek3b61b152008-06-17 06:35:36 +0000499<h2><a name="design">Internal Design and Implementation</a></h2>
Chris Lattneread27db2007-12-10 08:12:49 +0000500<!--*************************************************************************-->
501
Chris Lattner1a380a02007-12-10 07:14:08 +0000502<!--=======================================================================-->
Ted Kremenek3b61b152008-06-17 06:35:36 +0000503<h3><a name="real">A real-world, production quality compiler</a></h3>
Chris Lattner6908f302007-12-10 05:52:05 +0000504<!--=======================================================================-->
Chris Lattner7a274392007-10-06 05:23:00 +0000505
Chris Lattner6908f302007-12-10 05:52:05 +0000506<p>
Chris Lattnercddb2af2007-12-10 18:56:37 +0000507Clang is designed and built by experienced compiler developers who
Chris Lattner6908f302007-12-10 05:52:05 +0000508are increasingly frustrated with the problems that <a
509href="comparison.html">existing open source compilers</a> have. Clang is
510carefully and thoughtfully designed and built to provide the foundation of a
511whole new generation of C/C++/Objective C development tools, and we intend for
Chris Lattnercddb2af2007-12-10 18:56:37 +0000512it to be production quality.</p>
Chris Lattner6908f302007-12-10 05:52:05 +0000513
514<p>Being a production quality compiler means many things: it means being high
515performance, being solid and (relatively) bug free, and it means eventually
516being used and depended on by a broad range of people. While we are still in
517the early development stages, we strongly believe that this will become a
518reality.</p>
519
520<!--=======================================================================-->
Ted Kremenek3b61b152008-06-17 06:35:36 +0000521<h3><a name="simplecode">A simple and hackable code base</a></h3>
Chris Lattnerb5604af2007-12-10 07:23:52 +0000522<!--=======================================================================-->
523
524<p>Our goal is to make it possible for anyone with a basic understanding
525of compilers and working knowledge of the C/C++/ObjC languages to understand and
526extend the clang source base. A large part of this falls out of our decision to
527make the AST mirror the languages as closely as possible: you have your friendly
528if statement, for statement, parenthesis expression, structs, unions, etc, all
529represented in a simple and explicit way.</p>
530
531<p>In addition to a simple design, we work to make the source base approachable
532by commenting it well, including citations of the language standards where
533appropriate, and designing the code for simplicity. Beyond that, clang offers
534a set of AST dumpers, printers, and visualizers that make it easy to put code in
535and see how it is represented.</p>
536
537<!--=======================================================================-->
Ted Kremenek3b61b152008-06-17 06:35:36 +0000538<h3><a name="unifiedparser">A single unified parser for C, Objective C, C++,
539and Objective C++</a></h3>
Chris Lattner6908f302007-12-10 05:52:05 +0000540<!--=======================================================================-->
541
542<p>Clang is the "C Language Family Front-end", which means we intend to support
543the most popular members of the C family. We are convinced that the right
544parsing technology for this class of languages is a hand-built recursive-descent
545parser. Because it is plain C++ code, recursive descent makes it very easy for
546new developers to understand the code, it easily supports ad-hoc rules and other
547strange hacks required by C/C++, and makes it straight-forward to implement
548excellent diagnostics and error recovery.</p>
549
550<p>We believe that implementing C/C++/ObjC in a single unified parser makes the
551end result easier to maintain and evolve than maintaining a separate C and C++
552parser which must be bugfixed and maintained independently of each other.</p>
553
554<!--=======================================================================-->
Ted Kremenek3b61b152008-06-17 06:35:36 +0000555<h3><a name="conformance">Conformance with C/C++/ObjC and their
556 variants</a></h3>
Chris Lattner6908f302007-12-10 05:52:05 +0000557<!--=======================================================================-->
558
559<p>When you start work on implementing a language, you find out that there is a
560huge gap between how the language works and how most people understand it to
561work. This gap is the difference between a normal programmer and a (scary?
562super-natural?) "language lawyer", who knows the ins and outs of the language
563and can grok standardese with ease.</p>
564
565<p>In practice, being conformant with the languages means that we aim to support
566the full language, including the dark and dusty corners (like trigraphs,
567preprocessor arcana, C99 VLAs, etc). Where we support extensions above and
568beyond what the standard officially allows, we make an effort to explicitly call
569this out in the code and emit warnings about it (which are disabled by default,
570but can optionally be mapped to either warnings or errors), allowing you to use
571clang in "strict" mode if you desire.</p>
572
573<p>We also intend to support "dialects" of these languages, such as C89, K&amp;R
574C, C++'03, Objective-C 2, etc.</p>
575
Chris Lattner7a274392007-10-06 05:23:00 +0000576</div>
577</body>
Chris Lattnerbafc68f2007-10-06 05:48:57 +0000578</html>