blob: 8a1c856cd1c053d7fa92577db8d0860696c5dfa7 [file] [log] [blame]
Chris Lattnerce90ba62007-12-10 05:20:47 +00001<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
2 "http://www.w3.org/TR/html4/strict.dtd">
Chris Lattner7a274392007-10-06 05:23:00 +00003<html>
4<head>
Benjamin Kramer665a8dc2012-01-15 15:26:07 +00005 <META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
Chris Lattner6908f302007-12-10 05:52:05 +00006 <title>Clang - Features and Goals</title>
Benjamin Kramer665a8dc2012-01-15 15:26:07 +00007 <link type="text/css" rel="stylesheet" href="menu.css">
8 <link type="text/css" rel="stylesheet" href="content.css">
Chris Lattnerce90ba62007-12-10 05:20:47 +00009 <style type="text/css">
Chris Lattner7a274392007-10-06 05:23:00 +000010</style>
11</head>
12<body>
Chris Lattnerce90ba62007-12-10 05:20:47 +000013
Chris Lattner7a274392007-10-06 05:23:00 +000014<!--#include virtual="menu.html.incl"-->
Chris Lattnerce90ba62007-12-10 05:20:47 +000015
Chris Lattner7a274392007-10-06 05:23:00 +000016<div id="content">
Chris Lattner7a274392007-10-06 05:23:00 +000017
Chris Lattneread27db2007-12-10 08:12:49 +000018<!--*************************************************************************-->
Chris Lattner6908f302007-12-10 05:52:05 +000019<h1>Clang - Features and Goals</h1>
Chris Lattneread27db2007-12-10 08:12:49 +000020<!--*************************************************************************-->
21
Chris Lattner6908f302007-12-10 05:52:05 +000022<p>
23This page describes the <a href="index.html#goals">features and goals</a> of
24Clang in more detail and gives a more broad explanation about what we mean.
25These features are:
26</p>
Chris Lattner7a274392007-10-06 05:23:00 +000027
Chris Lattner1a380a02007-12-10 07:14:08 +000028<p>End-User Features:</p>
29
30<ul>
Chris Lattnerde9a4f52007-12-13 05:42:27 +000031<li><a href="#performance">Fast compiles and low memory use</a></li>
Chris Lattnercf086ea2007-12-10 08:19:29 +000032<li><a href="#expressivediags">Expressive diagnostics</a></li>
Chris Lattnerb5604af2007-12-10 07:23:52 +000033<li><a href="#gcccompat">GCC compatibility</a></li>
Chris Lattner1a380a02007-12-10 07:14:08 +000034</ul>
35
Chris Lattneread27db2007-12-10 08:12:49 +000036<p>Utility and Applications:</p>
37
38<ul>
39<li><a href="#libraryarch">Library based architecture</a></li>
40<li><a href="#diverseclients">Support diverse clients</a></li>
41<li><a href="#ideintegration">Integration with IDEs</a></li>
42<li><a href="#license">Use the LLVM 'BSD' License</a></li>
43</ul>
44
45<p>Internal Design and Implementation:</p>
46
Chris Lattner6908f302007-12-10 05:52:05 +000047<ul>
48<li><a href="#real">A real-world, production quality compiler</a></li>
Chris Lattnerb5604af2007-12-10 07:23:52 +000049<li><a href="#simplecode">A simple and hackable code base</a></li>
Chris Lattner6908f302007-12-10 05:52:05 +000050<li><a href="#unifiedparser">A single unified parser for C, Objective C, C++,
51 and Objective C++</a></li>
52<li><a href="#conformance">Conformance with C/C++/ObjC and their
53 variants</a></li>
54</ul>
Chris Lattner7a274392007-10-06 05:23:00 +000055
Chris Lattneread27db2007-12-10 08:12:49 +000056<!--*************************************************************************-->
Ted Kremenek3b61b152008-06-17 06:35:36 +000057<h2><a name="enduser">End-User Features</a></h2>
Chris Lattneread27db2007-12-10 08:12:49 +000058<!--*************************************************************************-->
Chris Lattner1a380a02007-12-10 07:14:08 +000059
60
61<!--=======================================================================-->
Ted Kremenek3b61b152008-06-17 06:35:36 +000062<h3><a name="performance">Fast compiles and Low Memory Use</a></h3>
Chris Lattner1a380a02007-12-10 07:14:08 +000063<!--=======================================================================-->
64
65<p>A major focus of our work on clang is to make it fast, light and scalable.
66The library-based architecture of clang makes it straight-forward to time and
67profile the cost of each layer of the stack, and the driver has a number of
Stephen Hines6bcf27b2014-05-29 04:14:42 -070068options for performance analysis. Many detailed benchmarks can be found online.</p>
Chris Lattner1a380a02007-12-10 07:14:08 +000069
70<p>Compile time performance is important, but when using clang as an API, often
71memory use is even moreso: the less memory the code takes the more code you can
72fit into memory at a time (useful for whole program analysis tools, for
73example).</p>
74
Chris Lattner1a380a02007-12-10 07:14:08 +000075<p>In addition to being efficient when pitted head-to-head against GCC in batch
Stephen Hines6bcf27b2014-05-29 04:14:42 -070076mode, clang is built with a <a href="#libraryarch">library based
Chris Lattner1a380a02007-12-10 07:14:08 +000077architecture</a> that makes it relatively easy to adapt it and build new tools
78with it. This means that it is often possible to apply out-of-the-box thinking
Stephen Hines6bcf27b2014-05-29 04:14:42 -070079and novel techniques to improve compilation in various ways.</p>
Chris Lattner1a380a02007-12-10 07:14:08 +000080
81
82<!--=======================================================================-->
Ted Kremenek3b61b152008-06-17 06:35:36 +000083<h3><a name="expressivediags">Expressive Diagnostics</a></h3>
Chris Lattner1a380a02007-12-10 07:14:08 +000084<!--=======================================================================-->
85
Chris Lattner13cc2352009-03-19 06:52:51 +000086<p>In addition to being fast and functional, we aim to make Clang extremely user
87friendly. As far as a command-line compiler goes, this basically boils down to
88making the diagnostics (error and warning messages) generated by the compiler
Chris Lattner9ef36922009-03-19 22:03:42 +000089be as useful as possible. There are several ways that we do this, but the
90most important are pinpointing exactly what is wrong in the program,
Chris Lattner202a7422009-03-19 18:56:04 +000091highlighting related information so that it is easy to understand at a glance,
92and making the wording as clear as possible.</p>
Chris Lattner1a380a02007-12-10 07:14:08 +000093
Chris Lattner202a7422009-03-19 18:56:04 +000094<p>Here is one simple example that illustrates the difference between a typical
95GCC and Clang diagnostic:</p>
Chris Lattner13cc2352009-03-19 06:52:51 +000096
97<pre>
98 $ <b>gcc-4.2 -fsyntax-only t.c</b>
99 t.c:7: error: invalid operands to binary + (have 'int' and 'struct A')
100 $ <b>clang -fsyntax-only t.c</b>
101 t.c:7:39: error: invalid operands to binary expression ('int' and 'struct A')
Benjamin Kramer665a8dc2012-01-15 15:26:07 +0000102 <span style="color:darkgreen"> return y + func(y ? ((SomeA.X + 40) + SomeA) / 42 + SomeA.X : SomeA.X);</span>
103 <span style="color:blue"> ~~~~~~~~~~~~~~ ^ ~~~~~</span>
Chris Lattner13cc2352009-03-19 06:52:51 +0000104</pre>
105
106<p>Here you can see that you don't even need to see the original source code to
Stephen Hines6bcf27b2014-05-29 04:14:42 -0700107understand what is wrong based on the Clang error: Because Clang prints a
Chris Lattner13cc2352009-03-19 06:52:51 +0000108caret, you know exactly <em>which</em> plus it is complaining about. The range
109information highlights the left and right side of the plus which makes it
110immediately obvious what the compiler is talking about, which is very useful for
Chris Lattner9ef36922009-03-19 22:03:42 +0000111cases involving precedence issues and many other situations.</p>
Chris Lattner13cc2352009-03-19 06:52:51 +0000112
Chris Lattner9ef36922009-03-19 22:03:42 +0000113<p>Clang diagnostics are very polished and have many features. For more
Chris Lattner202a7422009-03-19 18:56:04 +0000114information and examples, please see the <a href="diagnostics.html">Expressive
115Diagnostics</a> page.</p>
Chris Lattner1a380a02007-12-10 07:14:08 +0000116
Chris Lattnerb5604af2007-12-10 07:23:52 +0000117<!--=======================================================================-->
Ted Kremenek3b61b152008-06-17 06:35:36 +0000118<h3><a name="gcccompat">GCC Compatibility</a></h3>
Chris Lattnerb5604af2007-12-10 07:23:52 +0000119<!--=======================================================================-->
120
121<p>GCC is currently the defacto-standard open source compiler today, and it
122routinely compiles a huge volume of code. GCC supports a huge number of
123extensions and features (many of which are undocumented) and a lot of
124code and header files depend on these features in order to build.</p>
125
126<p>While it would be nice to be able to ignore these extensions and focus on
127implementing the language standards to the letter, pragmatics force us to
128support the GCC extensions that see the most use. Many users just want their
129code to compile, they don't care to argue about whether it is pedantically C99
130or not.</p>
131
132<p>As mentioned above, all
133extensions are explicitly recognized as such and marked with extension
134diagnostics, which can be mapped to warnings, errors, or just ignored.
135</p>
136
Chris Lattner1a380a02007-12-10 07:14:08 +0000137
Chris Lattneread27db2007-12-10 08:12:49 +0000138<!--*************************************************************************-->
Ted Kremenek3b61b152008-06-17 06:35:36 +0000139<h2><a name="applications">Utility and Applications</a></h2>
Chris Lattneread27db2007-12-10 08:12:49 +0000140<!--*************************************************************************-->
141
Chris Lattner1a380a02007-12-10 07:14:08 +0000142<!--=======================================================================-->
Ted Kremenek3b61b152008-06-17 06:35:36 +0000143<h3><a name="libraryarch">Library Based Architecture</a></h3>
Chris Lattner1a380a02007-12-10 07:14:08 +0000144<!--=======================================================================-->
145
Chris Lattneread27db2007-12-10 08:12:49 +0000146<p>A major design concept for clang is its use of a library-based
147architecture. In this design, various parts of the front-end can be cleanly
148divided into separate libraries which can then be mixed up for different needs
149and uses. In addition, the library-based approach encourages good interfaces
150and makes it easier for new developers to get involved (because they only need
151to understand small pieces of the big picture).</p>
152
Benjamin Kramer665a8dc2012-01-15 15:26:07 +0000153<blockquote><p>
Chris Lattneread27db2007-12-10 08:12:49 +0000154"The world needs better compiler tools, tools which are built as libraries.
155This design point allows reuse of the tools in new and novel ways. However,
156building the tools as libraries isn't enough: they must have clean APIs, be as
157decoupled from each other as possible, and be easy to modify/extend. This
158requires clean layering, decent design, and keeping the libraries independent of
Benjamin Kramer665a8dc2012-01-15 15:26:07 +0000159any specific client."</p></blockquote>
Chris Lattneread27db2007-12-10 08:12:49 +0000160
161<p>
162Currently, clang is divided into the following libraries and tool:
163</p>
164
165<ul>
166<li><b>libsupport</b> - Basic support library, from LLVM.</li>
167<li><b>libsystem</b> - System abstraction library, from LLVM.</li>
168<li><b>libbasic</b> - Diagnostics, SourceLocations, SourceBuffer abstraction,
169 file system caching for input source files.</li>
170<li><b>libast</b> - Provides classes to represent the C AST, the C type system,
171 builtin functions, and various helpers for analyzing and manipulating the
172 AST (visitors, pretty printers, etc).</li>
173<li><b>liblex</b> - Lexing and preprocessing, identifier hash table, pragma
174 handling, tokens, and macro expansion.</li>
175<li><b>libparse</b> - Parsing. This library invokes coarse-grained 'Actions'
176 provided by the client (e.g. libsema builds ASTs) but knows nothing about
177 ASTs or other client-specific data structures.</li>
178<li><b>libsema</b> - Semantic Analysis. This provides a set of parser actions
179 to build a standardized AST for programs.</li>
180<li><b>libcodegen</b> - Lower the AST to LLVM IR for optimization &amp; code
181 generation.</li>
182<li><b>librewrite</b> - Editing of text buffers (important for code rewriting
183 transformation, like refactoring).</li>
184<li><b>libanalysis</b> - Static analysis support.</li>
185<li><b>clang</b> - A driver program, client of the libraries at various
186 levels.</li>
187</ul>
188
189<p>As an example of the power of this library based design.... If you wanted to
190build a preprocessor, you would take the Basic and Lexer libraries. If you want
191an indexer, you would take the previous two and add the Parser library and
192some actions for indexing. If you want a refactoring, static analysis, or
193source-to-source compiler tool, you would then add the AST building and
194semantic analyzer libraries.</p>
195
196<p>For more information about the low-level implementation details of the
197various clang libraries, please see the <a href="docs/InternalsManual.html">
198clang Internals Manual</a>.</p>
199
200<!--=======================================================================-->
Ted Kremenek3b61b152008-06-17 06:35:36 +0000201<h3><a name="diverseclients">Support Diverse Clients</a></h3>
Chris Lattneread27db2007-12-10 08:12:49 +0000202<!--=======================================================================-->
203
204<p>Clang is designed and built with many grand plans for how we can use it. The
205driving force is the fact that we use C and C++ daily, and have to suffer due to
206a lack of good tools available for it. We believe that the C and C++ tools
207ecosystem has been significantly limited by how difficult it is to parse and
208represent the source code for these languages, and we aim to rectify this
209problem in clang.</p>
210
211<p>The problem with this goal is that different clients have very different
212requirements. Consider code generation, for example: a simple front-end that
213parses for code generation must analyze the code for validity and emit code
214in some intermediate form to pass off to a optimizer or backend. Because
215validity analysis and code generation can largely be done on the fly, there is
216not hard requirement that the front-end actually build up a full AST for all
217the expressions and statements in the code. TCC and GCC are examples of
218compilers that either build no real AST (in the former case) or build a stripped
219down and simplified AST (in the later case) because they focus primarily on
220codegen.</p>
221
222<p>On the opposite side of the spectrum, some clients (like refactoring) want
223highly detailed information about the original source code and want a complete
224AST to describe it with. Refactoring wants to have information about macro
225expansions, the location of every paren expression '(((x)))' vs 'x', full
226position information, and much more. Further, refactoring wants to look
227<em>across the whole program</em> to ensure that it is making transformations
228that are safe. Making this efficient and getting this right requires a
229significant amount of engineering and algorithmic work that simply are
230unnecessary for a simple static compiler.</p>
231
232<p>The beauty of the clang approach is that it does not restrict how you use it.
233In particular, it is possible to use the clang preprocessor and parser to build
234an extremely quick and light-weight on-the-fly code generator (similar to TCC)
235that does not build an AST at all. As an intermediate step, clang supports
236using the current AST generation and semantic analysis code and having a code
237generation client free the AST for each function after code generation. Finally,
238clang provides support for building and retaining fully-fledged ASTs, and even
239supports writing them out to disk.</p>
240
241<p>Designing the libraries with clean and simple APIs allows these high-level
242policy decisions to be determined in the client, instead of forcing "one true
243way" in the implementation of any of these libraries. Getting this right is
244hard, and we don't always get it right the first time, but we fix any problems
245when we realize we made a mistake.</p>
246
247<!--=======================================================================-->
Benjamin Kramer665a8dc2012-01-15 15:26:07 +0000248<h3 id="ideintegration">Integration with IDEs</h3>
Chris Lattneread27db2007-12-10 08:12:49 +0000249<!--=======================================================================-->
250
251<p>
252We believe that Integrated Development Environments (IDE's) are a great way
253to pull together various pieces of the development puzzle, and aim to make clang
254work well in such an environment. The chief advantage of an IDE is that they
255typically have visibility across your entire project and are long-lived
256processes, whereas stand-alone compiler tools are typically invoked on each
257individual file in the project, and thus have limited scope.</p>
258
259<p>There are many implications of this difference, but a significant one has to
260do with efficiency and caching: sharing an address space across different files
261in a project, means that you can use intelligent caching and other techniques to
262dramatically reduce analysis/compilation time.</p>
263
264<p>A further difference between IDEs and batch compiler is that they often
265impose very different requirements on the front-end: they depend on high
266performance in order to provide a "snappy" experience, and thus really want
267techniques like "incremental compilation", "fuzzy parsing", etc. Finally, IDEs
268often have very different requirements than code generation, often requiring
269information that a codegen-only frontend can throw away. Clang is
270specifically designed and built to capture this information.
271</p>
272
273
274<!--=======================================================================-->
Ted Kremenek3b61b152008-06-17 06:35:36 +0000275<h3><a name="license">Use the LLVM 'BSD' License</a></h3>
Chris Lattneread27db2007-12-10 08:12:49 +0000276<!--=======================================================================-->
277
Chris Lattner81921cc2010-11-14 07:30:46 +0000278<p>We actively intend for clang (and LLVM as a whole) to be used for
Chris Lattner9a585842012-08-08 05:26:51 +0000279commercial projects, not only as a stand-alone compiler but also as a library
280embedded inside a proprietary application. The BSD license is the simplest way
281to allow this. We feel that the license encourages contributors to pick up the
282source and work with it, and believe that those individuals and organizations
283will contribute back their work if they do not want to have to maintain a fork
284forever (which is time consuming and expensive when merges are involved).
285Further, nobody makes money on compilers these days, but many people need them
286to get bigger goals accomplished: it makes sense for everyone to work
287together.</p>
Chris Lattneread27db2007-12-10 08:12:49 +0000288
289<p>For more information about the LLVM/clang license, please see the <a
290href="http://llvm.org/docs/DeveloperPolicy.html#license">LLVM License
291Description</a> for more information.</p>
292
293
294
295<!--*************************************************************************-->
Ted Kremenek3b61b152008-06-17 06:35:36 +0000296<h2><a name="design">Internal Design and Implementation</a></h2>
Chris Lattneread27db2007-12-10 08:12:49 +0000297<!--*************************************************************************-->
298
Chris Lattner1a380a02007-12-10 07:14:08 +0000299<!--=======================================================================-->
Ted Kremenek3b61b152008-06-17 06:35:36 +0000300<h3><a name="real">A real-world, production quality compiler</a></h3>
Chris Lattner6908f302007-12-10 05:52:05 +0000301<!--=======================================================================-->
Chris Lattner7a274392007-10-06 05:23:00 +0000302
Chris Lattner6908f302007-12-10 05:52:05 +0000303<p>
Chris Lattnercddb2af2007-12-10 18:56:37 +0000304Clang is designed and built by experienced compiler developers who
Chris Lattner6908f302007-12-10 05:52:05 +0000305are increasingly frustrated with the problems that <a
306href="comparison.html">existing open source compilers</a> have. Clang is
307carefully and thoughtfully designed and built to provide the foundation of a
308whole new generation of C/C++/Objective C development tools, and we intend for
Chris Lattnercddb2af2007-12-10 18:56:37 +0000309it to be production quality.</p>
Chris Lattner6908f302007-12-10 05:52:05 +0000310
311<p>Being a production quality compiler means many things: it means being high
312performance, being solid and (relatively) bug free, and it means eventually
313being used and depended on by a broad range of people. While we are still in
314the early development stages, we strongly believe that this will become a
315reality.</p>
316
317<!--=======================================================================-->
Ted Kremenek3b61b152008-06-17 06:35:36 +0000318<h3><a name="simplecode">A simple and hackable code base</a></h3>
Chris Lattnerb5604af2007-12-10 07:23:52 +0000319<!--=======================================================================-->
320
321<p>Our goal is to make it possible for anyone with a basic understanding
322of compilers and working knowledge of the C/C++/ObjC languages to understand and
323extend the clang source base. A large part of this falls out of our decision to
324make the AST mirror the languages as closely as possible: you have your friendly
325if statement, for statement, parenthesis expression, structs, unions, etc, all
326represented in a simple and explicit way.</p>
327
328<p>In addition to a simple design, we work to make the source base approachable
329by commenting it well, including citations of the language standards where
330appropriate, and designing the code for simplicity. Beyond that, clang offers
331a set of AST dumpers, printers, and visualizers that make it easy to put code in
332and see how it is represented.</p>
333
334<!--=======================================================================-->
Ted Kremenek3b61b152008-06-17 06:35:36 +0000335<h3><a name="unifiedparser">A single unified parser for C, Objective C, C++,
336and Objective C++</a></h3>
Chris Lattner6908f302007-12-10 05:52:05 +0000337<!--=======================================================================-->
338
339<p>Clang is the "C Language Family Front-end", which means we intend to support
340the most popular members of the C family. We are convinced that the right
341parsing technology for this class of languages is a hand-built recursive-descent
342parser. Because it is plain C++ code, recursive descent makes it very easy for
343new developers to understand the code, it easily supports ad-hoc rules and other
344strange hacks required by C/C++, and makes it straight-forward to implement
345excellent diagnostics and error recovery.</p>
346
347<p>We believe that implementing C/C++/ObjC in a single unified parser makes the
348end result easier to maintain and evolve than maintaining a separate C and C++
349parser which must be bugfixed and maintained independently of each other.</p>
350
351<!--=======================================================================-->
Ted Kremenek3b61b152008-06-17 06:35:36 +0000352<h3><a name="conformance">Conformance with C/C++/ObjC and their
353 variants</a></h3>
Chris Lattner6908f302007-12-10 05:52:05 +0000354<!--=======================================================================-->
355
356<p>When you start work on implementing a language, you find out that there is a
357huge gap between how the language works and how most people understand it to
358work. This gap is the difference between a normal programmer and a (scary?
359super-natural?) "language lawyer", who knows the ins and outs of the language
360and can grok standardese with ease.</p>
361
362<p>In practice, being conformant with the languages means that we aim to support
363the full language, including the dark and dusty corners (like trigraphs,
364preprocessor arcana, C99 VLAs, etc). Where we support extensions above and
365beyond what the standard officially allows, we make an effort to explicitly call
366this out in the code and emit warnings about it (which are disabled by default,
367but can optionally be mapped to either warnings or errors), allowing you to use
368clang in "strict" mode if you desire.</p>
369
370<p>We also intend to support "dialects" of these languages, such as C89, K&amp;R
371C, C++'03, Objective-C 2, etc.</p>
372
Chris Lattner7a274392007-10-06 05:23:00 +0000373</div>
374</body>
Chris Lattnerbafc68f2007-10-06 05:48:57 +0000375</html>