blob: 28e98e1aa15d6384543ba8c8416944a0ab1d6acc [file] [log] [blame]
Chris Lattnerce90ba62007-12-10 05:20:47 +00001<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
2 "http://www.w3.org/TR/html4/strict.dtd">
Chris Lattner7a274392007-10-06 05:23:00 +00003<html>
4<head>
Chris Lattnerce90ba62007-12-10 05:20:47 +00005 <META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
Chris Lattner6908f302007-12-10 05:52:05 +00006 <title>Clang - Features and Goals</title>
Chris Lattnerce90ba62007-12-10 05:20:47 +00007 <link type="text/css" rel="stylesheet" href="menu.css" />
8 <link type="text/css" rel="stylesheet" href="content.css" />
9 <style type="text/css">
Chris Lattner7a274392007-10-06 05:23:00 +000010</style>
11</head>
12<body>
Chris Lattnerce90ba62007-12-10 05:20:47 +000013
Chris Lattner7a274392007-10-06 05:23:00 +000014<!--#include virtual="menu.html.incl"-->
Chris Lattnerce90ba62007-12-10 05:20:47 +000015
Chris Lattner7a274392007-10-06 05:23:00 +000016<div id="content">
Chris Lattner7a274392007-10-06 05:23:00 +000017
Chris Lattner6908f302007-12-10 05:52:05 +000018<h1>Clang - Features and Goals</h1>
19<p>
20This page describes the <a href="index.html#goals">features and goals</a> of
21Clang in more detail and gives a more broad explanation about what we mean.
22These features are:
23</p>
Chris Lattner7a274392007-10-06 05:23:00 +000024
Chris Lattner6908f302007-12-10 05:52:05 +000025<ul>
26<li><a href="#real">A real-world, production quality compiler</a></li>
27<li><a href="#unifiedparser">A single unified parser for C, Objective C, C++,
28 and Objective C++</a></li>
29<li><a href="#conformance">Conformance with C/C++/ObjC and their
30 variants</a></li>
31</ul>
Chris Lattner7a274392007-10-06 05:23:00 +000032
Chris Lattner6908f302007-12-10 05:52:05 +000033<!--=======================================================================-->
34<h2><a name="real">A real-world, production quality compiler</a></h2>
35<!--=======================================================================-->
Chris Lattner7a274392007-10-06 05:23:00 +000036
Chris Lattner6908f302007-12-10 05:52:05 +000037<p>
38Clang is designed and built by experienced commercial compiler developers who
39are increasingly frustrated with the problems that <a
40href="comparison.html">existing open source compilers</a> have. Clang is
41carefully and thoughtfully designed and built to provide the foundation of a
42whole new generation of C/C++/Objective C development tools, and we intend for
43it to be commercial quality.</p>
44
45<p>Being a production quality compiler means many things: it means being high
46performance, being solid and (relatively) bug free, and it means eventually
47being used and depended on by a broad range of people. While we are still in
48the early development stages, we strongly believe that this will become a
49reality.</p>
50
51<!--=======================================================================-->
52<h2><a name="unifiedparser">A single unified parser for C, Objective C, C++,
53and Objective C++</a></h2>
54<!--=======================================================================-->
55
56<p>Clang is the "C Language Family Front-end", which means we intend to support
57the most popular members of the C family. We are convinced that the right
58parsing technology for this class of languages is a hand-built recursive-descent
59parser. Because it is plain C++ code, recursive descent makes it very easy for
60new developers to understand the code, it easily supports ad-hoc rules and other
61strange hacks required by C/C++, and makes it straight-forward to implement
62excellent diagnostics and error recovery.</p>
63
64<p>We believe that implementing C/C++/ObjC in a single unified parser makes the
65end result easier to maintain and evolve than maintaining a separate C and C++
66parser which must be bugfixed and maintained independently of each other.</p>
67
68<!--=======================================================================-->
69<h2><a name="conformance">Conformance with C/C++/ObjC and their
70 variants</a></h2>
71<!--=======================================================================-->
72
73<p>When you start work on implementing a language, you find out that there is a
74huge gap between how the language works and how most people understand it to
75work. This gap is the difference between a normal programmer and a (scary?
76super-natural?) "language lawyer", who knows the ins and outs of the language
77and can grok standardese with ease.</p>
78
79<p>In practice, being conformant with the languages means that we aim to support
80the full language, including the dark and dusty corners (like trigraphs,
81preprocessor arcana, C99 VLAs, etc). Where we support extensions above and
82beyond what the standard officially allows, we make an effort to explicitly call
83this out in the code and emit warnings about it (which are disabled by default,
84but can optionally be mapped to either warnings or errors), allowing you to use
85clang in "strict" mode if you desire.</p>
86
87<p>We also intend to support "dialects" of these languages, such as C89, K&amp;R
88C, C++'03, Objective-C 2, etc.</p>
89
90<!--=======================================================================-->
Chris Lattner7a274392007-10-06 05:23:00 +000091<h2>Library based architecture</h2>
Chris Lattner6908f302007-12-10 05:52:05 +000092<!--=======================================================================-->
93
Chris Lattner7a274392007-10-06 05:23:00 +000094A major design concept for the LLVM front-end involves using a library based architecture. In this library based architecture, various parts of the front-end can be cleanly divided into separate libraries which can then be mixed up for different needs and uses. In addition, the library based approach makes it much easier for new developers to get involved and extend LLVM to do new and unique things. In the words of Chris,
Chris Lattnerbafc68f2007-10-06 05:48:57 +000095<div class="quote">"The world needs better compiler tools, tools which are built as libraries. This design point allows reuse of the tools in new and novel ways. However, building the tools as libraries isn't enough: they must have clean APIs, be as decoupled from each other as possible, and be easy to modify/extend. This requires clean layering, decent design, and keeping the libraries independent of any specific client."</div>
Chris Lattner7a274392007-10-06 05:23:00 +000096Currently, the LLVM front-end is divided into the following libraries:
97<ul>
Chris Lattner6908f302007-12-10 05:52:05 +000098<li>libsupport - Basic support library, reused from LLVM.
99<li>libsystem - System abstraction library, reused from LLVM.
100<li>libbasic - Diagnostics, SourceLocations, SourceBuffer abstraction, file system caching for input source files. <span class="weak_txt">(depends on above libraries)</span>
101<li>libast - Provides classes to represent the C AST, the C type system, builtin functions, and various helpers for analyzing and manipulating the AST (visitors, pretty printers, etc). <span class="weak_txt">(depends on above libraries)</span>
102<li>liblex - C/C++/ObjC lexing and preprocessing, identifier hash table, pragma handling, tokens, and macros. <span class="weak_txt">(depends on above libraries)</span>
103<li>libparse - Parsing and local semantic analysis. This library invokes coarse-grained 'Actions' provided by the client to do stuff (e.g. libsema builds ASTs). <span class="weak_txt">(depends on above libraries)</span>
104<li>libsema - Provides a set of parser actions to build a standardized AST for programs. AST's are 'streamed' out a top-level declaration at a time, allowing clients to use decl-at-a-time processing, build up entire translation units, or even build 'whole program' ASTs depending on how they use the APIs. <span class="weak_txt">(depends on libast and libparse)</span>
105<li>libcodegen - Lower the AST to LLVM IR for optimization &amp; codegen. <span class="weak_txt">(depends on libast)</span>
106<li>librewrite - Editing of text buffers, depends on libast.</li>
107<li>libanalysis - Static analysis support, depends on libast.</li>
108<li><b>clang</b> - An example driver, client of the libraries at various levels. <span class="weak_txt">(depends on above libraries, and LLVM VMCore)</span>
Chris Lattner7a274392007-10-06 05:23:00 +0000109</ul>
110As an example of the power of this library based design.... If you wanted to build a preprocessor, you would take the Basic and Lexer libraries. If you want an indexer, you would take the previous two and add the Parser library and some actions for indexing. If you want a refactoring, static analysis, or source-to-source compiler tool, you would then add the AST building and semantic analyzer libraries.
111In the end, LLVM's library based design will provide developers with many more possibilities.
112
Chris Lattner40ae32f2007-12-10 05:06:15 +0000113<h2><a name="performance">Speed and Memory</a></h2>
Chris Lattner96e778b2007-10-06 05:30:19 +0000114Another major focus of LLVM's frontend is speed (for all libraries). Even at this early stage, the clang front-end is quicker than gcc and uses less memory.<br>
Chris Lattner7a274392007-10-06 05:23:00 +0000115<div class="img_container">
116 <div class="img_title">Memory:</div>
117 <img src="feature-memory1.png" />
118 <div class="img_desc">This test was run using Mac OS X's Carbon.h header, which is 12.3MB spread across 558 files!
Chris Lattner96e778b2007-10-06 05:30:19 +0000119 Although large headers are one of the worst case scenarios for GCC, they are very common and it shows how clang's implemenation is significantly more memory efficient.
Chris Lattner7a274392007-10-06 05:23:00 +0000120 </div>
121</div>
122<div class="img_container">
123 <div class="img_title">Performance:</div>
124 <img src="feature-compile1.png" />
Chris Lattner96e778b2007-10-06 05:30:19 +0000125 <div class="img_desc">Even at this early stage, the C parser for Clang is able to achieve significantly better performance. Many optimizations are still possible of course.
Chris Lattner7a274392007-10-06 05:23:00 +0000126 </div>
127</div>
128<div class="img_container">
129 <div class="img_title">Performance:</div>
130 <img src="feature-compile2.png" />
Chris Lattner96e778b2007-10-06 05:30:19 +0000131 <div class="img_desc">By using very trivial file-system caching, clang can significantly speed up preprocessing-bound applications like distcc. <span class="img_notes">(<a href="clang_video-07-25-2007.html">more details</a>)</span>
132</div>
Chris Lattner7a274392007-10-06 05:23:00 +0000133</div>
134
135<h2><a name="expressivediags">Expressive Diagnostics</a></h2>
Chris Lattner96e778b2007-10-06 05:30:19 +0000136Clang is designed to efficiently capture range information for expressions and statements, which allows it to emit very detailed diagnostic information when a problem is detected.<br>
Chris Lattner7a274392007-10-06 05:23:00 +0000137<div class="img_container">
138 <div class="img_title">Clang vs GCC:</div>
139 <img src="feature-diagnostics1.png" />
140 <div class="img_desc">There are several things to take note of in this example:
141 <ul>
142 <li>The error messages from Clang are more detailed.
143 <li>Clang shows you exactly where the error is, plus the range it has a problem with.
144 </ul>
145 </div>
146 <div class="img_notes"><span>Notes:</span>The first results are from clang; the second results are from gcc.</div>
147</div>
148<h2>Better Integration with IDEs</h2>
Chris Lattner96e778b2007-10-06 05:30:19 +0000149Another design goal of Clang is to integrate extremely well with IDEs. IDEs often have very different requirements than code generation, often requiring information that a codegen-only frontend can throw away. Clang is specifically designed and built to capture this information.
Chris Lattner7a274392007-10-06 05:23:00 +0000150</div>
151</body>
Chris Lattnerbafc68f2007-10-06 05:48:57 +0000152</html>