| <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" |
| "http://www.w3.org/TR/html4/strict.dtd"> |
| <html> |
| <head> |
| <meta http-equiv="content-type" content="text/html; charset=iso-8859-1"> |
| <title>Clang - Expressive Diagnostics</title> |
| <link type="text/css" rel="stylesheet" href="menu.css" /> |
| <link type="text/css" rel="stylesheet" href="content.css" /> |
| <style type="text/css"> |
| </style> |
| </head> |
| <body> |
| |
| <!--#include virtual="menu.html.incl"--> |
| |
| <div id="content"> |
| |
| |
| <!--=======================================================================--> |
| <h1>Expressive Diagnostics</h1> |
| <!--=======================================================================--> |
| |
| <p>In addition to being fast and functional, we aim to make Clang extremely user |
| friendly. As far as a command-line compiler goes, this basically boils down to |
| making the diagnostics (error and warning messages) generated by the compiler |
| be as useful as possible. There are several ways that we do this. This section |
| talks about the experience provided by the command line compiler, contrasting |
| Clang output to GCC 4.2's output in several examples. |
| <!-- |
| Other clients |
| that embed Clang and extract equivalent information through internal APIs.--> |
| </p> |
| |
| <h2>Column Numbers and Caret Diagnostics</h2> |
| |
| <p>First, all diagnostics produced by clang include full column number |
| information. The clang command-line compiler driver uses this information |
| to print "caret diagnostics". |
| (IDEs can use the information to display in-line error markup.) |
| Precise error location in the source is a feature provided by many commercial |
| compilers, but is generally missing from open source |
| compilers. This is nice because it makes it very easy to understand exactly |
| what is wrong in a particular piece of code</p> |
| |
| <p>The caret (the blue "^" character) exactly shows where the problem is, even |
| inside of a string. This makes it really easy to jump to the problem and |
| helps when multiple instances of the same character occur on a line. (We'll |
| revisit this more in following examples.)</p> |
| |
| <pre> |
| $ <b>gcc-4.2 -fsyntax-only -Wformat format-strings.c</b> |
| format-strings.c:91: warning: too few arguments for format |
| $ <b>clang -fsyntax-only format-strings.c</b> |
| format-strings.c:91:13: <font color="magenta">warning:</font> '.*' specified field precision is missing a matching 'int' argument |
| <font color="darkgreen"> printf("%.*d");</font> |
| <font color="blue"> ^</font> |
| </pre> |
| |
| <h2>Range Highlighting for Related Text</h2> |
| |
| <p>Clang captures and accurately tracks range information for expressions, |
| statements, and other constructs in your program and uses this to make |
| diagnostics highlight related information. In the following somewhat |
| nonsensical example you can see that you don't even need to see the original source code to |
| understand what is wrong based on the Clang error. Because clang prints a |
| caret, you know exactly <em>which</em> plus it is complaining about. The range |
| information highlights the left and right side of the plus which makes it |
| immediately obvious what the compiler is talking about. |
| Range information is very useful for |
| cases involving precedence issues and many other cases.</p> |
| |
| <pre> |
| $ <b>gcc-4.2 -fsyntax-only t.c</b> |
| t.c:7: error: invalid operands to binary + (have 'int' and 'struct A') |
| $ <b>clang -fsyntax-only t.c</b> |
| t.c:7:39: <font color="red">error:</font> invalid operands to binary expression ('int' and 'struct A') |
| <font color="darkgreen"> return y + func(y ? ((SomeA.X + 40) + SomeA) / 42 + SomeA.X : SomeA.X);</font> |
| <font color="blue"> ~~~~~~~~~~~~~~ ^ ~~~~~</font> |
| </pre> |
| |
| <h2>Precision in Wording</h2> |
| |
| <p>A detail is that we have tried really hard to make the diagnostics that come |
| out of clang contain exactly the pertinent information about what is wrong and |
| why. In the example above, we tell you what the inferred types are for |
| the left and right hand sides, and we don't repeat what is obvious from the |
| caret (e.g., that this is a "binary +").</p> |
| |
| <p>Many other examples abound. In the following example, not only do we tell you that there is a problem with the * |
| and point to it, we say exactly why and tell you what the type is (in case it is |
| a complicated subexpression, such as a call to an overloaded function). This |
| sort of attention to detail makes it much easier to understand and fix problems |
| quickly.</p> |
| |
| <pre> |
| $ <b>gcc-4.2 -fsyntax-only t.c</b> |
| t.c:5: error: invalid type argument of 'unary *' |
| $ <b>clang -fsyntax-only t.c</b> |
| t.c:5:11: <font color="red">error:</font> indirection requires pointer operand ('int' invalid) |
| <font color="darkgreen"> int y = *SomeA.X;</font> |
| <font color="blue"> ^~~~~~~~</font> |
| </pre> |
| |
| <h2>No Pretty Printing of Expressions in Diagnostics</h2> |
| |
| <p>Since Clang has range highlighting, it never needs to pretty print your code |
| back out to you. This is particularly bad in G++ (which often emits errors |
| containing lowered vtable references), but even GCC can produce |
| inscrutible error messages in some cases when it tries to do this. In this |
| example P and Q have type "int*":</p> |
| |
| <pre> |
| $ <b>gcc-4.2 -fsyntax-only t.c</b> |
| #'exact_div_expr' not supported by pp_c_expression#'t.c:12: error: called object is not a function |
| $ <b>clang -fsyntax-only t.c</b> |
| t.c:12:8: <font color="red">error:</font> called object type 'int' is not a function or function pointer |
| <font color="darkgreen"> (P-Q)();</font> |
| <font color="blue"> ~~~~~^</font> |
| </pre> |
| |
| |
| <h2>Typedef Preservation and Selective Unwrapping</h2> |
| |
| <p>Many programmers use high-level user defined types, typedefs, and other |
| syntactic sugar to refer to types in their program. This is useful because they |
| can abbreviate otherwise very long types and it is useful to preserve the |
| typename in diagnostics. However, sometimes very simple typedefs can wrap |
| trivial types and it is important to strip off the typedef to understand what |
| is going on. Clang aims to handle both cases well.<p> |
| |
| <p>The following example shows where it is important to preserve |
| a typedef in C. Here the type printed by GCC isn't even valid, but if the error |
| were about a very long and complicated type (as often happens in C++) the error |
| message would be ugly just because it was long and hard to read.</p> |
| |
| <pre> |
| $ <b>gcc-4.2 -fsyntax-only t.c</b> |
| t.c:15: error: invalid operands to binary / (have 'float __vector__' and 'const int *') |
| $ <b>clang -fsyntax-only t.c</b> |
| t.c:15:11: <font color="red">error:</font> can't convert between vector values of different size ('__m128' and 'int const *') |
| <font color="darkgreen"> myvec[1]/P;</font> |
| <font color="blue"> ~~~~~~~~^~</font> |
| </pre> |
| |
| <p>The following example shows where it is useful for the compiler to expose |
| underlying details of a typedef. If the user was somehow confused about how the |
| system "pid_t" typedef is defined, Clang helpfully displays it with "aka".</p> |
| |
| <pre> |
| $ <b>gcc-4.2 -fsyntax-only t.c</b> |
| t.c:13: error: request for member 'x' in something not a structure or union |
| $ <b>clang -fsyntax-only t.c</b> |
| t.c:13:9: <font color="red">error:</font> member reference base type 'pid_t' (aka 'int') is not a structure or union |
| <font color="darkgreen"> myvar = myvar.x;</font> |
| <font color="blue"> ~~~~~ ^</font> |
| </pre> |
| |
| <p>In C++, type preservation includes retaining any qualification written into type names. For example, if we take a small snippet of code such as: |
| |
| <blockquote> |
| <pre> |
| namespace services { |
| struct WebService { }; |
| } |
| namespace myapp { |
| namespace servers { |
| struct Server { }; |
| } |
| } |
| |
| using namespace myapp; |
| void addHTTPService(servers::Server const &server, ::services::WebService const *http) { |
| server += http; |
| } |
| </pre> |
| </blockquote> |
| |
| <p>and then compile it, we see that Clang is both providing more accurate information and is retaining the types as written by the user (e.g., "servers::Server", "::services::WebService"): |
| |
| <pre> |
| $ <b>g++-4.2 -fsyntax-only t.cpp</b> |
| t.cpp:9: error: no match for 'operator+=' in 'server += http' |
| $ <b>clang -fsyntax-only t.cpp</b> |
| t.cpp:9:10: <font color="red">error:</font> invalid operands to binary expression ('servers::Server const' and '::services::WebService const *') |
| <font color="darkgreen">server += http;</font> |
| <font color="blue">~~~~~~ ^ ~~~~</font> |
| </pre> |
| |
| <p>Naturally, type preservation extends to uses of templates, and Clang retains information about how a particular template specialization (like <code>std::vector<Real></code>) was spelled within the source code. For example:</p> |
| |
| <pre> |
| $ <b>g++-4.2 -fsyntax-only t.cpp</b> |
| t.cpp:12: error: no match for 'operator=' in 'str = vec' |
| $ <b>clang -fsyntax-only t.cpp</b> |
| t.cpp:12:7: <font color="red">error:</font> incompatible type assigning 'vector<Real>', expected 'std::string' (aka 'class std::basic_string<char>') |
| <font color="darkgreen">str = vec</font>; |
| <font color="blue">^ ~~~</font> |
| </pre> |
| |
| <h2>Fix-it Hints</h2> |
| |
| <p>"Fix-it" hints provide advice for fixing small, localized problems |
| in source code. When Clang produces a diagnostic about a particular |
| problem that it can work around (e.g., non-standard or redundant |
| syntax, missing keywords, common mistakes, etc.), it may also provide |
| specific guidance in the form of a code transformation to correct the |
| problem. In the following example, Clang warns about the use of a GCC |
| extension that has been considered obsolete since 1993. The underlined |
| code should be removed, then replaced with the code below the |
| caret line (".x =" or ".y =", respectively).</p> |
| |
| <pre> |
| $ <b>clang t.c</b> |
| t.c:5:28: <font color="magenta">warning:</font> use of GNU old-style field designator extension |
| <font color="darkgreen">struct point origin = { x: 0.0, y: 0.0 };</font> |
| <font color="red">~~</font> <font color="blue">^</font> |
| <font color="darkgreen">.x = </font> |
| t.c:5:36: <font color="magenta">warning:</font> use of GNU old-style field designator extension |
| <font color="darkgreen">struct point origin = { x: 0.0, y: 0.0 };</font> |
| <font color="red">~~</font> <font color="blue">^</font> |
| <font color="darkgreen">.y = </font> |
| </pre> |
| |
| <p>"Fix-it" hints are most useful for |
| working around common user errors and misconceptions. For example, C++ users |
| commonly forget the syntax for explicit specialization of class templates, |
| as in the error in the following example. Again, after describing the problem, |
| Clang provides the fix--add <code>template<></code>--as part of the |
| diagnostic.<p> |
| |
| <pre> |
| $ <b>clang t.cpp</b> |
| t.cpp:9:3: <font color="red">error:</font> template specialization requires 'template<>' |
| struct iterator_traits<file_iterator> { |
| <font color="blue">^</font> |
| <font color="darkgreen">template<> </font> |
| </pre> |
| |
| <h2>Automatic Macro Expansion</h2> |
| |
| <p>Many errors happen in macros that are sometimes deeply nested. With |
| traditional compilers, you need to dig deep into the definition of the macro to |
| understand how you got into trouble. The following simple example shows how |
| Clang helps you out by automatically printing instantiation information and |
| nested range information for diagnostics as they are instantiated through macros |
| and also shows how some of the other pieces work in a bigger example.</p> |
| |
| <pre> |
| $ <b>gcc-4.2 -fsyntax-only t.c</b> |
| t.c: In function 'test': |
| t.c:80: error: invalid operands to binary < (have 'struct mystruct' and 'float') |
| $ <b>clang -fsyntax-only t.c</b> |
| t.c:80:3: <font color="red">error:</font> invalid operands to binary expression ('typeof(P)' (aka 'struct mystruct') and 'typeof(F)' (aka 'float')) |
| <font color="darkgreen"> X = MYMAX(P, F);</font> |
| <font color="blue"> ^~~~~~~~~~~</font> |
| t.c:76:94: note: instantiated from: |
| <font color="darkgreen">#define MYMAX(A,B) __extension__ ({ __typeof__(A) __a = (A); __typeof__(B) __b = (B); __a < __b ? __b : __a; })</font> |
| <font color="blue"> ~~~ ^ ~~~</font> |
| </pre> |
| |
| <p>Here's another real world warning that occurs in the "window" Unix package (which |
| implements the "wwopen" class of APIs):</p> |
| |
| <pre> |
| $ <b>clang -fsyntax-only t.c</b> |
| t.c:22:2: <font color="magenta">warning:</font> type specifier missing, defaults to 'int' |
| <font color="darkgreen"> ILPAD();</font> |
| <font color="blue"> ^</font> |
| t.c:17:17: note: instantiated from: |
| <font color="darkgreen">#define ILPAD() PAD((NROW - tt.tt_row) * 10) /* 1 ms per char */</font> |
| <font color="blue"> ^</font> |
| t.c:14:2: note: instantiated from: |
| <font color="darkgreen"> register i; \</font> |
| <font color="blue"> ^</font> |
| </pre> |
| |
| <p>In practice, we've found that Clang's treatment of macros is actually more useful in multiply nested |
| macros that in simple ones.</p> |
| |
| <h2>Quality of Implementation and Attention to Detail</h2> |
| |
| <p>Finally, we have put a lot of work polishing the little things, because |
| little things add up over time and contribute to a great user experience.</p> |
| |
| <p>The following example shows a trivial little tweak, where we tell you to put the semicolon at |
| the end of the line that is missing it (line 4) instead of at the beginning of |
| the following line (line 5). This is particularly important with fixit hints |
| and caret diagnostics, because otherwise you don't get the important context. |
| </p> |
| |
| <pre> |
| $ <b>gcc-4.2 t.c</b> |
| t.c: In function 'foo': |
| t.c:5: error: expected ';' before '}' token |
| $ <b>clang t.c</b> |
| t.c:4:8: <font color="red">error:</font> expected ';' after expression |
| <font color="darkgreen"> bar()</font> |
| <font color="blue"> ^</font> |
| <font color="blue"> ;</font> |
| </pre> |
| |
| <p>The following example shows much better error recovery than GCC. The message coming out |
| of GCC is completely useless for diagnosing the problem. Clang tries much harder |
| and produces a much more useful diagnosis of the problem.</p> |
| |
| <pre> |
| $ <b>gcc-4.2 t.c</b> |
| t.c:3: error: expected '=', ',', ';', 'asm' or '__attribute__' before '*' token |
| $ <b>clang t.c</b> |
| t.c:3:1: <font color="red">error:</font> unknown type name 'foo_t' |
| <font color="darkgreen">foo_t *P = 0;</font> |
| <font color="blue">^</font> |
| </pre> |
| |
| <p>The following example shows that we recover from the simple case of |
| forgetting a ; after a struct definition much better than GCC.</p> |
| |
| <pre> |
| $ <b>cat t.cc</b> |
| template<class T> |
| class a {} |
| class temp {}; |
| a<temp> b; |
| struct b { |
| } |
| $ <b>gcc-4.2 t.cc</b> |
| t.cc:3: error: multiple types in one declaration |
| t.cc:4: error: non-template type 'a' used as a template |
| t.cc:4: error: invalid type in declaration before ';' token |
| t.cc:6: error: expected unqualified-id at end of input |
| $ <b>clang t.cc</b> |
| t.cc:2:11: <font color="red">error:</font> expected ';' after class |
| <font color="darkgreen">class a {}</font> |
| <font color="blue"> ^</font> |
| <font color="blue"> ;</font> |
| t.cc:6:2: <font color="red">error:</font> expected ';' after struct |
| <font color="darkgreen">}</font> |
| <font color="blue"> ^</font> |
| <font color="blue"> ;</font> |
| </pre> |
| |
| <p>While each of these details is minor, we feel that they all add up to provide |
| a much more polished experience.</p> |
| |
| </div> |
| </body> |
| </html> |