Shih-wei Liao | f8fd82b | 2010-02-10 11:10:31 -0800 | [diff] [blame^] | 1 | <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" |
| 2 | "http://www.w3.org/TR/html4/strict.dtd"> |
| 3 | <html> |
| 4 | <head> |
| 5 | <META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" /> |
| 6 | <title>Clang - Expressive Diagnostics</title> |
| 7 | <link type="text/css" rel="stylesheet" href="menu.css" /> |
| 8 | <link type="text/css" rel="stylesheet" href="content.css" /> |
| 9 | <style type="text/css"> |
| 10 | </style> |
| 11 | </head> |
| 12 | <body> |
| 13 | |
| 14 | <!--#include virtual="menu.html.incl"--> |
| 15 | |
| 16 | <div id="content"> |
| 17 | |
| 18 | |
| 19 | <!--=======================================================================--> |
| 20 | <h1>Expressive Diagnostics</h1> |
| 21 | <!--=======================================================================--> |
| 22 | |
| 23 | <p>In addition to being fast and functional, we aim to make Clang extremely user |
| 24 | friendly. As far as a command-line compiler goes, this basically boils down to |
| 25 | making the diagnostics (error and warning messages) generated by the compiler |
| 26 | be as useful as possible. There are several ways that we do this. This section |
| 27 | talks about the experience provided by the command line compiler, contrasting |
| 28 | Clang output to GCC 4.2's output in several examples. |
| 29 | <!-- |
| 30 | Other clients |
| 31 | that embed Clang and extract equivalent information through internal APIs.--> |
| 32 | </p> |
| 33 | |
| 34 | <h2>Column Numbers and Caret Diagnostics</h2> |
| 35 | |
| 36 | <p>First, all diagnostics produced by clang include full column number |
| 37 | information, and use this to print "caret diagnostics". This is a feature |
| 38 | provided by many commercial compilers, but is generally missing from open source |
| 39 | compilers. This is nice because it makes it very easy to understand exactly |
| 40 | what is wrong in a particular piece of code, an example is:</p> |
| 41 | |
| 42 | <pre> |
| 43 | $ <b>gcc-4.2 -fsyntax-only -Wformat format-strings.c</b> |
| 44 | format-strings.c:91: warning: too few arguments for format |
| 45 | $ <b>clang -fsyntax-only format-strings.c</b> |
| 46 | format-strings.c:91:13: <font color="magenta">warning:</font> '.*' specified field precision is missing a matching 'int' argument |
| 47 | <font color="darkgreen"> printf("%.*d");</font> |
| 48 | <font color="blue"> ^</font> |
| 49 | </pre> |
| 50 | |
| 51 | <p>The caret (the blue "^" character) exactly shows where the problem is, even |
| 52 | inside of the string. This makes it really easy to jump to the problem and |
| 53 | helps when multiple instances of the same character occur on a line. We'll |
| 54 | revisit this more in following examples.</p> |
| 55 | |
| 56 | <h2>Range Highlighting for Related Text</h2> |
| 57 | |
| 58 | <p>Clang captures and accurately tracks range information for expressions, |
| 59 | statements, and other constructs in your program and uses this to make |
| 60 | diagnostics highlight related information. For example, here's a somewhat |
| 61 | nonsensical example to illustrate this:</p> |
| 62 | |
| 63 | <pre> |
| 64 | $ <b>gcc-4.2 -fsyntax-only t.c</b> |
| 65 | t.c:7: error: invalid operands to binary + (have 'int' and 'struct A') |
| 66 | $ <b>clang -fsyntax-only t.c</b> |
| 67 | t.c:7:39: <font color="red">error:</font> invalid operands to binary expression ('int' and 'struct A') |
| 68 | <font color="darkgreen"> return y + func(y ? ((SomeA.X + 40) + SomeA) / 42 + SomeA.X : SomeA.X);</font> |
| 69 | <font color="blue"> ~~~~~~~~~~~~~~ ^ ~~~~~</font> |
| 70 | </pre> |
| 71 | |
| 72 | <p>Here you can see that you don't even need to see the original source code to |
| 73 | understand what is wrong based on the Clang error: Because clang prints a |
| 74 | caret, you know exactly <em>which</em> plus it is complaining about. The range |
| 75 | information highlights the left and right side of the plus which makes it |
| 76 | immediately obvious what the compiler is talking about, which is very useful for |
| 77 | cases involving precedence issues and many other cases.</p> |
| 78 | |
| 79 | <h2>Precision in Wording</h2> |
| 80 | |
| 81 | <p>A detail is that we have tried really hard to make the diagnostics that come |
| 82 | out of clang contain exactly the pertinent information about what is wrong and |
| 83 | why. In the example above, we tell you what the inferred types are for |
| 84 | the left and right hand sides, and we don't repeat what is obvious from the |
| 85 | caret (that this is a "binary +"). Many other examples abound, here is a simple |
| 86 | one:</p> |
| 87 | |
| 88 | <pre> |
| 89 | $ <b>gcc-4.2 -fsyntax-only t.c</b> |
| 90 | t.c:5: error: invalid type argument of 'unary *' |
| 91 | $ <b>clang -fsyntax-only t.c</b> |
| 92 | t.c:5:11: <font color="red">error:</font> indirection requires pointer operand ('int' invalid) |
| 93 | <font color="darkgreen"> int y = *SomeA.X;</font> |
| 94 | <font color="blue"> ^~~~~~~~</font> |
| 95 | </pre> |
| 96 | |
| 97 | <p>In this example, not only do we tell you that there is a problem with the * |
| 98 | and point to it, we say exactly why and tell you what the type is (in case it is |
| 99 | a complicated subexpression, such as a call to an overloaded function). This |
| 100 | sort of attention to detail makes it much easier to understand and fix problems |
| 101 | quickly.</p> |
| 102 | |
| 103 | <h2>No Pretty Printing of Expressions in Diagnostics</h2> |
| 104 | |
| 105 | <p>Since Clang has range highlighting, it never needs to pretty print your code |
| 106 | back out to you. This is particularly bad in G++ (which often emits errors |
| 107 | containing lowered vtable references), but even GCC can produce |
| 108 | inscrutible error messages in some cases when it tries to do this. In this |
| 109 | example P and Q have type "int*":</p> |
| 110 | |
| 111 | <pre> |
| 112 | $ <b>gcc-4.2 -fsyntax-only t.c</b> |
| 113 | #'exact_div_expr' not supported by pp_c_expression#'t.c:12: error: called object is not a function |
| 114 | $ <b>clang -fsyntax-only t.c</b> |
| 115 | t.c:12:8: <font color="red">error:</font> called object type 'int' is not a function or function pointer |
| 116 | <font color="darkgreen"> (P-Q)();</font> |
| 117 | <font color="blue"> ~~~~~^</font> |
| 118 | </pre> |
| 119 | |
| 120 | |
| 121 | <h2>Typedef Preservation and Selective Unwrapping</h2> |
| 122 | |
| 123 | <p>Many programmers use high-level user defined types, typedefs, and other |
| 124 | syntactic sugar to refer to types in their program. This is useful because they |
| 125 | can abbreviate otherwise very long types and it is useful to preserve the |
| 126 | typename in diagnostics. However, sometimes very simple typedefs can wrap |
| 127 | trivial types and it is important to strip off the typedef to understand what |
| 128 | is going on. Clang aims to handle both cases well.<p> |
| 129 | |
| 130 | <p>For example, here is an example that shows where it is important to preserve |
| 131 | a typedef in C:</p> |
| 132 | |
| 133 | <pre> |
| 134 | $ <b>gcc-4.2 -fsyntax-only t.c</b> |
| 135 | t.c:15: error: invalid operands to binary / (have 'float __vector__' and 'const int *') |
| 136 | $ <b>clang -fsyntax-only t.c</b> |
| 137 | t.c:15:11: <font color="red">error:</font> can't convert between vector values of different size ('__m128' and 'int const *') |
| 138 | <font color="darkgreen"> myvec[1]/P;</font> |
| 139 | <font color="blue"> ~~~~~~~~^~</font> |
| 140 | </pre> |
| 141 | |
| 142 | <p>Here the type printed by GCC isn't even valid, but if the error were about a |
| 143 | very long and complicated type (as often happens in C++) the error message would |
| 144 | be ugly just because it was long and hard to read. Here's an example where it |
| 145 | is useful for the compiler to expose underlying details of a typedef:</p> |
| 146 | |
| 147 | <pre> |
| 148 | $ <b>gcc-4.2 -fsyntax-only t.c</b> |
| 149 | t.c:13: error: request for member 'x' in something not a structure or union |
| 150 | $ <b>clang -fsyntax-only t.c</b> |
| 151 | t.c:13:9: <font color="red">error:</font> member reference base type 'pid_t' (aka 'int') is not a structure or union |
| 152 | <font color="darkgreen"> myvar = myvar.x;</font> |
| 153 | <font color="blue"> ~~~~~ ^</font> |
| 154 | </pre> |
| 155 | |
| 156 | <p>If the user was somehow confused about how the system "pid_t" typedef is |
| 157 | defined, Clang helpfully displays it with "aka".</p> |
| 158 | |
| 159 | <p>In C++, type preservation includes retaining any qualification written into type names. For example, if we take a small snippet of code such as: |
| 160 | |
| 161 | <blockquote> |
| 162 | <pre> |
| 163 | namespace services { |
| 164 | struct WebService { }; |
| 165 | } |
| 166 | namespace myapp { |
| 167 | namespace servers { |
| 168 | struct Server { }; |
| 169 | } |
| 170 | } |
| 171 | |
| 172 | using namespace myapp; |
| 173 | void addHTTPService(servers::Server const &server, ::services::WebService const *http) { |
| 174 | server += http; |
| 175 | } |
| 176 | </pre> |
| 177 | </blockquote> |
| 178 | |
| 179 | <p>and then compile it, we see that Clang is both providing more accurate information and is retaining the types as written by the user (e.g., "servers::Server", "::services::WebService"): |
| 180 | |
| 181 | <pre> |
| 182 | $ <b>g++-4.2 -fsyntax-only t.cpp</b> |
| 183 | t.cpp:9: error: no match for 'operator+=' in 'server += http' |
| 184 | $ <b>clang -fsyntax-only t.cpp</b> |
| 185 | t.cpp:9:10: <font color="red">error:</font> invalid operands to binary expression ('servers::Server const' and '::services::WebService const *') |
| 186 | <font color="darkgreen">server += http;</font> |
| 187 | <font color="blue">~~~~~~ ^ ~~~~</font> |
| 188 | </pre> |
| 189 | |
| 190 | <p>Naturally, type preservation extends to uses of templates, and Clang retains information about how a particular template specialization (like <code>std::vector<Real></code>) was spelled within the source code. For example:</p> |
| 191 | |
| 192 | <pre> |
| 193 | $ <b>g++-4.2 -fsyntax-only t.cpp</b> |
| 194 | t.cpp:12: error: no match for 'operator=' in 'str = vec' |
| 195 | $ <b>clang -fsyntax-only t.cpp</b> |
| 196 | t.cpp:12:7: <font color="red">error:</font> incompatible type assigning 'vector<Real>', expected 'std::string' (aka 'class std::basic_string<char>') |
| 197 | <font color="darkgreen">str = vec</font>; |
| 198 | <font color="blue">^ ~~~</font> |
| 199 | </pre> |
| 200 | |
| 201 | <h2>Fix-it Hints</h2> |
| 202 | |
| 203 | <p>"Fix-it" hints provide advice for fixing small, localized problems |
| 204 | in source code. When Clang produces a diagnostic about a particular |
| 205 | problem that it can work around (e.g., non-standard or redundant |
| 206 | syntax, missing keywords, common mistakes, etc.), it may also provide |
| 207 | specific guidance in the form of a code transformation to correct the |
| 208 | problem. For example, here Clang warns about the use of a GCC |
| 209 | extension that has been considered obsolete since 1993:</p> |
| 210 | |
| 211 | <pre> |
| 212 | $ <b>clang t.c</b> |
| 213 | t.c:5:28: <font color="magenta">warning:</font> use of GNU old-style field designator extension |
| 214 | <font color="darkgreen">struct point origin = { x: 0.0, y: 0.0 };</font> |
| 215 | <font color="red">~~</font> <font color="blue">^</font> |
| 216 | <font color="darkgreen">.x = </font> |
| 217 | t.c:5:36: <font color="magenta">warning:</font> use of GNU old-style field designator extension |
| 218 | <font color="darkgreen">struct point origin = { x: 0.0, y: 0.0 };</font> |
| 219 | <font color="red">~~</font> <font color="blue">^</font> |
| 220 | <font color="darkgreen">.y = </font> |
| 221 | </pre> |
| 222 | |
| 223 | <p>The underlined code should be removed, then replaced with the code below the |
| 224 | caret line (".x =" or ".y =", respectively). "Fix-it" hints are most useful for |
| 225 | working around common user errors and misconceptions. For example, C++ users |
| 226 | commonly forget the syntax for explicit specialization of class templates, |
| 227 | as in the following error:</p> |
| 228 | |
| 229 | <pre> |
| 230 | $ <b>clang t.cpp</b> |
| 231 | t.cpp:9:3: <font color="red">error:</font> template specialization requires 'template<>' |
| 232 | struct iterator_traits<file_iterator> { |
| 233 | <font color="blue">^</font> |
| 234 | <font color="darkgreen">template<> </font> |
| 235 | </pre> |
| 236 | |
| 237 | <p>Again, after describing the problem, Clang provides the fix--add <code>template<></code>--as part of the diagnostic.<p> |
| 238 | |
| 239 | <h2>Automatic Macro Expansion</h2> |
| 240 | |
| 241 | <p>Many errors happen in macros that are sometimes deeply nested. With |
| 242 | traditional compilers, you need to dig deep into the definition of the macro to |
| 243 | understand how you got into trouble. Here's a simple example that shows how |
| 244 | Clang helps you out:</p> |
| 245 | |
| 246 | <pre> |
| 247 | $ <b>gcc-4.2 -fsyntax-only t.c</b> |
| 248 | t.c: In function 'test': |
| 249 | t.c:80: error: invalid operands to binary < (have 'struct mystruct' and 'float') |
| 250 | $ <b>clang -fsyntax-only t.c</b> |
| 251 | t.c:80:3: <font color="red">error:</font> invalid operands to binary expression ('typeof(P)' (aka 'struct mystruct') and 'typeof(F)' (aka 'float')) |
| 252 | <font color="darkgreen"> X = MYMAX(P, F);</font> |
| 253 | <font color="blue"> ^~~~~~~~~~~</font> |
| 254 | t.c:76:94: note: instantiated from: |
| 255 | <font color="darkgreen">#define MYMAX(A,B) __extension__ ({ __typeof__(A) __a = (A); __typeof__(B) __b = (B); __a < __b ? __b : __a; })</font> |
| 256 | <font color="blue"> ~~~ ^ ~~~</font> |
| 257 | </pre> |
| 258 | |
| 259 | <p>This shows how clang automatically prints instantiation information and |
| 260 | nested range information for diagnostics as they are instantiated through macros |
| 261 | and also shows how some of the other pieces work in a bigger example. Here's |
| 262 | another real world warning that occurs in the "window" Unix package (which |
| 263 | implements the "wwopen" class of APIs):</p> |
| 264 | |
| 265 | <pre> |
| 266 | $ <b>clang -fsyntax-only t.c</b> |
| 267 | t.c:22:2: <font color="magenta">warning:</font> type specifier missing, defaults to 'int' |
| 268 | <font color="darkgreen"> ILPAD();</font> |
| 269 | <font color="blue"> ^</font> |
| 270 | t.c:17:17: note: instantiated from: |
| 271 | <font color="darkgreen">#define ILPAD() PAD((NROW - tt.tt_row) * 10) /* 1 ms per char */</font> |
| 272 | <font color="blue"> ^</font> |
| 273 | t.c:14:2: note: instantiated from: |
| 274 | <font color="darkgreen"> register i; \</font> |
| 275 | <font color="blue"> ^</font> |
| 276 | </pre> |
| 277 | |
| 278 | <p>In practice, we've found that this is actually more useful in multiply nested |
| 279 | macros that in simple ones.</p> |
| 280 | |
| 281 | <h2>Quality of Implementation and Attention to Detail</h2> |
| 282 | |
| 283 | <p>Finally, we have put a lot of work polishing the little things, because |
| 284 | little things add up over time and contribute to a great user experience. Three |
| 285 | examples are:</p> |
| 286 | |
| 287 | <pre> |
| 288 | $ <b>gcc-4.2 t.c</b> |
| 289 | t.c: In function 'foo': |
| 290 | t.c:5: error: expected ';' before '}' token |
| 291 | $ <b>clang t.c</b> |
| 292 | t.c:4:8: <font color="red">error:</font> expected ';' after expression |
| 293 | <font color="darkgreen"> bar()</font> |
| 294 | <font color="blue"> ^</font> |
| 295 | <font color="blue"> ;</font> |
| 296 | </pre> |
| 297 | |
| 298 | <p>This shows a trivial little tweak, where we tell you to put the semicolon at |
| 299 | the end of the line that is missing it (line 4) instead of at the beginning of |
| 300 | the following line (line 5). This is particularly important with fixit hints |
| 301 | and caret diagnostics, because otherwise you don't get the important context. |
| 302 | </p> |
| 303 | |
| 304 | <pre> |
| 305 | $ <b>gcc-4.2 t.c</b> |
| 306 | t.c:3: error: expected '=', ',', ';', 'asm' or '__attribute__' before '*' token |
| 307 | $ <b>clang t.c</b> |
| 308 | t.c:3:1: <font color="red">error:</font> unknown type name 'foo_t' |
| 309 | <font color="darkgreen">foo_t *P = 0;</font> |
| 310 | <font color="blue">^</font> |
| 311 | </pre> |
| 312 | |
| 313 | <p>This shows an example of much better error recovery. The message coming out |
| 314 | of GCC is completely useless for diagnosing the problem, Clang tries much harder |
| 315 | and produces a much more useful diagnosis of the problem.</p> |
| 316 | |
| 317 | <pre> |
| 318 | $ <b>cat t.cc</b> |
| 319 | template<class T> |
| 320 | class a {} |
| 321 | class temp {}; |
| 322 | a<temp> b; |
| 323 | struct b { |
| 324 | } |
| 325 | $ <b>gcc-4.2 t.cc</b> |
| 326 | t.cc:3: error: multiple types in one declaration |
| 327 | t.cc:4: error: non-template type 'a' used as a template |
| 328 | t.cc:4: error: invalid type in declaration before ';' token |
| 329 | t.cc:6: error: expected unqualified-id at end of input |
| 330 | $ <b>clang t.cc</b> |
| 331 | t.cc:2:11: <font color="red">error:</font> expected ';' after class |
| 332 | <font color="darkgreen">class a {}</font> |
| 333 | <font color="blue"> ^</font> |
| 334 | <font color="blue"> ;</font> |
| 335 | t.cc:6:2: <font color="red">error:</font> expected ';' after struct |
| 336 | <font color="darkgreen">}</font> |
| 337 | <font color="blue"> ^</font> |
| 338 | <font color="blue"> ;</font> |
| 339 | </pre> |
| 340 | |
| 341 | <p>This shows that we recover from the simple case of forgetting a ; after |
| 342 | a struct definition much better than GCC.</p> |
| 343 | |
| 344 | <p>While each of these details is minor, we feel that they all add up to provide |
| 345 | a much more polished experience.</p> |
| 346 | |
| 347 | </div> |
| 348 | </body> |
| 349 | </html> |