blob: 9e252419234e29f332fb2d3aaf0b3d16e0b36742 [file] [log] [blame]
Bill Wendling1c5e94a2012-06-20 02:57:56 +00001=====================
2LLVM Coding Standards
3=====================
4
5.. contents::
6 :local:
7
8Introduction
9============
10
11This document attempts to describe a few coding standards that are being used in
12the LLVM source tree. Although no coding standards should be regarded as
13absolute requirements to be followed in all instances, coding standards are
14particularly important for large-scale code bases that follow a library-based
15design (like LLVM).
16
Chandler Carruthc8ce0652014-02-28 12:24:18 +000017While this document may provide guidance for some mechanical formatting issues,
18whitespace, or other "microscopic details", these are not fixed standards.
19Always follow the golden rule:
Bill Wendling1c5e94a2012-06-20 02:57:56 +000020
21.. _Golden Rule:
22
23 **If you are extending, enhancing, or bug fixing already implemented code,
24 use the style that is already being used so that the source is uniform and
25 easy to follow.**
26
27Note that some code bases (e.g. ``libc++``) have really good reasons to deviate
28from the coding standards. In the case of ``libc++``, this is because the
29naming and other conventions are dictated by the C++ standard. If you think
30there is a specific good reason to deviate from the standards here, please bring
31it up on the LLVMdev mailing list.
32
33There are some conventions that are not uniformly followed in the code base
34(e.g. the naming convention). This is because they are relatively new, and a
35lot of code was written before they were put in place. Our long term goal is
36for the entire codebase to follow the convention, but we explicitly *do not*
37want patches that do large-scale reformating of existing code. On the other
38hand, it is reasonable to rename the methods of a class if you're about to
39change it in some other way. Just do the reformating as a separate commit from
40the functionality change.
41
42The ultimate goal of these guidelines is the increase readability and
43maintainability of our common source base. If you have suggestions for topics to
44be included, please mail them to `Chris <mailto:sabre@nondot.org>`_.
45
Chandler Carruthe8c97892014-02-28 13:35:54 +000046Languages, Libraries, and Standards
47===================================
48
49Most source code in LLVM and other LLVM projects using these coding standards
50is C++ code. There are some places where C code is used either due to
51environment restrictions, historical restrictions, or due to third-party source
52code imported into the tree. Generally, our preference is for standards
53conforming, modern, and portable C++ code as the implementation language of
54choice.
55
56C++ Standard Versions
57---------------------
58
Chandler Carruth25353ac2014-03-01 02:48:03 +000059LLVM, Clang, and LLD are currently written using C++11 conforming code,
60although we restrict ourselves to features which are available in the major
61toolchains supported as host compilers. The LLDB project is even more
62aggressive in the set of host compilers supported and thus uses still more
63features. Regardless of the supported features, code is expected to (when
64reasonable) be standard, portable, and modern C++11 code. We avoid unnecessary
65vendor-specific extensions, etc.
Chandler Carruthe8c97892014-02-28 13:35:54 +000066
67C++ Standard Library
68--------------------
69
70Use the C++ standard library facilities whenever they are available for
71a particular task. LLVM and related projects emphasize and rely on the standard
72library facilities for as much as possible. Common support libraries providing
73functionality missing from the standard library for which there are standard
74interfaces or active work on adding standard interfaces will often be
75implemented in the LLVM namespace following the expected standard interface.
76
77There are some exceptions such as the standard I/O streams library which are
78avoided. Also, there is much more detailed information on these subjects in the
Sean Silva1703e702014-04-08 21:06:22 +000079:doc:`ProgrammersManual`.
Chandler Carruthe8c97892014-02-28 13:35:54 +000080
81Supported C++11 Language and Library Features
Sean Silva216f1ee2014-03-02 00:21:42 +000082---------------------------------------------
Chandler Carruthe8c97892014-02-28 13:35:54 +000083
Chandler Carruthe8c97892014-02-28 13:35:54 +000084While LLVM, Clang, and LLD use C++11, not all features are available in all of
85the toolchains which we support. The set of features supported for use in LLVM
Benjamin Kramerde1a1932015-02-15 19:34:17 +000086is the intersection of those supported in MSVC 2013, GCC 4.7, and Clang 3.1.
Chandler Carruthe8c97892014-02-28 13:35:54 +000087The ultimate definition of this set is what build bots with those respective
Chandler Carruth25353ac2014-03-01 02:48:03 +000088toolchains accept. Don't argue with the build bots. However, we have some
89guidance below to help you know what to expect.
Chandler Carruthe8c97892014-02-28 13:35:54 +000090
91Each toolchain provides a good reference for what it accepts:
Richard Smithf30ed8f2014-02-28 21:11:28 +000092
Chandler Carruthe8c97892014-02-28 13:35:54 +000093* Clang: http://clang.llvm.org/cxx_status.html
94* GCC: http://gcc.gnu.org/projects/cxx0x.html
95* MSVC: http://msdn.microsoft.com/en-us/library/hh567368.aspx
96
97In most cases, the MSVC list will be the dominating factor. Here is a summary
98of the features that are expected to work. Features not on this list are
99unlikely to be supported by our host compilers.
100
101* Rvalue references: N2118_
Richard Smitha98d4002014-02-28 21:14:25 +0000102
Chandler Carruthe8c97892014-02-28 13:35:54 +0000103 * But *not* Rvalue references for ``*this`` or member qualifiers (N2439_)
Richard Smitha98d4002014-02-28 21:14:25 +0000104
Chandler Carruthe8c97892014-02-28 13:35:54 +0000105* Static assert: N1720_
106* ``auto`` type deduction: N1984_, N1737_
107* Trailing return types: N2541_
108* Lambdas: N2927_
Reid Kleckner38dcdb72014-03-03 21:12:13 +0000109
Reid Kleckner6a8fada2014-07-02 00:42:07 +0000110 * But *not* lambdas with default arguments.
Reid Kleckner38dcdb72014-03-03 21:12:13 +0000111
Chandler Carruthe8c97892014-02-28 13:35:54 +0000112* ``decltype``: N2343_
113* Nested closing right angle brackets: N1757_
114* Extern templates: N1987_
115* ``nullptr``: N2431_
116* Strongly-typed and forward declarable enums: N2347_, N2764_
117* Local and unnamed types as template arguments: N2657_
118* Range-based for-loop: N2930_
Duncan P. N. Exon Smith8443d582014-04-17 18:02:34 +0000119
120 * But ``{}`` are required around inner ``do {} while()`` loops. As a result,
121 ``{}`` are required around function-like macros inside range-based for
122 loops.
123
Chandler Carruthe8c97892014-02-28 13:35:54 +0000124* ``override`` and ``final``: N2928_, N3206_, N3272_
125* Atomic operations and the C++11 memory model: N2429_
126
127.. _N2118: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2118.html
Ben Langmuir3b0a8662014-02-28 19:37:20 +0000128.. _N2439: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2439.htm
129.. _N1720: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1720.html
Chandler Carruthe8c97892014-02-28 13:35:54 +0000130.. _N1984: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n1984.pdf
Ben Langmuir3b0a8662014-02-28 19:37:20 +0000131.. _N1737: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1737.pdf
132.. _N2541: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2541.htm
133.. _N2927: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2009/n2927.pdf
134.. _N2343: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2343.pdf
135.. _N1757: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1757.html
Chandler Carruthe8c97892014-02-28 13:35:54 +0000136.. _N1987: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n1987.htm
Ben Langmuir3b0a8662014-02-28 19:37:20 +0000137.. _N2431: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2431.pdf
138.. _N2347: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2347.pdf
139.. _N2764: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2764.pdf
140.. _N2657: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2657.htm
141.. _N2930: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2009/n2930.html
142.. _N2928: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2009/n2928.htm
143.. _N3206: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3206.htm
144.. _N3272: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2011/n3272.htm
145.. _N2429: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2429.htm
Reid Kleckner38dcdb72014-03-03 21:12:13 +0000146.. _MSVC-compatible RTTI: http://llvm.org/PR18951
Chandler Carruthe8c97892014-02-28 13:35:54 +0000147
148The supported features in the C++11 standard libraries are less well tracked,
149but also much greater. Most of the standard libraries implement most of C++11's
150library. The most likely lowest common denominator is Linux support. For
151libc++, the support is just poorly tested and undocumented but expected to be
152largely complete. YMMV. For libstdc++, the support is documented in detail in
153`the libstdc++ manual`_. There are some very minor missing facilities that are
154unlikely to be common problems, and there are a few larger gaps that are worth
155being aware of:
156
157* Not all of the type traits are implemented
158* No regular expression library.
159* While most of the atomics library is well implemented, the fences are
160 missing. Fortunately, they are rarely needed.
161* The locale support is incomplete.
Peter Collingbourne23d72e82014-03-03 19:54:42 +0000162* ``std::initializer_list`` (and the constructors and functions that take it as
163 an argument) are not always available, so you cannot (for example) initialize
164 a ``std::vector`` with a braced initializer list.
Duncan P. N. Exon Smith38f556d2014-08-19 16:49:40 +0000165* ``std::equal()`` (and other algorithms) incorrectly assert in MSVC when given
166 ``nullptr`` as an iterator.
Chandler Carruthe8c97892014-02-28 13:35:54 +0000167
Chandler Carruth25353ac2014-03-01 02:48:03 +0000168Other than these areas you should assume the standard library is available and
169working as expected until some build bot tells you otherwise. If you're in an
170uncertain area of one of the above points, but you cannot test on a Linux
171system, your best approach is to minimize your use of these features, and watch
172the Linux build bots to find out if your usage triggered a bug. For example, if
173you hit a type trait which doesn't work we can then add support to LLVM's
174traits header to emulate it.
Chandler Carruth6e390fa2014-02-28 21:59:51 +0000175
Chandler Carruthe8c97892014-02-28 13:35:54 +0000176.. _the libstdc++ manual:
177 http://gcc.gnu.org/onlinedocs/gcc-4.7.3/libstdc++/manual/manual/status.html#status.iso.2011
178
Peter Collingbournee0461992014-10-14 00:40:53 +0000179Other Languages
180---------------
181
182Any code written in the Go programming language is not subject to the
183formatting rules below. Instead, we adopt the formatting rules enforced by
184the `gofmt`_ tool.
185
186Go code should strive to be idiomatic. Two good sets of guidelines for what
187this means are `Effective Go`_ and `Go Code Review Comments`_.
188
189.. _gofmt:
190 https://golang.org/cmd/gofmt/
191
192.. _Effective Go:
193 https://golang.org/doc/effective_go.html
194
195.. _Go Code Review Comments:
196 https://code.google.com/p/go-wiki/wiki/CodeReviewComments
197
Bill Wendling1c5e94a2012-06-20 02:57:56 +0000198Mechanical Source Issues
199========================
200
201Source Code Formatting
202----------------------
203
204Commenting
205^^^^^^^^^^
206
207Comments are one critical part of readability and maintainability. Everyone
208knows they should comment their code, and so should you. When writing comments,
209write them as English prose, which means they should use proper capitalization,
210punctuation, etc. Aim to describe what the code is trying to do and why, not
211*how* it does it at a micro level. Here are a few critical things to document:
212
213.. _header file comment:
214
215File Headers
216""""""""""""
217
218Every source file should have a header on it that describes the basic purpose of
219the file. If a file does not have a header, it should not be checked into the
220tree. The standard header looks like this:
221
222.. code-block:: c++
223
224 //===-- llvm/Instruction.h - Instruction class definition -------*- C++ -*-===//
225 //
226 // The LLVM Compiler Infrastructure
227 //
228 // This file is distributed under the University of Illinois Open Source
229 // License. See LICENSE.TXT for details.
230 //
231 //===----------------------------------------------------------------------===//
Michael J. Spencer99a241f2012-10-01 19:59:21 +0000232 ///
233 /// \file
234 /// \brief This file contains the declaration of the Instruction class, which is
235 /// the base class for all of the VM instructions.
236 ///
Bill Wendling1c5e94a2012-06-20 02:57:56 +0000237 //===----------------------------------------------------------------------===//
238
239A few things to note about this particular format: The "``-*- C++ -*-``" string
240on the first line is there to tell Emacs that the source file is a C++ file, not
241a C file (Emacs assumes ``.h`` files are C files by default).
242
243.. note::
244
245 This tag is not necessary in ``.cpp`` files. The name of the file is also
246 on the first line, along with a very short description of the purpose of the
247 file. This is important when printing out code and flipping though lots of
248 pages.
249
250The next section in the file is a concise note that defines the license that the
251file is released under. This makes it perfectly clear what terms the source
252code can be distributed under and should not be modified in any way.
253
Paul Robinson343e4962015-01-22 00:19:56 +0000254The main body is a ``doxygen`` comment (identified by the ``///`` comment
255marker instead of the usual ``//``) describing the purpose of the file. It
Michael J. Spencer99a241f2012-10-01 19:59:21 +0000256should have a ``\brief`` command that describes the file in one or two
257sentences. Any additional information should be separated by a blank line. If
258an algorithm is being implemented or something tricky is going on, a reference
259to the paper where it is published should be included, as well as any notes or
260*gotchas* in the code to watch out for.
Bill Wendling1c5e94a2012-06-20 02:57:56 +0000261
262Class overviews
263"""""""""""""""
264
265Classes are one fundamental part of a good object oriented design. As such, a
266class definition should have a comment block that explains what the class is
267used for and how it works. Every non-trivial class is expected to have a
268``doxygen`` comment block.
269
270Method information
271""""""""""""""""""
272
273Methods defined in a class (as well as any global functions) should also be
274documented properly. A quick note about what it does and a description of the
275borderline behaviour is all that is necessary here (unless something
276particularly tricky or insidious is going on). The hope is that people can
277figure out how to use your interfaces without reading the code itself.
278
279Good things to talk about here are what happens when something unexpected
280happens: does the method return null? Abort? Format your hard disk?
281
282Comment Formatting
283^^^^^^^^^^^^^^^^^^
284
Paul Robinson343e4962015-01-22 00:19:56 +0000285In general, prefer C++ style comments (``//`` for normal comments, ``///`` for
286``doxygen`` documentation comments). They take less space, require
Bill Wendling1c5e94a2012-06-20 02:57:56 +0000287less typing, don't have nesting problems, etc. There are a few cases when it is
288useful to use C style (``/* */``) comments however:
289
290#. When writing C code: Obviously if you are writing C code, use C style
291 comments.
292
293#. When writing a header file that may be ``#include``\d by a C source file.
294
295#. When writing a source file that is used by a tool that only accepts C style
296 comments.
297
298To comment out a large block of code, use ``#if 0`` and ``#endif``. These nest
299properly and are better behaved in general than C style comments.
300
Dmitri Gribenko9fb49d22012-10-20 13:27:43 +0000301Doxygen Use in Documentation Comments
302^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
303
304Use the ``\file`` command to turn the standard file header into a file-level
305comment.
306
307Include descriptive ``\brief`` paragraphs for all public interfaces (public
308classes, member and non-member functions). Explain API use and purpose in
309``\brief`` paragraphs, don't just restate the information that can be inferred
310from the API name. Put detailed discussion into separate paragraphs.
311
312To refer to parameter names inside a paragraph, use the ``\p name`` command.
313Don't use the ``\arg name`` command since it starts a new paragraph that
314contains documentation for the parameter.
315
316Wrap non-inline code examples in ``\code ... \endcode``.
317
318To document a function parameter, start a new paragraph with the
319``\param name`` command. If the parameter is used as an out or an in/out
320parameter, use the ``\param [out] name`` or ``\param [in,out] name`` command,
321respectively.
322
323To describe function return value, start a new paragraph with the ``\returns``
324command.
325
326A minimal documentation comment:
327
328.. code-block:: c++
329
330 /// \brief Does foo and bar.
331 void fooBar(bool Baz);
332
333A documentation comment that uses all Doxygen features in a preferred way:
334
335.. code-block:: c++
336
337 /// \brief Does foo and bar.
338 ///
339 /// Does not do foo the usual way if \p Baz is true.
340 ///
341 /// Typical usage:
342 /// \code
343 /// fooBar(false, "quux", Res);
344 /// \endcode
345 ///
346 /// \param Quux kind of foo to do.
347 /// \param [out] Result filled with bar sequence on foo success.
348 ///
349 /// \returns true on success.
350 bool fooBar(bool Baz, StringRef Quux, std::vector<int> &Result);
351
Chris Lattner4fe27462013-09-01 15:48:08 +0000352Don't duplicate the documentation comment in the header file and in the
353implementation file. Put the documentation comments for public APIs into the
354header file. Documentation comments for private APIs can go to the
355implementation file. In any case, implementation files can include additional
356comments (not necessarily in Doxygen markup) to explain implementation details
357as needed.
358
Dmitri Gribenko9fb49d22012-10-20 13:27:43 +0000359Don't duplicate function or class name at the beginning of the comment.
360For humans it is obvious which function or class is being documented;
361automatic documentation processing tools are smart enough to bind the comment
362to the correct declaration.
363
364Wrong:
365
366.. code-block:: c++
367
368 // In Something.h:
369
370 /// Something - An abstraction for some complicated thing.
371 class Something {
372 public:
373 /// fooBar - Does foo and bar.
374 void fooBar();
375 };
376
377 // In Something.cpp:
378
379 /// fooBar - Does foo and bar.
380 void Something::fooBar() { ... }
381
382Correct:
383
384.. code-block:: c++
385
386 // In Something.h:
387
388 /// \brief An abstraction for some complicated thing.
389 class Something {
390 public:
391 /// \brief Does foo and bar.
392 void fooBar();
393 };
394
395 // In Something.cpp:
396
397 // Builds a B-tree in order to do foo. See paper by...
398 void Something::fooBar() { ... }
399
400It is not required to use additional Doxygen features, but sometimes it might
401be a good idea to do so.
402
403Consider:
404
405* adding comments to any narrow namespace containing a collection of
406 related functions or types;
407
408* using top-level groups to organize a collection of related functions at
409 namespace scope where the grouping is smaller than the namespace;
410
411* using member groups and additional comments attached to member
412 groups to organize within a class.
413
414For example:
415
416.. code-block:: c++
417
418 class Something {
419 /// \name Functions that do Foo.
420 /// @{
421 void fooBar();
422 void fooBaz();
423 /// @}
424 ...
425 };
426
Bill Wendling1c5e94a2012-06-20 02:57:56 +0000427``#include`` Style
428^^^^^^^^^^^^^^^^^^
429
430Immediately after the `header file comment`_ (and include guards if working on a
431header file), the `minimal list of #includes`_ required by the file should be
432listed. We prefer these ``#include``\s to be listed in this order:
433
434.. _Main Module Header:
435.. _Local/Private Headers:
436
437#. Main Module Header
438#. Local/Private Headers
Chandler Carruth494cfc02012-12-02 11:53:27 +0000439#. ``llvm/...``
Bill Wendling1c5e94a2012-06-20 02:57:56 +0000440#. System ``#include``\s
441
Chandler Carruth494cfc02012-12-02 11:53:27 +0000442and each category should be sorted lexicographically by the full path.
Bill Wendling1c5e94a2012-06-20 02:57:56 +0000443
444The `Main Module Header`_ file applies to ``.cpp`` files which implement an
445interface defined by a ``.h`` file. This ``#include`` should always be included
446**first** regardless of where it lives on the file system. By including a
447header file first in the ``.cpp`` files that implement the interfaces, we ensure
448that the header does not have any hidden dependencies which are not explicitly
449``#include``\d in the header, but should be. It is also a form of documentation
450in the ``.cpp`` file to indicate where the interfaces it implements are defined.
451
452.. _fit into 80 columns:
453
454Source Code Width
455^^^^^^^^^^^^^^^^^
456
457Write your code to fit within 80 columns of text. This helps those of us who
458like to print out code and look at your code in an ``xterm`` without resizing
459it.
460
461The longer answer is that there must be some limit to the width of the code in
462order to reasonably allow developers to have multiple files side-by-side in
463windows on a modest display. If you are going to pick a width limit, it is
464somewhat arbitrary but you might as well pick something standard. Going with 90
465columns (for example) instead of 80 columns wouldn't add any significant value
466and would be detrimental to printing out code. Also many other projects have
467standardized on 80 columns, so some people have already configured their editors
468for it (vs something else, like 90 columns).
469
470This is one of many contentious issues in coding standards, but it is not up for
471debate.
472
473Use Spaces Instead of Tabs
474^^^^^^^^^^^^^^^^^^^^^^^^^^
475
476In all cases, prefer spaces to tabs in source files. People have different
477preferred indentation levels, and different styles of indentation that they
478like; this is fine. What isn't fine is that different editors/viewers expand
479tabs out to different tab stops. This can cause your code to look completely
480unreadable, and it is not worth dealing with.
481
482As always, follow the `Golden Rule`_ above: follow the style of
483existing code if you are modifying and extending it. If you like four spaces of
484indentation, **DO NOT** do that in the middle of a chunk of code with two spaces
485of indentation. Also, do not reindent a whole source file: it makes for
486incredible diffs that are absolutely worthless.
487
488Indent Code Consistently
489^^^^^^^^^^^^^^^^^^^^^^^^
490
491Okay, in your first year of programming you were told that indentation is
Chandler Carruthe55d9bf2014-03-02 08:38:35 +0000492important. If you didn't believe and internalize this then, now is the time.
493Just do it. With the introduction of C++11, there are some new formatting
494challenges that merit some suggestions to help have consistent, maintainable,
495and tool-friendly formatting and indentation.
Bill Wendling1c5e94a2012-06-20 02:57:56 +0000496
Chandler Carruthe55d9bf2014-03-02 08:38:35 +0000497Format Lambdas Like Blocks Of Code
498""""""""""""""""""""""""""""""""""
499
500When formatting a multi-line lambda, format it like a block of code, that's
501what it is. If there is only one multi-line lambda in a statement, and there
502are no expressions lexically after it in the statement, drop the indent to the
503standard two space indent for a block of code, as if it were an if-block opened
504by the preceding part of the statement:
505
506.. code-block:: c++
507
508 std::sort(foo.begin(), foo.end(), [&](Foo a, Foo b) -> bool {
509 if (a.blah < b.blah)
510 return true;
511 if (a.baz < b.baz)
512 return true;
513 return a.bam < b.bam;
514 });
515
Chandler Carruthd9ff35f2014-03-02 09:13:39 +0000516To take best advantage of this formatting, if you are designing an API which
517accepts a continuation or single callable argument (be it a functor, or
518a ``std::function``), it should be the last argument if at all possible.
519
Chandler Carruthe55d9bf2014-03-02 08:38:35 +0000520If there are multiple multi-line lambdas in a statement, or there is anything
521interesting after the lambda in the statement, indent the block two spaces from
522the indent of the ``[]``:
523
524.. code-block:: c++
525
526 dyn_switch(V->stripPointerCasts(),
527 [] (PHINode *PN) {
528 // process phis...
529 },
530 [] (SelectInst *SI) {
531 // process selects...
532 },
533 [] (LoadInst *LI) {
534 // process loads...
535 },
536 [] (AllocaInst *AI) {
537 // process allocas...
538 });
539
540Braced Initializer Lists
541""""""""""""""""""""""""
542
543With C++11, there are significantly more uses of braced lists to perform
544initialization. These allow you to easily construct aggregate temporaries in
545expressions among other niceness. They now have a natural way of ending up
546nested within each other and within function calls in order to build up
547aggregates (such as option structs) from local variables. To make matters
548worse, we also have many more uses of braces in an expression context that are
549*not* performing initialization.
550
551The historically common formatting of braced initialization of aggregate
552variables does not mix cleanly with deep nesting, general expression contexts,
553function arguments, and lambdas. We suggest new code use a simple rule for
554formatting braced initialization lists: act as-if the braces were parentheses
555in a function call. The formatting rules exactly match those already well
556understood for formatting nested function calls. Examples:
557
558.. code-block:: c++
559
560 foo({a, b, c}, {1, 2, 3});
561
562 llvm::Constant *Mask[] = {
563 llvm::ConstantInt::get(llvm::Type::getInt32Ty(getLLVMContext()), 0),
564 llvm::ConstantInt::get(llvm::Type::getInt32Ty(getLLVMContext()), 1),
565 llvm::ConstantInt::get(llvm::Type::getInt32Ty(getLLVMContext()), 2)};
566
567This formatting scheme also makes it particularly easy to get predictable,
568consistent, and automatic formatting with tools like `Clang Format`_.
569
570.. _Clang Format: http://clang.llvm.org/docs/ClangFormat.html
571
572Language and Compiler Issues
573----------------------------
Bill Wendling1c5e94a2012-06-20 02:57:56 +0000574
575Treat Compiler Warnings Like Errors
576^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
577
578If your code has compiler warnings in it, something is wrong --- you aren't
579casting values correctly, you have "questionable" constructs in your code, or
580you are doing something legitimately wrong. Compiler warnings can cover up
581legitimate errors in output and make dealing with a translation unit difficult.
582
583It is not possible to prevent all warnings from all compilers, nor is it
584desirable. Instead, pick a standard compiler (like ``gcc``) that provides a
585good thorough set of warnings, and stick to it. At least in the case of
586``gcc``, it is possible to work around any spurious errors by changing the
587syntax of the code slightly. For example, a warning that annoys me occurs when
588I write code like this:
589
590.. code-block:: c++
591
592 if (V = getValue()) {
593 ...
594 }
595
596``gcc`` will warn me that I probably want to use the ``==`` operator, and that I
597probably mistyped it. In most cases, I haven't, and I really don't want the
598spurious errors. To fix this particular problem, I rewrite the code like
599this:
600
601.. code-block:: c++
602
603 if ((V = getValue())) {
604 ...
605 }
606
607which shuts ``gcc`` up. Any ``gcc`` warning that annoys you can be fixed by
608massaging the code appropriately.
609
610Write Portable Code
611^^^^^^^^^^^^^^^^^^^
612
613In almost all cases, it is possible and within reason to write completely
614portable code. If there are cases where it isn't possible to write portable
615code, isolate it behind a well defined (and well documented) interface.
616
617In practice, this means that you shouldn't assume much about the host compiler
618(and Visual Studio tends to be the lowest common denominator). If advanced
619features are used, they should only be an implementation detail of a library
620which has a simple exposed API, and preferably be buried in ``libSystem``.
621
622Do not use RTTI or Exceptions
623^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
624
625In an effort to reduce code and executable size, LLVM does not use RTTI
626(e.g. ``dynamic_cast<>;``) or exceptions. These two language features violate
627the general C++ principle of *"you only pay for what you use"*, causing
628executable bloat even if exceptions are never used in the code base, or if RTTI
629is never used for a class. Because of this, we turn them off globally in the
630code.
631
632That said, LLVM does make extensive use of a hand-rolled form of RTTI that use
Sean Silva1703e702014-04-08 21:06:22 +0000633templates like :ref:`isa\<>, cast\<>, and dyn_cast\<> <isa>`.
Sean Silva0fc33ec2012-11-17 21:01:44 +0000634This form of RTTI is opt-in and can be
635:doc:`added to any class <HowToSetUpLLVMStyleRTTI>`. It is also
Bill Wendling1c5e94a2012-06-20 02:57:56 +0000636substantially more efficient than ``dynamic_cast<>``.
637
638.. _static constructor:
639
640Do not use Static Constructors
641^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
642
643Static constructors and destructors (e.g. global variables whose types have a
644constructor or destructor) should not be added to the code base, and should be
645removed wherever possible. Besides `well known problems
646<http://yosefk.com/c++fqa/ctors.html#fqa-10.12>`_ where the order of
647initialization is undefined between globals in different source files, the
648entire concept of static constructors is at odds with the common use case of
649LLVM as a library linked into a larger application.
650
651Consider the use of LLVM as a JIT linked into another application (perhaps for
652`OpenGL, custom languages <http://llvm.org/Users.html>`_, `shaders in movies
653<http://llvm.org/devmtg/2010-11/Gritz-OpenShadingLang.pdf>`_, etc). Due to the
654design of static constructors, they must be executed at startup time of the
655entire application, regardless of whether or how LLVM is used in that larger
656application. There are two problems with this:
657
658* The time to run the static constructors impacts startup time of applications
659 --- a critical time for GUI apps, among others.
660
661* The static constructors cause the app to pull many extra pages of memory off
662 the disk: both the code for the constructor in each ``.o`` file and the small
663 amount of data that gets touched. In addition, touched/dirty pages put more
664 pressure on the VM system on low-memory machines.
665
666We would really like for there to be zero cost for linking in an additional LLVM
667target or other library into an application, but static constructors violate
668this goal.
669
670That said, LLVM unfortunately does contain static constructors. It would be a
671`great project <http://llvm.org/PR11944>`_ for someone to purge all static
672constructors from LLVM, and then enable the ``-Wglobal-constructors`` warning
673flag (when building with Clang) to ensure we do not regress in the future.
674
675Use of ``class`` and ``struct`` Keywords
676^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
677
678In C++, the ``class`` and ``struct`` keywords can be used almost
679interchangeably. The only difference is when they are used to declare a class:
680``class`` makes all members private by default while ``struct`` makes all
681members public by default.
682
683Unfortunately, not all compilers follow the rules and some will generate
684different symbols based on whether ``class`` or ``struct`` was used to declare
Duncan P. N. Exon Smith9724e832014-03-03 16:48:44 +0000685the symbol (e.g., MSVC). This can lead to problems at link time.
Bill Wendling1c5e94a2012-06-20 02:57:56 +0000686
Duncan P. N. Exon Smith9724e832014-03-03 16:48:44 +0000687* All declarations and definitions of a given ``class`` or ``struct`` must use
688 the same keyword. For example:
689
690.. code-block:: c++
691
692 class Foo;
693
694 // Breaks mangling in MSVC.
695 struct Foo { int Data; };
696
697* As a rule of thumb, ``struct`` should be kept to structures where *all*
698 members are declared public.
699
700.. code-block:: c++
701
702 // Foo feels like a class... this is strange.
703 struct Foo {
704 private:
705 int Data;
706 public:
707 Foo() : Data(0) { }
708 int getData() const { return Data; }
709 void setData(int D) { Data = D; }
710 };
711
712 // Bar isn't POD, but it does look like a struct.
713 struct Bar {
714 int Data;
715 Foo() : Data(0) { }
716 };
Bill Wendling1c5e94a2012-06-20 02:57:56 +0000717
Chandler Carruthe55d9bf2014-03-02 08:38:35 +0000718Do not use Braced Initializer Lists to Call a Constructor
719^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
720
721In C++11 there is a "generalized initialization syntax" which allows calling
722constructors using braced initializer lists. Do not use these to call
723constructors with any interesting logic or if you care that you're calling some
724*particular* constructor. Those should look like function calls using
725parentheses rather than like aggregate initialization. Similarly, if you need
726to explicitly name the type and call its constructor to create a temporary,
727don't use a braced initializer list. Instead, use a braced initializer list
728(without any type for temporaries) when doing aggregate initialization or
729something notionally equivalent. Examples:
730
731.. code-block:: c++
732
733 class Foo {
734 public:
735 // Construct a Foo by reading data from the disk in the whizbang format, ...
736 Foo(std::string filename);
737
738 // Construct a Foo by looking up the Nth element of some global data ...
739 Foo(int N);
740
741 // ...
742 };
743
744 // The Foo constructor call is very deliberate, no braces.
745 std::fill(foo.begin(), foo.end(), Foo("name"));
746
747 // The pair is just being constructed like an aggregate, use braces.
748 bar_map.insert({my_key, my_value});
749
750If you use a braced initializer list when initializing a variable, use an equals before the open curly brace:
751
752.. code-block:: c++
753
754 int data[] = {0, 1, 2, 3};
755
756Use ``auto`` Type Deduction to Make Code More Readable
757^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
758
759Some are advocating a policy of "almost always ``auto``" in C++11, however LLVM
760uses a more moderate stance. Use ``auto`` if and only if it makes the code more
761readable or easier to maintain. Don't "almost always" use ``auto``, but do use
762``auto`` with initializers like ``cast<Foo>(...)`` or other places where the
763type is already obvious from the context. Another time when ``auto`` works well
764for these purposes is when the type would have been abstracted away anyways,
765often behind a container's typedef such as ``std::vector<T>::iterator``.
766
Duncan P. N. Exon Smith99486372014-03-03 16:48:47 +0000767Beware unnecessary copies with ``auto``
768^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
769
770The convenience of ``auto`` makes it easy to forget that its default behavior
771is a copy. Particularly in range-based ``for`` loops, careless copies are
772expensive.
773
Duncan P. N. Exon Smithfdbb44a2014-03-07 18:06:15 +0000774As a rule of thumb, use ``auto &`` unless you need to copy the result, and use
775``auto *`` when copying pointers.
Duncan P. N. Exon Smith99486372014-03-03 16:48:47 +0000776
777.. code-block:: c++
778
Duncan P. N. Exon Smithfdbb44a2014-03-07 18:06:15 +0000779 // Typically there's no reason to copy.
Duncan P. N. Exon Smith99486372014-03-03 16:48:47 +0000780 for (const auto &Val : Container) { observe(Val); }
Duncan P. N. Exon Smith99486372014-03-03 16:48:47 +0000781 for (auto &Val : Container) { Val.change(); }
782
783 // Remove the reference if you really want a new copy.
784 for (auto Val : Container) { Val.change(); saveSomewhere(Val); }
785
Duncan P. N. Exon Smith6b3d6a42014-03-07 17:23:29 +0000786 // Copy pointers, but make it clear that they're pointers.
Duncan P. N. Exon Smithfdbb44a2014-03-07 18:06:15 +0000787 for (const auto *Ptr : Container) { observe(*Ptr); }
788 for (auto *Ptr : Container) { Ptr->change(); }
Duncan P. N. Exon Smith6b3d6a42014-03-07 17:23:29 +0000789
Bill Wendling1c5e94a2012-06-20 02:57:56 +0000790Style Issues
791============
792
793The High-Level Issues
794---------------------
795
796A Public Header File **is** a Module
797^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
798
799C++ doesn't do too well in the modularity department. There is no real
800encapsulation or data hiding (unless you use expensive protocol classes), but it
801is what we have to work with. When you write a public header file (in the LLVM
802source tree, they live in the top level "``include``" directory), you are
803defining a module of functionality.
804
805Ideally, modules should be completely independent of each other, and their
806header files should only ``#include`` the absolute minimum number of headers
807possible. A module is not just a class, a function, or a namespace: it's a
808collection of these that defines an interface. This interface may be several
809functions, classes, or data structures, but the important issue is how they work
810together.
811
812In general, a module should be implemented by one or more ``.cpp`` files. Each
813of these ``.cpp`` files should include the header that defines their interface
814first. This ensures that all of the dependences of the module header have been
815properly added to the module header itself, and are not implicit. System
816headers should be included after user headers for a translation unit.
817
818.. _minimal list of #includes:
819
820``#include`` as Little as Possible
821^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
822
823``#include`` hurts compile time performance. Don't do it unless you have to,
824especially in header files.
825
826But wait! Sometimes you need to have the definition of a class to use it, or to
827inherit from it. In these cases go ahead and ``#include`` that header file. Be
828aware however that there are many cases where you don't need to have the full
829definition of a class. If you are using a pointer or reference to a class, you
830don't need the header file. If you are simply returning a class instance from a
831prototyped function or method, you don't need it. In fact, for most cases, you
832simply don't need the definition of a class. And not ``#include``\ing speeds up
833compilation.
834
835It is easy to try to go too overboard on this recommendation, however. You
836**must** include all of the header files that you are using --- you can include
837them either directly or indirectly through another header file. To make sure
838that you don't accidentally forget to include a header file in your module
839header, make sure to include your module header **first** in the implementation
840file (as mentioned above). This way there won't be any hidden dependencies that
841you'll find out about later.
842
843Keep "Internal" Headers Private
844^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
845
846Many modules have a complex implementation that causes them to use more than one
847implementation (``.cpp``) file. It is often tempting to put the internal
848communication interface (helper classes, extra functions, etc) in the public
849module header file. Don't do this!
850
851If you really need to do something like this, put a private header file in the
852same directory as the source files, and include it locally. This ensures that
853your private interface remains private and undisturbed by outsiders.
854
855.. note::
856
857 It's okay to put extra implementation methods in a public class itself. Just
858 make them private (or protected) and all is well.
859
860.. _early exits:
861
862Use Early Exits and ``continue`` to Simplify Code
863^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
864
865When reading code, keep in mind how much state and how many previous decisions
866have to be remembered by the reader to understand a block of code. Aim to
867reduce indentation where possible when it doesn't make it more difficult to
868understand the code. One great way to do this is by making use of early exits
869and the ``continue`` keyword in long loops. As an example of using an early
870exit from a function, consider this "bad" code:
871
872.. code-block:: c++
873
Andrew Tricke6af4b92012-09-20 17:02:04 +0000874 Value *doSomething(Instruction *I) {
Bill Wendling1c5e94a2012-06-20 02:57:56 +0000875 if (!isa<TerminatorInst>(I) &&
Andrew Tricke6af4b92012-09-20 17:02:04 +0000876 I->hasOneUse() && doOtherThing(I)) {
Bill Wendling1c5e94a2012-06-20 02:57:56 +0000877 ... some long code ....
878 }
879
880 return 0;
881 }
882
883This code has several problems if the body of the ``'if'`` is large. When
884you're looking at the top of the function, it isn't immediately clear that this
885*only* does interesting things with non-terminator instructions, and only
886applies to things with the other predicates. Second, it is relatively difficult
887to describe (in comments) why these predicates are important because the ``if``
888statement makes it difficult to lay out the comments. Third, when you're deep
889within the body of the code, it is indented an extra level. Finally, when
890reading the top of the function, it isn't clear what the result is if the
891predicate isn't true; you have to read to the end of the function to know that
892it returns null.
893
894It is much preferred to format the code like this:
895
896.. code-block:: c++
897
Andrew Tricke6af4b92012-09-20 17:02:04 +0000898 Value *doSomething(Instruction *I) {
Bill Wendling1c5e94a2012-06-20 02:57:56 +0000899 // Terminators never need 'something' done to them because ...
900 if (isa<TerminatorInst>(I))
901 return 0;
902
903 // We conservatively avoid transforming instructions with multiple uses
904 // because goats like cheese.
905 if (!I->hasOneUse())
906 return 0;
907
908 // This is really just here for example.
Andrew Tricke6af4b92012-09-20 17:02:04 +0000909 if (!doOtherThing(I))
Bill Wendling1c5e94a2012-06-20 02:57:56 +0000910 return 0;
911
912 ... some long code ....
913 }
914
915This fixes these problems. A similar problem frequently happens in ``for``
916loops. A silly example is something like this:
917
918.. code-block:: c++
919
920 for (BasicBlock::iterator II = BB->begin(), E = BB->end(); II != E; ++II) {
921 if (BinaryOperator *BO = dyn_cast<BinaryOperator>(II)) {
922 Value *LHS = BO->getOperand(0);
923 Value *RHS = BO->getOperand(1);
924 if (LHS != RHS) {
925 ...
926 }
927 }
928 }
929
930When you have very, very small loops, this sort of structure is fine. But if it
931exceeds more than 10-15 lines, it becomes difficult for people to read and
932understand at a glance. The problem with this sort of code is that it gets very
933nested very quickly. Meaning that the reader of the code has to keep a lot of
934context in their brain to remember what is going immediately on in the loop,
935because they don't know if/when the ``if`` conditions will have ``else``\s etc.
936It is strongly preferred to structure the loop like this:
937
938.. code-block:: c++
939
940 for (BasicBlock::iterator II = BB->begin(), E = BB->end(); II != E; ++II) {
941 BinaryOperator *BO = dyn_cast<BinaryOperator>(II);
942 if (!BO) continue;
943
944 Value *LHS = BO->getOperand(0);
945 Value *RHS = BO->getOperand(1);
946 if (LHS == RHS) continue;
947
948 ...
949 }
950
951This has all the benefits of using early exits for functions: it reduces nesting
952of the loop, it makes it easier to describe why the conditions are true, and it
953makes it obvious to the reader that there is no ``else`` coming up that they
954have to push context into their brain for. If a loop is large, this can be a
955big understandability win.
956
957Don't use ``else`` after a ``return``
958^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
959
960For similar reasons above (reduction of indentation and easier reading), please
961do not use ``'else'`` or ``'else if'`` after something that interrupts control
962flow --- like ``return``, ``break``, ``continue``, ``goto``, etc. For
963example, this is *bad*:
964
965.. code-block:: c++
966
967 case 'J': {
968 if (Signed) {
969 Type = Context.getsigjmp_bufType();
970 if (Type.isNull()) {
971 Error = ASTContext::GE_Missing_sigjmp_buf;
972 return QualType();
973 } else {
974 break;
975 }
976 } else {
977 Type = Context.getjmp_bufType();
978 if (Type.isNull()) {
979 Error = ASTContext::GE_Missing_jmp_buf;
980 return QualType();
Meador Inge46137da2012-06-20 23:48:01 +0000981 } else {
Bill Wendling1c5e94a2012-06-20 02:57:56 +0000982 break;
Meador Inge46137da2012-06-20 23:48:01 +0000983 }
Bill Wendling1c5e94a2012-06-20 02:57:56 +0000984 }
985 }
Bill Wendling1c5e94a2012-06-20 02:57:56 +0000986
987It is better to write it like this:
988
989.. code-block:: c++
990
991 case 'J':
992 if (Signed) {
993 Type = Context.getsigjmp_bufType();
994 if (Type.isNull()) {
995 Error = ASTContext::GE_Missing_sigjmp_buf;
996 return QualType();
997 }
998 } else {
999 Type = Context.getjmp_bufType();
1000 if (Type.isNull()) {
1001 Error = ASTContext::GE_Missing_jmp_buf;
1002 return QualType();
1003 }
1004 }
1005 break;
1006
1007Or better yet (in this case) as:
1008
1009.. code-block:: c++
1010
1011 case 'J':
1012 if (Signed)
1013 Type = Context.getsigjmp_bufType();
1014 else
1015 Type = Context.getjmp_bufType();
1016
1017 if (Type.isNull()) {
1018 Error = Signed ? ASTContext::GE_Missing_sigjmp_buf :
1019 ASTContext::GE_Missing_jmp_buf;
1020 return QualType();
1021 }
1022 break;
1023
1024The idea is to reduce indentation and the amount of code you have to keep track
1025of when reading the code.
1026
1027Turn Predicate Loops into Predicate Functions
1028^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1029
1030It is very common to write small loops that just compute a boolean value. There
1031are a number of ways that people commonly write these, but an example of this
1032sort of thing is:
1033
1034.. code-block:: c++
1035
1036 bool FoundFoo = false;
Sean Silva7333a842012-11-17 23:25:33 +00001037 for (unsigned I = 0, E = BarList.size(); I != E; ++I)
1038 if (BarList[I]->isFoo()) {
Bill Wendling1c5e94a2012-06-20 02:57:56 +00001039 FoundFoo = true;
1040 break;
1041 }
1042
1043 if (FoundFoo) {
1044 ...
1045 }
1046
1047This sort of code is awkward to write, and is almost always a bad sign. Instead
1048of this sort of loop, we strongly prefer to use a predicate function (which may
1049be `static`_) that uses `early exits`_ to compute the predicate. We prefer the
1050code to be structured like this:
1051
1052.. code-block:: c++
1053
Dmitri Gribenko9fb49d22012-10-20 13:27:43 +00001054 /// \returns true if the specified list has an element that is a foo.
Andrew Trickfc9420c2012-09-20 02:01:06 +00001055 static bool containsFoo(const std::vector<Bar*> &List) {
Sean Silva7333a842012-11-17 23:25:33 +00001056 for (unsigned I = 0, E = List.size(); I != E; ++I)
1057 if (List[I]->isFoo())
Bill Wendling1c5e94a2012-06-20 02:57:56 +00001058 return true;
1059 return false;
1060 }
1061 ...
1062
Andrew Trickfc9420c2012-09-20 02:01:06 +00001063 if (containsFoo(BarList)) {
Bill Wendling1c5e94a2012-06-20 02:57:56 +00001064 ...
1065 }
1066
1067There are many reasons for doing this: it reduces indentation and factors out
1068code which can often be shared by other code that checks for the same predicate.
1069More importantly, it *forces you to pick a name* for the function, and forces
1070you to write a comment for it. In this silly example, this doesn't add much
1071value. However, if the condition is complex, this can make it a lot easier for
1072the reader to understand the code that queries for this predicate. Instead of
1073being faced with the in-line details of how we check to see if the BarList
1074contains a foo, we can trust the function name and continue reading with better
1075locality.
1076
1077The Low-Level Issues
1078--------------------
1079
1080Name Types, Functions, Variables, and Enumerators Properly
1081^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1082
1083Poorly-chosen names can mislead the reader and cause bugs. We cannot stress
1084enough how important it is to use *descriptive* names. Pick names that match
1085the semantics and role of the underlying entities, within reason. Avoid
1086abbreviations unless they are well known. After picking a good name, make sure
1087to use consistent capitalization for the name, as inconsistency requires clients
1088to either memorize the APIs or to look it up to find the exact spelling.
1089
1090In general, names should be in camel case (e.g. ``TextFileReader`` and
1091``isLValue()``). Different kinds of declarations have different rules:
1092
1093* **Type names** (including classes, structs, enums, typedefs, etc) should be
1094 nouns and start with an upper-case letter (e.g. ``TextFileReader``).
1095
1096* **Variable names** should be nouns (as they represent state). The name should
1097 be camel case, and start with an upper case letter (e.g. ``Leader`` or
1098 ``Boats``).
1099
1100* **Function names** should be verb phrases (as they represent actions), and
1101 command-like function should be imperative. The name should be camel case,
1102 and start with a lower case letter (e.g. ``openFile()`` or ``isFoo()``).
1103
1104* **Enum declarations** (e.g. ``enum Foo {...}``) are types, so they should
1105 follow the naming conventions for types. A common use for enums is as a
1106 discriminator for a union, or an indicator of a subclass. When an enum is
1107 used for something like this, it should have a ``Kind`` suffix
1108 (e.g. ``ValueKind``).
1109
1110* **Enumerators** (e.g. ``enum { Foo, Bar }``) and **public member variables**
1111 should start with an upper-case letter, just like types. Unless the
1112 enumerators are defined in their own small namespace or inside a class,
1113 enumerators should have a prefix corresponding to the enum declaration name.
1114 For example, ``enum ValueKind { ... };`` may contain enumerators like
1115 ``VK_Argument``, ``VK_BasicBlock``, etc. Enumerators that are just
1116 convenience constants are exempt from the requirement for a prefix. For
1117 instance:
1118
1119 .. code-block:: c++
1120
1121 enum {
1122 MaxSize = 42,
1123 Density = 12
1124 };
1125
1126As an exception, classes that mimic STL classes can have member names in STL's
1127style of lower-case words separated by underscores (e.g. ``begin()``,
Rafael Espindolab0b16222013-08-07 19:34:37 +00001128``push_back()``, and ``empty()``). Classes that provide multiple
1129iterators should add a singular prefix to ``begin()`` and ``end()``
1130(e.g. ``global_begin()`` and ``use_begin()``).
Bill Wendling1c5e94a2012-06-20 02:57:56 +00001131
1132Here are some examples of good and bad names:
1133
Meador Inge6a706af2012-06-20 23:57:00 +00001134.. code-block:: c++
Bill Wendling1c5e94a2012-06-20 02:57:56 +00001135
1136 class VehicleMaker {
1137 ...
1138 Factory<Tire> F; // Bad -- abbreviation and non-descriptive.
1139 Factory<Tire> Factory; // Better.
1140 Factory<Tire> TireFactory; // Even better -- if VehicleMaker has more than one
1141 // kind of factories.
1142 };
1143
1144 Vehicle MakeVehicle(VehicleType Type) {
1145 VehicleMaker M; // Might be OK if having a short life-span.
Sean Silva7333a842012-11-17 23:25:33 +00001146 Tire Tmp1 = M.makeTire(); // Bad -- 'Tmp1' provides no information.
1147 Light Headlight = M.makeLight("head"); // Good -- descriptive.
Bill Wendling1c5e94a2012-06-20 02:57:56 +00001148 ...
1149 }
1150
1151Assert Liberally
1152^^^^^^^^^^^^^^^^
1153
1154Use the "``assert``" macro to its fullest. Check all of your preconditions and
1155assumptions, you never know when a bug (not necessarily even yours) might be
1156caught early by an assertion, which reduces debugging time dramatically. The
1157"``<cassert>``" header file is probably already included by the header files you
1158are using, so it doesn't cost anything to use it.
1159
1160To further assist with debugging, make sure to put some kind of error message in
1161the assertion statement, which is printed if the assertion is tripped. This
1162helps the poor debugger make sense of why an assertion is being made and
1163enforced, and hopefully what to do about it. Here is one complete example:
1164
1165.. code-block:: c++
1166
Sean Silva7333a842012-11-17 23:25:33 +00001167 inline Value *getOperand(unsigned I) {
1168 assert(I < Operands.size() && "getOperand() out of range!");
1169 return Operands[I];
Bill Wendling1c5e94a2012-06-20 02:57:56 +00001170 }
1171
1172Here are more examples:
1173
1174.. code-block:: c++
1175
Alp Tokerf907b892013-12-05 05:44:44 +00001176 assert(Ty->isPointerType() && "Can't allocate a non-pointer type!");
Bill Wendling1c5e94a2012-06-20 02:57:56 +00001177
1178 assert((Opcode == Shl || Opcode == Shr) && "ShiftInst Opcode invalid!");
1179
1180 assert(idx < getNumSuccessors() && "Successor # out of range!");
1181
1182 assert(V1.getType() == V2.getType() && "Constant types must be identical!");
1183
1184 assert(isa<PHINode>(Succ->front()) && "Only works on PHId BBs!");
1185
1186You get the idea.
1187
Jordan Rose2962d952012-10-26 22:08:46 +00001188In the past, asserts were used to indicate a piece of code that should not be
1189reached. These were typically of the form:
Bill Wendling1c5e94a2012-06-20 02:57:56 +00001190
1191.. code-block:: c++
1192
Jordan Rose2962d952012-10-26 22:08:46 +00001193 assert(0 && "Invalid radix for integer literal");
Bill Wendling1c5e94a2012-06-20 02:57:56 +00001194
Jordan Rose2962d952012-10-26 22:08:46 +00001195This has a few issues, the main one being that some compilers might not
1196understand the assertion, or warn about a missing return in builds where
1197assertions are compiled out.
1198
1199Today, we have something much better: ``llvm_unreachable``:
Bill Wendling1c5e94a2012-06-20 02:57:56 +00001200
1201.. code-block:: c++
1202
Jordan Rose2962d952012-10-26 22:08:46 +00001203 llvm_unreachable("Invalid radix for integer literal");
1204
1205When assertions are enabled, this will print the message if it's ever reached
1206and then exit the program. When assertions are disabled (i.e. in release
1207builds), ``llvm_unreachable`` becomes a hint to compilers to skip generating
1208code for this branch. If the compiler does not support this, it will fall back
1209to the "abort" implementation.
Bill Wendling1c5e94a2012-06-20 02:57:56 +00001210
1211Another issue is that values used only by assertions will produce an "unused
1212value" warning when assertions are disabled. For example, this code will warn:
1213
1214.. code-block:: c++
1215
1216 unsigned Size = V.size();
1217 assert(Size > 42 && "Vector smaller than it should be");
1218
1219 bool NewToSet = Myset.insert(Value);
1220 assert(NewToSet && "The value shouldn't be in the set yet");
1221
1222These are two interesting different cases. In the first case, the call to
1223``V.size()`` is only useful for the assert, and we don't want it executed when
1224assertions are disabled. Code like this should move the call into the assert
1225itself. In the second case, the side effects of the call must happen whether
1226the assert is enabled or not. In this case, the value should be cast to void to
1227disable the warning. To be specific, it is preferred to write the code like
1228this:
1229
1230.. code-block:: c++
1231
1232 assert(V.size() > 42 && "Vector smaller than it should be");
1233
1234 bool NewToSet = Myset.insert(Value); (void)NewToSet;
1235 assert(NewToSet && "The value shouldn't be in the set yet");
1236
1237Do Not Use ``using namespace std``
1238^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1239
1240In LLVM, we prefer to explicitly prefix all identifiers from the standard
1241namespace with an "``std::``" prefix, rather than rely on "``using namespace
1242std;``".
1243
1244In header files, adding a ``'using namespace XXX'`` directive pollutes the
1245namespace of any source file that ``#include``\s the header. This is clearly a
1246bad thing.
1247
1248In implementation files (e.g. ``.cpp`` files), the rule is more of a stylistic
1249rule, but is still important. Basically, using explicit namespace prefixes
1250makes the code **clearer**, because it is immediately obvious what facilities
1251are being used and where they are coming from. And **more portable**, because
1252namespace clashes cannot occur between LLVM code and other namespaces. The
1253portability rule is important because different standard library implementations
1254expose different symbols (potentially ones they shouldn't), and future revisions
1255to the C++ standard will add more symbols to the ``std`` namespace. As such, we
1256never use ``'using namespace std;'`` in LLVM.
1257
1258The exception to the general rule (i.e. it's not an exception for the ``std``
1259namespace) is for implementation files. For example, all of the code in the
1260LLVM project implements code that lives in the 'llvm' namespace. As such, it is
1261ok, and actually clearer, for the ``.cpp`` files to have a ``'using namespace
1262llvm;'`` directive at the top, after the ``#include``\s. This reduces
1263indentation in the body of the file for source editors that indent based on
1264braces, and keeps the conceptual context cleaner. The general form of this rule
1265is that any ``.cpp`` file that implements code in any namespace may use that
1266namespace (and its parents'), but should not use any others.
1267
1268Provide a Virtual Method Anchor for Classes in Headers
1269^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1270
1271If a class is defined in a header file and has a vtable (either it has virtual
1272methods or it derives from classes with virtual methods), it must always have at
1273least one out-of-line virtual method in the class. Without this, the compiler
1274will copy the vtable and RTTI into every ``.o`` file that ``#include``\s the
1275header, bloating ``.o`` file sizes and increasing link times.
1276
David Blaikie00bec9a2012-09-21 17:47:36 +00001277Don't use default labels in fully covered switches over enumerations
1278^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1279
1280``-Wswitch`` warns if a switch, without a default label, over an enumeration
1281does not cover every enumeration value. If you write a default label on a fully
1282covered switch over an enumeration then the ``-Wswitch`` warning won't fire
1283when new elements are added to that enumeration. To help avoid adding these
1284kinds of defaults, Clang has the warning ``-Wcovered-switch-default`` which is
1285off by default but turned on when building LLVM with a version of Clang that
1286supports the warning.
1287
1288A knock-on effect of this stylistic requirement is that when building LLVM with
David Blaikief787f172012-09-21 18:03:02 +00001289GCC you may get warnings related to "control may reach end of non-void function"
David Blaikie00bec9a2012-09-21 17:47:36 +00001290if you return from each case of a covered switch-over-enum because GCC assumes
David Blaikief787f172012-09-21 18:03:02 +00001291that the enum expression may take any representable value, not just those of
1292individual enumerators. To suppress this warning, use ``llvm_unreachable`` after
1293the switch.
David Blaikie00bec9a2012-09-21 17:47:36 +00001294
Craig Topper1740e052012-09-18 04:43:40 +00001295Use ``LLVM_DELETED_FUNCTION`` to mark uncallable methods
1296^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1297
1298Prior to C++11, a common pattern to make a class uncopyable was to declare an
1299unimplemented copy constructor and copy assignment operator and make them
1300private. This would give a compiler error for accessing a private method or a
1301linker error because it wasn't implemented.
1302
Dmitri Gribenkobe88f562012-09-18 14:00:58 +00001303With C++11, we can mark methods that won't be implemented with ``= delete``.
Craig Topper1740e052012-09-18 04:43:40 +00001304This will trigger a much better error message and tell the compiler that the
1305method will never be implemented. This enables other checks like
1306``-Wunused-private-field`` to run correctly on classes that contain these
1307methods.
1308
Duncan P. N. Exon Smithb6f58112014-04-17 18:02:36 +00001309For compatibility with MSVC, ``LLVM_DELETED_FUNCTION`` should be used which
1310will expand to ``= delete`` on compilers that support it. These methods should
1311still be declared private. Example of the uncopyable pattern:
Craig Topper1740e052012-09-18 04:43:40 +00001312
1313.. code-block:: c++
1314
1315 class DontCopy {
1316 private:
1317 DontCopy(const DontCopy&) LLVM_DELETED_FUNCTION;
1318 DontCopy &operator =(const DontCopy&) LLVM_DELETED_FUNCTION;
1319 public:
1320 ...
1321 };
1322
Bill Wendling1c5e94a2012-06-20 02:57:56 +00001323Don't evaluate ``end()`` every time through a loop
1324^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1325
1326Because C++ doesn't have a standard "``foreach``" loop (though it can be
1327emulated with macros and may be coming in C++'0x) we end up writing a lot of
1328loops that manually iterate from begin to end on a variety of containers or
1329through other data structures. One common mistake is to write a loop in this
1330style:
1331
1332.. code-block:: c++
1333
1334 BasicBlock *BB = ...
1335 for (BasicBlock::iterator I = BB->begin(); I != BB->end(); ++I)
1336 ... use I ...
1337
1338The problem with this construct is that it evaluates "``BB->end()``" every time
1339through the loop. Instead of writing the loop like this, we strongly prefer
1340loops to be written so that they evaluate it once before the loop starts. A
1341convenient way to do this is like so:
1342
1343.. code-block:: c++
1344
1345 BasicBlock *BB = ...
1346 for (BasicBlock::iterator I = BB->begin(), E = BB->end(); I != E; ++I)
1347 ... use I ...
1348
1349The observant may quickly point out that these two loops may have different
1350semantics: if the container (a basic block in this case) is being mutated, then
1351"``BB->end()``" may change its value every time through the loop and the second
1352loop may not in fact be correct. If you actually do depend on this behavior,
1353please write the loop in the first form and add a comment indicating that you
1354did it intentionally.
1355
1356Why do we prefer the second form (when correct)? Writing the loop in the first
1357form has two problems. First it may be less efficient than evaluating it at the
1358start of the loop. In this case, the cost is probably minor --- a few extra
1359loads every time through the loop. However, if the base expression is more
1360complex, then the cost can rise quickly. I've seen loops where the end
Sean Silva7333a842012-11-17 23:25:33 +00001361expression was actually something like: "``SomeMap[X]->end()``" and map lookups
Bill Wendling1c5e94a2012-06-20 02:57:56 +00001362really aren't cheap. By writing it in the second form consistently, you
1363eliminate the issue entirely and don't even have to think about it.
1364
1365The second (even bigger) issue is that writing the loop in the first form hints
1366to the reader that the loop is mutating the container (a fact that a comment
1367would handily confirm!). If you write the loop in the second form, it is
1368immediately obvious without even looking at the body of the loop that the
1369container isn't being modified, which makes it easier to read the code and
1370understand what it does.
1371
1372While the second form of the loop is a few extra keystrokes, we do strongly
1373prefer it.
1374
1375``#include <iostream>`` is Forbidden
1376^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1377
1378The use of ``#include <iostream>`` in library files is hereby **forbidden**,
1379because many common implementations transparently inject a `static constructor`_
1380into every translation unit that includes it.
1381
1382Note that using the other stream headers (``<sstream>`` for example) is not
1383problematic in this regard --- just ``<iostream>``. However, ``raw_ostream``
1384provides various APIs that are better performing for almost every use than
1385``std::ostream`` style APIs.
1386
1387.. note::
1388
1389 New code should always use `raw_ostream`_ for writing, or the
1390 ``llvm::MemoryBuffer`` API for reading files.
1391
1392.. _raw_ostream:
1393
1394Use ``raw_ostream``
1395^^^^^^^^^^^^^^^^^^^
1396
1397LLVM includes a lightweight, simple, and efficient stream implementation in
1398``llvm/Support/raw_ostream.h``, which provides all of the common features of
1399``std::ostream``. All new code should use ``raw_ostream`` instead of
1400``ostream``.
1401
1402Unlike ``std::ostream``, ``raw_ostream`` is not a template and can be forward
1403declared as ``class raw_ostream``. Public headers should generally not include
1404the ``raw_ostream`` header, but use forward declarations and constant references
1405to ``raw_ostream`` instances.
1406
1407Avoid ``std::endl``
1408^^^^^^^^^^^^^^^^^^^
1409
1410The ``std::endl`` modifier, when used with ``iostreams`` outputs a newline to
1411the output stream specified. In addition to doing this, however, it also
1412flushes the output stream. In other words, these are equivalent:
1413
1414.. code-block:: c++
1415
1416 std::cout << std::endl;
1417 std::cout << '\n' << std::flush;
1418
1419Most of the time, you probably have no reason to flush the output stream, so
1420it's better to use a literal ``'\n'``.
1421
Dmitri Gribenkoa84c59c2013-02-04 10:24:58 +00001422Don't use ``inline`` when defining a function in a class definition
1423^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1424
1425A member function defined in a class definition is implicitly inline, so don't
1426put the ``inline`` keyword in this case.
1427
1428Don't:
1429
1430.. code-block:: c++
1431
1432 class Foo {
1433 public:
1434 inline void bar() {
1435 // ...
1436 }
1437 };
1438
1439Do:
1440
1441.. code-block:: c++
1442
1443 class Foo {
1444 public:
1445 void bar() {
1446 // ...
1447 }
1448 };
1449
Bill Wendling1c5e94a2012-06-20 02:57:56 +00001450Microscopic Details
1451-------------------
1452
1453This section describes preferred low-level formatting guidelines along with
1454reasoning on why we prefer them.
1455
1456Spaces Before Parentheses
1457^^^^^^^^^^^^^^^^^^^^^^^^^
1458
1459We prefer to put a space before an open parenthesis only in control flow
1460statements, but not in normal function call expressions and function-like
1461macros. For example, this is good:
1462
1463.. code-block:: c++
1464
Sean Silva7333a842012-11-17 23:25:33 +00001465 if (X) ...
1466 for (I = 0; I != 100; ++I) ...
1467 while (LLVMRocks) ...
Bill Wendling1c5e94a2012-06-20 02:57:56 +00001468
1469 somefunc(42);
1470 assert(3 != 4 && "laws of math are failing me");
1471
Sean Silva7333a842012-11-17 23:25:33 +00001472 A = foo(42, 92) + bar(X);
Bill Wendling1c5e94a2012-06-20 02:57:56 +00001473
1474and this is bad:
1475
1476.. code-block:: c++
1477
Sean Silva7333a842012-11-17 23:25:33 +00001478 if(X) ...
1479 for(I = 0; I != 100; ++I) ...
1480 while(LLVMRocks) ...
Bill Wendling1c5e94a2012-06-20 02:57:56 +00001481
1482 somefunc (42);
1483 assert (3 != 4 && "laws of math are failing me");
1484
Sean Silva7333a842012-11-17 23:25:33 +00001485 A = foo (42, 92) + bar (X);
Bill Wendling1c5e94a2012-06-20 02:57:56 +00001486
1487The reason for doing this is not completely arbitrary. This style makes control
1488flow operators stand out more, and makes expressions flow better. The function
1489call operator binds very tightly as a postfix operator. Putting a space after a
1490function name (as in the last example) makes it appear that the code might bind
1491the arguments of the left-hand-side of a binary operator with the argument list
1492of a function and the name of the right side. More specifically, it is easy to
Sean Silva7333a842012-11-17 23:25:33 +00001493misread the "``A``" example as:
Bill Wendling1c5e94a2012-06-20 02:57:56 +00001494
1495.. code-block:: c++
1496
Sean Silva7333a842012-11-17 23:25:33 +00001497 A = foo ((42, 92) + bar) (X);
Bill Wendling1c5e94a2012-06-20 02:57:56 +00001498
1499when skimming through the code. By avoiding a space in a function, we avoid
1500this misinterpretation.
1501
1502Prefer Preincrement
1503^^^^^^^^^^^^^^^^^^^
1504
1505Hard fast rule: Preincrement (``++X``) may be no slower than postincrement
1506(``X++``) and could very well be a lot faster than it. Use preincrementation
1507whenever possible.
1508
1509The semantics of postincrement include making a copy of the value being
1510incremented, returning it, and then preincrementing the "work value". For
1511primitive types, this isn't a big deal. But for iterators, it can be a huge
1512issue (for example, some iterators contains stack and set objects in them...
1513copying an iterator could invoke the copy ctor's of these as well). In general,
1514get in the habit of always using preincrement, and you won't have a problem.
1515
1516
1517Namespace Indentation
1518^^^^^^^^^^^^^^^^^^^^^
1519
1520In general, we strive to reduce indentation wherever possible. This is useful
1521because we want code to `fit into 80 columns`_ without wrapping horribly, but
Chandler Carruth36dc5192014-01-20 10:15:32 +00001522also because it makes it easier to understand the code. To facilitate this and
1523avoid some insanely deep nesting on occasion, don't indent namespaces. If it
1524helps readability, feel free to add a comment indicating what namespace is
1525being closed by a ``}``. For example:
Bill Wendling1c5e94a2012-06-20 02:57:56 +00001526
1527.. code-block:: c++
1528
1529 namespace llvm {
1530 namespace knowledge {
1531
Dmitri Gribenko9fb49d22012-10-20 13:27:43 +00001532 /// This class represents things that Smith can have an intimate
Bill Wendling1c5e94a2012-06-20 02:57:56 +00001533 /// understanding of and contains the data associated with it.
1534 class Grokable {
1535 ...
1536 public:
1537 explicit Grokable() { ... }
1538 virtual ~Grokable() = 0;
1539
1540 ...
1541
1542 };
1543
1544 } // end namespace knowledge
1545 } // end namespace llvm
1546
Chandler Carruth36dc5192014-01-20 10:15:32 +00001547
1548Feel free to skip the closing comment when the namespace being closed is
1549obvious for any reason. For example, the outer-most namespace in a header file
1550is rarely a source of confusion. But namespaces both anonymous and named in
1551source files that are being closed half way through the file probably could use
1552clarification.
Bill Wendling1c5e94a2012-06-20 02:57:56 +00001553
1554.. _static:
1555
1556Anonymous Namespaces
1557^^^^^^^^^^^^^^^^^^^^
1558
1559After talking about namespaces in general, you may be wondering about anonymous
1560namespaces in particular. Anonymous namespaces are a great language feature
1561that tells the C++ compiler that the contents of the namespace are only visible
1562within the current translation unit, allowing more aggressive optimization and
1563eliminating the possibility of symbol name collisions. Anonymous namespaces are
1564to C++ as "static" is to C functions and global variables. While "``static``"
1565is available in C++, anonymous namespaces are more general: they can make entire
1566classes private to a file.
1567
1568The problem with anonymous namespaces is that they naturally want to encourage
1569indentation of their body, and they reduce locality of reference: if you see a
1570random function definition in a C++ file, it is easy to see if it is marked
1571static, but seeing if it is in an anonymous namespace requires scanning a big
1572chunk of the file.
1573
1574Because of this, we have a simple guideline: make anonymous namespaces as small
1575as possible, and only use them for class declarations. For example, this is
1576good:
1577
1578.. code-block:: c++
1579
1580 namespace {
Chandler Carruth36dc5192014-01-20 10:15:32 +00001581 class StringSort {
1582 ...
1583 public:
1584 StringSort(...)
1585 bool operator<(const char *RHS) const;
1586 };
Bill Wendling1c5e94a2012-06-20 02:57:56 +00001587 } // end anonymous namespace
1588
Andrew Trickfc9420c2012-09-20 02:01:06 +00001589 static void runHelper() {
Bill Wendling1c5e94a2012-06-20 02:57:56 +00001590 ...
1591 }
1592
1593 bool StringSort::operator<(const char *RHS) const {
1594 ...
1595 }
1596
1597This is bad:
1598
1599.. code-block:: c++
1600
1601 namespace {
Chandler Carruth36dc5192014-01-20 10:15:32 +00001602
Bill Wendling1c5e94a2012-06-20 02:57:56 +00001603 class StringSort {
1604 ...
1605 public:
1606 StringSort(...)
1607 bool operator<(const char *RHS) const;
1608 };
1609
Andrew Trickfc9420c2012-09-20 02:01:06 +00001610 void runHelper() {
Bill Wendling1c5e94a2012-06-20 02:57:56 +00001611 ...
1612 }
1613
1614 bool StringSort::operator<(const char *RHS) const {
1615 ...
1616 }
1617
1618 } // end anonymous namespace
1619
Andrew Trickfc9420c2012-09-20 02:01:06 +00001620This is bad specifically because if you're looking at "``runHelper``" in the middle
Bill Wendling1c5e94a2012-06-20 02:57:56 +00001621of a large C++ file, that you have no immediate way to tell if it is local to
1622the file. When it is marked static explicitly, this is immediately obvious.
1623Also, there is no reason to enclose the definition of "``operator<``" in the
1624namespace just because it was declared there.
1625
1626See Also
1627========
1628
Joel Jones7818be42013-01-21 23:20:47 +00001629A lot of these comments and recommendations have been culled from other sources.
Bill Wendling1c5e94a2012-06-20 02:57:56 +00001630Two particularly important books for our work are:
1631
1632#. `Effective C++
1633 <http://www.amazon.com/Effective-Specific-Addison-Wesley-Professional-Computing/dp/0321334876>`_
1634 by Scott Meyers. Also interesting and useful are "More Effective C++" and
1635 "Effective STL" by the same author.
1636
1637#. `Large-Scale C++ Software Design
1638 <http://www.amazon.com/Large-Scale-Software-Design-John-Lakos/dp/0201633620/ref=sr_1_1>`_
1639 by John Lakos
1640
1641If you get some free time, and you haven't read them: do so, you might learn
1642something.