blob: ce5d2c038cf54603d76c32b25e563456ebcf6711 [file] [log] [blame]
Linus Torvalds1da177e2005-04-16 15:20:36 -07001
2 Linux kernel coding style
3
4This is a short document describing the preferred coding style for the
5linux kernel. Coding style is very personal, and I won't _force_ my
6views on anybody, but this is what goes for anything that I have to be
7able to maintain, and I'd prefer it for most other things too. Please
8at least consider the points made here.
9
10First off, I'd suggest printing out a copy of the GNU coding standards,
11and NOT read it. Burn them, it's a great symbolic gesture.
12
13Anyway, here goes:
14
15
16 Chapter 1: Indentation
17
18Tabs are 8 characters, and thus indentations are also 8 characters.
19There are heretic movements that try to make indentations 4 (or even 2!)
20characters deep, and that is akin to trying to define the value of PI to
21be 3.
22
23Rationale: The whole idea behind indentation is to clearly define where
24a block of control starts and ends. Especially when you've been looking
25at your screen for 20 straight hours, you'll find it a lot easier to see
26how the indentation works if you have large indentations.
27
28Now, some people will claim that having 8-character indentations makes
29the code move too far to the right, and makes it hard to read on a
3080-character terminal screen. The answer to that is that if you need
31more than 3 levels of indentation, you're screwed anyway, and should fix
32your program.
33
34In short, 8-char indents make things easier to read, and have the added
35benefit of warning you when you're nesting your functions too deep.
36Heed that warning.
37
38Don't put multiple statements on a single line unless you have
39something to hide:
40
41 if (condition) do_this;
42 do_something_everytime;
43
44Outside of comments, documentation and except in Kconfig, spaces are never
45used for indentation, and the above example is deliberately broken.
46
47Get a decent editor and don't leave whitespace at the end of lines.
48
49
50 Chapter 2: Breaking long lines and strings
51
52Coding style is all about readability and maintainability using commonly
53available tools.
54
55The limit on the length of lines is 80 columns and this is a hard limit.
56
57Statements longer than 80 columns will be broken into sensible chunks.
58Descendants are always substantially shorter than the parent and are placed
59substantially to the right. The same applies to function headers with a long
60argument list. Long strings are as well broken into shorter strings.
61
62void fun(int a, int b, int c)
63{
64 if (condition)
65 printk(KERN_WARNING "Warning this is a long printk with "
66 "3 parameters a: %u b: %u "
67 "c: %u \n", a, b, c);
68 else
69 next_statement;
70}
71
72 Chapter 3: Placing Braces
73
74The other issue that always comes up in C styling is the placement of
75braces. Unlike the indent size, there are few technical reasons to
76choose one placement strategy over the other, but the preferred way, as
77shown to us by the prophets Kernighan and Ritchie, is to put the opening
78brace last on the line, and put the closing brace first, thusly:
79
80 if (x is true) {
81 we do y
82 }
83
84However, there is one special case, namely functions: they have the
85opening brace at the beginning of the next line, thus:
86
87 int function(int x)
88 {
89 body of function
90 }
91
92Heretic people all over the world have claimed that this inconsistency
93is ... well ... inconsistent, but all right-thinking people know that
94(a) K&R are _right_ and (b) K&R are right. Besides, functions are
95special anyway (you can't nest them in C).
96
97Note that the closing brace is empty on a line of its own, _except_ in
98the cases where it is followed by a continuation of the same statement,
99ie a "while" in a do-statement or an "else" in an if-statement, like
100this:
101
102 do {
103 body of do-loop
104 } while (condition);
105
106and
107
108 if (x == y) {
109 ..
110 } else if (x > y) {
111 ...
112 } else {
113 ....
114 }
115
116Rationale: K&R.
117
118Also, note that this brace-placement also minimizes the number of empty
119(or almost empty) lines, without any loss of readability. Thus, as the
120supply of new-lines on your screen is not a renewable resource (think
12125-line terminal screens here), you have more empty lines to put
122comments on.
123
124
125 Chapter 4: Naming
126
127C is a Spartan language, and so should your naming be. Unlike Modula-2
128and Pascal programmers, C programmers do not use cute names like
129ThisVariableIsATemporaryCounter. A C programmer would call that
130variable "tmp", which is much easier to write, and not the least more
131difficult to understand.
132
133HOWEVER, while mixed-case names are frowned upon, descriptive names for
134global variables are a must. To call a global function "foo" is a
135shooting offense.
136
137GLOBAL variables (to be used only if you _really_ need them) need to
138have descriptive names, as do global functions. If you have a function
139that counts the number of active users, you should call that
140"count_active_users()" or similar, you should _not_ call it "cntusr()".
141
142Encoding the type of a function into the name (so-called Hungarian
143notation) is brain damaged - the compiler knows the types anyway and can
144check those, and it only confuses the programmer. No wonder MicroSoft
145makes buggy programs.
146
147LOCAL variable names should be short, and to the point. If you have
148some random integer loop counter, it should probably be called "i".
149Calling it "loop_counter" is non-productive, if there is no chance of it
150being mis-understood. Similarly, "tmp" can be just about any type of
151variable that is used to hold a temporary value.
152
153If you are afraid to mix up your local variable names, you have another
154problem, which is called the function-growth-hormone-imbalance syndrome.
155See next chapter.
156
157
158 Chapter 5: Functions
159
160Functions should be short and sweet, and do just one thing. They should
161fit on one or two screenfuls of text (the ISO/ANSI screen size is 80x24,
162as we all know), and do one thing and do that well.
163
164The maximum length of a function is inversely proportional to the
165complexity and indentation level of that function. So, if you have a
166conceptually simple function that is just one long (but simple)
167case-statement, where you have to do lots of small things for a lot of
168different cases, it's OK to have a longer function.
169
170However, if you have a complex function, and you suspect that a
171less-than-gifted first-year high-school student might not even
172understand what the function is all about, you should adhere to the
173maximum limits all the more closely. Use helper functions with
174descriptive names (you can ask the compiler to in-line them if you think
175it's performance-critical, and it will probably do a better job of it
176than you would have done).
177
178Another measure of the function is the number of local variables. They
179shouldn't exceed 5-10, or you're doing something wrong. Re-think the
180function, and split it into smaller pieces. A human brain can
181generally easily keep track of about 7 different things, anything more
182and it gets confused. You know you're brilliant, but maybe you'd like
183to understand what you did 2 weeks from now.
184
185
186 Chapter 6: Centralized exiting of functions
187
188Albeit deprecated by some people, the equivalent of the goto statement is
189used frequently by compilers in form of the unconditional jump instruction.
190
191The goto statement comes in handy when a function exits from multiple
192locations and some common work such as cleanup has to be done.
193
194The rationale is:
195
196- unconditional statements are easier to understand and follow
197- nesting is reduced
198- errors by not updating individual exit points when making
199 modifications are prevented
200- saves the compiler work to optimize redundant code away ;)
201
Jesper Juhldc3d28d2006-01-09 20:53:51 -0800202int fun(int a)
Linus Torvalds1da177e2005-04-16 15:20:36 -0700203{
204 int result = 0;
205 char *buffer = kmalloc(SIZE);
206
207 if (buffer == NULL)
208 return -ENOMEM;
209
210 if (condition1) {
211 while (loop1) {
212 ...
213 }
214 result = 1;
215 goto out;
216 }
217 ...
218out:
219 kfree(buffer);
220 return result;
221}
222
223 Chapter 7: Commenting
224
225Comments are good, but there is also a danger of over-commenting. NEVER
226try to explain HOW your code works in a comment: it's much better to
227write the code so that the _working_ is obvious, and it's a waste of
228time to explain badly written code.
229
230Generally, you want your comments to tell WHAT your code does, not HOW.
231Also, try to avoid putting comments inside a function body: if the
232function is so complex that you need to separately comment parts of it,
233you should probably go back to chapter 5 for a while. You can make
234small comments to note or warn about something particularly clever (or
235ugly), but try to avoid excess. Instead, put the comments at the head
236of the function, telling people what it does, and possibly WHY it does
237it.
238
Pekka J Enberge776eba2005-09-10 00:26:44 -0700239When commenting the kernel API functions, please use the kerneldoc format.
240See the files Documentation/kernel-doc-nano-HOWTO.txt and scripts/kernel-doc
241for details.
Linus Torvalds1da177e2005-04-16 15:20:36 -0700242
243 Chapter 8: You've made a mess of it
244
245That's OK, we all do. You've probably been told by your long-time Unix
246user helper that "GNU emacs" automatically formats the C sources for
247you, and you've noticed that yes, it does do that, but the defaults it
248uses are less than desirable (in fact, they are worse than random
249typing - an infinite number of monkeys typing into GNU emacs would never
250make a good program).
251
252So, you can either get rid of GNU emacs, or change it to use saner
253values. To do the latter, you can stick the following in your .emacs file:
254
255(defun linux-c-mode ()
256 "C mode with adjusted defaults for use with the Linux kernel."
257 (interactive)
258 (c-mode)
259 (c-set-style "K&R")
260 (setq tab-width 8)
261 (setq indent-tabs-mode t)
262 (setq c-basic-offset 8))
263
264This will define the M-x linux-c-mode command. When hacking on a
265module, if you put the string -*- linux-c -*- somewhere on the first
266two lines, this mode will be automatically invoked. Also, you may want
267to add
268
269(setq auto-mode-alist (cons '("/usr/src/linux.*/.*\\.[ch]$" . linux-c-mode)
270 auto-mode-alist))
271
272to your .emacs file if you want to have linux-c-mode switched on
273automagically when you edit source files under /usr/src/linux.
274
275But even if you fail in getting emacs to do sane formatting, not
276everything is lost: use "indent".
277
278Now, again, GNU indent has the same brain-dead settings that GNU emacs
279has, which is why you need to give it a few command line options.
280However, that's not too bad, because even the makers of GNU indent
281recognize the authority of K&R (the GNU people aren't evil, they are
282just severely misguided in this matter), so you just give indent the
283options "-kr -i8" (stands for "K&R, 8 character indents"), or use
284"scripts/Lindent", which indents in the latest style.
285
286"indent" has a lot of options, and especially when it comes to comment
287re-formatting you may want to take a look at the man page. But
288remember: "indent" is not a fix for bad programming.
289
290
291 Chapter 9: Configuration-files
292
293For configuration options (arch/xxx/Kconfig, and all the Kconfig files),
294somewhat different indentation is used.
295
296Help text is indented with 2 spaces.
297
298if CONFIG_EXPERIMENTAL
299 tristate CONFIG_BOOM
300 default n
301 help
302 Apply nitroglycerine inside the keyboard (DANGEROUS)
303 bool CONFIG_CHEER
304 depends on CONFIG_BOOM
305 default y
306 help
307 Output nice messages when you explode
308endif
309
310Generally, CONFIG_EXPERIMENTAL should surround all options not considered
311stable. All options that are known to trash data (experimental write-
312support for file-systems, for instance) should be denoted (DANGEROUS), other
313experimental options should be denoted (EXPERIMENTAL).
314
315
316 Chapter 10: Data structures
317
318Data structures that have visibility outside the single-threaded
319environment they are created and destroyed in should always have
320reference counts. In the kernel, garbage collection doesn't exist (and
321outside the kernel garbage collection is slow and inefficient), which
322means that you absolutely _have_ to reference count all your uses.
323
324Reference counting means that you can avoid locking, and allows multiple
325users to have access to the data structure in parallel - and not having
326to worry about the structure suddenly going away from under them just
327because they slept or did something else for a while.
328
329Note that locking is _not_ a replacement for reference counting.
330Locking is used to keep data structures coherent, while reference
331counting is a memory management technique. Usually both are needed, and
332they are not to be confused with each other.
333
334Many data structures can indeed have two levels of reference counting,
335when there are users of different "classes". The subclass count counts
336the number of subclass users, and decrements the global count just once
337when the subclass count goes to zero.
338
339Examples of this kind of "multi-level-reference-counting" can be found in
340memory management ("struct mm_struct": mm_users and mm_count), and in
341filesystem code ("struct super_block": s_count and s_active).
342
343Remember: if another thread can find your data structure, and you don't
344have a reference count on it, you almost certainly have a bug.
345
346
Arjan van de Vena771f2b2006-01-08 01:05:04 -0800347 Chapter 11: Macros, Enums and RTL
Linus Torvalds1da177e2005-04-16 15:20:36 -0700348
349Names of macros defining constants and labels in enums are capitalized.
350
351#define CONSTANT 0x12345
352
353Enums are preferred when defining several related constants.
354
355CAPITALIZED macro names are appreciated but macros resembling functions
356may be named in lower case.
357
358Generally, inline functions are preferable to macros resembling functions.
359
360Macros with multiple statements should be enclosed in a do - while block:
361
362#define macrofun(a, b, c) \
363 do { \
364 if (a == 5) \
365 do_this(b, c); \
366 } while (0)
367
368Things to avoid when using macros:
369
3701) macros that affect control flow:
371
372#define FOO(x) \
373 do { \
374 if (blah(x) < 0) \
375 return -EBUGGERED; \
376 } while(0)
377
378is a _very_ bad idea. It looks like a function call but exits the "calling"
379function; don't break the internal parsers of those who will read the code.
380
3812) macros that depend on having a local variable with a magic name:
382
383#define FOO(val) bar(index, val)
384
385might look like a good thing, but it's confusing as hell when one reads the
386code and it's prone to breakage from seemingly innocent changes.
387
3883) macros with arguments that are used as l-values: FOO(x) = y; will
389bite you if somebody e.g. turns FOO into an inline function.
390
3914) forgetting about precedence: macros defining constants using expressions
392must enclose the expression in parentheses. Beware of similar issues with
393macros using parameters.
394
395#define CONSTANT 0x4000
396#define CONSTEXP (CONSTANT | 3)
397
398The cpp manual deals with macros exhaustively. The gcc internals manual also
399covers RTL which is used frequently with assembly language in the kernel.
400
401
402 Chapter 12: Printing kernel messages
403
404Kernel developers like to be seen as literate. Do mind the spelling
405of kernel messages to make a good impression. Do not use crippled
406words like "dont" and use "do not" or "don't" instead.
407
408Kernel messages do not have to be terminated with a period.
409
410Printing numbers in parentheses (%d) adds no value and should be avoided.
411
412
Pekka J Enbergaf4e5a22005-09-16 19:28:11 -0700413 Chapter 13: Allocating memory
414
415The kernel provides the following general purpose memory allocators:
416kmalloc(), kzalloc(), kcalloc(), and vmalloc(). Please refer to the API
417documentation for further information about them.
418
419The preferred form for passing a size of a struct is the following:
420
421 p = kmalloc(sizeof(*p), ...);
422
423The alternative form where struct name is spelled out hurts readability and
424introduces an opportunity for a bug when the pointer variable type is changed
425but the corresponding sizeof that is passed to a memory allocator is not.
426
427Casting the return value which is a void pointer is redundant. The conversion
428from void pointer to any other pointer type is guaranteed by the C programming
429language.
430
431
Arjan van de Vena771f2b2006-01-08 01:05:04 -0800432 Chapter 14: The inline disease
433
434There appears to be a common misperception that gcc has a magic "make me
435faster" speedup option called "inline". While the use of inlines can be
436appropriate (for example as a means of replacing macros, see Chapter 11), it
437very often is not. Abundant use of the inline keyword leads to a much bigger
438kernel, which in turn slows the system as a whole down, due to a bigger
439icache footprint for the CPU and simply because there is less memory
440available for the pagecache. Just think about it; a pagecache miss causes a
441disk seek, which easily takes 5 miliseconds. There are a LOT of cpu cycles
442that can go into these 5 miliseconds.
443
444A reasonable rule of thumb is to not put inline at functions that have more
445than 3 lines of code in them. An exception to this rule are the cases where
446a parameter is known to be a compiletime constant, and as a result of this
447constantness you *know* the compiler will be able to optimize most of your
448function away at compile time. For a good example of this later case, see
449the kmalloc() inline function.
450
451Often people argue that adding inline to functions that are static and used
452only once is always a win since there is no space tradeoff. While this is
453technically correct, gcc is capable of inlining these automatically without
454help, and the maintenance issue of removing the inline when a second user
455appears outweighs the potential value of the hint that tells gcc to do
456something it would have done anyway.
457
458
459
460 Chapter 15: References
Linus Torvalds1da177e2005-04-16 15:20:36 -0700461
462The C Programming Language, Second Edition
463by Brian W. Kernighan and Dennis M. Ritchie.
464Prentice Hall, Inc., 1988.
465ISBN 0-13-110362-8 (paperback), 0-13-110370-9 (hardback).
466URL: http://cm.bell-labs.com/cm/cs/cbook/
467
468The Practice of Programming
469by Brian W. Kernighan and Rob Pike.
470Addison-Wesley, Inc., 1999.
471ISBN 0-201-61586-X.
472URL: http://cm.bell-labs.com/cm/cs/tpop/
473
474GNU manuals - where in compliance with K&R and this text - for cpp, gcc,
Xose Vazquez Perez5b0ed2c2006-01-08 01:02:49 -0800475gcc internals and indent, all available from http://www.gnu.org/manual/
Linus Torvalds1da177e2005-04-16 15:20:36 -0700476
477WG14 is the international standardization working group for the programming
Xose Vazquez Perez5b0ed2c2006-01-08 01:02:49 -0800478language C, URL: http://www.open-std.org/JTC1/SC22/WG14/
479
480Kernel CodingStyle, by greg@kroah.com at OLS 2002:
481http://www.kroah.com/linux/talks/ols_2002_kernel_codingstyle_talk/html/
Linus Torvalds1da177e2005-04-16 15:20:36 -0700482
483--
Arjan van de Vena771f2b2006-01-08 01:05:04 -0800484Last updated on 30 December 2005 by a community effort on LKML.