Jonathan Corbet | 75b0214 | 2008-09-30 15:15:56 -0600 | [diff] [blame^] | 1 | 2: HOW THE DEVELOPMENT PROCESS WORKS |
| 2 | |
| 3 | Linux kernel development in the early 1990's was a pretty loose affair, |
| 4 | with relatively small numbers of users and developers involved. With a |
| 5 | user base in the millions and with some 2,000 developers involved over the |
| 6 | course of one year, the kernel has since had to evolve a number of |
| 7 | processes to keep development happening smoothly. A solid understanding of |
| 8 | how the process works is required in order to be an effective part of it. |
| 9 | |
| 10 | |
| 11 | 2.1: THE BIG PICTURE |
| 12 | |
| 13 | The kernel developers use a loosely time-based release process, with a new |
| 14 | major kernel release happening every two or three months. The recent |
| 15 | release history looks like this: |
| 16 | |
| 17 | 2.6.26 July 13, 2008 |
| 18 | 2.6.25 April 16, 2008 |
| 19 | 2.6.24 January 24, 2008 |
| 20 | 2.6.23 October 9, 2007 |
| 21 | 2.6.22 July 8, 2007 |
| 22 | 2.6.21 April 25, 2007 |
| 23 | 2.6.20 February 4, 2007 |
| 24 | |
| 25 | Every 2.6.x release is a major kernel release with new features, internal |
| 26 | API changes, and more. A typical 2.6 release can contain over 10,000 |
| 27 | changesets with changes to several hundred thousand lines of code. 2.6 is |
| 28 | thus the leading edge of Linux kernel development; the kernel uses a |
| 29 | rolling development model which is continually integrating major changes. |
| 30 | |
| 31 | A relatively straightforward discipline is followed with regard to the |
| 32 | merging of patches for each release. At the beginning of each development |
| 33 | cycle, the "merge window" is said to be open. At that time, code which is |
| 34 | deemed to be sufficiently stable (and which is accepted by the development |
| 35 | community) is merged into the mainline kernel. The bulk of changes for a |
| 36 | new development cycle (and all of the major changes) will be merged during |
| 37 | this time, at a rate approaching 1,000 changes ("patches," or "changesets") |
| 38 | per day. |
| 39 | |
| 40 | (As an aside, it is worth noting that the changes integrated during the |
| 41 | merge window do not come out of thin air; they have been collected, tested, |
| 42 | and staged ahead of time. How that process works will be described in |
| 43 | detail later on). |
| 44 | |
| 45 | The merge window lasts for two weeks. At the end of this time, Linus |
| 46 | Torvalds will declare that the window is closed and release the first of |
| 47 | the "rc" kernels. For the kernel which is destined to be 2.6.26, for |
| 48 | example, the release which happens at the end of the merge window will be |
| 49 | called 2.6.26-rc1. The -rc1 release is the signal that the time to merge |
| 50 | new features has passed, and that the time to stabilize the next kernel has |
| 51 | begun. |
| 52 | |
| 53 | Over the next six to ten weeks, only patches which fix problems should be |
| 54 | submitted to the mainline. On occasion a more significant change will be |
| 55 | allowed, but such occasions are rare; developers who try to merge new |
| 56 | features outside of the merge window tend to get an unfriendly reception. |
| 57 | As a general rule, if you miss the merge window for a given feature, the |
| 58 | best thing to do is to wait for the next development cycle. (An occasional |
| 59 | exception is made for drivers for previously-unsupported hardware; if they |
| 60 | touch no in-tree code, they cannot cause regressions and should be safe to |
| 61 | add at any time). |
| 62 | |
| 63 | As fixes make their way into the mainline, the patch rate will slow over |
| 64 | time. Linus releases new -rc kernels about once a week; a normal series |
| 65 | will get up to somewhere between -rc6 and -rc9 before the kernel is |
| 66 | considered to be sufficiently stable and the final 2.6.x release is made. |
| 67 | At that point the whole process starts over again. |
| 68 | |
| 69 | As an example, here is how the 2.6.25 development cycle went (all dates in |
| 70 | 2008): |
| 71 | |
| 72 | January 24 2.6.24 stable release |
| 73 | February 10 2.6.25-rc1, merge window closes |
| 74 | February 15 2.6.25-rc2 |
| 75 | February 24 2.6.25-rc3 |
| 76 | March 4 2.6.25-rc4 |
| 77 | March 9 2.6.25-rc5 |
| 78 | March 16 2.6.25-rc6 |
| 79 | March 25 2.6.25-rc7 |
| 80 | April 1 2.6.25-rc8 |
| 81 | April 11 2.6.25-rc9 |
| 82 | April 16 2.6.25 stable release |
| 83 | |
| 84 | How do the developers decide when to close the development cycle and create |
| 85 | the stable release? The most significant metric used is the list of |
| 86 | regressions from previous releases. No bugs are welcome, but those which |
| 87 | break systems which worked in the past are considered to be especially |
| 88 | serious. For this reason, patches which cause regressions are looked upon |
| 89 | unfavorably and are quite likely to be reverted during the stabilization |
| 90 | period. |
| 91 | |
| 92 | The developers' goal is to fix all known regressions before the stable |
| 93 | release is made. In the real world, this kind of perfection is hard to |
| 94 | achieve; there are just too many variables in a project of this size. |
| 95 | There comes a point where delaying the final release just makes the problem |
| 96 | worse; the pile of changes waiting for the next merge window will grow |
| 97 | larger, creating even more regressions the next time around. So most 2.6.x |
| 98 | kernels go out with a handful of known regressions though, hopefully, none |
| 99 | of them are serious. |
| 100 | |
| 101 | Once a stable release is made, its ongoing maintenance is passed off to the |
| 102 | "stable team," currently comprised of Greg Kroah-Hartman and Chris Wright. |
| 103 | The stable team will release occasional updates to the stable release using |
| 104 | the 2.6.x.y numbering scheme. To be considered for an update release, a |
| 105 | patch must (1) fix a significant bug, and (2) already be merged into the |
| 106 | mainline for the next development kernel. Continuing our 2.6.25 example, |
| 107 | the history (as of this writing) is: |
| 108 | |
| 109 | May 1 2.6.25.1 |
| 110 | May 6 2.6.25.2 |
| 111 | May 9 2.6.25.3 |
| 112 | May 15 2.6.25.4 |
| 113 | June 7 2.6.25.5 |
| 114 | June 9 2.6.25.6 |
| 115 | June 16 2.6.25.7 |
| 116 | June 21 2.6.25.8 |
| 117 | June 24 2.6.25.9 |
| 118 | |
| 119 | Stable updates for a given kernel are made for approximately six months; |
| 120 | after that, the maintenance of stable releases is solely the responsibility |
| 121 | of the distributors which have shipped that particular kernel. |
| 122 | |
| 123 | |
| 124 | 2.2: THE LIFECYCLE OF A PATCH |
| 125 | |
| 126 | Patches do not go directly from the developer's keyboard into the mainline |
| 127 | kernel. There is, instead, a somewhat involved (if somewhat informal) |
| 128 | process designed to ensure that each patch is reviewed for quality and that |
| 129 | each patch implements a change which is desirable to have in the mainline. |
| 130 | This process can happen quickly for minor fixes, or, in the case of large |
| 131 | and controversial changes, go on for years. Much developer frustration |
| 132 | comes from a lack of understanding of this process or from attempts to |
| 133 | circumvent it. |
| 134 | |
| 135 | In the hopes of reducing that frustration, this document will describe how |
| 136 | a patch gets into the kernel. What follows below is an introduction which |
| 137 | describes the process in a somewhat idealized way. A much more detailed |
| 138 | treatment will come in later sections. |
| 139 | |
| 140 | The stages that a patch goes through are, generally: |
| 141 | |
| 142 | - Design. This is where the real requirements for the patch - and the way |
| 143 | those requirements will be met - are laid out. Design work is often |
| 144 | done without involving the community, but it is better to do this work |
| 145 | in the open if at all possible; it can save a lot of time redesigning |
| 146 | things later. |
| 147 | |
| 148 | - Early review. Patches are posted to the relevant mailing list, and |
| 149 | developers on that list reply with any comments they may have. This |
| 150 | process should turn up any major problems with a patch if all goes |
| 151 | well. |
| 152 | |
| 153 | - Wider review. When the patch is getting close to ready for mainline |
| 154 | inclusion, it will be accepted by a relevant subsystem maintainer - |
| 155 | though this acceptance is not a guarantee that the patch will make it |
| 156 | all the way to the mainline. The patch will show up in the maintainer's |
| 157 | subsystem tree and into the staging trees (described below). When the |
| 158 | process works, this step leads to more extensive review of the patch and |
| 159 | the discovery of any problems resulting from the integration of this |
| 160 | patch with work being done by others. |
| 161 | |
| 162 | - Merging into the mainline. Eventually, a successful patch will be |
| 163 | merged into the mainline repository managed by Linus Torvalds. More |
| 164 | comments and/or problems may surface at this time; it is important that |
| 165 | the developer be responsive to these and fix any issues which arise. |
| 166 | |
| 167 | - Stable release. The number of users potentially affected by the patch |
| 168 | is now large, so, once again, new problems may arise. |
| 169 | |
| 170 | - Long-term maintenance. While it is certainly possible for a developer |
| 171 | to forget about code after merging it, that sort of behavior tends to |
| 172 | leave a poor impression in the development community. Merging code |
| 173 | eliminates some of the maintenance burden, in that others will fix |
| 174 | problems caused by API changes. But the original developer should |
| 175 | continue to take responsibility for the code if it is to remain useful |
| 176 | in the longer term. |
| 177 | |
| 178 | One of the largest mistakes made by kernel developers (or their employers) |
| 179 | is to try to cut the process down to a single "merging into the mainline" |
| 180 | step. This approach invariably leads to frustration for everybody |
| 181 | involved. |
| 182 | |
| 183 | |
| 184 | 2.3: HOW PATCHES GET INTO THE KERNEL |
| 185 | |
| 186 | There is exactly one person who can merge patches into the mainline kernel |
| 187 | repository: Linus Torvalds. But, of the over 12,000 patches which went |
| 188 | into the 2.6.25 kernel, only 250 (around 2%) were directly chosen by Linus |
| 189 | himself. The kernel project has long since grown to a size where no single |
| 190 | developer could possibly inspect and select every patch unassisted. The |
| 191 | way the kernel developers have addressed this growth is through the use of |
| 192 | a lieutenant system built around a chain of trust. |
| 193 | |
| 194 | The kernel code base is logically broken down into a set of subsystems: |
| 195 | networking, specific architecture support, memory management, video |
| 196 | devices, etc. Most subsystems have a designated maintainer, a developer |
| 197 | who has overall responsibility for the code within that subsystem. These |
| 198 | subsystem maintainers are the gatekeepers (in a loose way) for the portion |
| 199 | of the kernel they manage; they are the ones who will (usually) accept a |
| 200 | patch for inclusion into the mainline kernel. |
| 201 | |
| 202 | Subsystem maintainers each manage their own version of the kernel source |
| 203 | tree, usually (but certainly not always) using the git source management |
| 204 | tool. Tools like git (and related tools like quilt or mercurial) allow |
| 205 | maintainers to track a list of patches, including authorship information |
| 206 | and other metadata. At any given time, the maintainer can identify which |
| 207 | patches in his or her repository are not found in the mainline. |
| 208 | |
| 209 | When the merge window opens, top-level maintainers will ask Linus to "pull" |
| 210 | the patches they have selected for merging from their repositories. If |
| 211 | Linus agrees, the stream of patches will flow up into his repository, |
| 212 | becoming part of the mainline kernel. The amount of attention that Linus |
| 213 | pays to specific patches received in a pull operation varies. It is clear |
| 214 | that, sometimes, he looks quite closely. But, as a general rule, Linus |
| 215 | trusts the subsystem maintainers to not send bad patches upstream. |
| 216 | |
| 217 | Subsystem maintainers, in turn, can pull patches from other maintainers. |
| 218 | For example, the networking tree is built from patches which accumulated |
| 219 | first in trees dedicated to network device drivers, wireless networking, |
| 220 | etc. This chain of repositories can be arbitrarily long, though it rarely |
| 221 | exceeds two or three links. Since each maintainer in the chain trusts |
| 222 | those managing lower-level trees, this process is known as the "chain of |
| 223 | trust." |
| 224 | |
| 225 | Clearly, in a system like this, getting patches into the kernel depends on |
| 226 | finding the right maintainer. Sending patches directly to Linus is not |
| 227 | normally the right way to go. |
| 228 | |
| 229 | |
| 230 | 2.4: STAGING TREES |
| 231 | |
| 232 | The chain of subsystem trees guides the flow of patches into the kernel, |
| 233 | but it also raises an interesting question: what if somebody wants to look |
| 234 | at all of the patches which are being prepared for the next merge window? |
| 235 | Developers will be interested in what other changes are pending to see |
| 236 | whether there are any conflicts to worry about; a patch which changes a |
| 237 | core kernel function prototype, for example, will conflict with any other |
| 238 | patches which use the older form of that function. Reviewers and testers |
| 239 | want access to the changes in their integrated form before all of those |
| 240 | changes land in the mainline kernel. One could pull changes from all of |
| 241 | the interesting subsystem trees, but that would be a big and error-prone |
| 242 | job. |
| 243 | |
| 244 | The answer comes in the form of staging trees, where subsystem trees are |
| 245 | collected for testing and review. The older of these trees, maintained by |
| 246 | Andrew Morton, is called "-mm" (for memory management, which is how it got |
| 247 | started). The -mm tree integrates patches from a long list of subsystem |
| 248 | trees; it also has some patches aimed at helping with debugging. |
| 249 | |
| 250 | Beyond that, -mm contains a significant collection of patches which have |
| 251 | been selected by Andrew directly. These patches may have been posted on a |
| 252 | mailing list, or they may apply to a part of the kernel for which there is |
| 253 | no designated subsystem tree. As a result, -mm operates as a sort of |
| 254 | subsystem tree of last resort; if there is no other obvious path for a |
| 255 | patch into the mainline, it is likely to end up in -mm. Miscellaneous |
| 256 | patches which accumulate in -mm will eventually either be forwarded on to |
| 257 | an appropriate subsystem tree or be sent directly to Linus. In a typical |
| 258 | development cycle, approximately 10% of the patches going into the mainline |
| 259 | get there via -mm. |
| 260 | |
| 261 | The current -mm patch can always be found from the front page of |
| 262 | |
| 263 | http://kernel.org/ |
| 264 | |
| 265 | Those who want to see the current state of -mm can get the "-mm of the |
| 266 | moment" tree, found at: |
| 267 | |
| 268 | http://userweb.kernel.org/~akpm/mmotm/ |
| 269 | |
| 270 | Use of the MMOTM tree is likely to be a frustrating experience, though; |
| 271 | there is a definite chance that it will not even compile. |
| 272 | |
| 273 | The other staging tree, started more recently, is linux-next, maintained by |
| 274 | Stephen Rothwell. The linux-next tree is, by design, a snapshot of what |
| 275 | the mainline is expected to look like after the next merge window closes. |
| 276 | Linux-next trees are announced on the linux-kernel and linux-next mailing |
| 277 | lists when they are assembled; they can be downloaded from: |
| 278 | |
| 279 | http://www.kernel.org/pub/linux/kernel/people/sfr/linux-next/ |
| 280 | |
| 281 | Some information about linux-next has been gathered at: |
| 282 | |
| 283 | http://linux.f-seidel.de/linux-next/pmwiki/ |
| 284 | |
| 285 | How the linux-next tree will fit into the development process is still |
| 286 | changing. As of this writing, the first full development cycle involving |
| 287 | linux-next (2.6.26) is coming to an end; thus far, it has proved to be a |
| 288 | valuable resource for finding and fixing integration problems before the |
| 289 | beginning of the merge window. See http://lwn.net/Articles/287155/ for |
| 290 | more information on how linux-next has worked to set up the 2.6.27 merge |
| 291 | window. |
| 292 | |
| 293 | Some developers have begun to suggest that linux-next should be used as the |
| 294 | target for future development as well. The linux-next tree does tend to be |
| 295 | far ahead of the mainline and is more representative of the tree into which |
| 296 | any new work will be merged. The downside to this idea is that the |
| 297 | volatility of linux-next tends to make it a difficult development target. |
| 298 | See http://lwn.net/Articles/289013/ for more information on this topic, and |
| 299 | stay tuned; much is still in flux where linux-next is involved. |
| 300 | |
| 301 | |
| 302 | 2.5: TOOLS |
| 303 | |
| 304 | As can be seen from the above text, the kernel development process depends |
| 305 | heavily on the ability to herd collections of patches in various |
| 306 | directions. The whole thing would not work anywhere near as well as it |
| 307 | does without suitably powerful tools. Tutorials on how to use these tools |
| 308 | are well beyond the scope of this document, but there is space for a few |
| 309 | pointers. |
| 310 | |
| 311 | By far the dominant source code management system used by the kernel |
| 312 | community is git. Git is one of a number of distributed version control |
| 313 | systems being developed in the free software community. It is well tuned |
| 314 | for kernel development, in that it performs quite well when dealing with |
| 315 | large repositories and large numbers of patches. It also has a reputation |
| 316 | for being difficult to learn and use, though it has gotten better over |
| 317 | time. Some sort of familiarity with git is almost a requirement for kernel |
| 318 | developers; even if they do not use it for their own work, they'll need git |
| 319 | to keep up with what other developers (and the mainline) are doing. |
| 320 | |
| 321 | Git is now packaged by almost all Linux distributions. There is a home |
| 322 | page at |
| 323 | |
| 324 | http://git.or.cz/ |
| 325 | |
| 326 | That page has pointers to documentation and tutorials. One should be |
| 327 | aware, in particular, of the Kernel Hacker's Guide to git, which has |
| 328 | information specific to kernel development: |
| 329 | |
| 330 | http://linux.yyz.us/git-howto.html |
| 331 | |
| 332 | Among the kernel developers who do not use git, the most popular choice is |
| 333 | almost certainly Mercurial: |
| 334 | |
| 335 | http://www.selenic.com/mercurial/ |
| 336 | |
| 337 | Mercurial shares many features with git, but it provides an interface which |
| 338 | many find easier to use. |
| 339 | |
| 340 | The other tool worth knowing about is Quilt: |
| 341 | |
| 342 | http://savannah.nongnu.org/projects/quilt/ |
| 343 | |
| 344 | Quilt is a patch management system, rather than a source code management |
| 345 | system. It does not track history over time; it is, instead, oriented |
| 346 | toward tracking a specific set of changes against an evolving code base. |
| 347 | Some major subsystem maintainers use quilt to manage patches intended to go |
| 348 | upstream. For the management of certain kinds of trees (-mm, for example), |
| 349 | quilt is the best tool for the job. |
| 350 | |
| 351 | |
| 352 | 2.6: MAILING LISTS |
| 353 | |
| 354 | A great deal of Linux kernel development work is done by way of mailing |
| 355 | lists. It is hard to be a fully-functioning member of the community |
| 356 | without joining at least one list somewhere. But Linux mailing lists also |
| 357 | represent a potential hazard to developers, who risk getting buried under a |
| 358 | load of electronic mail, running afoul of the conventions used on the Linux |
| 359 | lists, or both. |
| 360 | |
| 361 | Most kernel mailing lists are run on vger.kernel.org; the master list can |
| 362 | be found at: |
| 363 | |
| 364 | http://vger.kernel.org/vger-lists.html |
| 365 | |
| 366 | There are lists hosted elsewhere, though; a number of them are at |
| 367 | lists.redhat.com. |
| 368 | |
| 369 | The core mailing list for kernel development is, of course, linux-kernel. |
| 370 | This list is an intimidating place to be; volume can reach 500 messages per |
| 371 | day, the amount of noise is high, the conversation can be severely |
| 372 | technical, and participants are not always concerned with showing a high |
| 373 | degree of politeness. But there is no other place where the kernel |
| 374 | development community comes together as a whole; developers who avoid this |
| 375 | list will miss important information. |
| 376 | |
| 377 | There are a few hints which can help with linux-kernel survival: |
| 378 | |
| 379 | - Have the list delivered to a separate folder, rather than your main |
| 380 | mailbox. One must be able to ignore the stream for sustained periods of |
| 381 | time. |
| 382 | |
| 383 | - Do not try to follow every conversation - nobody else does. It is |
| 384 | important to filter on both the topic of interest (though note that |
| 385 | long-running conversations can drift away from the original subject |
| 386 | without changing the email subject line) and the people who are |
| 387 | participating. |
| 388 | |
| 389 | - Do not feed the trolls. If somebody is trying to stir up an angry |
| 390 | response, ignore them. |
| 391 | |
| 392 | - When responding to linux-kernel email (or that on other lists) preserve |
| 393 | the Cc: header for all involved. In the absence of a strong reason (such |
| 394 | as an explicit request), you should never remove recipients. Always make |
| 395 | sure that the person you are responding to is in the Cc: list. This |
| 396 | convention also makes it unnecessary to explicitly ask to be copied on |
| 397 | replies to your postings. |
| 398 | |
| 399 | - Search the list archives (and the net as a whole) before asking |
| 400 | questions. Some developers can get impatient with people who clearly |
| 401 | have not done their homework. |
| 402 | |
| 403 | - Avoid top-posting (the practice of putting your answer above the quoted |
| 404 | text you are responding to). It makes your response harder to read and |
| 405 | makes a poor impression. |
| 406 | |
| 407 | - Ask on the correct mailing list. Linux-kernel may be the general meeting |
| 408 | point, but it is not the best place to find developers from all |
| 409 | subsystems. |
| 410 | |
| 411 | The last point - finding the correct mailing list - is a common place for |
| 412 | beginning developers to go wrong. Somebody who asks a networking-related |
| 413 | question on linux-kernel will almost certainly receive a polite suggestion |
| 414 | to ask on the netdev list instead, as that is the list frequented by most |
| 415 | networking developers. Other lists exist for the SCSI, video4linux, IDE, |
| 416 | filesystem, etc. subsystems. The best place to look for mailing lists is |
| 417 | in the MAINTAINERS file packaged with the kernel source. |
| 418 | |
| 419 | |
| 420 | 2.7: GETTING STARTED WITH KERNEL DEVELOPMENT |
| 421 | |
| 422 | Questions about how to get started with the kernel development process are |
| 423 | common - from both individuals and companies. Equally common are missteps |
| 424 | which make the beginning of the relationship harder than it has to be. |
| 425 | |
| 426 | Companies often look to hire well-known developers to get a development |
| 427 | group started. This can, in fact, be an effective technique. But it also |
| 428 | tends to be expensive and does not do much to grow the pool of experienced |
| 429 | kernel developers. It is possible to bring in-house developers up to speed |
| 430 | on Linux kernel development, given the investment of a bit of time. Taking |
| 431 | this time can endow an employer with a group of developers who understand |
| 432 | the kernel and the company both, and who can help to train others as well. |
| 433 | Over the medium term, this is often the more profitable approach. |
| 434 | |
| 435 | Individual developers are often, understandably, at a loss for a place to |
| 436 | start. Beginning with a large project can be intimidating; one often wants |
| 437 | to test the waters with something smaller first. This is the point where |
| 438 | some developers jump into the creation of patches fixing spelling errors or |
| 439 | minor coding style issues. Unfortunately, such patches create a level of |
| 440 | noise which is distracting for the development community as a whole, so, |
| 441 | increasingly, they are looked down upon. New developers wishing to |
| 442 | introduce themselves to the community will not get the sort of reception |
| 443 | they wish for by these means. |
| 444 | |
| 445 | Andrew Morton gives this advice for aspiring kernel developers |
| 446 | |
| 447 | The #1 project for all kernel beginners should surely be "make sure |
| 448 | that the kernel runs perfectly at all times on all machines which |
| 449 | you can lay your hands on". Usually the way to do this is to work |
| 450 | with others on getting things fixed up (this can require |
| 451 | persistence!) but that's fine - it's a part of kernel development. |
| 452 | |
| 453 | (http://lwn.net/Articles/283982/). |
| 454 | |
| 455 | In the absence of obvious problems to fix, developers are advised to look |
| 456 | at the current lists of regressions and open bugs in general. There is |
| 457 | never any shortage of issues in need of fixing; by addressing these issues, |
| 458 | developers will gain experience with the process while, at the same time, |
| 459 | building respect with the rest of the development community. |