Jonathan Corbet | 75b0214 | 2008-09-30 15:15:56 -0600 | [diff] [blame] | 1 | 7: ADVANCED TOPICS |
| 2 | |
| 3 | At this point, hopefully, you have a handle on how the development process |
| 4 | works. There is still more to learn, however! This section will cover a |
| 5 | number of topics which can be helpful for developers wanting to become a |
| 6 | regular part of the Linux kernel development process. |
| 7 | |
| 8 | 7.1: MANAGING PATCHES WITH GIT |
| 9 | |
| 10 | The use of distributed version control for the kernel began in early 2002, |
| 11 | when Linus first started playing with the proprietary BitKeeper |
| 12 | application. While BitKeeper was controversial, the approach to software |
| 13 | version management it embodied most certainly was not. Distributed version |
| 14 | control enabled an immediate acceleration of the kernel development |
| 15 | project. In current times, there are several free alternatives to |
| 16 | BitKeeper. For better or for worse, the kernel project has settled on git |
| 17 | as its tool of choice. |
| 18 | |
| 19 | Managing patches with git can make life much easier for the developer, |
| 20 | especially as the volume of those patches grows. Git also has its rough |
| 21 | edges and poses certain hazards; it is a young and powerful tool which is |
| 22 | still being civilized by its developers. This document will not attempt to |
| 23 | teach the reader how to use git; that would be sufficient material for a |
| 24 | long document in its own right. Instead, the focus here will be on how git |
| 25 | fits into the kernel development process in particular. Developers who |
| 26 | wish to come up to speed with git will find more information at: |
| 27 | |
| 28 | http://git.or.cz/ |
| 29 | |
| 30 | http://www.kernel.org/pub/software/scm/git/docs/user-manual.html |
| 31 | |
| 32 | and on various tutorials found on the web. |
| 33 | |
| 34 | The first order of business is to read the above sites and get a solid |
| 35 | understanding of how git works before trying to use it to make patches |
| 36 | available to others. A git-using developer should be able to obtain a copy |
| 37 | of the mainline repository, explore the revision history, commit changes to |
| 38 | the tree, use branches, etc. An understanding of git's tools for the |
| 39 | rewriting of history (such as rebase) is also useful. Git comes with its |
| 40 | own terminology and concepts; a new user of git should know about refs, |
| 41 | remote branches, the index, fast-forward merges, pushes and pulls, detached |
| 42 | heads, etc. It can all be a little intimidating at the outset, but the |
| 43 | concepts are not that hard to grasp with a bit of study. |
| 44 | |
| 45 | Using git to generate patches for submission by email can be a good |
| 46 | exercise while coming up to speed. |
| 47 | |
| 48 | When you are ready to start putting up git trees for others to look at, you |
| 49 | will, of course, need a server that can be pulled from. Setting up such a |
| 50 | server with git-daemon is relatively straightforward if you have a system |
| 51 | which is accessible to the Internet. Otherwise, free, public hosting sites |
| 52 | (Github, for example) are starting to appear on the net. Established |
| 53 | developers can get an account on kernel.org, but those are not easy to come |
| 54 | by; see http://kernel.org/faq/ for more information. |
| 55 | |
| 56 | The normal git workflow involves the use of a lot of branches. Each line |
| 57 | of development can be separated into a separate "topic branch" and |
| 58 | maintained independently. Branches in git are cheap, there is no reason to |
| 59 | not make free use of them. And, in any case, you should not do your |
| 60 | development in any branch which you intend to ask others to pull from. |
| 61 | Publicly-available branches should be created with care; merge in patches |
| 62 | from development branches when they are in complete form and ready to go - |
| 63 | not before. |
| 64 | |
| 65 | Git provides some powerful tools which can allow you to rewrite your |
| 66 | development history. An inconvenient patch (one which breaks bisection, |
| 67 | say, or which has some other sort of obvious bug) can be fixed in place or |
| 68 | made to disappear from the history entirely. A patch series can be |
| 69 | rewritten as if it had been written on top of today's mainline, even though |
| 70 | you have been working on it for months. Changes can be transparently |
| 71 | shifted from one branch to another. And so on. Judicious use of git's |
| 72 | ability to revise history can help in the creation of clean patch sets with |
| 73 | fewer problems. |
| 74 | |
| 75 | Excessive use of this capability can lead to other problems, though, beyond |
| 76 | a simple obsession for the creation of the perfect project history. |
| 77 | Rewriting history will rewrite the changes contained in that history, |
| 78 | turning a tested (hopefully) kernel tree into an untested one. But, beyond |
| 79 | that, developers cannot easily collaborate if they do not have a shared |
| 80 | view of the project history; if you rewrite history which other developers |
| 81 | have pulled into their repositories, you will make life much more difficult |
| 82 | for those developers. So a simple rule of thumb applies here: history |
| 83 | which has been exported to others should generally be seen as immutable |
| 84 | thereafter. |
| 85 | |
| 86 | So, once you push a set of changes to your publicly-available server, those |
| 87 | changes should not be rewritten. Git will attempt to enforce this rule if |
| 88 | you try to push changes which do not result in a fast-forward merge |
| 89 | (i.e. changes which do not share the same history). It is possible to |
| 90 | override this check, and there may be times when it is necessary to rewrite |
| 91 | an exported tree. Moving changesets between trees to avoid conflicts in |
| 92 | linux-next is one example. But such actions should be rare. This is one |
| 93 | of the reasons why development should be done in private branches (which |
| 94 | can be rewritten if necessary) and only moved into public branches when |
| 95 | it's in a reasonably advanced state. |
| 96 | |
| 97 | As the mainline (or other tree upon which a set of changes is based) |
| 98 | advances, it is tempting to merge with that tree to stay on the leading |
| 99 | edge. For a private branch, rebasing can be an easy way to keep up with |
| 100 | another tree, but rebasing is not an option once a tree is exported to the |
| 101 | world. Once that happens, a full merge must be done. Merging occasionally |
| 102 | makes good sense, but overly frequent merges can clutter the history |
| 103 | needlessly. Suggested technique in this case is to merge infrequently, and |
| 104 | generally only at specific release points (such as a mainline -rc |
| 105 | release). If you are nervous about specific changes, you can always |
| 106 | perform test merges in a private branch. The git "rerere" tool can be |
| 107 | useful in such situations; it remembers how merge conflicts were resolved |
| 108 | so that you don't have to do the same work twice. |
| 109 | |
| 110 | One of the biggest recurring complaints about tools like git is this: the |
| 111 | mass movement of patches from one repository to another makes it easy to |
| 112 | slip in ill-advised changes which go into the mainline below the review |
| 113 | radar. Kernel developers tend to get unhappy when they see that kind of |
| 114 | thing happening; putting up a git tree with unreviewed or off-topic patches |
| 115 | can affect your ability to get trees pulled in the future. Quoting Linus: |
| 116 | |
| 117 | You can send me patches, but for me to pull a git patch from you, I |
| 118 | need to know that you know what you're doing, and I need to be able |
| 119 | to trust things *without* then having to go and check every |
| 120 | individual change by hand. |
| 121 | |
| 122 | (http://lwn.net/Articles/224135/). |
| 123 | |
| 124 | To avoid this kind of situation, ensure that all patches within a given |
| 125 | branch stick closely to the associated topic; a "driver fixes" branch |
| 126 | should not be making changes to the core memory management code. And, most |
| 127 | importantly, do not use a git tree to bypass the review process. Post an |
| 128 | occasional summary of the tree to the relevant list, and, when the time is |
| 129 | right, request that the tree be included in linux-next. |
| 130 | |
| 131 | If and when others start to send patches for inclusion into your tree, |
| 132 | don't forget to review them. Also ensure that you maintain the correct |
| 133 | authorship information; the git "am" tool does its best in this regard, but |
| 134 | you may have to add a "From:" line to the patch if it has been relayed to |
| 135 | you via a third party. |
| 136 | |
| 137 | When requesting a pull, be sure to give all the relevant information: where |
| 138 | your tree is, what branch to pull, and what changes will result from the |
| 139 | pull. The git request-pull command can be helpful in this regard; it will |
| 140 | format the request as other developers expect, and will also check to be |
| 141 | sure that you have remembered to push those changes to the public server. |
| 142 | |
| 143 | |
| 144 | 7.2: REVIEWING PATCHES |
| 145 | |
| 146 | Some readers will certainly object to putting this section with "advanced |
| 147 | topics" on the grounds that even beginning kernel developers should be |
| 148 | reviewing patches. It is certainly true that there is no better way to |
| 149 | learn how to program in the kernel environment than by looking at code |
| 150 | posted by others. In addition, reviewers are forever in short supply; by |
| 151 | looking at code you can make a significant contribution to the process as a |
| 152 | whole. |
| 153 | |
| 154 | Reviewing code can be an intimidating prospect, especially for a new kernel |
| 155 | developer who may well feel nervous about questioning code - in public - |
| 156 | which has been posted by those with more experience. Even code written by |
| 157 | the most experienced developers can be improved, though. Perhaps the best |
| 158 | piece of advice for reviewers (all reviewers) is this: phrase review |
| 159 | comments as questions rather than criticisms. Asking "how does the lock |
| 160 | get released in this path?" will always work better than stating "the |
| 161 | locking here is wrong." |
| 162 | |
| 163 | Different developers will review code from different points of view. Some |
| 164 | are mostly concerned with coding style and whether code lines have trailing |
| 165 | white space. Others will focus primarily on whether the change implemented |
| 166 | by the patch as a whole is a good thing for the kernel or not. Yet others |
| 167 | will check for problematic locking, excessive stack usage, possible |
| 168 | security issues, duplication of code found elsewhere, adequate |
| 169 | documentation, adverse effects on performance, user-space ABI changes, etc. |
| 170 | All types of review, if they lead to better code going into the kernel, are |
| 171 | welcome and worthwhile. |
| 172 | |
| 173 | |