Michael Platings | 7aecb64 | 2019-03-28 14:42:21 +0000 | [diff] [blame] | 1 | =================== |
| 2 | Variable Names Plan |
| 3 | =================== |
| 4 | |
| 5 | .. contents:: |
| 6 | :local: |
| 7 | |
| 8 | This plan is *provisional*. It is not agreed upon. It is written with the |
| 9 | intention of capturing the desires and concerns of the LLVM community, and |
| 10 | forming them into a plan that can be agreed upon. |
| 11 | The original author is somewhat naïve in the ways of LLVM so there will |
| 12 | inevitably be some details that are flawed. You can help - you can edit this |
| 13 | page (preferably with a Phabricator review for larger changes) or reply to the |
| 14 | `Request For Comments thread |
| 15 | <http://lists.llvm.org/pipermail/llvm-dev/2019-February/130083.html>`_. |
| 16 | |
| 17 | Too Long; Didn't Read |
| 18 | ===================== |
| 19 | |
| 20 | Improve the readability of LLVM code. |
| 21 | |
| 22 | Introduction |
| 23 | ============ |
| 24 | |
| 25 | The current `variable naming rule |
| 26 | <../CodingStandards.html#name-types-functions-variables-and-enumerators-properly>`_ |
| 27 | states: |
| 28 | |
| 29 | Variable names should be nouns (as they represent state). The name should be |
| 30 | camel case, and start with an upper case letter (e.g. Leader or Boats). |
| 31 | |
| 32 | This rule is the same as that for type names. This is a problem because the |
| 33 | type name cannot be reused for a variable name [*]_. LLVM developers tend to |
| 34 | work around this by either prepending ``The`` to the type name:: |
| 35 | |
| 36 | Triple TheTriple; |
| 37 | |
| 38 | ... or more commonly use an acronym, despite the coding standard stating "Avoid |
| 39 | abbreviations unless they are well known":: |
| 40 | |
| 41 | Triple T; |
| 42 | |
| 43 | The proliferation of acronyms leads to hard-to-read code such as `this |
| 44 | <https://github.com/llvm/llvm-project/blob/0a8bc14ad7f3209fe702d18e250194cd90188596/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp#L7445>`_:: |
| 45 | |
| 46 | InnerLoopVectorizer LB(L, PSE, LI, DT, TLI, TTI, AC, ORE, VF.Width, IC, |
| 47 | &LVL, &CM); |
| 48 | |
| 49 | Many other coding guidelines [LLDB]_ [Google]_ [WebKit]_ [Qt]_ [Rust]_ [Swift]_ |
| 50 | [Python]_ require that variable names begin with a lower case letter in contrast |
| 51 | to class names which begin with a capital letter. This convention means that the |
| 52 | most readable variable name also requires the least thought:: |
| 53 | |
| 54 | Triple triple; |
| 55 | |
| 56 | There is some agreement that the current rule is broken [LattnerAgree]_ |
| 57 | [ArsenaultAgree]_ [RobinsonAgree]_ and that acronyms are an obstacle to reading |
| 58 | new code [MalyutinDistinguish]_ [CarruthAcronym]_ [PicusAcronym]_. There are |
| 59 | some opposing views [ParzyszekAcronym2]_ [RicciAcronyms]_. |
| 60 | |
| 61 | This work-in-progress proposal is to change the coding standard for variable |
| 62 | names to require that they start with a lower case letter. |
| 63 | |
| 64 | .. [*] In `some cases |
| 65 | <https://github.com/llvm/llvm-project/blob/8b72080d4d7b13072f371712eed333f987b7a18e/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp#L2727>`_ |
| 66 | the type name *is* reused as a variable name, but this shadows the type name |
| 67 | and confuses many debuggers [DenisovCamelBack]_. |
| 68 | |
| 69 | Variable Names Coding Standard Options |
| 70 | ====================================== |
| 71 | |
| 72 | There are two main options for variable names that begin with a lower case |
| 73 | letter: ``camelBack`` and ``lower_case``. (These are also known by other names |
| 74 | but here we use the terminology from clang-tidy). |
| 75 | |
| 76 | ``camelBack`` is consistent with [WebKit]_, [Qt]_ and [Swift]_ while |
| 77 | ``lower_case`` is consistent with [LLDB]_, [Google]_, [Rust]_ and [Python]_. |
| 78 | |
| 79 | ``camelBack`` is already used for function names, which may be considered an |
| 80 | advantage [LattnerFunction]_ or a disadvantage [CarruthFunction]_. |
| 81 | |
| 82 | Approval for ``camelBack`` was expressed by [DenisovCamelBack]_ |
| 83 | [LattnerFunction]_ [IvanovicDistinguish]_. |
| 84 | Opposition to ``camelBack`` was expressed by [CarruthCamelBack]_ |
| 85 | [TurnerCamelBack]_. |
| 86 | Approval for ``lower_case`` was expressed by [CarruthLower]_ |
| 87 | [CarruthCamelBack]_ [TurnerLLDB]_. |
| 88 | Opposition to ``lower_case`` was expressed by [LattnerLower]_. |
| 89 | |
| 90 | Differentiating variable kinds |
| 91 | ------------------------------ |
| 92 | |
| 93 | An additional requested change is to distinguish between different kinds of |
| 94 | variables [RobinsonDistinguish]_ [RobinsonDistinguish2]_ [JonesDistinguish]_ |
| 95 | [IvanovicDistinguish]_ [CarruthDistinguish]_ [MalyutinDistinguish]_. |
| 96 | |
| 97 | Others oppose this idea [HähnleDistinguish]_ [GreeneDistinguish]_ |
| 98 | [HendersonPrefix]_. |
| 99 | |
| 100 | A possibility is for member variables to be prefixed with ``m_`` and for global |
| 101 | variables to be prefixed with ``g_`` to distinguish them from local variables. |
| 102 | This is consistent with [LLDB]_. The ``m_`` prefix is consistent with [WebKit]_. |
| 103 | |
| 104 | A variation is for member variables to be prefixed with ``m`` |
| 105 | [IvanovicDistinguish]_ [BeylsDistinguish]_. This is consistent with [Mozilla]_. |
| 106 | |
| 107 | Another option is for member variables to be suffixed with ``_`` which is |
| 108 | consistent with [Google]_ and similar to [Python]_. Opposed by |
| 109 | [ParzyszekDistinguish]_. |
| 110 | |
| 111 | Reducing the number of acronyms |
| 112 | =============================== |
| 113 | |
| 114 | While switching coding standard will make it easier to use non-acronym names for |
| 115 | new code, it doesn't improve the existing large body of code that uses acronyms |
| 116 | extensively to the detriment of its readability. Further, it is natural and |
| 117 | generally encouraged that new code be written in the style of the surrounding |
| 118 | code. Therefore it is likely that much newly written code will also use |
| 119 | acronyms despite what the coding standard says, much as it is today. |
| 120 | |
| 121 | As well as changing the case of variable names, they could also be expanded to |
| 122 | their non-acronym form e.g. ``Triple T`` → ``Triple triple``. |
| 123 | |
| 124 | There is support for expanding many acronyms [CarruthAcronym]_ [PicusAcronym]_ |
| 125 | but there is a preference that expanding acronyms be deferred |
| 126 | [ParzyszekAcronym]_ [CarruthAcronym]_. |
| 127 | |
| 128 | The consensus within the community seems to be that at least some acronyms are |
| 129 | valuable [ParzyszekAcronym]_ [LattnerAcronym]_. The most commonly cited acronym |
| 130 | is ``TLI`` however that is used to refer to both ``TargetLowering`` and |
| 131 | ``TargetLibraryInfo`` [GreeneDistinguish]_. |
| 132 | |
| 133 | The following is a list of acronyms considered sufficiently useful that the |
| 134 | benefit of using them outweighs the cost of learning them. Acronyms that are |
| 135 | either not on the list or are used to refer to a different type should be |
| 136 | expanded. |
| 137 | |
| 138 | ============================ ============= |
| 139 | Class name Variable name |
| 140 | ============================ ============= |
| 141 | DeterministicFiniteAutomaton dfa |
| 142 | DominatorTree dt |
| 143 | LoopInfo li |
| 144 | MachineFunction mf |
| 145 | MachineInstr mi |
| 146 | MachineRegisterInfo mri |
| 147 | ScalarEvolution se |
| 148 | TargetInstrInfo tii |
| 149 | TargetLibraryInfo tli |
| 150 | TargetRegisterInfo tri |
| 151 | ============================ ============= |
| 152 | |
| 153 | In some cases renaming acronyms to the full type name will result in overly |
| 154 | verbose code. Unlike most classes, a variable's scope is limited and therefore |
| 155 | some of its purpose can implied from that scope, meaning that fewer words are |
Kazuaki Ishizaki | f65d4aa | 2020-01-22 11:30:57 +0800 | [diff] [blame] | 156 | necessary to give it a clear name. For example, in an optimization pass the reader |
Michael Platings | 7aecb64 | 2019-03-28 14:42:21 +0000 | [diff] [blame] | 157 | can assume that a variable's purpose relates to optimization and therefore an |
| 158 | ``OptimizationRemarkEmitter`` variable could be given the name ``remarkEmitter`` |
| 159 | or even ``remarker``. |
| 160 | |
| 161 | The following is a list of longer class names and the associated shorter |
| 162 | variable name. |
| 163 | |
| 164 | ========================= ============= |
| 165 | Class name Variable name |
| 166 | ========================= ============= |
| 167 | BasicBlock block |
| 168 | ConstantExpr expr |
| 169 | ExecutionEngine engine |
| 170 | MachineOperand operand |
| 171 | OptimizationRemarkEmitter remarker |
| 172 | PreservedAnalyses analyses |
| 173 | PreservedAnalysesChecker checker |
| 174 | TargetLowering lowering |
| 175 | TargetMachine machine |
| 176 | ========================= ============= |
| 177 | |
| 178 | Transition Options |
| 179 | ================== |
| 180 | |
| 181 | There are three main options for transitioning: |
| 182 | |
| 183 | 1. Keep the current coding standard |
| 184 | 2. Laissez faire |
| 185 | 3. Big bang |
| 186 | |
| 187 | Keep the current coding standard |
| 188 | -------------------------------- |
| 189 | |
| 190 | Proponents of keeping the current coding standard (i.e. not transitioning at |
| 191 | all) question whether the cost of transition outweighs the benefit |
| 192 | [EmersonConcern]_ [ReamesConcern]_ [BradburyConcern]_. |
| 193 | The costs are that ``git blame`` will become less usable; and that merging the |
| 194 | changes will be costly for downstream maintainers. See `Big bang`_ for potential |
| 195 | mitigations. |
| 196 | |
| 197 | Laissez faire |
| 198 | ------------- |
| 199 | |
| 200 | The coding standard could allow both ``CamelCase`` and ``camelBack`` styles for |
| 201 | variable names [LattnerTransition]_. |
| 202 | |
| 203 | A code review to implement this is at https://reviews.llvm.org/D57896. |
| 204 | |
| 205 | Advantages |
| 206 | ********** |
| 207 | |
| 208 | * Very easy to implement initially. |
| 209 | |
| 210 | Disadvantages |
| 211 | ************* |
| 212 | |
| 213 | * Leads to inconsistency [BradburyConcern]_ [AminiInconsistent]_. |
| 214 | * Inconsistency means it will be hard to know at a guess what name a variable |
| 215 | will have [DasInconsistent]_ [CarruthInconsistent]_. |
| 216 | * Some large-scale renaming may happen anyway, leading to its disadvantages |
| 217 | without any mitigations. |
| 218 | |
| 219 | Big bang |
| 220 | -------- |
| 221 | |
| 222 | With this approach, variables will be renamed by an automated script in a series |
| 223 | of large commits. |
| 224 | |
| 225 | The principle advantage of this approach is that it minimises the cost of |
| 226 | inconsistency [BradburyTransition]_ [RobinsonTransition]_. |
| 227 | |
| 228 | It goes against a policy of avoiding large-scale reformatting of existing code |
| 229 | [GreeneDistinguish]_. |
| 230 | |
| 231 | It has been suggested that LLD would be a good starter project for the renaming |
| 232 | [Ueyama]_. |
| 233 | |
| 234 | Keeping git blame usable |
| 235 | ************************ |
| 236 | |
| 237 | ``git blame`` (or ``git annotate``) permits quickly identifying the commit that |
| 238 | changed a given line in a file. After renaming variables, many lines will show |
| 239 | as being changed by that one commit, requiring a further invocation of ``git |
| 240 | blame`` to identify prior, more interesting commits [GreeneGitBlame]_ |
| 241 | [RicciAcronyms]_. |
| 242 | |
| 243 | **Mitigation**: `git-hyper-blame |
| 244 | <https://commondatastorage.googleapis.com/chrome-infra-docs/flat/depot_tools/docs/html/git-hyper-blame.html>`_ |
| 245 | can ignore or "look through" a given set of commits. |
| 246 | A ``.git-blame-ignore-revs`` file identifying the variable renaming commits |
| 247 | could be added to the LLVM git repository root directory. |
| 248 | It is being `investigated |
| 249 | <https://public-inbox.org/git/20190324235020.49706-1-michael@platin.gs/>`_ |
| 250 | whether similar functionality could be added to ``git blame`` itself. |
| 251 | |
| 252 | Minimising cost of downstream merges |
| 253 | ************************************ |
| 254 | |
| 255 | There are many forks of LLVM with downstream changes. Merging a large-scale |
| 256 | renaming change could be difficult for the fork maintainers. |
| 257 | |
| 258 | **Mitigation**: A large-scale renaming would be automated. A fork maintainer can |
| 259 | merge from the commit immediately before the renaming, then apply the renaming |
| 260 | script to their own branch. They can then merge again from the renaming commit, |
| 261 | resolving all conflicts by choosing their own version. This could be tested on |
| 262 | the [SVE]_ fork. |
| 263 | |
| 264 | Provisional Plan |
| 265 | ================ |
| 266 | |
| 267 | This is a provisional plan for the `Big bang`_ approach. It has not been agreed. |
| 268 | |
| 269 | #. Investigate improving ``git blame``. The extent to which it can be made to |
| 270 | "look through" commits may impact how big a change can be made. |
| 271 | |
| 272 | #. Write a script to expand acronyms. |
| 273 | |
| 274 | #. Experiment and perform dry runs of the various refactoring options. |
| 275 | Results can be published in forks of the LLVM Git repository. |
| 276 | |
| 277 | #. Consider the evidence and agree on the new policy. |
| 278 | |
| 279 | #. Agree & announce a date for the renaming of the starter project (LLD). |
| 280 | |
| 281 | #. Update the `policy page <../CodingStandards.html>`_. This will explain the |
| 282 | old and new rules and which projects each applies to. |
| 283 | |
| 284 | #. Refactor the starter project in two commits: |
| 285 | |
| 286 | 1. Add or change the project's .clang-tidy to reflect the agreed rules. |
| 287 | (This is in a separate commit to enable the merging process described in |
| 288 | `Minimising cost of downstream merges`_). |
| 289 | Also update the project list on the policy page. |
| 290 | 2. Apply ``clang-tidy`` to the project's files, with only the |
| 291 | ``readability-identifier-naming`` rules enabled. ``clang-tidy`` will also |
| 292 | reformat the affected lines according to the rules in ``.clang-format``. |
| 293 | It is anticipated that this will be a good dog-fooding opportunity for |
| 294 | clang-tidy, and bugs should be fixed in the process, likely including: |
| 295 | |
| 296 | * `readability-identifier-naming incorrectly fixes lambda capture |
| 297 | <https://bugs.llvm.org/show_bug.cgi?id=41119>`_. |
| 298 | * `readability-identifier-naming incorrectly fixes variables which |
| 299 | become keywords <https://bugs.llvm.org/show_bug.cgi?id=41120>`_. |
| 300 | * `readability-identifier-naming misses fixing member variables in |
| 301 | destructor <https://bugs.llvm.org/show_bug.cgi?id=41122>`_. |
| 302 | |
| 303 | #. Gather feedback and refine the process as appropriate. |
| 304 | |
| 305 | #. Apply the process to the following projects, with a suitable delay between |
| 306 | each (at least 4 weeks after the first change, at least 2 weeks subsequently) |
| 307 | to allow gathering further feedback. |
| 308 | This list should exclude projects that must adhere to an externally defined |
| 309 | standard e.g. libcxx. |
| 310 | The list is roughly in chronological order of renaming. |
| 311 | Some items may not make sense to rename individually - it is expected that |
| 312 | this list will change following experimentation: |
| 313 | |
| 314 | * TableGen |
| 315 | * llvm/tools |
| 316 | * clang-tools-extra |
| 317 | * clang |
| 318 | * ARM backend |
| 319 | * AArch64 backend |
| 320 | * AMDGPU backend |
| 321 | * ARC backend |
| 322 | * AVR backend |
| 323 | * BPF backend |
| 324 | * Hexagon backend |
| 325 | * Lanai backend |
| 326 | * MIPS backend |
| 327 | * NVPTX backend |
| 328 | * PowerPC backend |
| 329 | * RISC-V backend |
| 330 | * Sparc backend |
| 331 | * SystemZ backend |
| 332 | * WebAssembly backend |
| 333 | * X86 backend |
| 334 | * XCore backend |
| 335 | * libLTO |
| 336 | * Debug Information |
| 337 | * Remainder of llvm |
| 338 | * compiler-rt |
| 339 | * libunwind |
| 340 | * openmp |
| 341 | * parallel-libs |
| 342 | * polly |
| 343 | * lldb |
| 344 | |
| 345 | #. Remove the old variable name rule from the policy page. |
| 346 | |
| 347 | #. Repeat many of the steps in the sequence, using a script to expand acronyms. |
| 348 | |
| 349 | References |
| 350 | ========== |
| 351 | |
| 352 | .. [LLDB] LLDB Coding Conventions https://llvm.org/svn/llvm-project/lldb/branches/release_39/www/lldb-coding-conventions.html |
| 353 | .. [Google] Google C++ Style Guide https://google.github.io/styleguide/cppguide.html#Variable_Names |
| 354 | .. [WebKit] WebKit Code Style Guidelines https://webkit.org/code-style-guidelines/#names |
| 355 | .. [Qt] Qt Coding Style https://wiki.qt.io/Qt_Coding_Style#Declaring_variables |
| 356 | .. [Rust] Rust naming conventions https://doc.rust-lang.org/1.0.0/style/style/naming/README.html |
| 357 | .. [Swift] Swift API Design Guidelines https://swift.org/documentation/api-design-guidelines/#general-conventions |
| 358 | .. [Python] Style Guide for Python Code https://www.python.org/dev/peps/pep-0008/#function-and-variable-names |
Sylvestre Ledru | 2026d7b | 2019-12-24 13:38:59 +0100 | [diff] [blame] | 359 | .. [Mozilla] Mozilla Coding style: Prefixes https://firefox-source-docs.mozilla.org/tools/lint/coding-style/coding_style_cpp.html#prefixes |
Michael Platings | 7aecb64 | 2019-03-28 14:42:21 +0000 | [diff] [blame] | 360 | .. [SVE] LLVM with support for SVE https://github.com/ARM-software/LLVM-SVE |
| 361 | .. [AminiInconsistent] Mehdi Amini, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130329.html |
| 362 | .. [ArsenaultAgree] Matt Arsenault, http://lists.llvm.org/pipermail/llvm-dev/2019-February/129934.html |
| 363 | .. [BeylsDistinguish] Kristof Beyls, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130292.html |
| 364 | .. [BradburyConcern] Alex Bradbury, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130266.html |
| 365 | .. [BradburyTransition] Alex Bradbury, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130388.html |
| 366 | .. [CarruthAcronym] Chandler Carruth, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130313.html |
| 367 | .. [CarruthCamelBack] Chandler Carruth, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130214.html |
| 368 | .. [CarruthDistinguish] Chandler Carruth, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130310.html |
| 369 | .. [CarruthFunction] Chandler Carruth, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130309.html |
| 370 | .. [CarruthInconsistent] Chandler Carruth, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130312.html |
| 371 | .. [CarruthLower] Chandler Carruth, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130430.html |
| 372 | .. [DasInconsistent] Sanjoy Das, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130304.html |
| 373 | .. [DenisovCamelBack] Alex Denisov, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130179.html |
| 374 | .. [EmersonConcern] Amara Emerson, http://lists.llvm.org/pipermail/llvm-dev/2019-February/129894.html |
| 375 | .. [GreeneDistinguish] David Greene, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130425.html |
| 376 | .. [GreeneGitBlame] David Greene, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130228.html |
| 377 | .. [HendersonPrefix] James Henderson, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130465.html |
| 378 | .. [HähnleDistinguish] Nicolai Hähnle, http://lists.llvm.org/pipermail/llvm-dev/2019-February/129923.html |
| 379 | .. [IvanovicDistinguish] Nemanja Ivanovic, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130249.html |
| 380 | .. [JonesDistinguish] JD Jones, http://lists.llvm.org/pipermail/llvm-dev/2019-February/129926.html |
| 381 | .. [LattnerAcronym] Chris Lattner, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130353.html |
| 382 | .. [LattnerAgree] Chris Latter, http://lists.llvm.org/pipermail/llvm-dev/2019-February/129907.html |
| 383 | .. [LattnerFunction] Chris Lattner, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130630.html |
| 384 | .. [LattnerLower] Chris Lattner, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130629.html |
| 385 | .. [LattnerTransition] Chris Lattner, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130355.html |
| 386 | .. [MalyutinDistinguish] Danila Malyutin, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130320.html |
| 387 | .. [ParzyszekAcronym] Krzysztof Parzyszek, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130306.html |
| 388 | .. [ParzyszekAcronym2] Krzysztof Parzyszek, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130323.html |
| 389 | .. [ParzyszekDistinguish] Krzysztof Parzyszek, http://lists.llvm.org/pipermail/llvm-dev/2019-February/129941.html |
| 390 | .. [PicusAcronym] Diana Picus, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130318.html |
| 391 | .. [ReamesConcern] Philip Reames, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130181.html |
| 392 | .. [RicciAcronyms] Bruno Ricci, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130328.html |
| 393 | .. [RobinsonAgree] Paul Robinson, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130111.html |
| 394 | .. [RobinsonDistinguish] Paul Robinson, http://lists.llvm.org/pipermail/llvm-dev/2019-February/129920.html |
| 395 | .. [RobinsonDistinguish2] Paul Robinson, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130229.html |
| 396 | .. [RobinsonTransition] Paul Robinson, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130415.html |
| 397 | .. [TurnerCamelBack] Zachary Turner, https://reviews.llvm.org/D57896#1402264 |
| 398 | .. [TurnerLLDB] Zachary Turner, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130213.html |
| 399 | .. [Ueyama] Rui Ueyama, http://lists.llvm.org/pipermail/llvm-dev/2019-February/130435.html |