blob: e84179f785935455d87c1c2aca75590939fc0ffd [file] [log] [blame]
Bill Wendling7d623452015-03-18 13:36:07 -07001====
2YAPF
3====
4
Bill Wendlingfb8ab382015-03-18 20:24:14 -07005.. image:: https://travis-ci.org/google/yapf.svg?branch=master
6 :target: https://travis-ci.org/google/yapf
7 :alt: Build status
8
Bill Wendling7d623452015-03-18 13:36:07 -07009Introduction
10============
11
Bill Wendling5632e672015-03-29 17:06:07 -070012Most of the current formatters for Python --- e.g., autopep8, and pep8ify ---
13are made to remove lint errors from code. This has some obvious limitations.
14For instance, code that conforms to the PEP 8 guidelines may not be
15reformatted. But it doesn't mean that the code looks good.
Bill Wendling7d623452015-03-18 13:36:07 -070016
17YAPF takes a different approach. It's based off of 'clang-format', developed by
18Daniel Jasper. In essence, the algorithm takes the code and reformats it to the
19best formatting that conforms to the style guide, even if the original code
Peter Bengtsson1c60ad72015-03-24 20:05:39 -070020didn't violate the style guide. The idea is also similar to the 'gofmt' tool for
Eli Bendersky07072f82015-03-23 06:41:14 -070021the Go programming language: end all holy wars about formatting - if the whole
22code base of a project is simply piped through YAPF whenever modifications are
23made, the style remains consistent throughout the project and there's no point
24arguing about style in every code review.
Bill Wendling7d623452015-03-18 13:36:07 -070025
26The ultimate goal is that the code YAPF produces is as good as the code that a
27programmer would write if they were following the style guide.
28
Bill Wendlingf5e50b62015-03-28 23:38:12 -070029.. footer::
Bill Wendling52e04112015-03-18 20:42:26 -070030
31 YAPF is not an official Google product (experimental or otherwise), it is
32 just code that happens to be owned by Google.
33
Bill Wendling7d623452015-03-18 13:36:07 -070034.. contents::
35
36Installation
37============
38
Eli Bendersky5eb88232015-03-27 06:27:11 -070039We consider YAPF to be "alpha" quality at this time. Therefore, we don't yet
Eli Bendersky8a365362015-03-25 18:42:22 -070040support official releases to PyPI and the most stable and correct version is
41at the tip of the ``master`` branch in this repository. We plan to make a
42first beta release (including to PyPI) in the next few weeks.
43
Eli Bendersky07072f82015-03-23 06:41:14 -070044If you intend to use YAPF as a command-line tool rather than as a library,
45installation is not necessary. YAPF supports being run as a directory by the
46Python interpreter. If you cloned/unzipped yapf into ``DIR``, it's possible to
47run::
48
Eli Benderskyb3678b32015-03-25 14:16:11 -070049 $ PYTHONPATH=DIR python DIR/yapf [options] ...
Eli Bendersky07072f82015-03-23 06:41:14 -070050
Eli Bendersky5eb88232015-03-27 06:27:11 -070051Python versions
52===============
53
54YAPF supports Python 2.7 and 3.4+.
55
56YAPF requires the code it formats to be valid Python for the version YAPF itself
57runs under. Therefore, if you format Python 3 code with YAPF, run YAPF itself
58under Python 3 (and similarly for Python 2).
59
Bill Wendling7d623452015-03-18 13:36:07 -070060Usage
61=====
62
Bill Wendlingfa22c892015-03-18 13:42:25 -070063Options::
Bill Wendling7d623452015-03-18 13:36:07 -070064
Bill Wendling5632e672015-03-29 17:06:07 -070065 usage: yapf [-h] [--style STYLE] [--noverify] [-d | -i] [-l START-END | -r]
66 ...
Bill Wendling7d623452015-03-18 13:36:07 -070067
Bill Wendlingfa22c892015-03-18 13:42:25 -070068 Formatter for Python code.
Bill Wendling7d623452015-03-18 13:36:07 -070069
Bill Wendlingfa22c892015-03-18 13:42:25 -070070 positional arguments:
71 files
72
73 optional arguments:
74 -h, --help show this help message and exit
Eli Bendersky83d2bd02015-03-23 06:33:48 -070075 --style STYLE specify formatting style: either a style name (for
76 example "pep8" or "google"), or the name of a file
Eli Bendersky2cec8b42015-03-27 06:31:34 -070077 with style settings. pep8 is the default.
Bill Wendling5632e672015-03-29 17:06:07 -070078 --noverify do not verify refomatted code for syntax errors
Bill Wendlingfa22c892015-03-18 13:42:25 -070079 -d, --diff print the diff for the fixed source
80 -i, --in-place make changes to files in place
81 -l START-END, --lines START-END
82 range of lines to reformat, one-based
83 -r, --recursive run recursively over directories
Bill Wendling7d623452015-03-18 13:36:07 -070084
Bill Wendling5632e672015-03-29 17:06:07 -070085.. note::
86
87 After reformatting a chunk of code, YAPF verifies that it's correct (can be
88 parsed by Python itself). This means that if you're reformatting Python 3
89 code, it's best to run YAPF itself under Python 3. The same goes for Python
90 2.
91
92 It's possible to disable verification with the ``--noverify`` flag.
Eli Bendersky891f4382015-03-20 15:28:49 -070093
Eli Bendersky83d2bd02015-03-23 06:33:48 -070094Formatting style
Bill Wendlingf5e50b62015-03-28 23:38:12 -070095================
Eli Bendersky83d2bd02015-03-23 06:33:48 -070096
97The formatting style used by YAPF is configurable and there are many "knobs"
98that can be used to tune how YAPF does formatting. See the ``style.py`` module
99for the full list.
100
101To control the style, run YAPF with the ``--style`` argument. It accepts either
102one of the predefined styles (currently "pep8" or "google"), or a path to a
103configuration file that specifies the desired style. The file is a simple
104listing of (case-insensitive) ``key = value`` pairs with a ``[style]`` heading.
Eli Bendersky159fa1e2015-03-23 06:34:22 -0700105For example::
Eli Bendersky83d2bd02015-03-23 06:33:48 -0700106
107 [style]
108 based_on_style = pep8
109 spaces_before_comment = 4
110 split_before_logical_operator = true
111
112The ``based_on_style`` setting determines which of the predefined styles this
113custom style is based on (think of it like subclassing).
Bill Wendling7d623452015-03-18 13:36:07 -0700114
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700115Example
116=======
117
118This ugly code:
119
120.. code-block:: python
121
122 x = { 'a':37,'b':42,
123
124 'c':927}
125
126 y = 'hello ''world'
127 z = 'hello '+'world'
128 a = 'hello {}'.format('world')
129 class foo ( object ):
130 def f (self ):
131 return 37*-+2
132 def g(self, x,y=42):
133 return y
134 def f ( a ) :
135 return 37+-+a[42-x : y**3]
136
137Is reformatted into:
138
139.. code-block:: python
140
141 x = {'a': 37, 'b': 42, 'c': 927}
142
143 y = 'hello ' 'world'
144 z = 'hello ' + 'world'
145 a = 'hello {}'.format('world')
146
147
148 class foo(object):
Bill Wendling5632e672015-03-29 17:06:07 -0700149 def f(self):
150 return 37 * -+2
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700151
Bill Wendling5632e672015-03-29 17:06:07 -0700152 def g(self, x, y=42):
153 return y
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700154
155
156 def f(a):
Bill Wendling5632e672015-03-29 17:06:07 -0700157 return 37 + -+a[42 - x:y ** 3]
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700158
Bill Wendling7d623452015-03-18 13:36:07 -0700159Why Not Improve Existing Tools?
160===============================
161
162We wanted to use clang-format's reformatting algorithm. It's very powerful and
163designed to come up with the best formatting possible. Existing tools were
164created with different goals in mind, and would require extensive modifications
165to convert to using clang-format's algorithm.
166
Bill Wendling7d623452015-03-18 13:36:07 -0700167Can I Use YAPF In My Program?
168=============================
169
170Please do! YAPF was designed to be used as a library as well as a command line
171tool. This means that a tool or IDE plugin is free to use YAPF.
172
Bill Wendling7d623452015-03-18 13:36:07 -0700173Gory Details
174============
175
176Algorithm Design
177----------------
178
Eli Benderskyd08130d2015-03-19 05:20:46 -0700179The main data structure in YAPF is the ``UnwrappedLine`` object. It holds a list
180of ``FormatToken``\s, that we would want to place on a single line if there were
181no column limit. An exception being a comment in the middle of an expression
Bill Wendling7d623452015-03-18 13:36:07 -0700182statement will force the line to be formatted on more than one line. The
Eli Benderskyd08130d2015-03-19 05:20:46 -0700183formatter works on one ``UnwrappedLine`` object at a time.
Bill Wendling7d623452015-03-18 13:36:07 -0700184
Eli Benderskyd08130d2015-03-19 05:20:46 -0700185An ``UnwrappedLine`` typically won't affect the formatting of lines before or
186after it. There is a part of the algorithm that may join two or more
187``UnwrappedLine``\s into one line. For instance, an if-then statement with a
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700188short body can be placed on a single line:
189
190.. code-block:: python
Bill Wendling7d623452015-03-18 13:36:07 -0700191
192 if a == 42: continue
193
194YAPF's formatting algorithm creates a weighted tree that acts as the solution
195space for the algorithm. Each node in the tree represents the result of a
196formatting decision --- i.e., whether to split or not to split before a token.
197Each formatting decision has a cost associated with it. Therefore, the cost is
198realized on the edge between two nodes. (In reality, the weighted tree doesn't
199have separate edge objects, so the cost resides on the nodes themselves.)
200
201For example, take the following Python code snippet. For the sake of this
202example, assume that line (1) violates the column limit restriction and needs to
203be reformatted.
204
Bill Wendlingfa22c892015-03-18 13:42:25 -0700205.. code-block:: python
Bill Wendling7d623452015-03-18 13:36:07 -0700206
Bill Wendlingfa22c892015-03-18 13:42:25 -0700207 def xxxxxxxxxxx(aaaaaaaaaaaa, bbbbbbbbb, cccccccc, dddddddd, eeeeee): # 1
208 pass # 2
Bill Wendling7d623452015-03-18 13:36:07 -0700209
210For line (1), the algorithm will build a tree where each node (a
Eli Benderskyd08130d2015-03-19 05:20:46 -0700211``FormattingDecisionState`` object) is the state of the line at that token given
212the decision to split before the token or not. Note: the ``FormatDecisionState``
213objects are copied by value so each node in the graph is unique and a change in
214one doesn't affect other nodes.
Bill Wendling7d623452015-03-18 13:36:07 -0700215
Bill Wendlingfa22c892015-03-18 13:42:25 -0700216Heuristics are used to determine the costs of splitting or not splitting.
217Because a node holds the state of the tree up to a token's insertion, it can
218easily determine if a splitting decision will violate one of the style
Bill Wendling7d623452015-03-18 13:36:07 -0700219requirements. For instance, the heuristic is able to apply an extra penalty to
220the edge when not splitting between the previous token and the one being added.
221
222There are some instances where we will never want to split the line, because
223doing so will always be detrimental (i.e., it will require a backslash-newline,
224which is very rarely desirable). For line (1), we will never want to split the
Eli Benderskyd08130d2015-03-19 05:20:46 -0700225first three tokens: ``def``, ``xxxxxxxxxxx``, and ``(``. Nor will we want to
226split between the ``)`` and the ``:`` at the end. These regions are said to be
227"unbreakable." This is reflected in the tree by there not being a "split"
Bill Wendling7d623452015-03-18 13:36:07 -0700228decision (left hand branch) within the unbreakable region.
229
230Now that we have the tree, we determine what the "best" formatting is by finding
231the path through the tree with the lowest cost.
232
233And that's it!