blob: 24323842581e6b6c946b070a8e42b367642c8c44 [file] [log] [blame]
Bill Wendling7d623452015-03-18 13:36:07 -07001====
2YAPF
3====
4
Bill Wendlingfb8ab382015-03-18 20:24:14 -07005.. image:: https://travis-ci.org/google/yapf.svg?branch=master
6 :target: https://travis-ci.org/google/yapf
7 :alt: Build status
8
Bill Wendling7d623452015-03-18 13:36:07 -07009Introduction
10============
11
Bill Wendling5632e672015-03-29 17:06:07 -070012Most of the current formatters for Python --- e.g., autopep8, and pep8ify ---
13are made to remove lint errors from code. This has some obvious limitations.
14For instance, code that conforms to the PEP 8 guidelines may not be
15reformatted. But it doesn't mean that the code looks good.
Bill Wendling7d623452015-03-18 13:36:07 -070016
17YAPF takes a different approach. It's based off of 'clang-format', developed by
18Daniel Jasper. In essence, the algorithm takes the code and reformats it to the
19best formatting that conforms to the style guide, even if the original code
Peter Bengtsson1c60ad72015-03-24 20:05:39 -070020didn't violate the style guide. The idea is also similar to the 'gofmt' tool for
Eli Bendersky07072f82015-03-23 06:41:14 -070021the Go programming language: end all holy wars about formatting - if the whole
22code base of a project is simply piped through YAPF whenever modifications are
23made, the style remains consistent throughout the project and there's no point
24arguing about style in every code review.
Bill Wendling7d623452015-03-18 13:36:07 -070025
26The ultimate goal is that the code YAPF produces is as good as the code that a
Bill Wendling8fb9c482015-03-29 17:32:07 -070027programmer would write if they were following the style guide. It takes away
28some of the drudgery of maintaining your code.
Bill Wendling7d623452015-03-18 13:36:07 -070029
Bill Wendlingf5e50b62015-03-28 23:38:12 -070030.. footer::
Bill Wendling52e04112015-03-18 20:42:26 -070031
32 YAPF is not an official Google product (experimental or otherwise), it is
33 just code that happens to be owned by Google.
34
Bill Wendling7d623452015-03-18 13:36:07 -070035.. contents::
36
37Installation
38============
39
Eli Bendersky5eb88232015-03-27 06:27:11 -070040We consider YAPF to be "alpha" quality at this time. Therefore, we don't yet
Eli Bendersky8a365362015-03-25 18:42:22 -070041support official releases to PyPI and the most stable and correct version is
42at the tip of the ``master`` branch in this repository. We plan to make a
43first beta release (including to PyPI) in the next few weeks.
44
Eli Bendersky07072f82015-03-23 06:41:14 -070045If you intend to use YAPF as a command-line tool rather than as a library,
46installation is not necessary. YAPF supports being run as a directory by the
47Python interpreter. If you cloned/unzipped yapf into ``DIR``, it's possible to
48run::
49
Eli Benderskyb3678b32015-03-25 14:16:11 -070050 $ PYTHONPATH=DIR python DIR/yapf [options] ...
Eli Bendersky07072f82015-03-23 06:41:14 -070051
Eli Bendersky5eb88232015-03-27 06:27:11 -070052Python versions
53===============
54
55YAPF supports Python 2.7 and 3.4+.
56
57YAPF requires the code it formats to be valid Python for the version YAPF itself
58runs under. Therefore, if you format Python 3 code with YAPF, run YAPF itself
59under Python 3 (and similarly for Python 2).
60
Bill Wendling7d623452015-03-18 13:36:07 -070061Usage
62=====
63
Bill Wendlingfa22c892015-03-18 13:42:25 -070064Options::
Bill Wendling7d623452015-03-18 13:36:07 -070065
Bill Wendlinga1cb4922015-03-30 20:18:12 -070066 usage: yapf [-h] [--style STYLE] [-d | -i] [-l START-END | -r] ...
Bill Wendling7d623452015-03-18 13:36:07 -070067
Bill Wendlingfa22c892015-03-18 13:42:25 -070068 Formatter for Python code.
Bill Wendling7d623452015-03-18 13:36:07 -070069
Bill Wendlingfa22c892015-03-18 13:42:25 -070070 positional arguments:
71 files
72
73 optional arguments:
74 -h, --help show this help message and exit
Eli Bendersky83d2bd02015-03-23 06:33:48 -070075 --style STYLE specify formatting style: either a style name (for
76 example "pep8" or "google"), or the name of a file
Eli Bendersky2cec8b42015-03-27 06:31:34 -070077 with style settings. pep8 is the default.
Bill Wendlingfa22c892015-03-18 13:42:25 -070078 -d, --diff print the diff for the fixed source
79 -i, --in-place make changes to files in place
80 -l START-END, --lines START-END
81 range of lines to reformat, one-based
82 -r, --recursive run recursively over directories
Bill Wendling7d623452015-03-18 13:36:07 -070083
Eli Bendersky83d2bd02015-03-23 06:33:48 -070084Formatting style
Bill Wendlingf5e50b62015-03-28 23:38:12 -070085================
Eli Bendersky83d2bd02015-03-23 06:33:48 -070086
87The formatting style used by YAPF is configurable and there are many "knobs"
88that can be used to tune how YAPF does formatting. See the ``style.py`` module
89for the full list.
90
Bill Wendlingc0167792015-04-02 01:58:39 -070091To control the style, run YAPF with the ``--style`` argument. It accepts one of
92the predefined styles (e.g., ``pep8`` or ``google``), a path to a configuration
93file that specifies the desired style, or a dictionary of key/value pairs.
94
95The config file is a simple listing of (case-insensitive) ``key = value`` pairs
96with a ``[style]`` heading. For example::
Eli Bendersky83d2bd02015-03-23 06:33:48 -070097
98 [style]
99 based_on_style = pep8
100 spaces_before_comment = 4
101 split_before_logical_operator = true
102
103The ``based_on_style`` setting determines which of the predefined styles this
104custom style is based on (think of it like subclassing).
Bill Wendling7d623452015-03-18 13:36:07 -0700105
Bill Wendlingc0167792015-04-02 01:58:39 -0700106It's also possible to do the same on the command line with a dictionary. For
107example::
108
109 --style='{based_on_style: google, indent_width: 4}'
110
111This will take the ``google`` base style and modify it to have four space
112indentations.
113
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700114Example
115=======
116
Bill Wendling8fb9c482015-03-29 17:32:07 -0700117An example of the type of formatting that YAPF can do, it will take this ugly code:
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700118
119.. code-block:: python
120
121 x = { 'a':37,'b':42,
122
123 'c':927}
124
125 y = 'hello ''world'
126 z = 'hello '+'world'
127 a = 'hello {}'.format('world')
128 class foo ( object ):
129 def f (self ):
130 return 37*-+2
131 def g(self, x,y=42):
132 return y
133 def f ( a ) :
134 return 37+-+a[42-x : y**3]
135
Bill Wendling8fb9c482015-03-29 17:32:07 -0700136and reformat it into:
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700137
138.. code-block:: python
139
140 x = {'a': 37, 'b': 42, 'c': 927}
141
142 y = 'hello ' 'world'
143 z = 'hello ' + 'world'
144 a = 'hello {}'.format('world')
145
146
147 class foo(object):
Bill Wendling5632e672015-03-29 17:06:07 -0700148 def f(self):
149 return 37 * -+2
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700150
Bill Wendling5632e672015-03-29 17:06:07 -0700151 def g(self, x, y=42):
152 return y
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700153
154
155 def f(a):
Bill Wendling5632e672015-03-29 17:06:07 -0700156 return 37 + -+a[42 - x:y ** 3]
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700157
Bill Wendling8fb9c482015-03-29 17:32:07 -0700158(Potentially) Frequently Asked Questions
159========================================
160
161Why does YAPF destroy my awesome formatting?
162--------------------------------------------
163
164YAPF tries very hard to get the formatting correct. But for some code, it won't
165be as good as hand-formatting. In particular, large data literals may become
166horribly disfigured under YAPF.
167
168The reason for this is manifold. But in essence YAPF is simply a tool to help
169with development. It will format things to coincide with the style guide, but
170that may not equate with readability.
171
172What can be done to alleviate this situation is to indicate regions YAPF should
173ignore when reformatting something:
174
175.. code-block:: python
176
177 # yapf: disable
178 FOO = {
179 # ... some very large, complex data literal.
180 }
181
182 BAR = [
183 # ... another large data literal.
184 ]
185 # yapf: enable
186
187You can also disable formatting for a single literal like this:
188
189.. code-block:: python
190
191 BAZ = {
192 [1, 2, 3, 4],
193 [5, 6, 7, 8],
194 [9, 10, 11, 12]
195 } # yapf: disable
196
Bill Wendling7d623452015-03-18 13:36:07 -0700197Why Not Improve Existing Tools?
Bill Wendling8fb9c482015-03-29 17:32:07 -0700198-------------------------------
Bill Wendling7d623452015-03-18 13:36:07 -0700199
200We wanted to use clang-format's reformatting algorithm. It's very powerful and
201designed to come up with the best formatting possible. Existing tools were
202created with different goals in mind, and would require extensive modifications
203to convert to using clang-format's algorithm.
204
Bill Wendling7d623452015-03-18 13:36:07 -0700205Can I Use YAPF In My Program?
Bill Wendling8fb9c482015-03-29 17:32:07 -0700206-----------------------------
Bill Wendling7d623452015-03-18 13:36:07 -0700207
208Please do! YAPF was designed to be used as a library as well as a command line
209tool. This means that a tool or IDE plugin is free to use YAPF.
210
Bill Wendling7d623452015-03-18 13:36:07 -0700211Gory Details
212============
213
214Algorithm Design
215----------------
216
Eli Benderskyd08130d2015-03-19 05:20:46 -0700217The main data structure in YAPF is the ``UnwrappedLine`` object. It holds a list
218of ``FormatToken``\s, that we would want to place on a single line if there were
219no column limit. An exception being a comment in the middle of an expression
Bill Wendling7d623452015-03-18 13:36:07 -0700220statement will force the line to be formatted on more than one line. The
Eli Benderskyd08130d2015-03-19 05:20:46 -0700221formatter works on one ``UnwrappedLine`` object at a time.
Bill Wendling7d623452015-03-18 13:36:07 -0700222
Eli Benderskyd08130d2015-03-19 05:20:46 -0700223An ``UnwrappedLine`` typically won't affect the formatting of lines before or
224after it. There is a part of the algorithm that may join two or more
225``UnwrappedLine``\s into one line. For instance, an if-then statement with a
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700226short body can be placed on a single line:
227
228.. code-block:: python
Bill Wendling7d623452015-03-18 13:36:07 -0700229
230 if a == 42: continue
231
232YAPF's formatting algorithm creates a weighted tree that acts as the solution
233space for the algorithm. Each node in the tree represents the result of a
234formatting decision --- i.e., whether to split or not to split before a token.
235Each formatting decision has a cost associated with it. Therefore, the cost is
236realized on the edge between two nodes. (In reality, the weighted tree doesn't
237have separate edge objects, so the cost resides on the nodes themselves.)
238
239For example, take the following Python code snippet. For the sake of this
240example, assume that line (1) violates the column limit restriction and needs to
241be reformatted.
242
Bill Wendlingfa22c892015-03-18 13:42:25 -0700243.. code-block:: python
Bill Wendling7d623452015-03-18 13:36:07 -0700244
Bill Wendlingfa22c892015-03-18 13:42:25 -0700245 def xxxxxxxxxxx(aaaaaaaaaaaa, bbbbbbbbb, cccccccc, dddddddd, eeeeee): # 1
246 pass # 2
Bill Wendling7d623452015-03-18 13:36:07 -0700247
248For line (1), the algorithm will build a tree where each node (a
Eli Benderskyd08130d2015-03-19 05:20:46 -0700249``FormattingDecisionState`` object) is the state of the line at that token given
250the decision to split before the token or not. Note: the ``FormatDecisionState``
251objects are copied by value so each node in the graph is unique and a change in
252one doesn't affect other nodes.
Bill Wendling7d623452015-03-18 13:36:07 -0700253
Bill Wendlingfa22c892015-03-18 13:42:25 -0700254Heuristics are used to determine the costs of splitting or not splitting.
255Because a node holds the state of the tree up to a token's insertion, it can
256easily determine if a splitting decision will violate one of the style
Bill Wendling7d623452015-03-18 13:36:07 -0700257requirements. For instance, the heuristic is able to apply an extra penalty to
258the edge when not splitting between the previous token and the one being added.
259
260There are some instances where we will never want to split the line, because
261doing so will always be detrimental (i.e., it will require a backslash-newline,
262which is very rarely desirable). For line (1), we will never want to split the
Eli Benderskyd08130d2015-03-19 05:20:46 -0700263first three tokens: ``def``, ``xxxxxxxxxxx``, and ``(``. Nor will we want to
264split between the ``)`` and the ``:`` at the end. These regions are said to be
265"unbreakable." This is reflected in the tree by there not being a "split"
Bill Wendling7d623452015-03-18 13:36:07 -0700266decision (left hand branch) within the unbreakable region.
267
268Now that we have the tree, we determine what the "best" formatting is by finding
269the path through the tree with the lowest cost.
270
271And that's it!