blob: 7e371d708d58c3e29057b20a34e48727939c2195 [file] [log] [blame]
Bill Wendling7d623452015-03-18 13:36:07 -07001====
2YAPF
3====
4
Bill Wendlingfb8ab382015-03-18 20:24:14 -07005.. image:: https://travis-ci.org/google/yapf.svg?branch=master
6 :target: https://travis-ci.org/google/yapf
7 :alt: Build status
8
Bill Wendling7d623452015-03-18 13:36:07 -07009Introduction
10============
11
Bill Wendling5632e672015-03-29 17:06:07 -070012Most of the current formatters for Python --- e.g., autopep8, and pep8ify ---
13are made to remove lint errors from code. This has some obvious limitations.
14For instance, code that conforms to the PEP 8 guidelines may not be
15reformatted. But it doesn't mean that the code looks good.
Bill Wendling7d623452015-03-18 13:36:07 -070016
17YAPF takes a different approach. It's based off of 'clang-format', developed by
18Daniel Jasper. In essence, the algorithm takes the code and reformats it to the
19best formatting that conforms to the style guide, even if the original code
Peter Bengtsson1c60ad72015-03-24 20:05:39 -070020didn't violate the style guide. The idea is also similar to the 'gofmt' tool for
Eli Bendersky07072f82015-03-23 06:41:14 -070021the Go programming language: end all holy wars about formatting - if the whole
22code base of a project is simply piped through YAPF whenever modifications are
23made, the style remains consistent throughout the project and there's no point
24arguing about style in every code review.
Bill Wendling7d623452015-03-18 13:36:07 -070025
26The ultimate goal is that the code YAPF produces is as good as the code that a
Bill Wendling8fb9c482015-03-29 17:32:07 -070027programmer would write if they were following the style guide. It takes away
28some of the drudgery of maintaining your code.
Bill Wendling7d623452015-03-18 13:36:07 -070029
Bill Wendlingf5e50b62015-03-28 23:38:12 -070030.. footer::
Bill Wendling52e04112015-03-18 20:42:26 -070031
32 YAPF is not an official Google product (experimental or otherwise), it is
33 just code that happens to be owned by Google.
34
Bill Wendling7d623452015-03-18 13:36:07 -070035.. contents::
36
37Installation
38============
39
Eli Bendersky5eb88232015-03-27 06:27:11 -070040We consider YAPF to be "alpha" quality at this time. Therefore, we don't yet
Eli Bendersky8a365362015-03-25 18:42:22 -070041support official releases to PyPI and the most stable and correct version is
42at the tip of the ``master`` branch in this repository. We plan to make a
43first beta release (including to PyPI) in the next few weeks.
44
Eli Bendersky07072f82015-03-23 06:41:14 -070045If you intend to use YAPF as a command-line tool rather than as a library,
46installation is not necessary. YAPF supports being run as a directory by the
47Python interpreter. If you cloned/unzipped yapf into ``DIR``, it's possible to
48run::
49
Eli Benderskyb3678b32015-03-25 14:16:11 -070050 $ PYTHONPATH=DIR python DIR/yapf [options] ...
Eli Bendersky07072f82015-03-23 06:41:14 -070051
Eli Bendersky5eb88232015-03-27 06:27:11 -070052Python versions
53===============
54
55YAPF supports Python 2.7 and 3.4+.
56
57YAPF requires the code it formats to be valid Python for the version YAPF itself
58runs under. Therefore, if you format Python 3 code with YAPF, run YAPF itself
59under Python 3 (and similarly for Python 2).
60
Bill Wendling7d623452015-03-18 13:36:07 -070061Usage
62=====
63
Bill Wendlingfa22c892015-03-18 13:42:25 -070064Options::
Bill Wendling7d623452015-03-18 13:36:07 -070065
Bill Wendlinga1cb4922015-03-30 20:18:12 -070066 usage: yapf [-h] [--style STYLE] [-d | -i] [-l START-END | -r] ...
Bill Wendling7d623452015-03-18 13:36:07 -070067
Bill Wendlingfa22c892015-03-18 13:42:25 -070068 Formatter for Python code.
Bill Wendling7d623452015-03-18 13:36:07 -070069
Bill Wendlingfa22c892015-03-18 13:42:25 -070070 positional arguments:
71 files
72
73 optional arguments:
74 -h, --help show this help message and exit
Eli Bendersky83d2bd02015-03-23 06:33:48 -070075 --style STYLE specify formatting style: either a style name (for
76 example "pep8" or "google"), or the name of a file
Eli Bendersky2cec8b42015-03-27 06:31:34 -070077 with style settings. pep8 is the default.
Bill Wendlingfa22c892015-03-18 13:42:25 -070078 -d, --diff print the diff for the fixed source
79 -i, --in-place make changes to files in place
80 -l START-END, --lines START-END
81 range of lines to reformat, one-based
82 -r, --recursive run recursively over directories
Bill Wendling7d623452015-03-18 13:36:07 -070083
Eli Bendersky83d2bd02015-03-23 06:33:48 -070084Formatting style
Bill Wendlingf5e50b62015-03-28 23:38:12 -070085================
Eli Bendersky83d2bd02015-03-23 06:33:48 -070086
87The formatting style used by YAPF is configurable and there are many "knobs"
88that can be used to tune how YAPF does formatting. See the ``style.py`` module
89for the full list.
90
91To control the style, run YAPF with the ``--style`` argument. It accepts either
92one of the predefined styles (currently "pep8" or "google"), or a path to a
93configuration file that specifies the desired style. The file is a simple
94listing of (case-insensitive) ``key = value`` pairs with a ``[style]`` heading.
Eli Bendersky159fa1e2015-03-23 06:34:22 -070095For example::
Eli Bendersky83d2bd02015-03-23 06:33:48 -070096
97 [style]
98 based_on_style = pep8
99 spaces_before_comment = 4
100 split_before_logical_operator = true
101
102The ``based_on_style`` setting determines which of the predefined styles this
103custom style is based on (think of it like subclassing).
Bill Wendling7d623452015-03-18 13:36:07 -0700104
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700105Example
106=======
107
Bill Wendling8fb9c482015-03-29 17:32:07 -0700108An example of the type of formatting that YAPF can do, it will take this ugly code:
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700109
110.. code-block:: python
111
112 x = { 'a':37,'b':42,
113
114 'c':927}
115
116 y = 'hello ''world'
117 z = 'hello '+'world'
118 a = 'hello {}'.format('world')
119 class foo ( object ):
120 def f (self ):
121 return 37*-+2
122 def g(self, x,y=42):
123 return y
124 def f ( a ) :
125 return 37+-+a[42-x : y**3]
126
Bill Wendling8fb9c482015-03-29 17:32:07 -0700127and reformat it into:
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700128
129.. code-block:: python
130
131 x = {'a': 37, 'b': 42, 'c': 927}
132
133 y = 'hello ' 'world'
134 z = 'hello ' + 'world'
135 a = 'hello {}'.format('world')
136
137
138 class foo(object):
Bill Wendling5632e672015-03-29 17:06:07 -0700139 def f(self):
140 return 37 * -+2
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700141
Bill Wendling5632e672015-03-29 17:06:07 -0700142 def g(self, x, y=42):
143 return y
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700144
145
146 def f(a):
Bill Wendling5632e672015-03-29 17:06:07 -0700147 return 37 + -+a[42 - x:y ** 3]
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700148
Bill Wendling8fb9c482015-03-29 17:32:07 -0700149(Potentially) Frequently Asked Questions
150========================================
151
152Why does YAPF destroy my awesome formatting?
153--------------------------------------------
154
155YAPF tries very hard to get the formatting correct. But for some code, it won't
156be as good as hand-formatting. In particular, large data literals may become
157horribly disfigured under YAPF.
158
159The reason for this is manifold. But in essence YAPF is simply a tool to help
160with development. It will format things to coincide with the style guide, but
161that may not equate with readability.
162
163What can be done to alleviate this situation is to indicate regions YAPF should
164ignore when reformatting something:
165
166.. code-block:: python
167
168 # yapf: disable
169 FOO = {
170 # ... some very large, complex data literal.
171 }
172
173 BAR = [
174 # ... another large data literal.
175 ]
176 # yapf: enable
177
178You can also disable formatting for a single literal like this:
179
180.. code-block:: python
181
182 BAZ = {
183 [1, 2, 3, 4],
184 [5, 6, 7, 8],
185 [9, 10, 11, 12]
186 } # yapf: disable
187
Bill Wendling7d623452015-03-18 13:36:07 -0700188Why Not Improve Existing Tools?
Bill Wendling8fb9c482015-03-29 17:32:07 -0700189-------------------------------
Bill Wendling7d623452015-03-18 13:36:07 -0700190
191We wanted to use clang-format's reformatting algorithm. It's very powerful and
192designed to come up with the best formatting possible. Existing tools were
193created with different goals in mind, and would require extensive modifications
194to convert to using clang-format's algorithm.
195
Bill Wendling7d623452015-03-18 13:36:07 -0700196Can I Use YAPF In My Program?
Bill Wendling8fb9c482015-03-29 17:32:07 -0700197-----------------------------
Bill Wendling7d623452015-03-18 13:36:07 -0700198
199Please do! YAPF was designed to be used as a library as well as a command line
200tool. This means that a tool or IDE plugin is free to use YAPF.
201
Bill Wendling7d623452015-03-18 13:36:07 -0700202Gory Details
203============
204
205Algorithm Design
206----------------
207
Eli Benderskyd08130d2015-03-19 05:20:46 -0700208The main data structure in YAPF is the ``UnwrappedLine`` object. It holds a list
209of ``FormatToken``\s, that we would want to place on a single line if there were
210no column limit. An exception being a comment in the middle of an expression
Bill Wendling7d623452015-03-18 13:36:07 -0700211statement will force the line to be formatted on more than one line. The
Eli Benderskyd08130d2015-03-19 05:20:46 -0700212formatter works on one ``UnwrappedLine`` object at a time.
Bill Wendling7d623452015-03-18 13:36:07 -0700213
Eli Benderskyd08130d2015-03-19 05:20:46 -0700214An ``UnwrappedLine`` typically won't affect the formatting of lines before or
215after it. There is a part of the algorithm that may join two or more
216``UnwrappedLine``\s into one line. For instance, an if-then statement with a
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700217short body can be placed on a single line:
218
219.. code-block:: python
Bill Wendling7d623452015-03-18 13:36:07 -0700220
221 if a == 42: continue
222
223YAPF's formatting algorithm creates a weighted tree that acts as the solution
224space for the algorithm. Each node in the tree represents the result of a
225formatting decision --- i.e., whether to split or not to split before a token.
226Each formatting decision has a cost associated with it. Therefore, the cost is
227realized on the edge between two nodes. (In reality, the weighted tree doesn't
228have separate edge objects, so the cost resides on the nodes themselves.)
229
230For example, take the following Python code snippet. For the sake of this
231example, assume that line (1) violates the column limit restriction and needs to
232be reformatted.
233
Bill Wendlingfa22c892015-03-18 13:42:25 -0700234.. code-block:: python
Bill Wendling7d623452015-03-18 13:36:07 -0700235
Bill Wendlingfa22c892015-03-18 13:42:25 -0700236 def xxxxxxxxxxx(aaaaaaaaaaaa, bbbbbbbbb, cccccccc, dddddddd, eeeeee): # 1
237 pass # 2
Bill Wendling7d623452015-03-18 13:36:07 -0700238
239For line (1), the algorithm will build a tree where each node (a
Eli Benderskyd08130d2015-03-19 05:20:46 -0700240``FormattingDecisionState`` object) is the state of the line at that token given
241the decision to split before the token or not. Note: the ``FormatDecisionState``
242objects are copied by value so each node in the graph is unique and a change in
243one doesn't affect other nodes.
Bill Wendling7d623452015-03-18 13:36:07 -0700244
Bill Wendlingfa22c892015-03-18 13:42:25 -0700245Heuristics are used to determine the costs of splitting or not splitting.
246Because a node holds the state of the tree up to a token's insertion, it can
247easily determine if a splitting decision will violate one of the style
Bill Wendling7d623452015-03-18 13:36:07 -0700248requirements. For instance, the heuristic is able to apply an extra penalty to
249the edge when not splitting between the previous token and the one being added.
250
251There are some instances where we will never want to split the line, because
252doing so will always be detrimental (i.e., it will require a backslash-newline,
253which is very rarely desirable). For line (1), we will never want to split the
Eli Benderskyd08130d2015-03-19 05:20:46 -0700254first three tokens: ``def``, ``xxxxxxxxxxx``, and ``(``. Nor will we want to
255split between the ``)`` and the ``:`` at the end. These regions are said to be
256"unbreakable." This is reflected in the tree by there not being a "split"
Bill Wendling7d623452015-03-18 13:36:07 -0700257decision (left hand branch) within the unbreakable region.
258
259Now that we have the tree, we determine what the "best" formatting is by finding
260the path through the tree with the lowest cost.
261
262And that's it!