blob: 761320fcfb878bd021ae15fcf0ea3d707a024f90 [file] [log] [blame]
Bill Wendling7d623452015-03-18 13:36:07 -07001====
2YAPF
3====
4
Bill Wendling19c44d02015-04-07 23:48:05 -07005.. image:: https://badge.fury.io/py/yapf.svg
6 :target: http://badge.fury.io/py/yapf
7 :alt: PyPI version
8
Bill Wendlingfb8ab382015-03-18 20:24:14 -07009.. image:: https://travis-ci.org/google/yapf.svg?branch=master
10 :target: https://travis-ci.org/google/yapf
11 :alt: Build status
12
Bill Wendling14ac8812015-04-05 02:47:32 -070013.. image:: https://coveralls.io/repos/google/yapf/badge.svg?branch=master
14 :target: https://coveralls.io/r/google/yapf?branch=master
15 :alt: Coverage status
16
Bill Wendling7d623452015-03-18 13:36:07 -070017Introduction
18============
19
Bill Wendling5632e672015-03-29 17:06:07 -070020Most of the current formatters for Python --- e.g., autopep8, and pep8ify ---
21are made to remove lint errors from code. This has some obvious limitations.
22For instance, code that conforms to the PEP 8 guidelines may not be
23reformatted. But it doesn't mean that the code looks good.
Bill Wendling7d623452015-03-18 13:36:07 -070024
25YAPF takes a different approach. It's based off of 'clang-format', developed by
26Daniel Jasper. In essence, the algorithm takes the code and reformats it to the
27best formatting that conforms to the style guide, even if the original code
Peter Bengtsson1c60ad72015-03-24 20:05:39 -070028didn't violate the style guide. The idea is also similar to the 'gofmt' tool for
Eli Bendersky07072f82015-03-23 06:41:14 -070029the Go programming language: end all holy wars about formatting - if the whole
30code base of a project is simply piped through YAPF whenever modifications are
31made, the style remains consistent throughout the project and there's no point
32arguing about style in every code review.
Bill Wendling7d623452015-03-18 13:36:07 -070033
34The ultimate goal is that the code YAPF produces is as good as the code that a
Bill Wendling8fb9c482015-03-29 17:32:07 -070035programmer would write if they were following the style guide. It takes away
36some of the drudgery of maintaining your code.
Bill Wendling7d623452015-03-18 13:36:07 -070037
Bill Wendlingf5e50b62015-03-28 23:38:12 -070038.. footer::
Bill Wendling52e04112015-03-18 20:42:26 -070039
40 YAPF is not an official Google product (experimental or otherwise), it is
41 just code that happens to be owned by Google.
42
Bill Wendling7d623452015-03-18 13:36:07 -070043.. contents::
44
45Installation
46============
47
Eli Benderskye0e83c12015-04-06 20:23:30 -070048To install YAPF from PyPI::
Eli Bendersky8a365362015-03-25 18:42:22 -070049
Eli Benderskye0e83c12015-04-06 20:23:30 -070050 $ pip install yapf
51
52YAPF is still considered in "alpha" stage, and the released version may change
53often; therefore, the best way to keep up-to-date with the latest development
54is to clone this repository.
55
56Note that if you intend to use YAPF as a command-line tool rather than as a
57library, installation is not necessary. YAPF supports being run as a directory
58by the Python interpreter. If you cloned/unzipped YAPF into ``DIR``, it's
59possible to run::
Eli Bendersky07072f82015-03-23 06:41:14 -070060
Eli Benderskyb3678b32015-03-25 14:16:11 -070061 $ PYTHONPATH=DIR python DIR/yapf [options] ...
Eli Bendersky07072f82015-03-23 06:41:14 -070062
Eli Bendersky5eb88232015-03-27 06:27:11 -070063Python versions
64===============
65
Eli Benderskya7bfe7e2015-04-05 06:33:18 -070066YAPF supports Python 2.7 and 3.4.1+.
Eli Bendersky5eb88232015-03-27 06:27:11 -070067
68YAPF requires the code it formats to be valid Python for the version YAPF itself
69runs under. Therefore, if you format Python 3 code with YAPF, run YAPF itself
70under Python 3 (and similarly for Python 2).
71
Bill Wendling7d623452015-03-18 13:36:07 -070072Usage
73=====
74
Bill Wendlingfa22c892015-03-18 13:42:25 -070075Options::
Bill Wendling7d623452015-03-18 13:36:07 -070076
Sam Clegg5170c3a2015-04-16 12:18:58 -070077 usage: yapf [-h] [--version] [--style-help] [--style STYLE] [--verify]
78 [-d | -i] [-l START-END | -r]
79 [files [files ...]]
Bill Wendling7d623452015-03-18 13:36:07 -070080
Bill Wendlingfa22c892015-03-18 13:42:25 -070081 Formatter for Python code.
Bill Wendling7d623452015-03-18 13:36:07 -070082
Bill Wendlingfa22c892015-03-18 13:42:25 -070083 positional arguments:
84 files
85
86 optional arguments:
87 -h, --help show this help message and exit
Sam Clegg5170c3a2015-04-16 12:18:58 -070088 --version show version number and exit
89 --style-help show style settings and exit
Eli Bendersky83d2bd02015-03-23 06:33:48 -070090 --style STYLE specify formatting style: either a style name (for
91 example "pep8" or "google"), or the name of a file
Sam Clegg5170c3a2015-04-16 12:18:58 -070092 with style settings. The default is pep8 unless a
93 .style.yapf file located in one of the parent
94 directories of the source file (or current directory
95 for stdin)
96 --verify try to verify refomatted code for syntax errors
Bill Wendlingfa22c892015-03-18 13:42:25 -070097 -d, --diff print the diff for the fixed source
98 -i, --in-place make changes to files in place
99 -l START-END, --lines START-END
100 range of lines to reformat, one-based
101 -r, --recursive run recursively over directories
Bill Wendling7d623452015-03-18 13:36:07 -0700102
Eli Bendersky83d2bd02015-03-23 06:33:48 -0700103Formatting style
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700104================
Eli Bendersky83d2bd02015-03-23 06:33:48 -0700105
106The formatting style used by YAPF is configurable and there are many "knobs"
107that can be used to tune how YAPF does formatting. See the ``style.py`` module
108for the full list.
109
Bill Wendlingc0167792015-04-02 01:58:39 -0700110To control the style, run YAPF with the ``--style`` argument. It accepts one of
111the predefined styles (e.g., ``pep8`` or ``google``), a path to a configuration
112file that specifies the desired style, or a dictionary of key/value pairs.
113
114The config file is a simple listing of (case-insensitive) ``key = value`` pairs
115with a ``[style]`` heading. For example::
Eli Bendersky83d2bd02015-03-23 06:33:48 -0700116
117 [style]
118 based_on_style = pep8
119 spaces_before_comment = 4
120 split_before_logical_operator = true
121
122The ``based_on_style`` setting determines which of the predefined styles this
123custom style is based on (think of it like subclassing).
Bill Wendling7d623452015-03-18 13:36:07 -0700124
Bill Wendlingc0167792015-04-02 01:58:39 -0700125It's also possible to do the same on the command line with a dictionary. For
126example::
127
128 --style='{based_on_style: google, indent_width: 4}'
129
130This will take the ``google`` base style and modify it to have four space
131indentations.
132
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700133Example
134=======
135
Sam Clegg4357fa32015-04-08 12:21:46 -0700136An example of the type of formatting that YAPF can do, it will take this ugly
137code:
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700138
139.. code-block:: python
140
141 x = { 'a':37,'b':42,
142
143 'c':927}
144
145 y = 'hello ''world'
146 z = 'hello '+'world'
147 a = 'hello {}'.format('world')
148 class foo ( object ):
149 def f (self ):
150 return 37*-+2
151 def g(self, x,y=42):
152 return y
153 def f ( a ) :
154 return 37+-+a[42-x : y**3]
155
Bill Wendling8fb9c482015-03-29 17:32:07 -0700156and reformat it into:
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700157
158.. code-block:: python
159
160 x = {'a': 37, 'b': 42, 'c': 927}
161
162 y = 'hello ' 'world'
163 z = 'hello ' + 'world'
164 a = 'hello {}'.format('world')
165
166
167 class foo(object):
Bill Wendling5632e672015-03-29 17:06:07 -0700168 def f(self):
169 return 37 * -+2
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700170
Bill Wendling5632e672015-03-29 17:06:07 -0700171 def g(self, x, y=42):
172 return y
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700173
174
175 def f(a):
Bill Wendling5632e672015-03-29 17:06:07 -0700176 return 37 + -+a[42 - x:y ** 3]
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700177
Andy Haydena00a6bf2015-06-15 18:47:41 -0700178Example as a module
179===================
180
181The two main APIs for calling yapf are ``FormatCode`` and ``FormatFile``:
182
183.. code-block:: python
184
185 from yapf.yapf_api import FormatCode # reformat a string of code
186 from yapf.yapf_api import FormatFile # reformat a file
187
188 >>> FormatCode("f ( a = 1, b = 2 )")
189 'f(a=1, b=2)\n'
190
191You can also pass a ``style_config``: Either a style name or a path to a file that contains
192formatting style settings. If None is specified, use the default style
193as set in ``style.DEFAULT_STYLE_FACTORY``.
194
195.. code-block:: python
196
197 >>> FormatCode("def g():\n return True", style_config='pep8')
198 'def g():\n return True\n'
199
200 >>> FormatFile("def g():\n return True", style_config='pep8')
201 'def g():\n return True\n'
202
203A ``lines`` argument: A list of tuples of lines (ints), [start, end],
204that we want to format. The lines are 1-based indexed. It can be used by
205third-party code (e.g., IDEs) when reformatting a snippet of code rather
206than a whole file.
207
208.. code-block:: python
209
210 >>> FormatCode("def g( ):\n a=1\n b = 2\n return a==b", lines=[(1, 1), (2, 3)])
211 'def g():\n a = 1\n b = 2\n return a==b\n'
212
213A ``print_diff`` (bool): Instead of returning the reformatted source, return a
214diff that turns the formatted source into reformatter source.
215
216.. code-block:: python
217
218 >>> print(FormatCode("a==b", filename="foo.py", print_diff=True))
219 --- foo.py (original)
220 +++ foo.py (reformatted)
221 @@ -1 +1 @@
222 -a==b
223 +a == b
224
225Note: the ``filename`` argument is inserted into the diff, the default would use ``<unknown>``.
226
227``FormatFile`` returns reformatted code from the passed file along with its encoding:
228
229.. code-block:: python
230
231 >>> print(open("foo.py").read()) # contents of file
232 a==b
233
234 >>> FormatFile("foo.py")
235 (u'a == b\n', 'utf-8')
236
Bill Wendling8fb9c482015-03-29 17:32:07 -0700237(Potentially) Frequently Asked Questions
238========================================
239
240Why does YAPF destroy my awesome formatting?
241--------------------------------------------
242
243YAPF tries very hard to get the formatting correct. But for some code, it won't
244be as good as hand-formatting. In particular, large data literals may become
245horribly disfigured under YAPF.
246
Diogo Moitinho de Almeida24458d02015-04-02 17:57:22 -0700247The reason for this is many-fold. But in essence YAPF is simply a tool to help
Bill Wendling8fb9c482015-03-29 17:32:07 -0700248with development. It will format things to coincide with the style guide, but
249that may not equate with readability.
250
251What can be done to alleviate this situation is to indicate regions YAPF should
252ignore when reformatting something:
253
254.. code-block:: python
255
256 # yapf: disable
257 FOO = {
258 # ... some very large, complex data literal.
259 }
260
261 BAR = [
262 # ... another large data literal.
263 ]
264 # yapf: enable
265
266You can also disable formatting for a single literal like this:
267
268.. code-block:: python
269
270 BAZ = {
271 [1, 2, 3, 4],
272 [5, 6, 7, 8],
273 [9, 10, 11, 12]
274 } # yapf: disable
275
Bill Wendling7d623452015-03-18 13:36:07 -0700276Why Not Improve Existing Tools?
Bill Wendling8fb9c482015-03-29 17:32:07 -0700277-------------------------------
Bill Wendling7d623452015-03-18 13:36:07 -0700278
279We wanted to use clang-format's reformatting algorithm. It's very powerful and
280designed to come up with the best formatting possible. Existing tools were
281created with different goals in mind, and would require extensive modifications
282to convert to using clang-format's algorithm.
283
Bill Wendling7d623452015-03-18 13:36:07 -0700284Can I Use YAPF In My Program?
Bill Wendling8fb9c482015-03-29 17:32:07 -0700285-----------------------------
Bill Wendling7d623452015-03-18 13:36:07 -0700286
287Please do! YAPF was designed to be used as a library as well as a command line
288tool. This means that a tool or IDE plugin is free to use YAPF.
289
Bill Wendling7d623452015-03-18 13:36:07 -0700290Gory Details
291============
292
293Algorithm Design
294----------------
295
Eli Benderskyd08130d2015-03-19 05:20:46 -0700296The main data structure in YAPF is the ``UnwrappedLine`` object. It holds a list
297of ``FormatToken``\s, that we would want to place on a single line if there were
298no column limit. An exception being a comment in the middle of an expression
Bill Wendling7d623452015-03-18 13:36:07 -0700299statement will force the line to be formatted on more than one line. The
Eli Benderskyd08130d2015-03-19 05:20:46 -0700300formatter works on one ``UnwrappedLine`` object at a time.
Bill Wendling7d623452015-03-18 13:36:07 -0700301
Eli Benderskyd08130d2015-03-19 05:20:46 -0700302An ``UnwrappedLine`` typically won't affect the formatting of lines before or
303after it. There is a part of the algorithm that may join two or more
304``UnwrappedLine``\s into one line. For instance, an if-then statement with a
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700305short body can be placed on a single line:
306
307.. code-block:: python
Bill Wendling7d623452015-03-18 13:36:07 -0700308
309 if a == 42: continue
310
311YAPF's formatting algorithm creates a weighted tree that acts as the solution
312space for the algorithm. Each node in the tree represents the result of a
313formatting decision --- i.e., whether to split or not to split before a token.
314Each formatting decision has a cost associated with it. Therefore, the cost is
315realized on the edge between two nodes. (In reality, the weighted tree doesn't
316have separate edge objects, so the cost resides on the nodes themselves.)
317
318For example, take the following Python code snippet. For the sake of this
319example, assume that line (1) violates the column limit restriction and needs to
320be reformatted.
321
Bill Wendlingfa22c892015-03-18 13:42:25 -0700322.. code-block:: python
Bill Wendling7d623452015-03-18 13:36:07 -0700323
Bill Wendlingfa22c892015-03-18 13:42:25 -0700324 def xxxxxxxxxxx(aaaaaaaaaaaa, bbbbbbbbb, cccccccc, dddddddd, eeeeee): # 1
325 pass # 2
Bill Wendling7d623452015-03-18 13:36:07 -0700326
327For line (1), the algorithm will build a tree where each node (a
Eli Benderskyd08130d2015-03-19 05:20:46 -0700328``FormattingDecisionState`` object) is the state of the line at that token given
329the decision to split before the token or not. Note: the ``FormatDecisionState``
330objects are copied by value so each node in the graph is unique and a change in
331one doesn't affect other nodes.
Bill Wendling7d623452015-03-18 13:36:07 -0700332
Bill Wendlingfa22c892015-03-18 13:42:25 -0700333Heuristics are used to determine the costs of splitting or not splitting.
334Because a node holds the state of the tree up to a token's insertion, it can
335easily determine if a splitting decision will violate one of the style
Bill Wendling7d623452015-03-18 13:36:07 -0700336requirements. For instance, the heuristic is able to apply an extra penalty to
337the edge when not splitting between the previous token and the one being added.
338
339There are some instances where we will never want to split the line, because
340doing so will always be detrimental (i.e., it will require a backslash-newline,
341which is very rarely desirable). For line (1), we will never want to split the
Eli Benderskyd08130d2015-03-19 05:20:46 -0700342first three tokens: ``def``, ``xxxxxxxxxxx``, and ``(``. Nor will we want to
343split between the ``)`` and the ``:`` at the end. These regions are said to be
344"unbreakable." This is reflected in the tree by there not being a "split"
Bill Wendling7d623452015-03-18 13:36:07 -0700345decision (left hand branch) within the unbreakable region.
346
347Now that we have the tree, we determine what the "best" formatting is by finding
348the path through the tree with the lowest cost.
349
350And that's it!