blob: fc63195de90ea7934cf92892eb62a1c2f72e7710 [file] [log] [blame]
Bill Wendling7d623452015-03-18 13:36:07 -07001====
2YAPF
3====
4
Bill Wendling19c44d02015-04-07 23:48:05 -07005.. image:: https://badge.fury.io/py/yapf.svg
6 :target: http://badge.fury.io/py/yapf
7 :alt: PyPI version
8
Bill Wendlingfb8ab382015-03-18 20:24:14 -07009.. image:: https://travis-ci.org/google/yapf.svg?branch=master
10 :target: https://travis-ci.org/google/yapf
11 :alt: Build status
12
Bill Wendling14ac8812015-04-05 02:47:32 -070013.. image:: https://coveralls.io/repos/google/yapf/badge.svg?branch=master
14 :target: https://coveralls.io/r/google/yapf?branch=master
15 :alt: Coverage status
16
Bill Wendling7d623452015-03-18 13:36:07 -070017Introduction
18============
19
Bill Wendling5632e672015-03-29 17:06:07 -070020Most of the current formatters for Python --- e.g., autopep8, and pep8ify ---
21are made to remove lint errors from code. This has some obvious limitations.
22For instance, code that conforms to the PEP 8 guidelines may not be
23reformatted. But it doesn't mean that the code looks good.
Bill Wendling7d623452015-03-18 13:36:07 -070024
25YAPF takes a different approach. It's based off of 'clang-format', developed by
26Daniel Jasper. In essence, the algorithm takes the code and reformats it to the
27best formatting that conforms to the style guide, even if the original code
Peter Bengtsson1c60ad72015-03-24 20:05:39 -070028didn't violate the style guide. The idea is also similar to the 'gofmt' tool for
Eli Bendersky07072f82015-03-23 06:41:14 -070029the Go programming language: end all holy wars about formatting - if the whole
30code base of a project is simply piped through YAPF whenever modifications are
31made, the style remains consistent throughout the project and there's no point
32arguing about style in every code review.
Bill Wendling7d623452015-03-18 13:36:07 -070033
34The ultimate goal is that the code YAPF produces is as good as the code that a
Bill Wendling8fb9c482015-03-29 17:32:07 -070035programmer would write if they were following the style guide. It takes away
36some of the drudgery of maintaining your code.
Bill Wendling7d623452015-03-18 13:36:07 -070037
Bill Wendlingf5e50b62015-03-28 23:38:12 -070038.. footer::
Bill Wendling52e04112015-03-18 20:42:26 -070039
40 YAPF is not an official Google product (experimental or otherwise), it is
41 just code that happens to be owned by Google.
42
Bill Wendling7d623452015-03-18 13:36:07 -070043.. contents::
44
45Installation
46============
47
Eli Benderskye0e83c12015-04-06 20:23:30 -070048To install YAPF from PyPI::
Eli Bendersky8a365362015-03-25 18:42:22 -070049
Eli Benderskye0e83c12015-04-06 20:23:30 -070050 $ pip install yapf
51
52YAPF is still considered in "alpha" stage, and the released version may change
53often; therefore, the best way to keep up-to-date with the latest development
54is to clone this repository.
55
56Note that if you intend to use YAPF as a command-line tool rather than as a
57library, installation is not necessary. YAPF supports being run as a directory
58by the Python interpreter. If you cloned/unzipped YAPF into ``DIR``, it's
59possible to run::
Eli Bendersky07072f82015-03-23 06:41:14 -070060
Eli Benderskyb3678b32015-03-25 14:16:11 -070061 $ PYTHONPATH=DIR python DIR/yapf [options] ...
Eli Bendersky07072f82015-03-23 06:41:14 -070062
Eli Bendersky5eb88232015-03-27 06:27:11 -070063Python versions
64===============
65
Eli Benderskya7bfe7e2015-04-05 06:33:18 -070066YAPF supports Python 2.7 and 3.4.1+.
Eli Bendersky5eb88232015-03-27 06:27:11 -070067
68YAPF requires the code it formats to be valid Python for the version YAPF itself
69runs under. Therefore, if you format Python 3 code with YAPF, run YAPF itself
70under Python 3 (and similarly for Python 2).
71
Bill Wendling7d623452015-03-18 13:36:07 -070072Usage
73=====
74
Bill Wendlingfa22c892015-03-18 13:42:25 -070075Options::
Bill Wendling7d623452015-03-18 13:36:07 -070076
Sam Clegg5170c3a2015-04-16 12:18:58 -070077 usage: yapf [-h] [--version] [--style-help] [--style STYLE] [--verify]
78 [-d | -i] [-l START-END | -r]
79 [files [files ...]]
Bill Wendling7d623452015-03-18 13:36:07 -070080
Bill Wendlingfa22c892015-03-18 13:42:25 -070081 Formatter for Python code.
Bill Wendling7d623452015-03-18 13:36:07 -070082
Bill Wendlingfa22c892015-03-18 13:42:25 -070083 positional arguments:
84 files
85
86 optional arguments:
87 -h, --help show this help message and exit
Sam Clegg5170c3a2015-04-16 12:18:58 -070088 --version show version number and exit
89 --style-help show style settings and exit
Eli Bendersky83d2bd02015-03-23 06:33:48 -070090 --style STYLE specify formatting style: either a style name (for
91 example "pep8" or "google"), or the name of a file
Sam Clegg5170c3a2015-04-16 12:18:58 -070092 with style settings. The default is pep8 unless a
93 .style.yapf file located in one of the parent
94 directories of the source file (or current directory
95 for stdin)
96 --verify try to verify refomatted code for syntax errors
Bill Wendlingfa22c892015-03-18 13:42:25 -070097 -d, --diff print the diff for the fixed source
98 -i, --in-place make changes to files in place
99 -l START-END, --lines START-END
100 range of lines to reformat, one-based
101 -r, --recursive run recursively over directories
Bill Wendling7d623452015-03-18 13:36:07 -0700102
Eli Bendersky83d2bd02015-03-23 06:33:48 -0700103Formatting style
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700104================
Eli Bendersky83d2bd02015-03-23 06:33:48 -0700105
106The formatting style used by YAPF is configurable and there are many "knobs"
107that can be used to tune how YAPF does formatting. See the ``style.py`` module
108for the full list.
109
Bill Wendlingc0167792015-04-02 01:58:39 -0700110To control the style, run YAPF with the ``--style`` argument. It accepts one of
111the predefined styles (e.g., ``pep8`` or ``google``), a path to a configuration
112file that specifies the desired style, or a dictionary of key/value pairs.
113
114The config file is a simple listing of (case-insensitive) ``key = value`` pairs
115with a ``[style]`` heading. For example::
Eli Bendersky83d2bd02015-03-23 06:33:48 -0700116
117 [style]
118 based_on_style = pep8
119 spaces_before_comment = 4
120 split_before_logical_operator = true
121
122The ``based_on_style`` setting determines which of the predefined styles this
123custom style is based on (think of it like subclassing).
Bill Wendling7d623452015-03-18 13:36:07 -0700124
Bill Wendlingc0167792015-04-02 01:58:39 -0700125It's also possible to do the same on the command line with a dictionary. For
126example::
127
128 --style='{based_on_style: google, indent_width: 4}'
129
130This will take the ``google`` base style and modify it to have four space
131indentations.
132
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700133Example
134=======
135
Sam Clegg4357fa32015-04-08 12:21:46 -0700136An example of the type of formatting that YAPF can do, it will take this ugly
137code:
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700138
139.. code-block:: python
140
141 x = { 'a':37,'b':42,
142
143 'c':927}
144
145 y = 'hello ''world'
146 z = 'hello '+'world'
147 a = 'hello {}'.format('world')
148 class foo ( object ):
149 def f (self ):
150 return 37*-+2
151 def g(self, x,y=42):
152 return y
153 def f ( a ) :
154 return 37+-+a[42-x : y**3]
155
Bill Wendling8fb9c482015-03-29 17:32:07 -0700156and reformat it into:
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700157
158.. code-block:: python
159
160 x = {'a': 37, 'b': 42, 'c': 927}
161
162 y = 'hello ' 'world'
163 z = 'hello ' + 'world'
164 a = 'hello {}'.format('world')
165
166
167 class foo(object):
Bill Wendling5632e672015-03-29 17:06:07 -0700168 def f(self):
169 return 37 * -+2
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700170
Bill Wendling5632e672015-03-29 17:06:07 -0700171 def g(self, x, y=42):
172 return y
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700173
174
175 def f(a):
Bill Wendling5632e672015-03-29 17:06:07 -0700176 return 37 + -+a[42 - x:y ** 3]
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700177
Andy Haydena00a6bf2015-06-15 18:47:41 -0700178Example as a module
179===================
180
Andy Hayden4af71682015-06-17 15:42:43 -0700181The two main APIs for calling yapf are ``FormatCode`` and ``FormatFile``, these
182share several arguments which are described below:
Andy Haydena00a6bf2015-06-15 18:47:41 -0700183
184.. code-block:: python
185
Andy Hayden4af71682015-06-17 15:42:43 -0700186 >>> from yapf.yapf_api import FormatCode # reformat a string of code
Andy Haydena00a6bf2015-06-15 18:47:41 -0700187
188 >>> FormatCode("f ( a = 1, b = 2 )")
189 'f(a=1, b=2)\n'
190
Andy Hayden4af71682015-06-17 15:42:43 -0700191A ``style_config`` argument: Either a style name or a path to a file that contains
Andy Haydena00a6bf2015-06-15 18:47:41 -0700192formatting style settings. If None is specified, use the default style
193as set in ``style.DEFAULT_STYLE_FACTORY``.
194
195.. code-block:: python
196
197 >>> FormatCode("def g():\n return True", style_config='pep8')
198 'def g():\n return True\n'
199
Andy Haydena00a6bf2015-06-15 18:47:41 -0700200A ``lines`` argument: A list of tuples of lines (ints), [start, end],
201that we want to format. The lines are 1-based indexed. It can be used by
202third-party code (e.g., IDEs) when reformatting a snippet of code rather
203than a whole file.
204
205.. code-block:: python
206
207 >>> FormatCode("def g( ):\n a=1\n b = 2\n return a==b", lines=[(1, 1), (2, 3)])
208 'def g():\n a = 1\n b = 2\n return a==b\n'
209
210A ``print_diff`` (bool): Instead of returning the reformatted source, return a
211diff that turns the formatted source into reformatter source.
212
213.. code-block:: python
214
215 >>> print(FormatCode("a==b", filename="foo.py", print_diff=True))
Bill Wendlingb8645ea2015-06-30 22:27:56 -0700216 --- foo.py (original)
217 +++ foo.py (reformatted)
Andy Haydena00a6bf2015-06-15 18:47:41 -0700218 @@ -1 +1 @@
219 -a==b
220 +a == b
221
Andy Hayden4af71682015-06-17 15:42:43 -0700222Note: the ``filename`` argument for ``FormatCode`` is what is inserted into
223the diff, the default is ``<unknown>``.
Andy Haydena00a6bf2015-06-15 18:47:41 -0700224
225``FormatFile`` returns reformatted code from the passed file along with its encoding:
226
227.. code-block:: python
228
Andy Hayden4af71682015-06-17 15:42:43 -0700229 >>> from yapf.yapf_api import FormatFile # reformat a file
230
Andy Haydena00a6bf2015-06-15 18:47:41 -0700231 >>> print(open("foo.py").read()) # contents of file
232 a==b
233
234 >>> FormatFile("foo.py")
Andy Hayden4af71682015-06-17 15:42:43 -0700235 ('a == b\n', 'utf-8')
236
237The ``in_place`` argument saves the reformatted code back to the file:
238
239.. code-block:: python
240
241 >>> FormatFile("foo.py", in_place=True)
242 (None, 'utf-8')
243
244 >>> print(open("foo.py").read()) # contents of file (now fixed)
245 a == b
246
Andy Haydena00a6bf2015-06-15 18:47:41 -0700247
Bill Wendling8fb9c482015-03-29 17:32:07 -0700248(Potentially) Frequently Asked Questions
249========================================
250
251Why does YAPF destroy my awesome formatting?
252--------------------------------------------
253
254YAPF tries very hard to get the formatting correct. But for some code, it won't
255be as good as hand-formatting. In particular, large data literals may become
256horribly disfigured under YAPF.
257
Diogo Moitinho de Almeida24458d02015-04-02 17:57:22 -0700258The reason for this is many-fold. But in essence YAPF is simply a tool to help
Bill Wendling8fb9c482015-03-29 17:32:07 -0700259with development. It will format things to coincide with the style guide, but
260that may not equate with readability.
261
262What can be done to alleviate this situation is to indicate regions YAPF should
263ignore when reformatting something:
264
265.. code-block:: python
266
267 # yapf: disable
268 FOO = {
269 # ... some very large, complex data literal.
270 }
271
272 BAR = [
273 # ... another large data literal.
274 ]
275 # yapf: enable
276
277You can also disable formatting for a single literal like this:
278
279.. code-block:: python
280
281 BAZ = {
282 [1, 2, 3, 4],
283 [5, 6, 7, 8],
284 [9, 10, 11, 12]
285 } # yapf: disable
286
Bill Wendling7d623452015-03-18 13:36:07 -0700287Why Not Improve Existing Tools?
Bill Wendling8fb9c482015-03-29 17:32:07 -0700288-------------------------------
Bill Wendling7d623452015-03-18 13:36:07 -0700289
290We wanted to use clang-format's reformatting algorithm. It's very powerful and
291designed to come up with the best formatting possible. Existing tools were
292created with different goals in mind, and would require extensive modifications
293to convert to using clang-format's algorithm.
294
Bill Wendling7d623452015-03-18 13:36:07 -0700295Can I Use YAPF In My Program?
Bill Wendling8fb9c482015-03-29 17:32:07 -0700296-----------------------------
Bill Wendling7d623452015-03-18 13:36:07 -0700297
298Please do! YAPF was designed to be used as a library as well as a command line
299tool. This means that a tool or IDE plugin is free to use YAPF.
300
Bill Wendling7d623452015-03-18 13:36:07 -0700301Gory Details
302============
303
304Algorithm Design
305----------------
306
Eli Benderskyd08130d2015-03-19 05:20:46 -0700307The main data structure in YAPF is the ``UnwrappedLine`` object. It holds a list
308of ``FormatToken``\s, that we would want to place on a single line if there were
309no column limit. An exception being a comment in the middle of an expression
Bill Wendling7d623452015-03-18 13:36:07 -0700310statement will force the line to be formatted on more than one line. The
Eli Benderskyd08130d2015-03-19 05:20:46 -0700311formatter works on one ``UnwrappedLine`` object at a time.
Bill Wendling7d623452015-03-18 13:36:07 -0700312
Eli Benderskyd08130d2015-03-19 05:20:46 -0700313An ``UnwrappedLine`` typically won't affect the formatting of lines before or
314after it. There is a part of the algorithm that may join two or more
315``UnwrappedLine``\s into one line. For instance, an if-then statement with a
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700316short body can be placed on a single line:
317
318.. code-block:: python
Bill Wendling7d623452015-03-18 13:36:07 -0700319
320 if a == 42: continue
321
322YAPF's formatting algorithm creates a weighted tree that acts as the solution
323space for the algorithm. Each node in the tree represents the result of a
324formatting decision --- i.e., whether to split or not to split before a token.
325Each formatting decision has a cost associated with it. Therefore, the cost is
326realized on the edge between two nodes. (In reality, the weighted tree doesn't
327have separate edge objects, so the cost resides on the nodes themselves.)
328
329For example, take the following Python code snippet. For the sake of this
330example, assume that line (1) violates the column limit restriction and needs to
331be reformatted.
332
Bill Wendlingfa22c892015-03-18 13:42:25 -0700333.. code-block:: python
Bill Wendling7d623452015-03-18 13:36:07 -0700334
Bill Wendlingfa22c892015-03-18 13:42:25 -0700335 def xxxxxxxxxxx(aaaaaaaaaaaa, bbbbbbbbb, cccccccc, dddddddd, eeeeee): # 1
336 pass # 2
Bill Wendling7d623452015-03-18 13:36:07 -0700337
338For line (1), the algorithm will build a tree where each node (a
Eli Benderskyd08130d2015-03-19 05:20:46 -0700339``FormattingDecisionState`` object) is the state of the line at that token given
340the decision to split before the token or not. Note: the ``FormatDecisionState``
341objects are copied by value so each node in the graph is unique and a change in
342one doesn't affect other nodes.
Bill Wendling7d623452015-03-18 13:36:07 -0700343
Bill Wendlingfa22c892015-03-18 13:42:25 -0700344Heuristics are used to determine the costs of splitting or not splitting.
345Because a node holds the state of the tree up to a token's insertion, it can
346easily determine if a splitting decision will violate one of the style
Bill Wendling7d623452015-03-18 13:36:07 -0700347requirements. For instance, the heuristic is able to apply an extra penalty to
348the edge when not splitting between the previous token and the one being added.
349
350There are some instances where we will never want to split the line, because
351doing so will always be detrimental (i.e., it will require a backslash-newline,
352which is very rarely desirable). For line (1), we will never want to split the
Eli Benderskyd08130d2015-03-19 05:20:46 -0700353first three tokens: ``def``, ``xxxxxxxxxxx``, and ``(``. Nor will we want to
354split between the ``)`` and the ``:`` at the end. These regions are said to be
355"unbreakable." This is reflected in the tree by there not being a "split"
Bill Wendling7d623452015-03-18 13:36:07 -0700356decision (left hand branch) within the unbreakable region.
357
358Now that we have the tree, we determine what the "best" formatting is by finding
359the path through the tree with the lowest cost.
360
361And that's it!