blob: 654cdbe70143b8edba5880f95e0e5b076d85012f [file] [log] [blame]
Bill Wendling7d623452015-03-18 13:36:07 -07001====
2YAPF
3====
4
Bill Wendling19c44d02015-04-07 23:48:05 -07005.. image:: https://badge.fury.io/py/yapf.svg
6 :target: http://badge.fury.io/py/yapf
7 :alt: PyPI version
8
Bill Wendlingfb8ab382015-03-18 20:24:14 -07009.. image:: https://travis-ci.org/google/yapf.svg?branch=master
10 :target: https://travis-ci.org/google/yapf
11 :alt: Build status
12
Bill Wendling14ac8812015-04-05 02:47:32 -070013.. image:: https://coveralls.io/repos/google/yapf/badge.svg?branch=master
14 :target: https://coveralls.io/r/google/yapf?branch=master
15 :alt: Coverage status
16
Bill Wendling7d623452015-03-18 13:36:07 -070017Introduction
18============
19
Bill Wendling5632e672015-03-29 17:06:07 -070020Most of the current formatters for Python --- e.g., autopep8, and pep8ify ---
21are made to remove lint errors from code. This has some obvious limitations.
22For instance, code that conforms to the PEP 8 guidelines may not be
23reformatted. But it doesn't mean that the code looks good.
Bill Wendling7d623452015-03-18 13:36:07 -070024
25YAPF takes a different approach. It's based off of 'clang-format', developed by
26Daniel Jasper. In essence, the algorithm takes the code and reformats it to the
27best formatting that conforms to the style guide, even if the original code
Peter Bengtsson1c60ad72015-03-24 20:05:39 -070028didn't violate the style guide. The idea is also similar to the 'gofmt' tool for
Eli Bendersky07072f82015-03-23 06:41:14 -070029the Go programming language: end all holy wars about formatting - if the whole
30code base of a project is simply piped through YAPF whenever modifications are
31made, the style remains consistent throughout the project and there's no point
32arguing about style in every code review.
Bill Wendling7d623452015-03-18 13:36:07 -070033
34The ultimate goal is that the code YAPF produces is as good as the code that a
Bill Wendling8fb9c482015-03-29 17:32:07 -070035programmer would write if they were following the style guide. It takes away
36some of the drudgery of maintaining your code.
Bill Wendling7d623452015-03-18 13:36:07 -070037
Bill Wendlingf5e50b62015-03-28 23:38:12 -070038.. footer::
Bill Wendling52e04112015-03-18 20:42:26 -070039
40 YAPF is not an official Google product (experimental or otherwise), it is
41 just code that happens to be owned by Google.
42
Bill Wendling7d623452015-03-18 13:36:07 -070043.. contents::
44
45Installation
46============
47
Eli Benderskye0e83c12015-04-06 20:23:30 -070048To install YAPF from PyPI::
Eli Bendersky8a365362015-03-25 18:42:22 -070049
Eli Benderskye0e83c12015-04-06 20:23:30 -070050 $ pip install yapf
51
52YAPF is still considered in "alpha" stage, and the released version may change
53often; therefore, the best way to keep up-to-date with the latest development
54is to clone this repository.
55
56Note that if you intend to use YAPF as a command-line tool rather than as a
57library, installation is not necessary. YAPF supports being run as a directory
58by the Python interpreter. If you cloned/unzipped YAPF into ``DIR``, it's
59possible to run::
Eli Bendersky07072f82015-03-23 06:41:14 -070060
Eli Benderskyb3678b32015-03-25 14:16:11 -070061 $ PYTHONPATH=DIR python DIR/yapf [options] ...
Eli Bendersky07072f82015-03-23 06:41:14 -070062
Eli Bendersky5eb88232015-03-27 06:27:11 -070063Python versions
64===============
65
Eli Benderskya7bfe7e2015-04-05 06:33:18 -070066YAPF supports Python 2.7 and 3.4.1+.
Eli Bendersky5eb88232015-03-27 06:27:11 -070067
68YAPF requires the code it formats to be valid Python for the version YAPF itself
69runs under. Therefore, if you format Python 3 code with YAPF, run YAPF itself
70under Python 3 (and similarly for Python 2).
71
Bill Wendling7d623452015-03-18 13:36:07 -070072Usage
73=====
74
Bill Wendlingfa22c892015-03-18 13:42:25 -070075Options::
Bill Wendling7d623452015-03-18 13:36:07 -070076
Sam Clegg5170c3a2015-04-16 12:18:58 -070077 usage: yapf [-h] [--version] [--style-help] [--style STYLE] [--verify]
78 [-d | -i] [-l START-END | -r]
79 [files [files ...]]
Bill Wendling7d623452015-03-18 13:36:07 -070080
Bill Wendlingfa22c892015-03-18 13:42:25 -070081 Formatter for Python code.
Bill Wendling7d623452015-03-18 13:36:07 -070082
Bill Wendlingfa22c892015-03-18 13:42:25 -070083 positional arguments:
84 files
85
86 optional arguments:
87 -h, --help show this help message and exit
Sam Clegg5170c3a2015-04-16 12:18:58 -070088 --version show version number and exit
89 --style-help show style settings and exit
Eli Bendersky83d2bd02015-03-23 06:33:48 -070090 --style STYLE specify formatting style: either a style name (for
91 example "pep8" or "google"), or the name of a file
Sam Clegg5170c3a2015-04-16 12:18:58 -070092 with style settings. The default is pep8 unless a
93 .style.yapf file located in one of the parent
94 directories of the source file (or current directory
95 for stdin)
96 --verify try to verify refomatted code for syntax errors
Bill Wendlingfa22c892015-03-18 13:42:25 -070097 -d, --diff print the diff for the fixed source
98 -i, --in-place make changes to files in place
99 -l START-END, --lines START-END
100 range of lines to reformat, one-based
101 -r, --recursive run recursively over directories
Bill Wendling7d623452015-03-18 13:36:07 -0700102
Eli Bendersky83d2bd02015-03-23 06:33:48 -0700103Formatting style
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700104================
Eli Bendersky83d2bd02015-03-23 06:33:48 -0700105
106The formatting style used by YAPF is configurable and there are many "knobs"
107that can be used to tune how YAPF does formatting. See the ``style.py`` module
108for the full list.
109
Bill Wendlingc0167792015-04-02 01:58:39 -0700110To control the style, run YAPF with the ``--style`` argument. It accepts one of
111the predefined styles (e.g., ``pep8`` or ``google``), a path to a configuration
112file that specifies the desired style, or a dictionary of key/value pairs.
113
114The config file is a simple listing of (case-insensitive) ``key = value`` pairs
115with a ``[style]`` heading. For example::
Eli Bendersky83d2bd02015-03-23 06:33:48 -0700116
117 [style]
118 based_on_style = pep8
119 spaces_before_comment = 4
120 split_before_logical_operator = true
121
122The ``based_on_style`` setting determines which of the predefined styles this
123custom style is based on (think of it like subclassing).
Bill Wendling7d623452015-03-18 13:36:07 -0700124
Bill Wendlingc0167792015-04-02 01:58:39 -0700125It's also possible to do the same on the command line with a dictionary. For
126example::
127
128 --style='{based_on_style: google, indent_width: 4}'
129
130This will take the ``google`` base style and modify it to have four space
131indentations.
132
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700133Example
134=======
135
Sam Clegg4357fa32015-04-08 12:21:46 -0700136An example of the type of formatting that YAPF can do, it will take this ugly
137code:
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700138
139.. code-block:: python
140
141 x = { 'a':37,'b':42,
142
143 'c':927}
144
145 y = 'hello ''world'
146 z = 'hello '+'world'
147 a = 'hello {}'.format('world')
148 class foo ( object ):
149 def f (self ):
150 return 37*-+2
151 def g(self, x,y=42):
152 return y
153 def f ( a ) :
154 return 37+-+a[42-x : y**3]
155
Bill Wendling8fb9c482015-03-29 17:32:07 -0700156and reformat it into:
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700157
158.. code-block:: python
159
160 x = {'a': 37, 'b': 42, 'c': 927}
161
162 y = 'hello ' 'world'
163 z = 'hello ' + 'world'
164 a = 'hello {}'.format('world')
165
166
167 class foo(object):
Bill Wendling5632e672015-03-29 17:06:07 -0700168 def f(self):
169 return 37 * -+2
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700170
Bill Wendling5632e672015-03-29 17:06:07 -0700171 def g(self, x, y=42):
172 return y
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700173
174
175 def f(a):
Bill Wendling8d8f5122015-10-16 11:46:23 -0700176 return 37 + -+a[42 - x:y**3]
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700177
Andy Haydena00a6bf2015-06-15 18:47:41 -0700178Example as a module
179===================
180
Andy Hayden4af71682015-06-17 15:42:43 -0700181The two main APIs for calling yapf are ``FormatCode`` and ``FormatFile``, these
182share several arguments which are described below:
Andy Haydena00a6bf2015-06-15 18:47:41 -0700183
184.. code-block:: python
185
Andy Hayden4af71682015-06-17 15:42:43 -0700186 >>> from yapf.yapf_api import FormatCode # reformat a string of code
Ɓukasz Langa94089872015-09-22 16:02:26 -0700187
Andy Haydena00a6bf2015-06-15 18:47:41 -0700188 >>> FormatCode("f ( a = 1, b = 2 )")
189 'f(a=1, b=2)\n'
190
Andy Hayden4af71682015-06-17 15:42:43 -0700191A ``style_config`` argument: Either a style name or a path to a file that contains
Andy Haydena00a6bf2015-06-15 18:47:41 -0700192formatting style settings. If None is specified, use the default style
193as set in ``style.DEFAULT_STYLE_FACTORY``.
194
195.. code-block:: python
196
197 >>> FormatCode("def g():\n return True", style_config='pep8')
198 'def g():\n return True\n'
199
Andy Haydena00a6bf2015-06-15 18:47:41 -0700200A ``lines`` argument: A list of tuples of lines (ints), [start, end],
201that we want to format. The lines are 1-based indexed. It can be used by
202third-party code (e.g., IDEs) when reformatting a snippet of code rather
203than a whole file.
204
205.. code-block:: python
206
207 >>> FormatCode("def g( ):\n a=1\n b = 2\n return a==b", lines=[(1, 1), (2, 3)])
208 'def g():\n a = 1\n b = 2\n return a==b\n'
209
210A ``print_diff`` (bool): Instead of returning the reformatted source, return a
211diff that turns the formatted source into reformatter source.
212
213.. code-block:: python
214
215 >>> print(FormatCode("a==b", filename="foo.py", print_diff=True))
Bill Wendlingb8645ea2015-06-30 22:27:56 -0700216 --- foo.py (original)
217 +++ foo.py (reformatted)
Andy Haydena00a6bf2015-06-15 18:47:41 -0700218 @@ -1 +1 @@
219 -a==b
220 +a == b
221
Andy Hayden4af71682015-06-17 15:42:43 -0700222Note: the ``filename`` argument for ``FormatCode`` is what is inserted into
223the diff, the default is ``<unknown>``.
Andy Haydena00a6bf2015-06-15 18:47:41 -0700224
225``FormatFile`` returns reformatted code from the passed file along with its encoding:
226
227.. code-block:: python
228
Andy Hayden4af71682015-06-17 15:42:43 -0700229 >>> from yapf.yapf_api import FormatFile # reformat a file
230
Andy Haydena00a6bf2015-06-15 18:47:41 -0700231 >>> print(open("foo.py").read()) # contents of file
232 a==b
233
234 >>> FormatFile("foo.py")
Andy Hayden4af71682015-06-17 15:42:43 -0700235 ('a == b\n', 'utf-8')
236
Bill Wendlingcfbb1242015-09-20 12:08:18 -0700237The ``in-place`` argument saves the reformatted code back to the file:
Andy Hayden4af71682015-06-17 15:42:43 -0700238
239.. code-block:: python
240
241 >>> FormatFile("foo.py", in_place=True)
242 (None, 'utf-8')
243
244 >>> print(open("foo.py").read()) # contents of file (now fixed)
245 a == b
246
Andy Haydena00a6bf2015-06-15 18:47:41 -0700247
Bill Wendling8fb9c482015-03-29 17:32:07 -0700248(Potentially) Frequently Asked Questions
249========================================
250
251Why does YAPF destroy my awesome formatting?
252--------------------------------------------
253
254YAPF tries very hard to get the formatting correct. But for some code, it won't
255be as good as hand-formatting. In particular, large data literals may become
256horribly disfigured under YAPF.
257
Diogo Moitinho de Almeida24458d02015-04-02 17:57:22 -0700258The reason for this is many-fold. But in essence YAPF is simply a tool to help
Bill Wendling8fb9c482015-03-29 17:32:07 -0700259with development. It will format things to coincide with the style guide, but
260that may not equate with readability.
261
262What can be done to alleviate this situation is to indicate regions YAPF should
263ignore when reformatting something:
264
265.. code-block:: python
266
267 # yapf: disable
268 FOO = {
269 # ... some very large, complex data literal.
270 }
271
272 BAR = [
273 # ... another large data literal.
274 ]
275 # yapf: enable
276
277You can also disable formatting for a single literal like this:
278
279.. code-block:: python
280
281 BAZ = {
Scott Sandersoneda4e262015-07-05 21:10:06 -0400282 (1, 2, 3, 4),
283 (5, 6, 7, 8),
284 (9, 10, 11, 12),
Bill Wendling8fb9c482015-03-29 17:32:07 -0700285 } # yapf: disable
286
Ɓukasz Langa94089872015-09-22 16:02:26 -0700287To preserve the nice dedented closing brackets, use the
288``dedent_closing_brackets`` in your style. Note that in this case all
289brackets, including function definitions and calls, are going to use
290that style. This provides consistency across the formatted codebase.
291
Bill Wendling7d623452015-03-18 13:36:07 -0700292Why Not Improve Existing Tools?
Bill Wendling8fb9c482015-03-29 17:32:07 -0700293-------------------------------
Bill Wendling7d623452015-03-18 13:36:07 -0700294
295We wanted to use clang-format's reformatting algorithm. It's very powerful and
296designed to come up with the best formatting possible. Existing tools were
297created with different goals in mind, and would require extensive modifications
298to convert to using clang-format's algorithm.
299
Bill Wendling7d623452015-03-18 13:36:07 -0700300Can I Use YAPF In My Program?
Bill Wendling8fb9c482015-03-29 17:32:07 -0700301-----------------------------
Bill Wendling7d623452015-03-18 13:36:07 -0700302
303Please do! YAPF was designed to be used as a library as well as a command line
304tool. This means that a tool or IDE plugin is free to use YAPF.
305
Bill Wendling7d623452015-03-18 13:36:07 -0700306Gory Details
307============
308
309Algorithm Design
310----------------
311
Eli Benderskyd08130d2015-03-19 05:20:46 -0700312The main data structure in YAPF is the ``UnwrappedLine`` object. It holds a list
313of ``FormatToken``\s, that we would want to place on a single line if there were
314no column limit. An exception being a comment in the middle of an expression
Bill Wendling7d623452015-03-18 13:36:07 -0700315statement will force the line to be formatted on more than one line. The
Eli Benderskyd08130d2015-03-19 05:20:46 -0700316formatter works on one ``UnwrappedLine`` object at a time.
Bill Wendling7d623452015-03-18 13:36:07 -0700317
Eli Benderskyd08130d2015-03-19 05:20:46 -0700318An ``UnwrappedLine`` typically won't affect the formatting of lines before or
319after it. There is a part of the algorithm that may join two or more
320``UnwrappedLine``\s into one line. For instance, an if-then statement with a
Bill Wendlingf5e50b62015-03-28 23:38:12 -0700321short body can be placed on a single line:
322
323.. code-block:: python
Bill Wendling7d623452015-03-18 13:36:07 -0700324
325 if a == 42: continue
326
327YAPF's formatting algorithm creates a weighted tree that acts as the solution
328space for the algorithm. Each node in the tree represents the result of a
329formatting decision --- i.e., whether to split or not to split before a token.
330Each formatting decision has a cost associated with it. Therefore, the cost is
331realized on the edge between two nodes. (In reality, the weighted tree doesn't
332have separate edge objects, so the cost resides on the nodes themselves.)
333
334For example, take the following Python code snippet. For the sake of this
335example, assume that line (1) violates the column limit restriction and needs to
336be reformatted.
337
Bill Wendlingfa22c892015-03-18 13:42:25 -0700338.. code-block:: python
Bill Wendling7d623452015-03-18 13:36:07 -0700339
Bill Wendlingfa22c892015-03-18 13:42:25 -0700340 def xxxxxxxxxxx(aaaaaaaaaaaa, bbbbbbbbb, cccccccc, dddddddd, eeeeee): # 1
341 pass # 2
Bill Wendling7d623452015-03-18 13:36:07 -0700342
343For line (1), the algorithm will build a tree where each node (a
Eli Benderskyd08130d2015-03-19 05:20:46 -0700344``FormattingDecisionState`` object) is the state of the line at that token given
345the decision to split before the token or not. Note: the ``FormatDecisionState``
346objects are copied by value so each node in the graph is unique and a change in
347one doesn't affect other nodes.
Bill Wendling7d623452015-03-18 13:36:07 -0700348
Bill Wendlingfa22c892015-03-18 13:42:25 -0700349Heuristics are used to determine the costs of splitting or not splitting.
350Because a node holds the state of the tree up to a token's insertion, it can
351easily determine if a splitting decision will violate one of the style
Bill Wendling7d623452015-03-18 13:36:07 -0700352requirements. For instance, the heuristic is able to apply an extra penalty to
353the edge when not splitting between the previous token and the one being added.
354
355There are some instances where we will never want to split the line, because
356doing so will always be detrimental (i.e., it will require a backslash-newline,
357which is very rarely desirable). For line (1), we will never want to split the
Eli Benderskyd08130d2015-03-19 05:20:46 -0700358first three tokens: ``def``, ``xxxxxxxxxxx``, and ``(``. Nor will we want to
359split between the ``)`` and the ``:`` at the end. These regions are said to be
360"unbreakable." This is reflected in the tree by there not being a "split"
Bill Wendling7d623452015-03-18 13:36:07 -0700361decision (left hand branch) within the unbreakable region.
362
363Now that we have the tree, we determine what the "best" formatting is by finding
364the path through the tree with the lowest cost.
365
366And that's it!