blob: 227f596cccfe9b56bc8b78ab0757cc4adfd9802c [file] [log] [blame]
Bill Wendling7d623452015-03-18 13:36:07 -07001====
2YAPF
3====
4
Bill Wendlingfb8ab382015-03-18 20:24:14 -07005.. image:: https://travis-ci.org/google/yapf.svg?branch=master
6 :target: https://travis-ci.org/google/yapf
7 :alt: Build status
8
Bill Wendling7d623452015-03-18 13:36:07 -07009Introduction
10============
11
12Most of the current formatters for Python -- e.g., autopep8, and pep8ify -- are
13made to remove lint errors from code. This has some obvious limitations. For
14instance, code that conforms to the PEP 8 guidelines may not be reformatted.
15But it doesn't mean that the code looks good.
16
17YAPF takes a different approach. It's based off of 'clang-format', developed by
18Daniel Jasper. In essence, the algorithm takes the code and reformats it to the
19best formatting that conforms to the style guide, even if the original code
20didn't violate the style guide.
21
22The ultimate goal is that the code YAPF produces is as good as the code that a
23programmer would write if they were following the style guide.
24
Bill Wendling52e04112015-03-18 20:42:26 -070025.. note::
26
27 YAPF is not an official Google product (experimental or otherwise), it is
28 just code that happens to be owned by Google.
29
Bill Wendling7d623452015-03-18 13:36:07 -070030.. contents::
31
32Installation
33============
34
Eli Bendersky891f4382015-03-20 15:28:49 -070035YAPF supports Python 2.7 and 3.4+.
Eli Bendersky78b616e2015-03-19 05:17:31 -070036
37To install YAPF from the source directory::
Bill Wendling7d623452015-03-18 13:36:07 -070038
39 $ sudo python ./setup.py install
40
Bill Wendling7d623452015-03-18 13:36:07 -070041Usage
42=====
43
Bill Wendlingfa22c892015-03-18 13:42:25 -070044Options::
Bill Wendling7d623452015-03-18 13:36:07 -070045
Eli Bendersky83d2bd02015-03-23 06:33:48 -070046
47 usage: yapf [-h] [--style STYLE] [-d | -i] [-l START-END | -r] ...
Bill Wendling7d623452015-03-18 13:36:07 -070048
Bill Wendlingfa22c892015-03-18 13:42:25 -070049 Formatter for Python code.
Bill Wendling7d623452015-03-18 13:36:07 -070050
Bill Wendlingfa22c892015-03-18 13:42:25 -070051 positional arguments:
52 files
53
54 optional arguments:
55 -h, --help show this help message and exit
Eli Bendersky83d2bd02015-03-23 06:33:48 -070056 --style STYLE specify formatting style: either a style name (for
57 example "pep8" or "google"), or the name of a file
58 with style settings
Bill Wendlingfa22c892015-03-18 13:42:25 -070059 -d, --diff print the diff for the fixed source
60 -i, --in-place make changes to files in place
61 -l START-END, --lines START-END
62 range of lines to reformat, one-based
63 -r, --recursive run recursively over directories
Bill Wendling7d623452015-03-18 13:36:07 -070064
Eli Bendersky891f4382015-03-20 15:28:49 -070065Note: after reformatting a chunk of code, YAPF verifies that it's correct (can
66be parsed by Python itself). This means that if you're reformatting Python 3
67code, it's best to run YAPF itself under Python 3. The same goes for Python 2.
68
Eli Bendersky83d2bd02015-03-23 06:33:48 -070069Formatting style
70----------------
71
72The formatting style used by YAPF is configurable and there are many "knobs"
73that can be used to tune how YAPF does formatting. See the ``style.py`` module
74for the full list.
75
76To control the style, run YAPF with the ``--style`` argument. It accepts either
77one of the predefined styles (currently "pep8" or "google"), or a path to a
78configuration file that specifies the desired style. The file is a simple
79listing of (case-insensitive) ``key = value`` pairs with a ``[style]`` heading.
Eli Bendersky159fa1e2015-03-23 06:34:22 -070080For example::
Eli Bendersky83d2bd02015-03-23 06:33:48 -070081
82 [style]
83 based_on_style = pep8
84 spaces_before_comment = 4
85 split_before_logical_operator = true
86
87The ``based_on_style`` setting determines which of the predefined styles this
88custom style is based on (think of it like subclassing).
Bill Wendling7d623452015-03-18 13:36:07 -070089
90Why Not Improve Existing Tools?
91===============================
92
93We wanted to use clang-format's reformatting algorithm. It's very powerful and
94designed to come up with the best formatting possible. Existing tools were
95created with different goals in mind, and would require extensive modifications
96to convert to using clang-format's algorithm.
97
98
99Can I Use YAPF In My Program?
100=============================
101
102Please do! YAPF was designed to be used as a library as well as a command line
103tool. This means that a tool or IDE plugin is free to use YAPF.
104
105
106Gory Details
107============
108
109Algorithm Design
110----------------
111
Eli Benderskyd08130d2015-03-19 05:20:46 -0700112The main data structure in YAPF is the ``UnwrappedLine`` object. It holds a list
113of ``FormatToken``\s, that we would want to place on a single line if there were
114no column limit. An exception being a comment in the middle of an expression
Bill Wendling7d623452015-03-18 13:36:07 -0700115statement will force the line to be formatted on more than one line. The
Eli Benderskyd08130d2015-03-19 05:20:46 -0700116formatter works on one ``UnwrappedLine`` object at a time.
Bill Wendling7d623452015-03-18 13:36:07 -0700117
Eli Benderskyd08130d2015-03-19 05:20:46 -0700118An ``UnwrappedLine`` typically won't affect the formatting of lines before or
119after it. There is a part of the algorithm that may join two or more
120``UnwrappedLine``\s into one line. For instance, an if-then statement with a
121short body can be placed on a single line::
Bill Wendling7d623452015-03-18 13:36:07 -0700122
123 if a == 42: continue
124
125YAPF's formatting algorithm creates a weighted tree that acts as the solution
126space for the algorithm. Each node in the tree represents the result of a
127formatting decision --- i.e., whether to split or not to split before a token.
128Each formatting decision has a cost associated with it. Therefore, the cost is
129realized on the edge between two nodes. (In reality, the weighted tree doesn't
130have separate edge objects, so the cost resides on the nodes themselves.)
131
132For example, take the following Python code snippet. For the sake of this
133example, assume that line (1) violates the column limit restriction and needs to
134be reformatted.
135
Bill Wendlingfa22c892015-03-18 13:42:25 -0700136.. code-block:: python
Bill Wendling7d623452015-03-18 13:36:07 -0700137
Bill Wendlingfa22c892015-03-18 13:42:25 -0700138 def xxxxxxxxxxx(aaaaaaaaaaaa, bbbbbbbbb, cccccccc, dddddddd, eeeeee): # 1
139 pass # 2
Bill Wendling7d623452015-03-18 13:36:07 -0700140
141For line (1), the algorithm will build a tree where each node (a
Eli Benderskyd08130d2015-03-19 05:20:46 -0700142``FormattingDecisionState`` object) is the state of the line at that token given
143the decision to split before the token or not. Note: the ``FormatDecisionState``
144objects are copied by value so each node in the graph is unique and a change in
145one doesn't affect other nodes.
Bill Wendling7d623452015-03-18 13:36:07 -0700146
Bill Wendlingfa22c892015-03-18 13:42:25 -0700147Heuristics are used to determine the costs of splitting or not splitting.
148Because a node holds the state of the tree up to a token's insertion, it can
149easily determine if a splitting decision will violate one of the style
Bill Wendling7d623452015-03-18 13:36:07 -0700150requirements. For instance, the heuristic is able to apply an extra penalty to
151the edge when not splitting between the previous token and the one being added.
152
153There are some instances where we will never want to split the line, because
154doing so will always be detrimental (i.e., it will require a backslash-newline,
155which is very rarely desirable). For line (1), we will never want to split the
Eli Benderskyd08130d2015-03-19 05:20:46 -0700156first three tokens: ``def``, ``xxxxxxxxxxx``, and ``(``. Nor will we want to
157split between the ``)`` and the ``:`` at the end. These regions are said to be
158"unbreakable." This is reflected in the tree by there not being a "split"
Bill Wendling7d623452015-03-18 13:36:07 -0700159decision (left hand branch) within the unbreakable region.
160
161Now that we have the tree, we determine what the "best" formatting is by finding
162the path through the tree with the lowest cost.
163
164And that's it!