Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 1 | .. _pyporting-howto: |
| 2 | |
| 3 | ********************************* |
| 4 | Porting Python 2 Code to Python 3 |
| 5 | ********************************* |
| 6 | |
| 7 | :author: Brett Cannon |
| 8 | |
| 9 | .. topic:: Abstract |
| 10 | |
Brett Cannon | 4b0c24a | 2011-02-03 22:14:58 +0000 | [diff] [blame] | 11 | With Python 3 being the future of Python while Python 2 is still in active |
| 12 | use, it is good to have your project available for both major releases of |
Brett Cannon | 9ca21b3 | 2014-01-07 11:52:04 -0500 | [diff] [blame] | 13 | Python. This guide is meant to help you figure out how best to support both |
| 14 | Python 2 & 3 simultaneously. |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 15 | |
Brett Cannon | 4b0c24a | 2011-02-03 22:14:58 +0000 | [diff] [blame] | 16 | If you are looking to port an extension module instead of pure Python code, |
Éric Araujo | 5405a0b | 2011-02-05 16:03:12 +0000 | [diff] [blame] | 17 | please see :ref:`cporting-howto`. |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 18 | |
Brett Cannon | 9ca21b3 | 2014-01-07 11:52:04 -0500 | [diff] [blame] | 19 | If you would like to read one core Python developer's take on why Python 3 |
| 20 | came into existence, you can read Nick Coghlan's `Python 3 Q & A`_. |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 21 | |
Brett Cannon | 9ca21b3 | 2014-01-07 11:52:04 -0500 | [diff] [blame] | 22 | For help with porting, you can email the python-porting_ mailing list with |
| 23 | questions. |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 24 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 25 | The Short Explanation |
| 26 | ===================== |
Larry Hastings | 3732ed2 | 2014-03-15 21:13:56 -0700 | [diff] [blame] | 27 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 28 | To make your project be single-source Python 2/3 compatible, the basic steps |
| 29 | are: |
Larry Hastings | 3732ed2 | 2014-03-15 21:13:56 -0700 | [diff] [blame] | 30 | |
Brett Cannon | 2645bad | 2015-03-13 12:49:44 -0400 | [diff] [blame] | 31 | #. Only worry about supporting Python 2.7 |
Brett Cannon | 90783eb | 2014-12-12 15:13:43 -0500 | [diff] [blame] | 32 | #. Make sure you have good test coverage (coverage.py_ can help; |
| 33 | ``pip install coverage``) |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 34 | #. Learn the differences between Python 2 & 3 |
Brett Cannon | 90783eb | 2014-12-12 15:13:43 -0500 | [diff] [blame] | 35 | #. Use Modernize_ or Futurize_ to update your code (``pip install modernize`` or |
| 36 | ``pip install future``, respectively) |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 37 | #. Use Pylint_ to help make sure you don't regress on your Python 3 support |
Brett Cannon | 2645bad | 2015-03-13 12:49:44 -0400 | [diff] [blame] | 38 | (``pip install pylint``) |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 39 | #. Use caniusepython3_ to find out which of your dependencies are blocking your |
Brett Cannon | 90783eb | 2014-12-12 15:13:43 -0500 | [diff] [blame] | 40 | use of Python 3 (``pip install caniusepython3``) |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 41 | #. Once your dependencies are no longer blocking you, use continuous integration |
| 42 | to make sure you stay compatible with Python 2 & 3 (tox_ can help test |
Brett Cannon | 90783eb | 2014-12-12 15:13:43 -0500 | [diff] [blame] | 43 | against multiple versions of Python; ``pip install tox``) |
Larry Hastings | 3732ed2 | 2014-03-15 21:13:56 -0700 | [diff] [blame] | 44 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 45 | If you are dropping support for Python 2 entirely, then after you learn the |
| 46 | differences between Python 2 & 3 you can run 2to3_ over your code and skip the |
| 47 | rest of the steps outlined above. |
Georg Brandl | 728e4de | 2014-10-29 09:00:30 +0100 | [diff] [blame] | 48 | |
Larry Hastings | 3732ed2 | 2014-03-15 21:13:56 -0700 | [diff] [blame] | 49 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 50 | Details |
| 51 | ======= |
Larry Hastings | 3732ed2 | 2014-03-15 21:13:56 -0700 | [diff] [blame] | 52 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 53 | A key point about supporting Python 2 & 3 simultaneously is that you can start |
| 54 | **today**! Even if your dependencies are not supporting Python 3 yet that does |
| 55 | not mean you can't modernize your code **now** to support Python 3. Most changes |
| 56 | required to support Python 3 lead to cleaner code using newer practices even in |
| 57 | Python 2. |
Larry Hastings | 3732ed2 | 2014-03-15 21:13:56 -0700 | [diff] [blame] | 58 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 59 | Another key point is that modernizing your Python 2 code to also support |
| 60 | Python 3 is largely automated for you. While you might have to make some API |
| 61 | decisions thanks to Python 3 clarifying text data versus binary data, the |
| 62 | lower-level work is now mostly done for you and thus can at least benefit from |
| 63 | the automated changes immediately. |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 64 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 65 | Keep those key points in mind while you read on about the details of porting |
| 66 | your code to support Python 2 & 3 simultaneously. |
Georg Brandl | 2cb2fa9 | 2011-02-07 15:30:45 +0000 | [diff] [blame] | 67 | |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 68 | |
Brett Cannon | 2645bad | 2015-03-13 12:49:44 -0400 | [diff] [blame] | 69 | Drop support for Python 2.6 and older |
| 70 | ------------------------------------- |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 71 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 72 | While you can make Python 2.5 work with Python 3, it is **much** easier if you |
Brett Cannon | 2645bad | 2015-03-13 12:49:44 -0400 | [diff] [blame] | 73 | only have to work with Python 2.7. If dropping Python 2.5 is not an |
| 74 | option then the six_ project can help you support Python 2.5 & 3 simultaneously |
Brett Cannon | 90783eb | 2014-12-12 15:13:43 -0500 | [diff] [blame] | 75 | (``pip install six``). Do realize, though, that nearly all the projects listed |
| 76 | in this HOWTO will not be available to you. |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 77 | |
Brett Cannon | 2645bad | 2015-03-13 12:49:44 -0400 | [diff] [blame] | 78 | If you are able to skip Python 2.5 and older, then the required changes |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 79 | to your code should continue to look and feel like idiomatic Python code. At |
| 80 | worst you will have to use a function instead of a method in some instances or |
| 81 | have to import a function instead of using a built-in one, but otherwise the |
| 82 | overall transformation should not feel foreign to you. |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 83 | |
Brett Cannon | 2645bad | 2015-03-13 12:49:44 -0400 | [diff] [blame] | 84 | But you should aim for only supporting Python 2.7. Python 2.6 is no longer |
| 85 | supported and thus is not receiving bugfixes. This means **you** will have to |
| 86 | work around any issues you come across with Python 2.6. There are also some |
| 87 | tools mentioned in this HOWTO which do not support Python 2.6 (e.g., Pylint_), |
| 88 | and this will become more commonplace as time goes on. It will simply be easier |
| 89 | for you if you only support the versions of Python that you have to support. |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 90 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 91 | Make sure you specify the proper version support in your ``setup.py`` file |
| 92 | -------------------------------------------------------------------------- |
Brett Cannon | b7e6b89 | 2013-03-09 14:22:35 -0500 | [diff] [blame] | 93 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 94 | In your ``setup.py`` file you should have the proper `trove classifier`_ |
| 95 | specifying what versions of Python you support. As your project does not support |
| 96 | Python 3 yet you should at least have |
| 97 | ``Programming Language :: Python :: 2 :: Only`` specified. Ideally you should |
| 98 | also specify each major/minor version of Python that you do support, e.g. |
| 99 | ``Programming Language :: Python :: 2.7``. |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 100 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 101 | Have good test coverage |
| 102 | ----------------------- |
Brett Cannon | 9ca21b3 | 2014-01-07 11:52:04 -0500 | [diff] [blame] | 103 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 104 | Once you have your code supporting the oldest version of Python 2 you want it |
| 105 | to, you will want to make sure your test suite has good coverage. A good rule of |
| 106 | thumb is that if you want to be confident enough in your test suite that any |
| 107 | failures that appear after having tools rewrite your code are actual bugs in the |
| 108 | tools and not in your code. If you want a number to aim for, try to get over 80% |
| 109 | coverage (and don't feel bad if you can't easily get past 90%). If you |
| 110 | don't already have a tool to measure test coverage then coverage.py_ is |
| 111 | recommended. |
Brett Cannon | 9ca21b3 | 2014-01-07 11:52:04 -0500 | [diff] [blame] | 112 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 113 | Learn the differences between Python 2 & 3 |
| 114 | ------------------------------------------- |
Brett Cannon | 9ca21b3 | 2014-01-07 11:52:04 -0500 | [diff] [blame] | 115 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 116 | Once you have your code well-tested you are ready to begin porting your code to |
| 117 | Python 3! But to fully understand how your code is going to change and what |
| 118 | you want to look out for while you code, you will want to learn what changes |
| 119 | Python 3 makes in terms of Python 2. Typically the two best ways of doing that |
| 120 | is reading the `"What's New"`_ doc for each release of Python 3 and the |
Brett Cannon | 90783eb | 2014-12-12 15:13:43 -0500 | [diff] [blame] | 121 | `Porting to Python 3`_ book (which is free online). There is also a handy |
| 122 | `cheat sheet`_ from the Python-Future project. |
Brett Cannon | 9ca21b3 | 2014-01-07 11:52:04 -0500 | [diff] [blame] | 123 | |
Brett Cannon | 9ca21b3 | 2014-01-07 11:52:04 -0500 | [diff] [blame] | 124 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 125 | Update your code |
| 126 | ---------------- |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 127 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 128 | Once you feel like you know what is different in Python 3 compared to Python 2, |
| 129 | it's time to update your code! You have a choice between two tools in porting |
| 130 | your code automatically: Modernize_ and Futurize_. Which tool you choose will |
| 131 | depend on how much like Python 3 you want your code to be. Futurize_ does its |
| 132 | best to make Python 3 idioms and practices exist in Python 2, e.g. backporting |
| 133 | the ``bytes`` type from Python 3 so that you have semantic parity between the |
| 134 | major versions of Python. Modernize_, |
| 135 | on the other hand, is more conservative and targets a Python 2/3 subset of |
| 136 | Python, relying on six_ to help provide compatibility. |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 137 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 138 | Regardless of which tool you choose, they will update your code to run under |
| 139 | Python 3 while staying compatible with the version of Python 2 you started with. |
| 140 | Depending on how conservative you want to be, you may want to run the tool over |
| 141 | your test suite first and visually inspect the diff to make sure the |
| 142 | transformation is accurate. After you have transformed your test suite and |
| 143 | verified that all the tests still pass as expected, then you can transform your |
| 144 | application code knowing that any tests which fail is a translation failure. |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 145 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 146 | Unfortunately the tools can't automate everything to make your code work under |
| 147 | Python 3 and so there are a handful of things you will need to update manually |
| 148 | to get full Python 3 support (which of these steps are necessary vary between |
| 149 | the tools). Read the documentation for the tool you choose to use to see what it |
| 150 | fixes by default and what it can do optionally to know what will (not) be fixed |
| 151 | for you and what you may have to fix on your own (e.g. using ``io.open()`` over |
| 152 | the built-in ``open()`` function is off by default in Modernize). Luckily, |
| 153 | though, there are only a couple of things to watch out for which can be |
| 154 | considered large issues that may be hard to debug if not watched for. |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 155 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 156 | Division |
| 157 | ++++++++ |
Georg Brandl | 2cb2fa9 | 2011-02-07 15:30:45 +0000 | [diff] [blame] | 158 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 159 | In Python 3, ``5 / 2 == 2.5`` and not ``2``; all division between ``int`` values |
| 160 | result in a ``float``. This change has actually been planned since Python 2.2 |
| 161 | which was released in 2002. Since then users have been encouraged to add |
| 162 | ``from __future__ import division`` to any and all files which use the ``/`` and |
| 163 | ``//`` operators or to be running the interpreter with the ``-Q`` flag. If you |
| 164 | have not been doing this then you will need to go through your code and do two |
| 165 | things: |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 166 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 167 | #. Add ``from __future__ import division`` to your files |
| 168 | #. Update any division operator as necessary to either use ``//`` to use floor |
| 169 | division or continue using ``/`` and expect a float |
Georg Brandl | 2cb2fa9 | 2011-02-07 15:30:45 +0000 | [diff] [blame] | 170 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 171 | The reason that ``/`` isn't simply translated to ``//`` automatically is that if |
Brett Cannon | fdde79d | 2015-02-27 15:10:03 -0500 | [diff] [blame] | 172 | an object defines a ``__truediv__`` method but not ``__floordiv__`` then your |
| 173 | code would begin to fail (e.g. a user-defined class that uses ``/`` to |
| 174 | signify some operation but not ``//`` for the same thing or at all). |
Georg Brandl | 2cb2fa9 | 2011-02-07 15:30:45 +0000 | [diff] [blame] | 175 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 176 | Text versus binary data |
| 177 | +++++++++++++++++++++++ |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 178 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 179 | In Python 2 you could use the ``str`` type for both text and binary data. |
| 180 | Unfortunately this confluence of two different concepts could lead to brittle |
| 181 | code which sometimes worked for either kind of data, sometimes not. It also |
| 182 | could lead to confusing APIs if people didn't explicitly state that something |
| 183 | that accepted ``str`` accepted either text or binary data instead of one |
| 184 | specific type. This complicated the situation especially for anyone supporting |
| 185 | multiple languages as APIs wouldn't bother explicitly supporting ``unicode`` |
| 186 | when they claimed text data support. |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 187 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 188 | To make the distinction between text and binary data clearer and more |
| 189 | pronounced, Python 3 did what most languages created in the age of the internet |
| 190 | have done and made text and binary data distinct types that cannot blindly be |
| 191 | mixed together (Python predates widespread access to the internet). For any code |
| 192 | that only deals with text or only binary data, this separation doesn't pose an |
| 193 | issue. But for code that has to deal with both, it does mean you might have to |
| 194 | now care about when you are using text compared to binary data, which is why |
| 195 | this cannot be entirely automated. |
Georg Brandl | 2cb2fa9 | 2011-02-07 15:30:45 +0000 | [diff] [blame] | 196 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 197 | To start, you will need to decide which APIs take text and which take binary |
| 198 | (it is **highly** recommended you don't design APIs that can take both due to |
| 199 | the difficulty of keeping the code working; as stated earlier it is difficult to |
| 200 | do well). In Python 2 this means making sure the APIs that take text can work |
| 201 | with ``unicode`` in Python 2 and those that work with binary data work with the |
| 202 | ``bytes`` type from Python 3 and thus a subset of ``str`` in Python 2 (which the |
| 203 | ``bytes`` type in Python 2 is an alias for). Usually the biggest issue is |
| 204 | realizing which methods exist for which types in Python 2 & 3 simultaneously |
| 205 | (for text that's ``unicode`` in Python 2 and ``str`` in Python 3, for binary |
| 206 | that's ``str``/``bytes`` in Python 2 and ``bytes`` in Python 3). The following |
| 207 | table lists the **unique** methods of each data type across Python 2 & 3 |
| 208 | (e.g., the ``decode()`` method is usable on the equivalent binary data type in |
| 209 | either Python 2 or 3, but it can't be used by the text data type consistently |
Brett Cannon | 8396b8e | 2015-04-13 16:32:16 -0400 | [diff] [blame] | 210 | between Python 2 and 3 because ``str`` in Python 3 doesn't have the method). Do |
Brett Cannon | fd53f98 | 2015-04-13 16:21:07 -0400 | [diff] [blame] | 211 | note that as of Python 3.5 the ``__mod__`` method was added to the bytes type. |
Brett Cannon | ce71ab2 | 2011-02-05 22:05:05 +0000 | [diff] [blame] | 212 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 213 | ======================== ===================== |
| 214 | **Text data** **Binary data** |
| 215 | ------------------------ --------------------- |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 216 | \ decode |
| 217 | ------------------------ --------------------- |
| 218 | encode |
| 219 | ------------------------ --------------------- |
| 220 | format |
| 221 | ------------------------ --------------------- |
| 222 | isdecimal |
| 223 | ------------------------ --------------------- |
| 224 | isnumeric |
| 225 | ======================== ===================== |
Brett Cannon | ce71ab2 | 2011-02-05 22:05:05 +0000 | [diff] [blame] | 226 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 227 | Making the distinction easier to handle can be accomplished by encoding and |
| 228 | decoding between binary data and text at the edge of your code. This means that |
| 229 | when you receive text in binary data, you should immediately decode it. And if |
| 230 | your code needs to send text as binary data then encode it as late as possible. |
| 231 | This allows your code to work with only text internally and thus eliminates |
| 232 | having to keep track of what type of data you are working with. |
Brett Cannon | ce71ab2 | 2011-02-05 22:05:05 +0000 | [diff] [blame] | 233 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 234 | The next issue is making sure you know whether the string literals in your code |
| 235 | represent text or binary data. At minimum you should add a ``b`` prefix to any |
| 236 | literal that presents binary data. For text you should either use the |
| 237 | ``from __future__ import unicode_literals`` statement or add a ``u`` prefix to |
| 238 | the text literal. |
Antoine Pitrou | 5c28cfdc | 2011-02-05 11:53:39 +0000 | [diff] [blame] | 239 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 240 | As part of this dichotomy you also need to be careful about opening files. |
Antoine Pitrou | 5c28cfdc | 2011-02-05 11:53:39 +0000 | [diff] [blame] | 241 | Unless you have been working on Windows, there is a chance you have not always |
| 242 | bothered to add the ``b`` mode when opening a binary file (e.g., ``rb`` for |
| 243 | binary reading). Under Python 3, binary files and text files are clearly |
| 244 | distinct and mutually incompatible; see the :mod:`io` module for details. |
| 245 | Therefore, you **must** make a decision of whether a file will be used for |
Martin Panter | c04fb56 | 2016-02-10 05:44:01 +0000 | [diff] [blame] | 246 | binary access (allowing binary data to be read and/or written) or text access |
| 247 | (allowing text data to be read and/or written). You should also use :func:`io.open` |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 248 | for opening files instead of the built-in :func:`open` function as the :mod:`io` |
| 249 | module is consistent from Python 2 to 3 while the built-in :func:`open` function |
| 250 | is not (in Python 3 it's actually :func:`io.open`). |
Antoine Pitrou | 5c28cfdc | 2011-02-05 11:53:39 +0000 | [diff] [blame] | 251 | |
Brett Cannon | 90783eb | 2014-12-12 15:13:43 -0500 | [diff] [blame] | 252 | The constructors of both ``str`` and ``bytes`` have different semantics for the |
| 253 | same arguments between Python 2 & 3. Passing an integer to ``bytes`` in Python 2 |
| 254 | will give you the string representation of the integer: ``bytes(3) == '3'``. |
| 255 | But in Python 3, an integer argument to ``bytes`` will give you a bytes object |
| 256 | as long as the integer specified, filled with null bytes: |
| 257 | ``bytes(3) == b'\x00\x00\x00'``. A similar worry is necessary when passing a |
| 258 | bytes object to ``str``. In Python 2 you just get the bytes object back: |
| 259 | ``str(b'3') == b'3'``. But in Python 3 you get the string representation of the |
| 260 | bytes object: ``str(b'3') == "b'3'"``. |
| 261 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 262 | Finally, the indexing of binary data requires careful handling (slicing does |
| 263 | **not** require any special handling). In Python 2, |
| 264 | ``b'123'[1] == b'2'`` while in Python 3 ``b'123'[1] == 50``. Because binary data |
| 265 | is simply a collection of binary numbers, Python 3 returns the integer value for |
| 266 | the byte you index on. But in Python 2 because ``bytes == str``, indexing |
| 267 | returns a one-item slice of bytes. The six_ project has a function |
| 268 | named ``six.indexbytes()`` which will return an integer like in Python 3: |
| 269 | ``six.indexbytes(b'123', 1)``. |
Antoine Pitrou | 5c28cfdc | 2011-02-05 11:53:39 +0000 | [diff] [blame] | 270 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 271 | To summarize: |
Antoine Pitrou | 5c28cfdc | 2011-02-05 11:53:39 +0000 | [diff] [blame] | 272 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 273 | #. Decide which of your APIs take text and which take binary data |
| 274 | #. Make sure that your code that works with text also works with ``unicode`` and |
| 275 | code for binary data works with ``bytes`` in Python 2 (see the table above |
| 276 | for what methods you cannot use for each type) |
| 277 | #. Mark all binary literals with a ``b`` prefix, use a ``u`` prefix or |
| 278 | :mod:`__future__` import statement for text literals |
| 279 | #. Decode binary data to text as soon as possible, encode text as binary data as |
| 280 | late as possible |
| 281 | #. Open files using :func:`io.open` and make sure to specify the ``b`` mode when |
| 282 | appropriate |
| 283 | #. Be careful when indexing binary data |
Antoine Pitrou | 5c28cfdc | 2011-02-05 11:53:39 +0000 | [diff] [blame] | 284 | |
Brett Cannon | adcb654 | 2016-03-18 13:23:58 -0700 | [diff] [blame] | 285 | |
| 286 | Use feature detection instead of version detection |
| 287 | ++++++++++++++++++++++++++++++++++++++++++++++++++ |
| 288 | Inevitably you will have code that has to choose what to do based on what |
| 289 | version of Python is running. The best way to do this is with feature detection |
| 290 | of whether the version of Python you're running under supports what you need. |
| 291 | If for some reason that doesn't work then you should make the version check is |
| 292 | against Python 2 and not Python 3. To help explain this, let's look at an |
| 293 | example. |
| 294 | |
| 295 | Let's pretend that you need access to a feature of importlib_ that |
| 296 | is available in Python's standard library since Python 3.3 and available for |
| 297 | Python 2 through importlib2_ on PyPI. You might be tempted to write code to |
| 298 | access e.g. the ``importlib.abc`` module by doing the following:: |
| 299 | |
| 300 | import sys |
| 301 | |
| 302 | if sys.version[0] == 3: |
| 303 | from importlib import abc |
| 304 | else: |
| 305 | from importlib2 import abc |
| 306 | |
| 307 | The problem with this code is what happens when Python 4 comes out? It would |
| 308 | be better to treat Python 2 as the exceptional case instead of Python 3 and |
| 309 | assume that future Python versions will be more compatible with Python 3 than |
| 310 | Python 2:: |
| 311 | |
| 312 | import sys |
| 313 | |
| 314 | if sys.version[0] > 2: |
| 315 | from importlib import abc |
| 316 | else: |
| 317 | from importlib2 import abc |
| 318 | |
| 319 | The best solution, though, is to do no version detection at all and instead rely |
| 320 | on feature detection. That avoids any potential issues of getting the version |
| 321 | detection wrong and helps keep you future-compatible:: |
| 322 | |
| 323 | try: |
| 324 | from importlib import abc |
| 325 | except ImportError: |
| 326 | from importlib2 import abc |
| 327 | |
| 328 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 329 | Prevent compatibility regressions |
| 330 | --------------------------------- |
Antoine Pitrou | 5c28cfdc | 2011-02-05 11:53:39 +0000 | [diff] [blame] | 331 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 332 | Once you have fully translated your code to be compatible with Python 3, you |
| 333 | will want to make sure your code doesn't regress and stop working under |
| 334 | Python 3. This is especially true if you have a dependency which is blocking you |
| 335 | from actually running under Python 3 at the moment. |
Antoine Pitrou | 5c28cfdc | 2011-02-05 11:53:39 +0000 | [diff] [blame] | 336 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 337 | To help with staying compatible, any new modules you create should have |
| 338 | at least the following block of code at the top of it:: |
Georg Brandl | 2cb2fa9 | 2011-02-07 15:30:45 +0000 | [diff] [blame] | 339 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 340 | from __future__ import absolute_import |
| 341 | from __future__ import division |
Berker Peksag | bd62f0a | 2014-12-13 15:48:22 +0200 | [diff] [blame] | 342 | from __future__ import print_function |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 343 | from __future__ import unicode_literals |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 344 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 345 | You can also run Python 2 with the ``-3`` flag to be warned about various |
| 346 | compatibility issues your code triggers during execution. If you turn warnings |
| 347 | into errors with ``-Werror`` then you can make sure that you don't accidentally |
| 348 | miss a warning. |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 349 | |
Georg Brandl | 2cb2fa9 | 2011-02-07 15:30:45 +0000 | [diff] [blame] | 350 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 351 | You can also use the Pylint_ project and its ``--py3k`` flag to lint your code |
| 352 | to receive warnings when your code begins to deviate from Python 3 |
| 353 | compatibility. This also prevents you from having to run Modernize_ or Futurize_ |
| 354 | over your code regularly to catch compatibility regressions. This does require |
| 355 | you only support Python 2.7 and Python 3.4 or newer as that is Pylint's |
| 356 | minimum Python version support. |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 357 | |
| 358 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 359 | Check which dependencies block your transition |
| 360 | ---------------------------------------------- |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 361 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 362 | **After** you have made your code compatible with Python 3 you should begin to |
| 363 | care about whether your dependencies have also been ported. The caniusepython3_ |
| 364 | project was created to help you determine which projects |
| 365 | -- directly or indirectly -- are blocking you from supporting Python 3. There |
| 366 | is both a command-line tool as well as a web interface at |
| 367 | https://caniusepython3.com . |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 368 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 369 | The project also provides code which you can integrate into your test suite so |
| 370 | that you will have a failing test when you no longer have dependencies blocking |
| 371 | you from using Python 3. This allows you to avoid having to manually check your |
| 372 | dependencies and to be notified quickly when you can start running on Python 3. |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 373 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 374 | Update your ``setup.py`` file to denote Python 3 compatibility |
| 375 | -------------------------------------------------------------- |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 376 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 377 | Once your code works under Python 3, you should update the classifiers in |
| 378 | your ``setup.py`` to contain ``Programming Language :: Python :: 3`` and to not |
| 379 | specify sole Python 2 support. This will tell |
| 380 | anyone using your code that you support Python 2 **and** 3. Ideally you will |
| 381 | also want to add classifiers for each major/minor version of Python you now |
| 382 | support. |
Georg Brandl | 2cb2fa9 | 2011-02-07 15:30:45 +0000 | [diff] [blame] | 383 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 384 | Use continuous integration to stay compatible |
| 385 | --------------------------------------------- |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 386 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 387 | Once you are able to fully run under Python 3 you will want to make sure your |
| 388 | code always works under both Python 2 & 3. Probably the best tool for running |
| 389 | your tests under multiple Python interpreters is tox_. You can then integrate |
| 390 | tox with your continuous integration system so that you never accidentally break |
| 391 | Python 2 or 3 support. |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 392 | |
Martin Panter | a90a4a9 | 2016-05-30 04:04:50 +0000 | [diff] [blame] | 393 | You may also want to use the ``-bb`` flag with the Python 3 interpreter to |
Brett Cannon | 4269d6d | 2015-04-13 14:37:50 -0400 | [diff] [blame] | 394 | trigger an exception when you are comparing bytes to strings or bytes to an int |
| 395 | (the latter is available starting in Python 3.5). By default type-differing |
Antoine Pitrou | 3764fc2 | 2015-04-13 21:07:57 +0200 | [diff] [blame] | 396 | comparisons simply return ``False``, but if you made a mistake in your |
Brett Cannon | 4269d6d | 2015-04-13 14:37:50 -0400 | [diff] [blame] | 397 | separation of text/binary data handling or indexing on bytes you wouldn't easily |
| 398 | find the mistake. This flag will raise an exception when these kinds of |
| 399 | comparisons occur, making the mistake much easier to track down. |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 400 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 401 | And that's mostly it! At this point your code base is compatible with both |
| 402 | Python 2 and 3 simultaneously. Your testing will also be set up so that you |
| 403 | don't accidentally break Python 2 or 3 compatibility regardless of which version |
| 404 | you typically run your tests under while developing. |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 405 | |
Antoine Pitrou | 8d8f7c5 | 2011-02-05 11:40:05 +0000 | [diff] [blame] | 406 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 407 | Dropping Python 2 support completely |
| 408 | ==================================== |
Antoine Pitrou | 8d8f7c5 | 2011-02-05 11:40:05 +0000 | [diff] [blame] | 409 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 410 | If you are able to fully drop support for Python 2, then the steps required |
| 411 | to transition to Python 3 simplify greatly. |
Antoine Pitrou | 8d8f7c5 | 2011-02-05 11:40:05 +0000 | [diff] [blame] | 412 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 413 | #. Update your code to only support Python 2.7 |
| 414 | #. Make sure you have good test coverage (coverage.py_ can help) |
| 415 | #. Learn the differences between Python 2 & 3 |
| 416 | #. Use 2to3_ to rewrite your code to run only under Python 3 |
Antoine Pitrou | 8d8f7c5 | 2011-02-05 11:40:05 +0000 | [diff] [blame] | 417 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 418 | After this your code will be fully Python 3 compliant but in a way that is not |
| 419 | supported by Python 2. You should also update the classifiers in your |
| 420 | ``setup.py`` to contain ``Programming Language :: Python :: 3 :: Only``. |
Antoine Pitrou | 8d8f7c5 | 2011-02-05 11:40:05 +0000 | [diff] [blame] | 421 | |
Antoine Pitrou | 8d8f7c5 | 2011-02-05 11:40:05 +0000 | [diff] [blame] | 422 | |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 423 | .. _2to3: https://docs.python.org/3/library/2to3.html |
Brett Cannon | 17be09c | 2014-12-05 18:11:05 -0500 | [diff] [blame] | 424 | .. _caniusepython3: https://pypi.python.org/pypi/caniusepython3 |
Brett Cannon | 90783eb | 2014-12-12 15:13:43 -0500 | [diff] [blame] | 425 | .. _cheat sheet: http://python-future.org/compatible_idioms.html |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 426 | .. _coverage.py: https://pypi.python.org/pypi/coverage |
| 427 | .. _Futurize: http://python-future.org/automatic_conversion.html |
Brett Cannon | adcb654 | 2016-03-18 13:23:58 -0700 | [diff] [blame] | 428 | .. _importlib: https://docs.python.org/3/library/importlib.html#module-importlib |
| 429 | .. _importlib2: https://pypi.python.org/pypi/importlib2 |
Serhiy Storchaka | 6dff020 | 2016-05-07 10:49:07 +0300 | [diff] [blame] | 430 | .. _Modernize: https://python-modernize.readthedocs.org/en/latest/ |
Brett Cannon | 9ca21b3 | 2014-01-07 11:52:04 -0500 | [diff] [blame] | 431 | .. _Porting to Python 3: http://python3porting.com/ |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 432 | .. _Pylint: https://pypi.python.org/pypi/pylint |
Serhiy Storchaka | 6dff020 | 2016-05-07 10:49:07 +0300 | [diff] [blame] | 433 | .. _Python 3 Q & A: https://ncoghlan-devs-python-notes.readthedocs.org/en/latest/python3/questions_and_answers.html |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 434 | |
| 435 | .. _python-future: http://python-future.org/ |
Georg Brandl | e73778c | 2014-10-29 08:36:35 +0100 | [diff] [blame] | 436 | .. _python-porting: https://mail.python.org/mailman/listinfo/python-porting |
Brett Cannon | 9ca21b3 | 2014-01-07 11:52:04 -0500 | [diff] [blame] | 437 | .. _six: https://pypi.python.org/pypi/six |
| 438 | .. _tox: https://pypi.python.org/pypi/tox |
Brett Cannon | 6b33519 | 2014-12-05 10:56:12 -0500 | [diff] [blame] | 439 | .. _trove classifier: https://pypi.python.org/pypi?%3Aaction=list_classifiers |
| 440 | .. _"What's New": https://docs.python.org/3/whatsnew/index.html |