Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 1 | .. _pyporting-howto: |
| 2 | |
| 3 | ********************************* |
| 4 | Porting Python 2 Code to Python 3 |
| 5 | ********************************* |
| 6 | |
| 7 | :author: Brett Cannon |
| 8 | |
| 9 | .. topic:: Abstract |
| 10 | |
| 11 | With Python 3 being the future of Python while Python 2 is still in active |
| 12 | use, it is good to have your project available for both major releases of |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 13 | Python. This guide is meant to help you figure out how best to support both |
| 14 | Python 2 & 3 simultaneously. |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 15 | |
| 16 | If you are looking to port an extension module instead of pure Python code, |
| 17 | please see :ref:`cporting-howto`. |
| 18 | |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 19 | If you would like to read one core Python developer's take on why Python 3 |
| 20 | came into existence, you can read Nick Coghlan's `Python 3 Q & A`_. |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 21 | |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 22 | If you prefer to read a (free) book on porting a project to Python 3, |
| 23 | consider reading `Porting to Python 3`_ by Lennart Regebro which should cover |
| 24 | much of what is discussed in this HOWTO. |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 25 | |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 26 | For help with porting, you can email the python-porting_ mailing list with |
| 27 | questions. |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 28 | |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 29 | The Short Version |
| 30 | ================= |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 31 | |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 32 | * Decide what's the oldest version of Python 2 you want to support (if at all) |
| 33 | * Make sure you have a thorough test suite and use continuous integration |
| 34 | testing to make sure you stay compatible with the versions of Python you care |
| 35 | about |
| 36 | * If you have dependencies, check their Python 3 status using caniusepython3 |
| 37 | (`command-line tool <https://pypi.python.org/pypi/caniusepython3>`__, |
| 38 | `web app <https://caniusepython3.com/>`__) |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 39 | |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 40 | With that done, your options are: |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 41 | |
Georg Brandl | ea64fb7 | 2014-10-29 09:00:30 +0100 | [diff] [blame] | 42 | * If you are dropping Python 2 support, use :ref:`2to3 <2to3-reference>` to port |
| 43 | to Python 3 |
| 44 | |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 45 | * If you are keeping Python 2 support, then start writing Python 2/3-compatible |
| 46 | code starting **TODAY** |
| 47 | |
| 48 | + If you have dependencies that have not been ported, reach out to them to port |
| 49 | their project while working to make your code compatible with Python 3 so |
| 50 | you're ready when your dependencies are all ported |
| 51 | + If all your dependencies have been ported (or you have none), go ahead and |
| 52 | port to Python 3 |
| 53 | |
| 54 | * If you are creating a new project that wants to have 2/3 compatibility, |
| 55 | code in Python 3 and then backport to Python 2 |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 56 | |
| 57 | |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 58 | Before You Begin |
| 59 | ================ |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 60 | |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 61 | If your project is on the Cheeseshop_/PyPI_, make sure it has the proper |
| 62 | `trove classifiers`_ to signify what versions of Python it **currently** |
| 63 | supports. At minimum you should specify the major version(s), e.g. |
| 64 | ``Programming Language :: Python :: 2`` if your project currently only supports |
| 65 | Python 2. It is preferrable that you be as specific as possible by listing every |
| 66 | major/minor version of Python that you support, e.g. if your project supports |
| 67 | Python 2.6 and 2.7, then you want the classifiers of:: |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 68 | |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 69 | Programming Language :: Python :: 2 |
| 70 | Programming Language :: Python :: 2.6 |
| 71 | Programming Language :: Python :: 2.7 |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 72 | |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 73 | Once your project supports Python 3 you will want to go back and add the |
| 74 | appropriate classifiers for Python 3 as well. This is important as setting the |
| 75 | ``Programming Language :: Python :: 3`` classifier will lead to your project |
| 76 | being listed under the `Python 3 Packages`_ section of PyPI. |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 77 | |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 78 | Make sure you have a robust test suite. You need to |
| 79 | make sure everything continues to work, just like when you support a new |
| 80 | minor/feature release of Python. This means making sure your test suite is |
| 81 | thorough and is ported properly between Python 2 & 3 (consider using coverage_ |
| 82 | to measure that you have effective test coverage). You will also most likely |
| 83 | want to use something like tox_ to automate testing between all of your |
| 84 | supported versions of Python. You will also want to **port your tests first** so |
| 85 | that you can make sure that you detect breakage during the transition. Tests also |
| 86 | tend to be simpler than the code they are testing so it gives you an idea of how |
| 87 | easy it can be to port code. |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 88 | |
Georg Brandl | 0ffb462 | 2014-10-29 09:37:43 +0100 | [diff] [blame] | 89 | Drop support for older Python versions if possible. Python 2.5 |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 90 | introduced a lot of useful syntax and libraries which have become idiomatic |
Georg Brandl | 0ffb462 | 2014-10-29 09:37:43 +0100 | [diff] [blame] | 91 | in Python 3. Python 2.6 introduced future statements which makes |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 92 | compatibility much easier if you are going from Python 2 to 3. |
Georg Brandl | 0ffb462 | 2014-10-29 09:37:43 +0100 | [diff] [blame] | 93 | Python 2.7 continues the trend in the stdlib. Choose the newest version |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 94 | of Python which you believe can be your minimum support version |
| 95 | and work from there. |
| 96 | |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 97 | Target the newest version of Python 3 that you can. Beyond just the usual |
| 98 | bugfixes, compatibility has continued to improve between Python 2 and 3 as time |
| 99 | has passed. E.g. Python 3.3 added back the ``u`` prefix for |
| 100 | strings, making source-compatible Python code easier to write. |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 101 | |
| 102 | |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 103 | Writing Source-Compatible Python 2/3 Code |
| 104 | ========================================= |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 105 | |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 106 | Over the years the Python community has discovered that the easiest way to |
| 107 | support both Python 2 and 3 in parallel is to write Python code that works in |
| 108 | either version. While this might sound counter-intuitive at first, it actually |
| 109 | is not difficult and typically only requires following some select |
| 110 | (non-idiomatic) practices and using some key projects to help make bridging |
| 111 | between Python 2 and 3 easier. |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 112 | |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 113 | Projects to Consider |
| 114 | -------------------- |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 115 | |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 116 | The lowest level library for supporting Python 2 & 3 simultaneously is six_. |
| 117 | Reading through its documentation will give you an idea of where exactly the |
| 118 | Python language changed between versions 2 & 3 and thus what you will want the |
| 119 | library to help you continue to support. |
| 120 | |
| 121 | To help automate porting your code over to using six, you can use |
| 122 | modernize_. This project will attempt to rewrite your code to be as modern as |
| 123 | possible while using six to smooth out any differences between Python 2 & 3. |
| 124 | |
| 125 | If you want to write your compatible code to feel more like Python 3 there is |
| 126 | the future_ project. It tries to provide backports of objects from Python 3 so |
| 127 | that you can use them from Python 2-compatible code, e.g. replacing the |
| 128 | ``bytes`` type from Python 2 with the one from Python 3. |
| 129 | It also provides a translation script like modernize (its translation code is |
| 130 | actually partially based on it) to help start working with a pre-existing code |
| 131 | base. It is also unique in that its translation script will also port Python 3 |
| 132 | code backwards as well as Python 2 code forwards. |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 133 | |
| 134 | |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 135 | Tips & Tricks |
| 136 | ------------- |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 137 | |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 138 | To help with writing source-compatible code using one of the projects mentioned |
| 139 | in `Projects to Consider`_, consider following the below suggestions. Some of |
| 140 | them are handled by the suggested projects, so if you do use one of them then |
| 141 | read their documentation first to see which suggestions below will taken care of |
| 142 | for you. |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 143 | |
| 144 | Support Python 2.7 |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 145 | ////////////////// |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 146 | |
Georg Brandl | 0ffb462 | 2014-10-29 09:37:43 +0100 | [diff] [blame] | 147 | As a first step, make sure that your project is compatible with Python 2.7. |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 148 | This is just good to do as Python 2.7 is the last release of Python 2 and thus |
| 149 | will be used for a rather long time. It also allows for use of the ``-3`` flag |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 150 | to Python to help discover places in your code where compatibility might be an |
| 151 | issue (the ``-3`` flag is in Python 2.6 but Python 2.7 adds more warnings). |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 152 | |
Georg Brandl | 0ffb462 | 2014-10-29 09:37:43 +0100 | [diff] [blame] | 153 | Try to Support Python 2.6 and Newer Only |
| 154 | //////////////////////////////////////// |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 155 | |
Georg Brandl | 0ffb462 | 2014-10-29 09:37:43 +0100 | [diff] [blame] | 156 | While not possible for all projects, if you can support Python 2.6 and newer |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 157 | **only**, your life will be much easier. Various future statements, stdlib |
| 158 | additions, etc. exist only in Python 2.6 and later which greatly assist in |
Georg Brandl | 0ffb462 | 2014-10-29 09:37:43 +0100 | [diff] [blame] | 159 | supporting Python 3. But if you project must keep support for Python 2.5 then |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 160 | it is still possible to simultaneously support Python 3. |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 161 | |
| 162 | Below are the benefits you gain if you only have to support Python 2.6 and |
| 163 | newer. Some of these options are personal choice while others are |
| 164 | **strongly** recommended (the ones that are more for personal choice are |
| 165 | labeled as such). If you continue to support older versions of Python then you |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 166 | at least need to watch out for situations that these solutions fix and handle |
| 167 | them appropriately (which is where library help from e.g. six_ comes in handy). |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 168 | |
| 169 | |
| 170 | ``from __future__ import print_function`` |
| 171 | ''''''''''''''''''''''''''''''''''''''''' |
| 172 | |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 173 | It will not only get you used to typing ``print()`` as a function instead of a |
| 174 | statement, but it will also give you the various benefits the function has over |
| 175 | the Python 2 statement (six_ provides a function if you support Python 2.5 or |
| 176 | older). |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 177 | |
| 178 | |
| 179 | ``from __future__ import unicode_literals`` |
| 180 | ''''''''''''''''''''''''''''''''''''''''''' |
| 181 | |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 182 | If you choose to use this future statement then all string literals in |
| 183 | Python 2 will be assumed to be Unicode (as is already the case in Python 3). |
| 184 | If you choose not to use this future statement then you should mark all of your |
| 185 | text strings with a ``u`` prefix and only support Python 3.3 or newer. But you |
| 186 | are **strongly** advised to do one or the other (six_ provides a function in |
| 187 | case you don't want to use the future statement **and** you want to support |
| 188 | Python 3.2 or older). |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 189 | |
| 190 | |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 191 | Bytes/string literals |
| 192 | ''''''''''''''''''''' |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 193 | |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 194 | This is a **very** important one. Prefix Python 2 strings that |
| 195 | are meant to contain bytes with a ``b`` prefix to very clearly delineate |
| 196 | what is and is not a Python 3 text string (six_ provides a function to use for |
| 197 | Python 2.5 compatibility). |
| 198 | |
| 199 | This point cannot be stressed enough: make sure you know what all of your string |
| 200 | literals in Python 2 are meant to be in Python 3. Any string literal that |
| 201 | should be treated as bytes should have the ``b`` prefix. Any string literal |
| 202 | that should be Unicode/text in Python 2 should either have the ``u`` literal |
| 203 | (supported, but ignored, in Python 3.3 and later) or you should have |
| 204 | ``from __future__ import unicode_literals`` at the top of the file. But the key |
| 205 | point is you should know how Python 3 will treat every one one of your string |
| 206 | literals and you should mark them as appropriate. |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 207 | |
| 208 | There are some differences between byte literals in Python 2 and those in |
| 209 | Python 3 thanks to the bytes type just being an alias to ``str`` in Python 2. |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 210 | See the `Handle Common "Gotchas"`_ section for what to watch out for. |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 211 | |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 212 | ``from __future__ import absolute_import`` |
| 213 | '''''''''''''''''''''''''''''''''''''''''' |
| 214 | Discussed in more detail below, but you should use this future statement to |
| 215 | prevent yourself from accidentally using implicit relative imports. |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 216 | |
| 217 | |
Georg Brandl | 0ffb462 | 2014-10-29 09:37:43 +0100 | [diff] [blame] | 218 | Supporting Python 2.5 and Newer Only |
| 219 | //////////////////////////////////// |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 220 | |
Georg Brandl | 0ffb462 | 2014-10-29 09:37:43 +0100 | [diff] [blame] | 221 | If you are supporting Python 2.5 and newer there are still some features of |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 222 | Python that you can utilize. |
| 223 | |
| 224 | |
| 225 | ``from __future__ import absolute_import`` |
| 226 | '''''''''''''''''''''''''''''''''''''''''' |
| 227 | |
| 228 | Implicit relative imports (e.g., importing ``spam.bacon`` from within |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 229 | ``spam.eggs`` with the statement ``import bacon``) do not work in Python 3. |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 230 | This future statement moves away from that and allows the use of explicit |
| 231 | relative imports (e.g., ``from . import bacon``). |
| 232 | |
Georg Brandl | 0ffb462 | 2014-10-29 09:37:43 +0100 | [diff] [blame] | 233 | In Python 2.5 you must use |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 234 | the __future__ statement to get to use explicit relative imports and prevent |
Georg Brandl | 0ffb462 | 2014-10-29 09:37:43 +0100 | [diff] [blame] | 235 | implicit ones. In Python 2.6 explicit relative imports are available without |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 236 | the statement, but you still want the __future__ statement to prevent implicit |
Georg Brandl | 0ffb462 | 2014-10-29 09:37:43 +0100 | [diff] [blame] | 237 | relative imports. In Python 2.7 the __future__ statement is not needed. In |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 238 | other words, unless you are only supporting Python 2.7 or a version earlier |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 239 | than Python 2.5, use this __future__ statement. |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 240 | |
| 241 | |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 242 | Mark all Unicode strings with a ``u`` prefix |
| 243 | ''''''''''''''''''''''''''''''''''''''''''''' |
| 244 | |
| 245 | While Python 2.6 has a ``__future__`` statement to automatically cause Python 2 |
| 246 | to treat all string literals as Unicode, Python 2.5 does not have that shortcut. |
| 247 | This means you should go through and mark all string literals with a ``u`` |
| 248 | prefix to turn them explicitly into text strings where appropriate and only |
| 249 | support Python 3.3 or newer. Otherwise use a project like six_ which provides a |
| 250 | function to pass all text string literals through. |
| 251 | |
| 252 | |
| 253 | Capturing the Currently Raised Exception |
| 254 | '''''''''''''''''''''''''''''''''''''''' |
| 255 | |
| 256 | In Python 2.5 and earlier the syntax to access the current exception is:: |
| 257 | |
| 258 | try: |
| 259 | raise Exception() |
| 260 | except Exception, exc: |
| 261 | # Current exception is 'exc'. |
| 262 | pass |
| 263 | |
Georg Brandl | 0ffb462 | 2014-10-29 09:37:43 +0100 | [diff] [blame] | 264 | This syntax changed in Python 3 (and backported to Python 2.6 and later) |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 265 | to:: |
| 266 | |
| 267 | try: |
| 268 | raise Exception() |
| 269 | except Exception as exc: |
| 270 | # Current exception is 'exc'. |
| 271 | # In Python 3, 'exc' is restricted to the block; in Python 2.6/2.7 it will "leak". |
| 272 | pass |
| 273 | |
| 274 | Because of this syntax change you must change how you capture the current |
| 275 | exception in Python 2.5 and earlier to:: |
| 276 | |
| 277 | try: |
| 278 | raise Exception() |
| 279 | except Exception: |
| 280 | import sys |
| 281 | exc = sys.exc_info()[1] |
| 282 | # Current exception is 'exc'. |
| 283 | pass |
| 284 | |
| 285 | You can get more information about the raised exception from |
| 286 | :func:`sys.exc_info` than simply the current exception instance, but you most |
| 287 | likely don't need it. |
| 288 | |
| 289 | .. note:: |
| 290 | In Python 3, the traceback is attached to the exception instance |
| 291 | through the ``__traceback__`` attribute. If the instance is saved in |
| 292 | a local variable that persists outside of the ``except`` block, the |
| 293 | traceback will create a reference cycle with the current frame and its |
| 294 | dictionary of local variables. This will delay reclaiming dead |
| 295 | resources until the next cyclic :term:`garbage collection` pass. |
| 296 | |
| 297 | In Python 2, this problem only occurs if you save the traceback itself |
| 298 | (e.g. the third element of the tuple returned by :func:`sys.exc_info`) |
| 299 | in a variable. |
| 300 | |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 301 | |
| 302 | Handle Common "Gotchas" |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 303 | /////////////////////// |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 304 | |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 305 | These are things to watch out for no matter what version of Python 2 you are |
| 306 | supporting which are not syntactic considerations. |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 307 | |
| 308 | |
| 309 | ``from __future__ import division`` |
| 310 | ''''''''''''''''''''''''''''''''''' |
| 311 | |
| 312 | While the exact same outcome can be had by using the ``-Qnew`` argument to |
| 313 | Python, using this future statement lifts the requirement that your users use |
| 314 | the flag to get the expected behavior of division in Python 3 |
| 315 | (e.g., ``1/2 == 0.5; 1//2 == 0``). |
| 316 | |
| 317 | |
| 318 | |
| 319 | Specify when opening a file as binary |
| 320 | ''''''''''''''''''''''''''''''''''''' |
| 321 | |
| 322 | Unless you have been working on Windows, there is a chance you have not always |
| 323 | bothered to add the ``b`` mode when opening a binary file (e.g., ``rb`` for |
| 324 | binary reading). Under Python 3, binary files and text files are clearly |
| 325 | distinct and mutually incompatible; see the :mod:`io` module for details. |
| 326 | Therefore, you **must** make a decision of whether a file will be used for |
| 327 | binary access (allowing to read and/or write bytes data) or text access |
| 328 | (allowing to read and/or write unicode data). |
| 329 | |
| 330 | Text files |
| 331 | '''''''''' |
| 332 | |
| 333 | Text files created using ``open()`` under Python 2 return byte strings, |
| 334 | while under Python 3 they return unicode strings. Depending on your porting |
| 335 | strategy, this can be an issue. |
| 336 | |
| 337 | If you want text files to return unicode strings in Python 2, you have two |
| 338 | possibilities: |
| 339 | |
| 340 | * Under Python 2.6 and higher, use :func:`io.open`. Since :func:`io.open` |
| 341 | is essentially the same function in both Python 2 and Python 3, it will |
| 342 | help iron out any issues that might arise. |
| 343 | |
| 344 | * If pre-2.6 compatibility is needed, then you should use :func:`codecs.open` |
| 345 | instead. This will make sure that you get back unicode strings in Python 2. |
| 346 | |
| 347 | Subclass ``object`` |
| 348 | ''''''''''''''''''' |
| 349 | |
Georg Brandl | 0ffb462 | 2014-10-29 09:37:43 +0100 | [diff] [blame] | 350 | New-style classes have been around since Python 2.2. You need to make sure |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 351 | you are subclassing from ``object`` to avoid odd edge cases involving method |
| 352 | resolution order, etc. This continues to be totally valid in Python 3 (although |
| 353 | unneeded as all classes implicitly inherit from ``object``). |
| 354 | |
| 355 | |
| 356 | Deal With the Bytes/String Dichotomy |
| 357 | '''''''''''''''''''''''''''''''''''' |
| 358 | |
| 359 | One of the biggest issues people have when porting code to Python 3 is handling |
| 360 | the bytes/string dichotomy. Because Python 2 allowed the ``str`` type to hold |
| 361 | textual data, people have over the years been rather loose in their delineation |
| 362 | of what ``str`` instances held text compared to bytes. In Python 3 you cannot |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 363 | be so care-free anymore and need to properly handle the difference. The key to |
R David Murray | 393b7b5 | 2012-04-23 14:46:39 -0400 | [diff] [blame] | 364 | handling this issue is to make sure that **every** string literal in your |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 365 | Python 2 code is either syntactically or functionally marked as either bytes or |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 366 | text data. After this is done you then need to make sure your APIs are designed |
| 367 | to either handle a specific type or made to be properly polymorphic. |
| 368 | |
| 369 | |
| 370 | Mark Up Python 2 String Literals |
| 371 | ******************************** |
| 372 | |
| 373 | First thing you must do is designate every single string literal in Python 2 |
| 374 | as either textual or bytes data. If you are only supporting Python 2.6 or |
| 375 | newer, this can be accomplished by marking bytes literals with a ``b`` prefix |
| 376 | and then designating textual data with a ``u`` prefix or using the |
| 377 | ``unicode_literals`` future statement. |
| 378 | |
R David Murray | 393b7b5 | 2012-04-23 14:46:39 -0400 | [diff] [blame] | 379 | If your project supports versions of Python predating 2.6, then you should use |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 380 | the six_ project and its ``b()`` function to denote bytes literals. For text |
| 381 | literals you can either use six's ``u()`` function or use a ``u`` prefix. |
| 382 | |
| 383 | |
| 384 | Decide what APIs Will Accept |
| 385 | **************************** |
| 386 | |
| 387 | In Python 2 it was very easy to accidentally create an API that accepted both |
| 388 | bytes and textual data. But in Python 3, thanks to the more strict handling of |
| 389 | disparate types, this loose usage of bytes and text together tends to fail. |
| 390 | |
| 391 | Take the dict ``{b'a': 'bytes', u'a': 'text'}`` in Python 2.6. It creates the |
| 392 | dict ``{u'a': 'text'}`` since ``b'a' == u'a'``. But in Python 3 the equivalent |
| 393 | dict creates ``{b'a': 'bytes', 'a': 'text'}``, i.e., no lost data. Similar |
| 394 | issues can crop up when transitioning Python 2 code to Python 3. |
| 395 | |
| 396 | This means you need to choose what an API is going to accept and create and |
| 397 | consistently stick to that API in both Python 2 and 3. |
| 398 | |
| 399 | |
| 400 | Bytes / Unicode Comparison |
| 401 | ************************** |
| 402 | |
| 403 | In Python 3, mixing bytes and unicode is forbidden in most situations; it |
| 404 | will raise a :class:`TypeError` where Python 2 would have attempted an implicit |
| 405 | coercion between types. However, there is one case where it doesn't and |
| 406 | it can be very misleading:: |
| 407 | |
| 408 | >>> b"" == "" |
| 409 | False |
| 410 | |
| 411 | This is because an equality comparison is required by the language to always |
| 412 | succeed (and return ``False`` for incompatible types). However, this also |
| 413 | means that code incorrectly ported to Python 3 can display buggy behaviour |
| 414 | if such comparisons are silently executed. To detect such situations, |
| 415 | Python 3 has a ``-b`` flag that will display a warning:: |
| 416 | |
| 417 | $ python3 -b |
| 418 | >>> b"" == "" |
| 419 | __main__:1: BytesWarning: Comparison between bytes and string |
| 420 | False |
| 421 | |
| 422 | To turn the warning into an exception, use the ``-bb`` flag instead:: |
| 423 | |
| 424 | $ python3 -bb |
| 425 | >>> b"" == "" |
| 426 | Traceback (most recent call last): |
| 427 | File "<stdin>", line 1, in <module> |
| 428 | BytesWarning: Comparison between bytes and string |
| 429 | |
| 430 | |
| 431 | Indexing bytes objects |
| 432 | '''''''''''''''''''''' |
| 433 | |
| 434 | Another potentially surprising change is the indexing behaviour of bytes |
| 435 | objects in Python 3:: |
| 436 | |
| 437 | >>> b"xyz"[0] |
| 438 | 120 |
| 439 | |
| 440 | Indeed, Python 3 bytes objects (as well as :class:`bytearray` objects) |
| 441 | are sequences of integers. But code converted from Python 2 will often |
| 442 | assume that indexing a bytestring produces another bytestring, not an |
| 443 | integer. To reconcile both behaviours, use slicing:: |
| 444 | |
| 445 | >>> b"xyz"[0:1] |
| 446 | b'x' |
| 447 | >>> n = 1 |
| 448 | >>> b"xyz"[n:n+1] |
| 449 | b'y' |
| 450 | |
| 451 | The only remaining gotcha is that an out-of-bounds slice returns an empty |
| 452 | bytes object instead of raising ``IndexError``: |
| 453 | |
| 454 | >>> b"xyz"[3] |
| 455 | Traceback (most recent call last): |
| 456 | File "<stdin>", line 1, in <module> |
| 457 | IndexError: index out of range |
| 458 | >>> b"xyz"[3:4] |
| 459 | b'' |
| 460 | |
| 461 | |
| 462 | ``__str__()``/``__unicode__()`` |
| 463 | ''''''''''''''''''''''''''''''' |
| 464 | |
| 465 | In Python 2, objects can specify both a string and unicode representation of |
| 466 | themselves. In Python 3, though, there is only a string representation. This |
| 467 | becomes an issue as people can inadvertently do things in their ``__str__()`` |
| 468 | methods which have unpredictable results (e.g., infinite recursion if you |
| 469 | happen to use the ``unicode(self).encode('utf8')`` idiom as the body of your |
| 470 | ``__str__()`` method). |
| 471 | |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 472 | You can use a mixin class to work around this. This allows you to only define a |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 473 | ``__unicode__()`` method for your class and let the mixin derive |
| 474 | ``__str__()`` for you (code from |
| 475 | http://lucumr.pocoo.org/2011/1/22/forwards-compatible-python/):: |
| 476 | |
| 477 | import sys |
| 478 | |
| 479 | class UnicodeMixin(object): |
| 480 | |
| 481 | """Mixin class to handle defining the proper __str__/__unicode__ |
| 482 | methods in Python 2 or 3.""" |
| 483 | |
| 484 | if sys.version_info[0] >= 3: # Python 3 |
| 485 | def __str__(self): |
| 486 | return self.__unicode__() |
| 487 | else: # Python 2 |
| 488 | def __str__(self): |
| 489 | return self.__unicode__().encode('utf8') |
| 490 | |
| 491 | |
| 492 | class Spam(UnicodeMixin): |
| 493 | |
| 494 | def __unicode__(self): |
| 495 | return u'spam-spam-bacon-spam' # 2to3 will remove the 'u' prefix |
| 496 | |
| 497 | |
| 498 | Don't Index on Exceptions |
| 499 | ''''''''''''''''''''''''' |
| 500 | |
| 501 | In Python 2, the following worked:: |
| 502 | |
| 503 | >>> exc = Exception(1, 2, 3) |
| 504 | >>> exc.args[1] |
| 505 | 2 |
| 506 | >>> exc[1] # Python 2 only! |
| 507 | 2 |
| 508 | |
| 509 | But in Python 3, indexing directly on an exception is an error. You need to |
| 510 | make sure to only index on the :attr:`BaseException.args` attribute which is a |
| 511 | sequence containing all arguments passed to the :meth:`__init__` method. |
| 512 | |
| 513 | Even better is to use the documented attributes the exception provides. |
| 514 | |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 515 | |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 516 | Don't use ``__getslice__`` & Friends |
| 517 | '''''''''''''''''''''''''''''''''''' |
| 518 | |
| 519 | Been deprecated for a while, but Python 3 finally drops support for |
| 520 | ``__getslice__()``, etc. Move completely over to :meth:`__getitem__` and |
| 521 | friends. |
| 522 | |
| 523 | |
| 524 | Updating doctests |
| 525 | ''''''''''''''''' |
| 526 | |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 527 | Don't forget to make them Python 2/3 compatible as well. If you wrote a |
| 528 | monolithic set of doctests (e.g., a single docstring containing all of your |
| 529 | doctests), you should at least consider breaking the doctests up into smaller |
| 530 | pieces to make it more manageable to fix. Otherwise it might very well be worth |
| 531 | your time and effort to port your tests to :mod:`unittest`. |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 532 | |
| 533 | |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 534 | Update ``map`` for imbalanced input sequences |
| 535 | ''''''''''''''''''''''''''''''''''''''''''''' |
| 536 | |
| 537 | With Python 2, when ``map`` was given more than one input sequence it would pad |
Georg Brandl | f3f5052 | 2014-10-30 22:26:26 +0100 | [diff] [blame^] | 538 | the shorter sequences with ``None`` values, returning a sequence as long as the |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 539 | longest input sequence. |
| 540 | |
| 541 | With Python 3, if the input sequences to ``map`` are of unequal length, ``map`` |
| 542 | will stop at the termination of the shortest of the sequences. For full |
| 543 | compatibility with ``map`` from Python 2.x, wrap the sequence arguments in |
| 544 | :func:`itertools.zip_longest`, e.g. ``map(func, *sequences)`` becomes |
| 545 | ``list(map(func, itertools.zip_longest(*sequences)))``. |
| 546 | |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 547 | Eliminate ``-3`` Warnings |
| 548 | ------------------------- |
| 549 | |
| 550 | When you run your application's test suite, run it using the ``-3`` flag passed |
| 551 | to Python. This will cause various warnings to be raised during execution about |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 552 | things that are semantic changes between Python 2 and 3. Try to eliminate those |
| 553 | warnings to make your code even more portable to Python 3. |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 554 | |
| 555 | |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 556 | Alternative Approaches |
| 557 | ====================== |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 558 | |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 559 | While supporting Python 2 & 3 simultaneously is typically the preferred choice |
| 560 | by people so that they can continue to improve code and have it work for the |
| 561 | most number of users, your life may be easier if you only have to support one |
| 562 | major version of Python going forward. |
| 563 | |
| 564 | Supporting Only Python 3 Going Forward From Python 2 Code |
| 565 | --------------------------------------------------------- |
| 566 | |
| 567 | If you have Python 2 code but going forward only want to improve it as Python 3 |
| 568 | code, then you can use 2to3_ to translate your Python 2 code to Python 3 code. |
| 569 | This is only recommended, though, if your current version of your project is |
| 570 | going into maintenance mode and you want all new features to be exclusive to |
| 571 | Python 3. |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 572 | |
| 573 | |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 574 | Backporting Python 3 code to Python 2 |
| 575 | ------------------------------------- |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 576 | |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 577 | If you have Python 3 code and have little interest in supporting Python 2 you |
| 578 | can use 3to2_ to translate from Python 3 code to Python 2 code. This is only |
| 579 | recommended if you don't plan to heavily support Python 2 users. Otherwise |
| 580 | write your code for Python 3 and then backport as far back as you want. This |
| 581 | is typically easier than going from Python 2 to 3 as you will have worked out |
| 582 | any difficulties with e.g. bytes/strings, etc. |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 583 | |
| 584 | |
| 585 | Other Resources |
| 586 | =============== |
| 587 | |
| 588 | The authors of the following blog posts, wiki pages, and books deserve special |
| 589 | thanks for making public their tips for porting Python 2 code to Python 3 (and |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 590 | thus helping provide information for this document and its various revisions |
| 591 | over the years): |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 592 | |
Georg Brandl | 06f3b3b | 2014-10-29 08:36:35 +0100 | [diff] [blame] | 593 | * https://wiki.python.org/moin/PortingPythonToPy3k |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 594 | * http://python3porting.com/ |
| 595 | * http://docs.pythonsprints.com/python3_porting/py-porting.html |
| 596 | * http://techspot.zzzeek.org/2011/01/24/zzzeek-s-guide-to-python-3-porting/ |
| 597 | * http://dabeaz.blogspot.com/2011/01/porting-py65-and-my-superboard-to.html |
| 598 | * http://lucumr.pocoo.org/2011/1/22/forwards-compatible-python/ |
| 599 | * http://lucumr.pocoo.org/2010/2/11/porting-to-python-3-a-guide/ |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 600 | * https://wiki.ubuntu.com/Python/3 |
Éric Araujo | 52a5a03 | 2011-08-19 01:22:42 +0200 | [diff] [blame] | 601 | |
| 602 | If you feel there is something missing from this document that should be added, |
| 603 | please email the python-porting_ mailing list. |
| 604 | |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 605 | |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 606 | .. _3to2: https://pypi.python.org/pypi/3to2 |
| 607 | .. _Cheeseshop: PyPI_ |
| 608 | .. _coverage: https://pypi.python.org/pypi/coverage |
| 609 | .. _future: http://python-future.org/ |
| 610 | .. _modernize: https://github.com/mitsuhiko/python-modernize |
| 611 | .. _Porting to Python 3: http://python3porting.com/ |
Georg Brandl | 0f5d6c0 | 2014-10-29 10:57:37 +0100 | [diff] [blame] | 612 | .. _PyPI: https://pypi.python.org/pypi |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 613 | .. _Python 3 Packages: https://pypi.python.org/pypi?:action=browse&c=533&show=all |
| 614 | .. _Python 3 Q & A: http://ncoghlan-devs-python-notes.readthedocs.org/en/latest/python3/questions_and_answers.html |
Georg Brandl | 06f3b3b | 2014-10-29 08:36:35 +0100 | [diff] [blame] | 615 | .. _python-porting: https://mail.python.org/mailman/listinfo/python-porting |
Benjamin Peterson | 841da4e | 2014-03-11 14:28:37 -0500 | [diff] [blame] | 616 | .. _six: https://pypi.python.org/pypi/six |
| 617 | .. _tox: https://pypi.python.org/pypi/tox |
| 618 | .. _trove classifiers: https://pypi.python.org/pypi?%3Aaction=list_classifiers |