Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 1 | .. _pyporting-howto: |
| 2 | |
| 3 | ********************************* |
| 4 | Porting Python 2 Code to Python 3 |
| 5 | ********************************* |
| 6 | |
| 7 | :author: Brett Cannon |
| 8 | |
| 9 | .. topic:: Abstract |
| 10 | |
Brett Cannon | 4b0c24a | 2011-02-03 22:14:58 +0000 | [diff] [blame] | 11 | With Python 3 being the future of Python while Python 2 is still in active |
| 12 | use, it is good to have your project available for both major releases of |
| 13 | Python. This guide is meant to help you choose which strategy works best |
| 14 | for your project to support both Python 2 & 3 along with how to execute |
| 15 | that strategy. |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 16 | |
Brett Cannon | 4b0c24a | 2011-02-03 22:14:58 +0000 | [diff] [blame] | 17 | If you are looking to port an extension module instead of pure Python code, |
Éric Araujo | 5405a0b | 2011-02-05 16:03:12 +0000 | [diff] [blame] | 18 | please see :ref:`cporting-howto`. |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 19 | |
| 20 | |
| 21 | Choosing a Strategy |
| 22 | =================== |
Georg Brandl | 2cb2fa9 | 2011-02-07 15:30:45 +0000 | [diff] [blame] | 23 | |
Brett Cannon | b7e6b89 | 2013-03-09 14:22:35 -0500 | [diff] [blame] | 24 | When a project chooses to support both Python 2 & 3, |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 25 | a decision needs to be made as to how to go about accomplishing that goal. |
Eli Bendersky | 2d062de | 2011-02-07 04:19:57 +0000 | [diff] [blame] | 26 | The chosen strategy will depend on how large the project's existing |
Brett Cannon | b7e6b89 | 2013-03-09 14:22:35 -0500 | [diff] [blame] | 27 | codebase is and how much divergence you want from your current Python 2 codebase |
| 28 | (e.g., changing your code to work simultaneously with Python 2 and 3). |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 29 | |
Brett Cannon | 6277fa4 | 2011-02-18 01:34:28 +0000 | [diff] [blame] | 30 | If you would prefer to maintain a codebase which is semantically **and** |
| 31 | syntactically compatible with Python 2 & 3 simultaneously, you can write |
| 32 | :ref:`use_same_source`. While this tends to lead to somewhat non-idiomatic |
| 33 | code, it does mean you keep a rapid development process for you, the developer. |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 34 | |
Brett Cannon | b7e6b89 | 2013-03-09 14:22:35 -0500 | [diff] [blame] | 35 | If your project is brand-new or does not have a large codebase, then you may |
| 36 | want to consider writing/porting :ref:`all of your code for Python 3 |
| 37 | and use 3to2 <use_3to2>` to port your code for Python 2. |
| 38 | |
Brett Cannon | 6277fa4 | 2011-02-18 01:34:28 +0000 | [diff] [blame] | 39 | Finally, you do have the option of :ref:`using 2to3 <use_2to3>` to translate |
| 40 | Python 2 code into Python 3 code (with some manual help). This can take the |
| 41 | form of branching your code and using 2to3 to start a Python 3 branch. You can |
R David Murray | 790e005 | 2012-04-23 14:44:00 -0400 | [diff] [blame] | 42 | also have users perform the translation at installation time automatically so |
Brett Cannon | 6277fa4 | 2011-02-18 01:34:28 +0000 | [diff] [blame] | 43 | that you only have to maintain a Python 2 codebase. |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 44 | |
Brett Cannon | 6277fa4 | 2011-02-18 01:34:28 +0000 | [diff] [blame] | 45 | Regardless of which approach you choose, porting is not as hard or |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 46 | time-consuming as you might initially think. You can also tackle the problem |
| 47 | piece-meal as a good portion of porting is simply updating your code to follow |
| 48 | current best practices in a Python 2/3 compatible way. |
| 49 | |
| 50 | |
| 51 | Universal Bits of Advice |
| 52 | ------------------------ |
Georg Brandl | 2cb2fa9 | 2011-02-07 15:30:45 +0000 | [diff] [blame] | 53 | |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 54 | Regardless of what strategy you pick, there are a few things you should |
| 55 | consider. |
| 56 | |
| 57 | One is make sure you have a robust test suite. You need to make sure everything |
Brett Cannon | b7e6b89 | 2013-03-09 14:22:35 -0500 | [diff] [blame] | 58 | continues to work, just like when you support a new minor/feature release of |
| 59 | Python. This means making sure your test suite is thorough and is ported |
| 60 | properly between Python 2 & 3. You will also most likely want to use something |
| 61 | like tox_ to automate testing between both a Python 2 and Python 3 interpreter. |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 62 | |
| 63 | Two, once your project has Python 3 support, make sure to add the proper |
| 64 | classifier on the Cheeseshop_ (PyPI_). To have your project listed as Python 3 |
| 65 | compatible it must have the |
| 66 | `Python 3 classifier <http://pypi.python.org/pypi?:action=browse&c=533>`_ |
| 67 | (from |
| 68 | http://techspot.zzzeek.org/2011/01/24/zzzeek-s-guide-to-python-3-porting/):: |
| 69 | |
Brett Cannon | 4b0c24a | 2011-02-03 22:14:58 +0000 | [diff] [blame] | 70 | setup( |
| 71 | name='Your Library', |
| 72 | version='1.0', |
| 73 | classifiers=[ |
| 74 | # make sure to use :: Python *and* :: Python :: 3 so |
| 75 | # that pypi can list the package on the python 3 page |
| 76 | 'Programming Language :: Python', |
| 77 | 'Programming Language :: Python :: 3' |
| 78 | ], |
| 79 | packages=['yourlibrary'], |
| 80 | # make sure to add custom_fixers to the MANIFEST.in |
| 81 | include_package_data=True, |
| 82 | # ... |
| 83 | ) |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 84 | |
| 85 | |
| 86 | Doing so will cause your project to show up in the |
| 87 | `Python 3 packages list |
| 88 | <http://pypi.python.org/pypi?:action=browse&c=533&show=all>`_. You will know |
| 89 | you set the classifier properly as visiting your project page on the Cheeseshop |
| 90 | will show a Python 3 logo in the upper-left corner of the page. |
| 91 | |
| 92 | Three, the six_ project provides a library which helps iron out differences |
| 93 | between Python 2 & 3. If you find there is a sticky point that is a continual |
| 94 | point of contention in your translation or maintenance of code, consider using |
| 95 | a source-compatible solution relying on six. If you have to create your own |
| 96 | Python 2/3 compatible solution, you can use ``sys.version_info[0] >= 3`` as a |
| 97 | guard. |
| 98 | |
| 99 | Four, read all the approaches. Just because some bit of advice applies to one |
| 100 | approach more than another doesn't mean that some advice doesn't apply to other |
Brett Cannon | b7e6b89 | 2013-03-09 14:22:35 -0500 | [diff] [blame] | 101 | strategies. This is especially true of whether you decide to use 2to3 or be |
| 102 | source-compatible; tips for one approach almost always apply to the other. |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 103 | |
Eli Bendersky | 2d062de | 2011-02-07 04:19:57 +0000 | [diff] [blame] | 104 | Five, drop support for older Python versions if possible. `Python 2.5`_ |
| 105 | introduced a lot of useful syntax and libraries which have become idiomatic |
| 106 | in Python 3. `Python 2.6`_ introduced future statements which makes |
| 107 | compatibility much easier if you are going from Python 2 to 3. |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 108 | `Python 2.7`_ continues the trend in the stdlib. So choose the newest version |
Eli Bendersky | 2d062de | 2011-02-07 04:19:57 +0000 | [diff] [blame] | 109 | of Python which you believe can be your minimum support version |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 110 | and work from there. |
| 111 | |
Brett Cannon | b7e6b89 | 2013-03-09 14:22:35 -0500 | [diff] [blame] | 112 | Six, target the newest version of Python 3 that you can. Beyond just the usual |
| 113 | bugfixes, compatibility has continued to improve between Python 2 and 3 as time |
| 114 | has passed. This is especially true for Python 3.3 where the ``u`` prefix for |
| 115 | strings is allowed, making source-compatible Python code easier. |
| 116 | |
| 117 | Seven, make sure to look at the `Other Resources`_ for tips from other people |
| 118 | which may help you out. |
| 119 | |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 120 | |
| 121 | .. _tox: http://codespeak.net/tox/ |
| 122 | .. _Cheeseshop: |
| 123 | .. _PyPI: http://pypi.python.org/ |
| 124 | .. _six: http://packages.python.org/six |
| 125 | .. _Python 2.7: http://www.python.org/2.7.x |
| 126 | .. _Python 2.6: http://www.python.org/2.6.x |
| 127 | .. _Python 2.5: http://www.python.org/2.5.x |
| 128 | .. _Python 2.4: http://www.python.org/2.4.x |
Brett Cannon | ce71ab2 | 2011-02-05 22:05:05 +0000 | [diff] [blame] | 129 | .. _Python 2.3: http://www.python.org/2.3.x |
| 130 | .. _Python 2.2: http://www.python.org/2.2.x |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 131 | |
| 132 | |
| 133 | .. _use_3to2: |
| 134 | |
| 135 | Python 3 and 3to2 |
| 136 | ================= |
Georg Brandl | 2cb2fa9 | 2011-02-07 15:30:45 +0000 | [diff] [blame] | 137 | |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 138 | If you are starting a new project or your codebase is small enough, you may |
| 139 | want to consider writing your code for Python 3 and backporting to Python 2 |
| 140 | using 3to2_. Thanks to Python 3 being more strict about things than Python 2 |
| 141 | (e.g., bytes vs. strings), the source translation can be easier and more |
| 142 | straightforward than from Python 2 to 3. Plus it gives you more direct |
| 143 | experience developing in Python 3 which, since it is the future of Python, is a |
| 144 | good thing long-term. |
| 145 | |
| 146 | A drawback of this approach is that 3to2 is a third-party project. This means |
| 147 | that the Python core developers (and thus this guide) can make no promises |
| 148 | about how well 3to2 works at any time. There is nothing to suggest, though, |
| 149 | that 3to2 is not a high-quality project. |
| 150 | |
| 151 | |
| 152 | .. _3to2: https://bitbucket.org/amentajo/lib3to2/overview |
| 153 | |
| 154 | |
| 155 | .. _use_2to3: |
| 156 | |
| 157 | Python 2 and 2to3 |
| 158 | ================= |
Georg Brandl | 2cb2fa9 | 2011-02-07 15:30:45 +0000 | [diff] [blame] | 159 | |
Eli Bendersky | 7ac3419 | 2011-02-07 04:44:19 +0000 | [diff] [blame] | 160 | Included with Python since 2.6, the 2to3_ tool (and :mod:`lib2to3` module) |
| 161 | helps with porting Python 2 to Python 3 by performing various source |
| 162 | translations. This is a perfect solution for projects which wish to branch |
| 163 | their Python 3 code from their Python 2 codebase and maintain them as |
| 164 | independent codebases. You can even begin preparing to use this approach |
| 165 | today by writing future-compatible Python code which works cleanly in |
| 166 | Python 2 in conjunction with 2to3; all steps outlined below will work |
| 167 | with Python 2 code up to the point when the actual use of 2to3 occurs. |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 168 | |
| 169 | Use of 2to3 as an on-demand translation step at install time is also possible, |
| 170 | preventing the need to maintain a separate Python 3 codebase, but this approach |
| 171 | does come with some drawbacks. While users will only have to pay the |
| 172 | translation cost once at installation, you as a developer will need to pay the |
| 173 | cost regularly during development. If your codebase is sufficiently large |
| 174 | enough then the translation step ends up acting like a compilation step, |
| 175 | robbing you of the rapid development process you are used to with Python. |
| 176 | Obviously the time required to translate a project will vary, so do an |
| 177 | experimental translation just to see how long it takes to evaluate whether you |
| 178 | prefer this approach compared to using :ref:`use_same_source` or simply keeping |
| 179 | a separate Python 3 codebase. |
| 180 | |
Brett Cannon | b7e6b89 | 2013-03-09 14:22:35 -0500 | [diff] [blame] | 181 | Below are the typical steps taken by a project which tries to support |
| 182 | Python 2 & 3 while keeping the code directly executable by Python 2. |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 183 | |
| 184 | |
| 185 | Support Python 2.7 |
| 186 | ------------------ |
Georg Brandl | 2cb2fa9 | 2011-02-07 15:30:45 +0000 | [diff] [blame] | 187 | |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 188 | As a first step, make sure that your project is compatible with `Python 2.7`_. |
| 189 | This is just good to do as Python 2.7 is the last release of Python 2 and thus |
| 190 | will be used for a rather long time. It also allows for use of the ``-3`` flag |
| 191 | to Python to help discover places in your code which 2to3 cannot handle but are |
| 192 | known to cause issues. |
| 193 | |
Brett Cannon | ce71ab2 | 2011-02-05 22:05:05 +0000 | [diff] [blame] | 194 | Try to Support `Python 2.6`_ and Newer Only |
| 195 | ------------------------------------------- |
Georg Brandl | 2cb2fa9 | 2011-02-07 15:30:45 +0000 | [diff] [blame] | 196 | |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 197 | While not possible for all projects, if you can support `Python 2.6`_ and newer |
| 198 | **only**, your life will be much easier. Various future statements, stdlib |
| 199 | additions, etc. exist only in Python 2.6 and later which greatly assist in |
| 200 | porting to Python 3. But if you project must keep support for `Python 2.5`_ (or |
| 201 | even `Python 2.4`_) then it is still possible to port to Python 3. |
| 202 | |
| 203 | Below are the benefits you gain if you only have to support Python 2.6 and |
| 204 | newer. Some of these options are personal choice while others are |
| 205 | **strongly** recommended (the ones that are more for personal choice are |
| 206 | labeled as such). If you continue to support older versions of Python then you |
| 207 | at least need to watch out for situations that these solutions fix. |
| 208 | |
| 209 | |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 210 | ``from __future__ import print_function`` |
| 211 | ''''''''''''''''''''''''''''''''''''''''' |
Georg Brandl | 2cb2fa9 | 2011-02-07 15:30:45 +0000 | [diff] [blame] | 212 | |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 213 | This is a personal choice. 2to3 handles the translation from the print |
| 214 | statement to the print function rather well so this is an optional step. This |
| 215 | future statement does help, though, with getting used to typing |
| 216 | ``print('Hello, World')`` instead of ``print 'Hello, World'``. |
| 217 | |
| 218 | |
| 219 | ``from __future__ import unicode_literals`` |
| 220 | ''''''''''''''''''''''''''''''''''''''''''' |
Georg Brandl | 2cb2fa9 | 2011-02-07 15:30:45 +0000 | [diff] [blame] | 221 | |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 222 | Another personal choice. You can always mark what you want to be a (unicode) |
| 223 | string with a ``u`` prefix to get the same effect. But regardless of whether |
| 224 | you use this future statement or not, you **must** make sure you know exactly |
| 225 | which Python 2 strings you want to be bytes, and which are to be strings. This |
| 226 | means you should, **at minimum** mark all strings that are meant to be text |
Brett Cannon | b7e6b89 | 2013-03-09 14:22:35 -0500 | [diff] [blame] | 227 | strings with a ``u`` prefix if you do not use this future statement. Python 3.3 |
| 228 | allows strings to continue to have the ``u`` prefix (it's a no-op in that case) |
| 229 | to make it easier for code to be source-compatible between Python 2 & 3. |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 230 | |
| 231 | |
| 232 | Bytes literals |
| 233 | '''''''''''''' |
Georg Brandl | 2cb2fa9 | 2011-02-07 15:30:45 +0000 | [diff] [blame] | 234 | |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 235 | This is a **very** important one. The ability to prefix Python 2 strings that |
| 236 | are meant to contain bytes with a ``b`` prefix help to very clearly delineate |
| 237 | what is and is not a Python 3 string. When you run 2to3 on code, all Python 2 |
| 238 | strings become Python 3 strings **unless** they are prefixed with ``b``. |
| 239 | |
Brett Cannon | b7e6b89 | 2013-03-09 14:22:35 -0500 | [diff] [blame] | 240 | This point cannot be stressed enough: make sure you know what all of your string |
| 241 | literals in Python 2 are meant to become in Python 3. Any string literal that |
| 242 | should be treated as bytes should have the ``b`` prefix. Any string literal |
| 243 | that should be Unicode/text in Python 2 should either have the ``u`` literal |
| 244 | (supported, but ignored, in Python 3.3 and later) or you should have |
| 245 | ``from __future__ import unicode_literals`` at the top of the file. But the key |
| 246 | point is you should know how Python 3 will treat everyone one of your string |
| 247 | literals and you should mark them as appropriate. |
| 248 | |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 249 | There are some differences between byte literals in Python 2 and those in |
| 250 | Python 3 thanks to the bytes type just being an alias to ``str`` in Python 2. |
| 251 | Probably the biggest "gotcha" is that indexing results in different values. In |
| 252 | Python 2, the value of ``b'py'[1]`` is ``'y'``, while in Python 3 it's ``121``. |
| 253 | You can avoid this disparity by always slicing at the size of a single element: |
| 254 | ``b'py'[1:2]`` is ``'y'`` in Python 2 and ``b'y'`` in Python 3 (i.e., close |
| 255 | enough). |
| 256 | |
R David Murray | 790e005 | 2012-04-23 14:44:00 -0400 | [diff] [blame] | 257 | You cannot concatenate bytes and strings in Python 3. But since Python |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 258 | 2 has bytes aliased to ``str``, it will succeed: ``b'a' + u'b'`` works in |
| 259 | Python 2, but ``b'a' + 'b'`` in Python 3 is a :exc:`TypeError`. A similar issue |
| 260 | also comes about when doing comparisons between bytes and strings. |
| 261 | |
| 262 | |
Brett Cannon | ce71ab2 | 2011-02-05 22:05:05 +0000 | [diff] [blame] | 263 | Supporting `Python 2.5`_ and Newer Only |
| 264 | --------------------------------------- |
Georg Brandl | 2cb2fa9 | 2011-02-07 15:30:45 +0000 | [diff] [blame] | 265 | |
Brett Cannon | ce71ab2 | 2011-02-05 22:05:05 +0000 | [diff] [blame] | 266 | If you are supporting `Python 2.5`_ and newer there are still some features of |
| 267 | Python that you can utilize. |
| 268 | |
| 269 | |
Ezio Melotti | c17c1f6 | 2011-04-21 14:49:03 +0300 | [diff] [blame] | 270 | ``from __future__ import absolute_import`` |
| 271 | '''''''''''''''''''''''''''''''''''''''''' |
Georg Brandl | 2cb2fa9 | 2011-02-07 15:30:45 +0000 | [diff] [blame] | 272 | |
Brett Cannon | ce71ab2 | 2011-02-05 22:05:05 +0000 | [diff] [blame] | 273 | Implicit relative imports (e.g., importing ``spam.bacon`` from within |
| 274 | ``spam.eggs`` with the statement ``import bacon``) does not work in Python 3. |
| 275 | This future statement moves away from that and allows the use of explicit |
| 276 | relative imports (e.g., ``from . import bacon``). |
| 277 | |
| 278 | In `Python 2.5`_ you must use |
| 279 | the __future__ statement to get to use explicit relative imports and prevent |
| 280 | implicit ones. In `Python 2.6`_ explicit relative imports are available without |
| 281 | the statement, but you still want the __future__ statement to prevent implicit |
| 282 | relative imports. In `Python 2.7`_ the __future__ statement is not needed. In |
| 283 | other words, unless you are only supporting Python 2.7 or a version earlier |
| 284 | than Python 2.5, use the __future__ statement. |
| 285 | |
| 286 | |
Brett Cannon | b7e6b89 | 2013-03-09 14:22:35 -0500 | [diff] [blame] | 287 | Mark all Unicode strings with a ``u`` prefix |
| 288 | ''''''''''''''''''''''''''''''''''''''''''''' |
| 289 | |
| 290 | While Python 2.6 has a ``__future__`` statement to automatically cause Python 2 |
| 291 | to treat all string literals as Unicode, Python 2.5 does not have that shortcut. |
| 292 | This means you should go through and mark all string literals with a ``u`` |
| 293 | prefix to turn them explicitly into Unicode strings where appropriate. That |
| 294 | leaves all unmarked string literals to be considered byte literals in Python 3. |
| 295 | |
| 296 | |
Brett Cannon | ce71ab2 | 2011-02-05 22:05:05 +0000 | [diff] [blame] | 297 | |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 298 | Handle Common "Gotchas" |
| 299 | ----------------------- |
Georg Brandl | 2cb2fa9 | 2011-02-07 15:30:45 +0000 | [diff] [blame] | 300 | |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 301 | There are a few things that just consistently come up as sticking points for |
| 302 | people which 2to3 cannot handle automatically or can easily be done in Python 2 |
| 303 | to help modernize your code. |
| 304 | |
| 305 | |
Brett Cannon | ce71ab2 | 2011-02-05 22:05:05 +0000 | [diff] [blame] | 306 | ``from __future__ import division`` |
| 307 | ''''''''''''''''''''''''''''''''''' |
Georg Brandl | 2cb2fa9 | 2011-02-07 15:30:45 +0000 | [diff] [blame] | 308 | |
Brett Cannon | ce71ab2 | 2011-02-05 22:05:05 +0000 | [diff] [blame] | 309 | While the exact same outcome can be had by using the ``-Qnew`` argument to |
| 310 | Python, using this future statement lifts the requirement that your users use |
| 311 | the flag to get the expected behavior of division in Python 3 |
| 312 | (e.g., ``1/2 == 0.5; 1//2 == 0``). |
| 313 | |
| 314 | |
| 315 | |
Antoine Pitrou | 5c28cfdc | 2011-02-05 11:53:39 +0000 | [diff] [blame] | 316 | Specify when opening a file as binary |
| 317 | ''''''''''''''''''''''''''''''''''''' |
| 318 | |
| 319 | Unless you have been working on Windows, there is a chance you have not always |
| 320 | bothered to add the ``b`` mode when opening a binary file (e.g., ``rb`` for |
| 321 | binary reading). Under Python 3, binary files and text files are clearly |
| 322 | distinct and mutually incompatible; see the :mod:`io` module for details. |
| 323 | Therefore, you **must** make a decision of whether a file will be used for |
| 324 | binary access (allowing to read and/or write bytes data) or text access |
| 325 | (allowing to read and/or write unicode data). |
| 326 | |
| 327 | Text files |
| 328 | '''''''''' |
| 329 | |
| 330 | Text files created using ``open()`` under Python 2 return byte strings, |
| 331 | while under Python 3 they return unicode strings. Depending on your porting |
| 332 | strategy, this can be an issue. |
| 333 | |
| 334 | If you want text files to return unicode strings in Python 2, you have two |
| 335 | possibilities: |
| 336 | |
| 337 | * Under Python 2.6 and higher, use :func:`io.open`. Since :func:`io.open` |
| 338 | is essentially the same function in both Python 2 and Python 3, it will |
| 339 | help iron out any issues that might arise. |
| 340 | |
| 341 | * If pre-2.6 compatibility is needed, then you should use :func:`codecs.open` |
| 342 | instead. This will make sure that you get back unicode strings in Python 2. |
| 343 | |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 344 | Subclass ``object`` |
| 345 | ''''''''''''''''''' |
Georg Brandl | 2cb2fa9 | 2011-02-07 15:30:45 +0000 | [diff] [blame] | 346 | |
Brett Cannon | ce71ab2 | 2011-02-05 22:05:05 +0000 | [diff] [blame] | 347 | New-style classes have been around since `Python 2.2`_. You need to make sure |
| 348 | you are subclassing from ``object`` to avoid odd edge cases involving method |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 349 | resolution order, etc. This continues to be totally valid in Python 3 (although |
| 350 | unneeded as all classes implicitly inherit from ``object``). |
| 351 | |
| 352 | |
| 353 | Deal With the Bytes/String Dichotomy |
| 354 | '''''''''''''''''''''''''''''''''''' |
Georg Brandl | 2cb2fa9 | 2011-02-07 15:30:45 +0000 | [diff] [blame] | 355 | |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 356 | One of the biggest issues people have when porting code to Python 3 is handling |
| 357 | the bytes/string dichotomy. Because Python 2 allowed the ``str`` type to hold |
| 358 | textual data, people have over the years been rather loose in their delineation |
| 359 | of what ``str`` instances held text compared to bytes. In Python 3 you cannot |
| 360 | be so care-free anymore and need to properly handle the difference. The key |
R David Murray | 790e005 | 2012-04-23 14:44:00 -0400 | [diff] [blame] | 361 | handling this issue is to make sure that **every** string literal in your |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 362 | Python 2 code is either syntactically of functionally marked as either bytes or |
| 363 | text data. After this is done you then need to make sure your APIs are designed |
| 364 | to either handle a specific type or made to be properly polymorphic. |
| 365 | |
| 366 | |
| 367 | Mark Up Python 2 String Literals |
| 368 | ******************************** |
| 369 | |
| 370 | First thing you must do is designate every single string literal in Python 2 |
| 371 | as either textual or bytes data. If you are only supporting Python 2.6 or |
| 372 | newer, this can be accomplished by marking bytes literals with a ``b`` prefix |
| 373 | and then designating textual data with a ``u`` prefix or using the |
| 374 | ``unicode_literals`` future statement. |
| 375 | |
R David Murray | 790e005 | 2012-04-23 14:44:00 -0400 | [diff] [blame] | 376 | If your project supports versions of Python predating 2.6, then you should use |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 377 | the six_ project and its ``b()`` function to denote bytes literals. For text |
| 378 | literals you can either use six's ``u()`` function or use a ``u`` prefix. |
| 379 | |
| 380 | |
| 381 | Decide what APIs Will Accept |
| 382 | **************************** |
Georg Brandl | 2cb2fa9 | 2011-02-07 15:30:45 +0000 | [diff] [blame] | 383 | |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 384 | In Python 2 it was very easy to accidentally create an API that accepted both |
| 385 | bytes and textual data. But in Python 3, thanks to the more strict handling of |
| 386 | disparate types, this loose usage of bytes and text together tends to fail. |
| 387 | |
| 388 | Take the dict ``{b'a': 'bytes', u'a': 'text'}`` in Python 2.6. It creates the |
| 389 | dict ``{u'a': 'text'}`` since ``b'a' == u'a'``. But in Python 3 the equivalent |
| 390 | dict creates ``{b'a': 'bytes', 'a': 'text'}``, i.e., no lost data. Similar |
| 391 | issues can crop up when transitioning Python 2 code to Python 3. |
| 392 | |
| 393 | This means you need to choose what an API is going to accept and create and |
| 394 | consistently stick to that API in both Python 2 and 3. |
| 395 | |
| 396 | |
Brett Cannon | ce71ab2 | 2011-02-05 22:05:05 +0000 | [diff] [blame] | 397 | Bytes / Unicode Comparison |
Antoine Pitrou | 8d8f7c5 | 2011-02-05 11:40:05 +0000 | [diff] [blame] | 398 | ************************** |
| 399 | |
| 400 | In Python 3, mixing bytes and unicode is forbidden in most situations; it |
| 401 | will raise a :class:`TypeError` where Python 2 would have attempted an implicit |
| 402 | coercion between types. However, there is one case where it doesn't and |
| 403 | it can be very misleading:: |
| 404 | |
| 405 | >>> b"" == "" |
| 406 | False |
| 407 | |
Brett Cannon | a2f1544 | 2011-02-09 22:55:13 +0000 | [diff] [blame] | 408 | This is because an equality comparison is required by the language to always |
Antoine Pitrou | 8d8f7c5 | 2011-02-05 11:40:05 +0000 | [diff] [blame] | 409 | succeed (and return ``False`` for incompatible types). However, this also |
| 410 | means that code incorrectly ported to Python 3 can display buggy behaviour |
| 411 | if such comparisons are silently executed. To detect such situations, |
| 412 | Python 3 has a ``-b`` flag that will display a warning:: |
| 413 | |
| 414 | $ python3 -b |
| 415 | >>> b"" == "" |
| 416 | __main__:1: BytesWarning: Comparison between bytes and string |
| 417 | False |
| 418 | |
| 419 | To turn the warning into an exception, use the ``-bb`` flag instead:: |
| 420 | |
| 421 | $ python3 -bb |
| 422 | >>> b"" == "" |
| 423 | Traceback (most recent call last): |
| 424 | File "<stdin>", line 1, in <module> |
| 425 | BytesWarning: Comparison between bytes and string |
| 426 | |
| 427 | |
Antoine Pitrou | bd866e9 | 2011-02-05 12:13:38 +0000 | [diff] [blame] | 428 | Indexing bytes objects |
| 429 | '''''''''''''''''''''' |
| 430 | |
| 431 | Another potentially surprising change is the indexing behaviour of bytes |
| 432 | objects in Python 3:: |
| 433 | |
| 434 | >>> b"xyz"[0] |
| 435 | 120 |
| 436 | |
| 437 | Indeed, Python 3 bytes objects (as well as :class:`bytearray` objects) |
| 438 | are sequences of integers. But code converted from Python 2 will often |
| 439 | assume that indexing a bytestring produces another bytestring, not an |
| 440 | integer. To reconcile both behaviours, use slicing:: |
| 441 | |
| 442 | >>> b"xyz"[0:1] |
| 443 | b'x' |
| 444 | >>> n = 1 |
| 445 | >>> b"xyz"[n:n+1] |
| 446 | b'y' |
| 447 | |
| 448 | The only remaining gotcha is that an out-of-bounds slice returns an empty |
| 449 | bytes object instead of raising ``IndexError``: |
| 450 | |
| 451 | >>> b"xyz"[3] |
| 452 | Traceback (most recent call last): |
| 453 | File "<stdin>", line 1, in <module> |
| 454 | IndexError: index out of range |
| 455 | >>> b"xyz"[3:4] |
| 456 | b'' |
| 457 | |
| 458 | |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 459 | ``__str__()``/``__unicode__()`` |
| 460 | ''''''''''''''''''''''''''''''' |
Georg Brandl | 2cb2fa9 | 2011-02-07 15:30:45 +0000 | [diff] [blame] | 461 | |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 462 | In Python 2, objects can specify both a string and unicode representation of |
| 463 | themselves. In Python 3, though, there is only a string representation. This |
Georg Brandl | 2cb2fa9 | 2011-02-07 15:30:45 +0000 | [diff] [blame] | 464 | becomes an issue as people can inadvertently do things in their ``__str__()`` |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 465 | methods which have unpredictable results (e.g., infinite recursion if you |
| 466 | happen to use the ``unicode(self).encode('utf8')`` idiom as the body of your |
| 467 | ``__str__()`` method). |
| 468 | |
| 469 | There are two ways to solve this issue. One is to use a custom 2to3 fixer. The |
| 470 | blog post at http://lucumr.pocoo.org/2011/1/22/forwards-compatible-python/ |
| 471 | specifies how to do this. That will allow 2to3 to change all instances of ``def |
R David Murray | 790e005 | 2012-04-23 14:44:00 -0400 | [diff] [blame] | 472 | __unicode(self): ...`` to ``def __str__(self): ...``. This does require that you |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 473 | define your ``__str__()`` method in Python 2 before your ``__unicode__()`` |
| 474 | method. |
| 475 | |
| 476 | The other option is to use a mixin class. This allows you to only define a |
| 477 | ``__unicode__()`` method for your class and let the mixin derive |
| 478 | ``__str__()`` for you (code from |
| 479 | http://lucumr.pocoo.org/2011/1/22/forwards-compatible-python/):: |
| 480 | |
Brett Cannon | 4b0c24a | 2011-02-03 22:14:58 +0000 | [diff] [blame] | 481 | import sys |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 482 | |
Brett Cannon | 4b0c24a | 2011-02-03 22:14:58 +0000 | [diff] [blame] | 483 | class UnicodeMixin(object): |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 484 | |
Brett Cannon | 4b0c24a | 2011-02-03 22:14:58 +0000 | [diff] [blame] | 485 | """Mixin class to handle defining the proper __str__/__unicode__ |
| 486 | methods in Python 2 or 3.""" |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 487 | |
Brett Cannon | 4b0c24a | 2011-02-03 22:14:58 +0000 | [diff] [blame] | 488 | if sys.version_info[0] >= 3: # Python 3 |
| 489 | def __str__(self): |
| 490 | return self.__unicode__() |
| 491 | else: # Python 2 |
| 492 | def __str__(self): |
| 493 | return self.__unicode__().encode('utf8') |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 494 | |
| 495 | |
Brett Cannon | 4b0c24a | 2011-02-03 22:14:58 +0000 | [diff] [blame] | 496 | class Spam(UnicodeMixin): |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 497 | |
Brett Cannon | 4b0c24a | 2011-02-03 22:14:58 +0000 | [diff] [blame] | 498 | def __unicode__(self): |
| 499 | return u'spam-spam-bacon-spam' # 2to3 will remove the 'u' prefix |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 500 | |
| 501 | |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 502 | Don't Index on Exceptions |
| 503 | ''''''''''''''''''''''''' |
Antoine Pitrou | 5c28cfdc | 2011-02-05 11:53:39 +0000 | [diff] [blame] | 504 | |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 505 | In Python 2, the following worked:: |
| 506 | |
Brett Cannon | 4b0c24a | 2011-02-03 22:14:58 +0000 | [diff] [blame] | 507 | >>> exc = Exception(1, 2, 3) |
| 508 | >>> exc.args[1] |
| 509 | 2 |
| 510 | >>> exc[1] # Python 2 only! |
| 511 | 2 |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 512 | |
Eli Bendersky | 7ac3419 | 2011-02-07 04:44:19 +0000 | [diff] [blame] | 513 | But in Python 3, indexing directly on an exception is an error. You need to |
| 514 | make sure to only index on the :attr:`BaseException.args` attribute which is a |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 515 | sequence containing all arguments passed to the :meth:`__init__` method. |
| 516 | |
Eli Bendersky | 7ac3419 | 2011-02-07 04:44:19 +0000 | [diff] [blame] | 517 | Even better is to use the documented attributes the exception provides. |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 518 | |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 519 | Don't use ``__getslice__`` & Friends |
| 520 | '''''''''''''''''''''''''''''''''''' |
Antoine Pitrou | 5c28cfdc | 2011-02-05 11:53:39 +0000 | [diff] [blame] | 521 | |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 522 | Been deprecated for a while, but Python 3 finally drops support for |
| 523 | ``__getslice__()``, etc. Move completely over to :meth:`__getitem__` and |
| 524 | friends. |
| 525 | |
| 526 | |
Brett Cannon | 45aa7cc | 2011-02-05 22:16:40 +0000 | [diff] [blame] | 527 | Updating doctests |
| 528 | ''''''''''''''''' |
Georg Brandl | 2cb2fa9 | 2011-02-07 15:30:45 +0000 | [diff] [blame] | 529 | |
Brett Cannon | 45aa7cc | 2011-02-05 22:16:40 +0000 | [diff] [blame] | 530 | 2to3_ will attempt to generate fixes for doctests that it comes across. It's |
| 531 | not perfect, though. If you wrote a monolithic set of doctests (e.g., a single |
| 532 | docstring containing all of your doctests), you should at least consider |
| 533 | breaking the doctests up into smaller pieces to make it more manageable to fix. |
| 534 | Otherwise it might very well be worth your time and effort to port your tests |
| 535 | to :mod:`unittest`. |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 536 | |
| 537 | |
Jason R. Coombs | a90e364 | 2011-12-03 08:24:21 -0500 | [diff] [blame] | 538 | Update `map` for imbalanced input sequences |
| 539 | ''''''''''''''''''''''''''''''''''''''''''' |
| 540 | |
| 541 | With Python 2, `map` would pad input sequences of unequal length with |
| 542 | `None` values, returning a sequence as long as the longest input sequence. |
| 543 | |
| 544 | With Python 3, if the input sequences to `map` are of unequal length, `map` |
| 545 | will stop at the termination of the shortest of the sequences. For full |
| 546 | compatibility with `map` from Python 2.x, also wrap the sequences in |
| 547 | :func:`itertools.zip_longest`, e.g. ``map(func, *sequences)`` becomes |
| 548 | ``list(map(func, itertools.zip_longest(*sequences)))``. |
| 549 | |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 550 | Eliminate ``-3`` Warnings |
| 551 | ------------------------- |
Georg Brandl | 2cb2fa9 | 2011-02-07 15:30:45 +0000 | [diff] [blame] | 552 | |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 553 | When you run your application's test suite, run it using the ``-3`` flag passed |
| 554 | to Python. This will cause various warnings to be raised during execution about |
| 555 | things that 2to3 cannot handle automatically (e.g., modules that have been |
| 556 | removed). Try to eliminate those warnings to make your code even more portable |
| 557 | to Python 3. |
| 558 | |
| 559 | |
| 560 | Run 2to3 |
| 561 | -------- |
Georg Brandl | 2cb2fa9 | 2011-02-07 15:30:45 +0000 | [diff] [blame] | 562 | |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 563 | Once you have made your Python 2 code future-compatible with Python 3, it's |
| 564 | time to use 2to3_ to actually port your code. |
| 565 | |
| 566 | |
| 567 | Manually |
| 568 | '''''''' |
Georg Brandl | 2cb2fa9 | 2011-02-07 15:30:45 +0000 | [diff] [blame] | 569 | |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 570 | To manually convert source code using 2to3_, you use the ``2to3`` script that |
| 571 | is installed with Python 2.6 and later.:: |
| 572 | |
Brett Cannon | 4b0c24a | 2011-02-03 22:14:58 +0000 | [diff] [blame] | 573 | 2to3 <directory or file to convert> |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 574 | |
| 575 | This will cause 2to3 to write out a diff with all of the fixers applied for the |
| 576 | converted source code. If you would like 2to3 to go ahead and apply the changes |
| 577 | you can pass it the ``-w`` flag:: |
| 578 | |
Brett Cannon | 4b0c24a | 2011-02-03 22:14:58 +0000 | [diff] [blame] | 579 | 2to3 -w <stuff to convert> |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 580 | |
| 581 | There are other flags available to control exactly which fixers are applied, |
| 582 | etc. |
| 583 | |
| 584 | |
| 585 | During Installation |
| 586 | ''''''''''''''''''' |
Georg Brandl | 2cb2fa9 | 2011-02-07 15:30:45 +0000 | [diff] [blame] | 587 | |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 588 | When a user installs your project for Python 3, you can have either |
| 589 | :mod:`distutils` or Distribute_ run 2to3_ on your behalf. |
| 590 | For distutils, use the following idiom:: |
| 591 | |
Brett Cannon | 4b0c24a | 2011-02-03 22:14:58 +0000 | [diff] [blame] | 592 | try: # Python 3 |
| 593 | from distutils.command.build_py import build_py_2to3 as build_py |
| 594 | except ImportError: # Python 2 |
| 595 | from distutils.command.build_py import build_py |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 596 | |
Georg Brandl | 829befb | 2011-02-13 09:59:39 +0000 | [diff] [blame] | 597 | setup(cmdclass = {'build_py': build_py}, |
Brett Cannon | 4b0c24a | 2011-02-03 22:14:58 +0000 | [diff] [blame] | 598 | # ... |
| 599 | ) |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 600 | |
Georg Brandl | 829befb | 2011-02-13 09:59:39 +0000 | [diff] [blame] | 601 | For Distribute:: |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 602 | |
Brett Cannon | 4b0c24a | 2011-02-03 22:14:58 +0000 | [diff] [blame] | 603 | setup(use_2to3=True, |
| 604 | # ... |
| 605 | ) |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 606 | |
| 607 | This will allow you to not have to distribute a separate Python 3 version of |
| 608 | your project. It does require, though, that when you perform development that |
| 609 | you at least build your project and use the built Python 3 source for testing. |
| 610 | |
| 611 | |
| 612 | Verify & Test |
| 613 | ------------- |
Georg Brandl | 2cb2fa9 | 2011-02-07 15:30:45 +0000 | [diff] [blame] | 614 | |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 615 | At this point you should (hopefully) have your project converted in such a way |
| 616 | that it works in Python 3. Verify it by running your unit tests and making sure |
| 617 | nothing has gone awry. If you miss something then figure out how to fix it in |
| 618 | Python 3, backport to your Python 2 code, and run your code through 2to3 again |
| 619 | to verify the fix transforms properly. |
| 620 | |
| 621 | |
| 622 | .. _2to3: http://docs.python.org/py3k/library/2to3.html |
| 623 | .. _Distribute: http://packages.python.org/distribute/ |
| 624 | |
| 625 | |
| 626 | .. _use_same_source: |
| 627 | |
| 628 | Python 2/3 Compatible Source |
| 629 | ============================ |
Georg Brandl | 2cb2fa9 | 2011-02-07 15:30:45 +0000 | [diff] [blame] | 630 | |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 631 | While it may seem counter-intuitive, you can write Python code which is |
| 632 | source-compatible between Python 2 & 3. It does lead to code that is not |
| 633 | entirely idiomatic Python (e.g., having to extract the currently raised |
| 634 | exception from ``sys.exc_info()[1]``), but it can be run under Python 2 |
Brett Cannon | 6277fa4 | 2011-02-18 01:34:28 +0000 | [diff] [blame] | 635 | **and** Python 3 without using 2to3_ as a translation step (although the tool |
| 636 | should be used to help find potential portability problems). This allows you to |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 637 | continue to have a rapid development process regardless of whether you are |
| 638 | developing under Python 2 or Python 3. Whether this approach or using |
| 639 | :ref:`use_2to3` works best for you will be a per-project decision. |
| 640 | |
| 641 | To get a complete idea of what issues you will need to deal with, see the |
| 642 | `What's New in Python 3.0`_. Others have reorganized the data in other formats |
Serhiy Storchaka | a4d170d | 2013-12-23 18:20:51 +0200 | [diff] [blame] | 643 | such as http://docs.pythonsprints.com/python3_porting/py-porting.html\ . |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 644 | |
| 645 | The following are some steps to take to try to support both Python 2 & 3 from |
| 646 | the same source code. |
| 647 | |
| 648 | |
| 649 | .. _What's New in Python 3.0: http://docs.python.org/release/3.0/whatsnew/3.0.html |
| 650 | |
| 651 | |
Brett Cannon | 6277fa4 | 2011-02-18 01:34:28 +0000 | [diff] [blame] | 652 | Follow The Steps for Using 2to3_ |
| 653 | -------------------------------- |
Georg Brandl | 2cb2fa9 | 2011-02-07 15:30:45 +0000 | [diff] [blame] | 654 | |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 655 | All of the steps outlined in how to |
| 656 | :ref:`port Python 2 code with 2to3 <use_2to3>` apply |
| 657 | to creating a Python 2/3 codebase. This includes trying only support Python 2.6 |
| 658 | or newer (the :mod:`__future__` statements work in Python 3 without issue), |
| 659 | eliminating warnings that are triggered by ``-3``, etc. |
| 660 | |
Brett Cannon | 98135d0 | 2011-02-05 22:22:47 +0000 | [diff] [blame] | 661 | You should even consider running 2to3_ over your code (without committing the |
| 662 | changes). This will let you know where potential pain points are within your |
| 663 | code so that you can fix them properly before they become an issue. |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 664 | |
| 665 | |
| 666 | Use six_ |
| 667 | -------- |
Georg Brandl | 2cb2fa9 | 2011-02-07 15:30:45 +0000 | [diff] [blame] | 668 | |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 669 | The six_ project contains many things to help you write portable Python code. |
| 670 | You should make sure to read its documentation from beginning to end and use |
| 671 | any and all features it provides. That way you will minimize any mistakes you |
| 672 | might make in writing cross-version code. |
| 673 | |
| 674 | |
| 675 | Capturing the Currently Raised Exception |
| 676 | ---------------------------------------- |
Georg Brandl | 2cb2fa9 | 2011-02-07 15:30:45 +0000 | [diff] [blame] | 677 | |
Brett Cannon | ce71ab2 | 2011-02-05 22:05:05 +0000 | [diff] [blame] | 678 | One change between Python 2 and 3 that will require changing how you code (if |
| 679 | you support `Python 2.5`_ and earlier) is |
| 680 | accessing the currently raised exception. In Python 2.5 and earlier the syntax |
| 681 | to access the current exception is:: |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 682 | |
Brett Cannon | 4b0c24a | 2011-02-03 22:14:58 +0000 | [diff] [blame] | 683 | try: |
| 684 | raise Exception() |
| 685 | except Exception, exc: |
| 686 | # Current exception is 'exc' |
| 687 | pass |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 688 | |
Brett Cannon | ce71ab2 | 2011-02-05 22:05:05 +0000 | [diff] [blame] | 689 | This syntax changed in Python 3 (and backported to `Python 2.6`_ and later) |
| 690 | to:: |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 691 | |
Brett Cannon | 4b0c24a | 2011-02-03 22:14:58 +0000 | [diff] [blame] | 692 | try: |
| 693 | raise Exception() |
| 694 | except Exception as exc: |
| 695 | # Current exception is 'exc' |
Brett Cannon | ce71ab2 | 2011-02-05 22:05:05 +0000 | [diff] [blame] | 696 | # In Python 3, 'exc' is restricted to the block; Python 2.6 will "leak" |
Brett Cannon | 4b0c24a | 2011-02-03 22:14:58 +0000 | [diff] [blame] | 697 | pass |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 698 | |
| 699 | Because of this syntax change you must change to capturing the current |
| 700 | exception to:: |
| 701 | |
Brett Cannon | 4b0c24a | 2011-02-03 22:14:58 +0000 | [diff] [blame] | 702 | try: |
| 703 | raise Exception() |
| 704 | except Exception: |
| 705 | import sys |
| 706 | exc = sys.exc_info()[1] |
| 707 | # Current exception is 'exc' |
| 708 | pass |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 709 | |
| 710 | You can get more information about the raised exception from |
| 711 | :func:`sys.exc_info` than simply the current exception instance, but you most |
Antoine Pitrou | e6a1464 | 2011-02-05 12:01:07 +0000 | [diff] [blame] | 712 | likely don't need it. |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 713 | |
Antoine Pitrou | e6a1464 | 2011-02-05 12:01:07 +0000 | [diff] [blame] | 714 | .. note:: |
| 715 | In Python 3, the traceback is attached to the exception instance |
Brett Cannon | ce71ab2 | 2011-02-05 22:05:05 +0000 | [diff] [blame] | 716 | through the ``__traceback__`` attribute. If the instance is saved in |
Antoine Pitrou | e6a1464 | 2011-02-05 12:01:07 +0000 | [diff] [blame] | 717 | a local variable that persists outside of the ``except`` block, the |
| 718 | traceback will create a reference cycle with the current frame and its |
| 719 | dictionary of local variables. This will delay reclaiming dead |
| 720 | resources until the next cyclic :term:`garbage collection` pass. |
| 721 | |
| 722 | In Python 2, this problem only occurs if you save the traceback itself |
| 723 | (e.g. the third element of the tuple returned by :func:`sys.exc_info`) |
| 724 | in a variable. |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 725 | |
Georg Brandl | 2cb2fa9 | 2011-02-07 15:30:45 +0000 | [diff] [blame] | 726 | |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 727 | Other Resources |
| 728 | =============== |
Georg Brandl | 2cb2fa9 | 2011-02-07 15:30:45 +0000 | [diff] [blame] | 729 | |
Brett Cannon | 6277fa4 | 2011-02-18 01:34:28 +0000 | [diff] [blame] | 730 | The authors of the following blog posts, wiki pages, and books deserve special |
| 731 | thanks for making public their tips for porting Python 2 code to Python 3 (and |
| 732 | thus helping provide information for this document): |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 733 | |
Brett Cannon | 6277fa4 | 2011-02-18 01:34:28 +0000 | [diff] [blame] | 734 | * http://python3porting.com/ |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 735 | * http://docs.pythonsprints.com/python3_porting/py-porting.html |
| 736 | * http://techspot.zzzeek.org/2011/01/24/zzzeek-s-guide-to-python-3-porting/ |
| 737 | * http://dabeaz.blogspot.com/2011/01/porting-py65-and-my-superboard-to.html |
| 738 | * http://lucumr.pocoo.org/2011/1/22/forwards-compatible-python/ |
| 739 | * http://lucumr.pocoo.org/2010/2/11/porting-to-python-3-a-guide/ |
| 740 | * http://wiki.python.org/moin/PortingPythonToPy3k |
Brett Cannon | b7e6b89 | 2013-03-09 14:22:35 -0500 | [diff] [blame] | 741 | * https://wiki.ubuntu.com/Python/3 |
Brett Cannon | 8045d97 | 2011-02-03 22:01:54 +0000 | [diff] [blame] | 742 | |
| 743 | If you feel there is something missing from this document that should be added, |
| 744 | please email the python-porting_ mailing list. |
| 745 | |
| 746 | .. _python-porting: http://mail.python.org/mailman/listinfo/python-porting |