blob: 9015372bde0629974f304f15b37eb19606f555d4 [file] [log] [blame]
Brett Cannon8045d972011-02-03 22:01:54 +00001.. _pyporting-howto:
2
3*********************************
4Porting Python 2 Code to Python 3
5*********************************
6
7:author: Brett Cannon
8
9.. topic:: Abstract
10
Brett Cannon4b0c24a2011-02-03 22:14:58 +000011 With Python 3 being the future of Python while Python 2 is still in active
12 use, it is good to have your project available for both major releases of
13 Python. This guide is meant to help you choose which strategy works best
14 for your project to support both Python 2 & 3 along with how to execute
15 that strategy.
Brett Cannon8045d972011-02-03 22:01:54 +000016
Brett Cannon4b0c24a2011-02-03 22:14:58 +000017 If you are looking to port an extension module instead of pure Python code,
Éric Araujo5405a0b2011-02-05 16:03:12 +000018 please see :ref:`cporting-howto`.
Brett Cannon8045d972011-02-03 22:01:54 +000019
20
21Choosing a Strategy
22===================
Georg Brandl2cb2fa92011-02-07 15:30:45 +000023
Brett Cannonb7e6b892013-03-09 14:22:35 -050024When a project chooses to support both Python 2 & 3,
Brett Cannon8045d972011-02-03 22:01:54 +000025a decision needs to be made as to how to go about accomplishing that goal.
Eli Bendersky2d062de2011-02-07 04:19:57 +000026The chosen strategy will depend on how large the project's existing
Brett Cannonb7e6b892013-03-09 14:22:35 -050027codebase is and how much divergence you want from your current Python 2 codebase
28(e.g., changing your code to work simultaneously with Python 2 and 3).
Brett Cannon8045d972011-02-03 22:01:54 +000029
Brett Cannon6277fa42011-02-18 01:34:28 +000030If you would prefer to maintain a codebase which is semantically **and**
31syntactically compatible with Python 2 & 3 simultaneously, you can write
32:ref:`use_same_source`. While this tends to lead to somewhat non-idiomatic
33code, it does mean you keep a rapid development process for you, the developer.
Brett Cannon8045d972011-02-03 22:01:54 +000034
Brett Cannonb7e6b892013-03-09 14:22:35 -050035If your project is brand-new or does not have a large codebase, then you may
36want to consider writing/porting :ref:`all of your code for Python 3
37and use 3to2 <use_3to2>` to port your code for Python 2.
38
Brett Cannon6277fa42011-02-18 01:34:28 +000039Finally, you do have the option of :ref:`using 2to3 <use_2to3>` to translate
40Python 2 code into Python 3 code (with some manual help). This can take the
41form of branching your code and using 2to3 to start a Python 3 branch. You can
R David Murray790e0052012-04-23 14:44:00 -040042also have users perform the translation at installation time automatically so
Brett Cannon6277fa42011-02-18 01:34:28 +000043that you only have to maintain a Python 2 codebase.
Brett Cannon8045d972011-02-03 22:01:54 +000044
Brett Cannon6277fa42011-02-18 01:34:28 +000045Regardless of which approach you choose, porting is not as hard or
Brett Cannon8045d972011-02-03 22:01:54 +000046time-consuming as you might initially think. You can also tackle the problem
47piece-meal as a good portion of porting is simply updating your code to follow
48current best practices in a Python 2/3 compatible way.
49
50
51Universal Bits of Advice
52------------------------
Georg Brandl2cb2fa92011-02-07 15:30:45 +000053
Brett Cannon8045d972011-02-03 22:01:54 +000054Regardless of what strategy you pick, there are a few things you should
55consider.
56
57One is make sure you have a robust test suite. You need to make sure everything
Brett Cannonb7e6b892013-03-09 14:22:35 -050058continues to work, just like when you support a new minor/feature release of
59Python. This means making sure your test suite is thorough and is ported
60properly between Python 2 & 3. You will also most likely want to use something
61like tox_ to automate testing between both a Python 2 and Python 3 interpreter.
Brett Cannon8045d972011-02-03 22:01:54 +000062
63Two, once your project has Python 3 support, make sure to add the proper
64classifier on the Cheeseshop_ (PyPI_). To have your project listed as Python 3
65compatible it must have the
66`Python 3 classifier <http://pypi.python.org/pypi?:action=browse&c=533>`_
67(from
68http://techspot.zzzeek.org/2011/01/24/zzzeek-s-guide-to-python-3-porting/)::
69
Brett Cannon4b0c24a2011-02-03 22:14:58 +000070 setup(
71 name='Your Library',
72 version='1.0',
73 classifiers=[
74 # make sure to use :: Python *and* :: Python :: 3 so
75 # that pypi can list the package on the python 3 page
76 'Programming Language :: Python',
77 'Programming Language :: Python :: 3'
78 ],
79 packages=['yourlibrary'],
80 # make sure to add custom_fixers to the MANIFEST.in
81 include_package_data=True,
82 # ...
83 )
Brett Cannon8045d972011-02-03 22:01:54 +000084
85
86Doing so will cause your project to show up in the
87`Python 3 packages list
88<http://pypi.python.org/pypi?:action=browse&c=533&show=all>`_. You will know
89you set the classifier properly as visiting your project page on the Cheeseshop
90will show a Python 3 logo in the upper-left corner of the page.
91
92Three, the six_ project provides a library which helps iron out differences
93between Python 2 & 3. If you find there is a sticky point that is a continual
94point of contention in your translation or maintenance of code, consider using
95a source-compatible solution relying on six. If you have to create your own
96Python 2/3 compatible solution, you can use ``sys.version_info[0] >= 3`` as a
97guard.
98
99Four, read all the approaches. Just because some bit of advice applies to one
100approach more than another doesn't mean that some advice doesn't apply to other
Brett Cannonb7e6b892013-03-09 14:22:35 -0500101strategies. This is especially true of whether you decide to use 2to3 or be
102source-compatible; tips for one approach almost always apply to the other.
Brett Cannon8045d972011-02-03 22:01:54 +0000103
Eli Bendersky2d062de2011-02-07 04:19:57 +0000104Five, drop support for older Python versions if possible. `Python 2.5`_
105introduced a lot of useful syntax and libraries which have become idiomatic
106in Python 3. `Python 2.6`_ introduced future statements which makes
107compatibility much easier if you are going from Python 2 to 3.
Brett Cannon8045d972011-02-03 22:01:54 +0000108`Python 2.7`_ continues the trend in the stdlib. So choose the newest version
Eli Bendersky2d062de2011-02-07 04:19:57 +0000109of Python which you believe can be your minimum support version
Brett Cannon8045d972011-02-03 22:01:54 +0000110and work from there.
111
Brett Cannonb7e6b892013-03-09 14:22:35 -0500112Six, target the newest version of Python 3 that you can. Beyond just the usual
113bugfixes, compatibility has continued to improve between Python 2 and 3 as time
114has passed. This is especially true for Python 3.3 where the ``u`` prefix for
115strings is allowed, making source-compatible Python code easier.
116
117Seven, make sure to look at the `Other Resources`_ for tips from other people
118which may help you out.
119
Brett Cannon8045d972011-02-03 22:01:54 +0000120
121.. _tox: http://codespeak.net/tox/
122.. _Cheeseshop:
123.. _PyPI: http://pypi.python.org/
124.. _six: http://packages.python.org/six
125.. _Python 2.7: http://www.python.org/2.7.x
126.. _Python 2.6: http://www.python.org/2.6.x
127.. _Python 2.5: http://www.python.org/2.5.x
128.. _Python 2.4: http://www.python.org/2.4.x
Brett Cannonce71ab22011-02-05 22:05:05 +0000129.. _Python 2.3: http://www.python.org/2.3.x
130.. _Python 2.2: http://www.python.org/2.2.x
Brett Cannon8045d972011-02-03 22:01:54 +0000131
132
133.. _use_3to2:
134
135Python 3 and 3to2
136=================
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000137
Brett Cannon8045d972011-02-03 22:01:54 +0000138If you are starting a new project or your codebase is small enough, you may
139want to consider writing your code for Python 3 and backporting to Python 2
140using 3to2_. Thanks to Python 3 being more strict about things than Python 2
141(e.g., bytes vs. strings), the source translation can be easier and more
142straightforward than from Python 2 to 3. Plus it gives you more direct
143experience developing in Python 3 which, since it is the future of Python, is a
144good thing long-term.
145
146A drawback of this approach is that 3to2 is a third-party project. This means
147that the Python core developers (and thus this guide) can make no promises
148about how well 3to2 works at any time. There is nothing to suggest, though,
149that 3to2 is not a high-quality project.
150
151
152.. _3to2: https://bitbucket.org/amentajo/lib3to2/overview
153
154
155.. _use_2to3:
156
157Python 2 and 2to3
158=================
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000159
Eli Bendersky7ac34192011-02-07 04:44:19 +0000160Included with Python since 2.6, the 2to3_ tool (and :mod:`lib2to3` module)
161helps with porting Python 2 to Python 3 by performing various source
162translations. This is a perfect solution for projects which wish to branch
163their Python 3 code from their Python 2 codebase and maintain them as
164independent codebases. You can even begin preparing to use this approach
165today by writing future-compatible Python code which works cleanly in
166Python 2 in conjunction with 2to3; all steps outlined below will work
167with Python 2 code up to the point when the actual use of 2to3 occurs.
Brett Cannon8045d972011-02-03 22:01:54 +0000168
169Use of 2to3 as an on-demand translation step at install time is also possible,
170preventing the need to maintain a separate Python 3 codebase, but this approach
171does come with some drawbacks. While users will only have to pay the
172translation cost once at installation, you as a developer will need to pay the
173cost regularly during development. If your codebase is sufficiently large
174enough then the translation step ends up acting like a compilation step,
175robbing you of the rapid development process you are used to with Python.
176Obviously the time required to translate a project will vary, so do an
177experimental translation just to see how long it takes to evaluate whether you
178prefer this approach compared to using :ref:`use_same_source` or simply keeping
179a separate Python 3 codebase.
180
Brett Cannonb7e6b892013-03-09 14:22:35 -0500181Below are the typical steps taken by a project which tries to support
182Python 2 & 3 while keeping the code directly executable by Python 2.
Brett Cannon8045d972011-02-03 22:01:54 +0000183
184
185Support Python 2.7
186------------------
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000187
Brett Cannon8045d972011-02-03 22:01:54 +0000188As a first step, make sure that your project is compatible with `Python 2.7`_.
189This is just good to do as Python 2.7 is the last release of Python 2 and thus
190will be used for a rather long time. It also allows for use of the ``-3`` flag
191to Python to help discover places in your code which 2to3 cannot handle but are
192known to cause issues.
193
Brett Cannonce71ab22011-02-05 22:05:05 +0000194Try to Support `Python 2.6`_ and Newer Only
195-------------------------------------------
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000196
Brett Cannon8045d972011-02-03 22:01:54 +0000197While not possible for all projects, if you can support `Python 2.6`_ and newer
198**only**, your life will be much easier. Various future statements, stdlib
199additions, etc. exist only in Python 2.6 and later which greatly assist in
200porting to Python 3. But if you project must keep support for `Python 2.5`_ (or
201even `Python 2.4`_) then it is still possible to port to Python 3.
202
203Below are the benefits you gain if you only have to support Python 2.6 and
204newer. Some of these options are personal choice while others are
205**strongly** recommended (the ones that are more for personal choice are
206labeled as such). If you continue to support older versions of Python then you
207at least need to watch out for situations that these solutions fix.
208
209
Brett Cannon8045d972011-02-03 22:01:54 +0000210``from __future__ import print_function``
211'''''''''''''''''''''''''''''''''''''''''
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000212
Brett Cannon8045d972011-02-03 22:01:54 +0000213This is a personal choice. 2to3 handles the translation from the print
214statement to the print function rather well so this is an optional step. This
215future statement does help, though, with getting used to typing
216``print('Hello, World')`` instead of ``print 'Hello, World'``.
217
218
219``from __future__ import unicode_literals``
220'''''''''''''''''''''''''''''''''''''''''''
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000221
Brett Cannon8045d972011-02-03 22:01:54 +0000222Another personal choice. You can always mark what you want to be a (unicode)
223string with a ``u`` prefix to get the same effect. But regardless of whether
224you use this future statement or not, you **must** make sure you know exactly
225which Python 2 strings you want to be bytes, and which are to be strings. This
226means you should, **at minimum** mark all strings that are meant to be text
Brett Cannonb7e6b892013-03-09 14:22:35 -0500227strings with a ``u`` prefix if you do not use this future statement. Python 3.3
228allows strings to continue to have the ``u`` prefix (it's a no-op in that case)
229to make it easier for code to be source-compatible between Python 2 & 3.
Brett Cannon8045d972011-02-03 22:01:54 +0000230
231
232Bytes literals
233''''''''''''''
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000234
Brett Cannon8045d972011-02-03 22:01:54 +0000235This is a **very** important one. The ability to prefix Python 2 strings that
236are meant to contain bytes with a ``b`` prefix help to very clearly delineate
237what is and is not a Python 3 string. When you run 2to3 on code, all Python 2
238strings become Python 3 strings **unless** they are prefixed with ``b``.
239
Brett Cannonb7e6b892013-03-09 14:22:35 -0500240This point cannot be stressed enough: make sure you know what all of your string
241literals in Python 2 are meant to become in Python 3. Any string literal that
242should be treated as bytes should have the ``b`` prefix. Any string literal
243that should be Unicode/text in Python 2 should either have the ``u`` literal
244(supported, but ignored, in Python 3.3 and later) or you should have
245``from __future__ import unicode_literals`` at the top of the file. But the key
246point is you should know how Python 3 will treat everyone one of your string
247literals and you should mark them as appropriate.
248
Brett Cannon8045d972011-02-03 22:01:54 +0000249There are some differences between byte literals in Python 2 and those in
250Python 3 thanks to the bytes type just being an alias to ``str`` in Python 2.
251Probably the biggest "gotcha" is that indexing results in different values. In
252Python 2, the value of ``b'py'[1]`` is ``'y'``, while in Python 3 it's ``121``.
253You can avoid this disparity by always slicing at the size of a single element:
254``b'py'[1:2]`` is ``'y'`` in Python 2 and ``b'y'`` in Python 3 (i.e., close
255enough).
256
R David Murray790e0052012-04-23 14:44:00 -0400257You cannot concatenate bytes and strings in Python 3. But since Python
Brett Cannon8045d972011-02-03 22:01:54 +00002582 has bytes aliased to ``str``, it will succeed: ``b'a' + u'b'`` works in
259Python 2, but ``b'a' + 'b'`` in Python 3 is a :exc:`TypeError`. A similar issue
260also comes about when doing comparisons between bytes and strings.
261
262
Brett Cannonce71ab22011-02-05 22:05:05 +0000263Supporting `Python 2.5`_ and Newer Only
264---------------------------------------
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000265
Brett Cannonce71ab22011-02-05 22:05:05 +0000266If you are supporting `Python 2.5`_ and newer there are still some features of
267Python that you can utilize.
268
269
Ezio Melottic17c1f62011-04-21 14:49:03 +0300270``from __future__ import absolute_import``
271''''''''''''''''''''''''''''''''''''''''''
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000272
Brett Cannonce71ab22011-02-05 22:05:05 +0000273Implicit relative imports (e.g., importing ``spam.bacon`` from within
274``spam.eggs`` with the statement ``import bacon``) does not work in Python 3.
275This future statement moves away from that and allows the use of explicit
276relative imports (e.g., ``from . import bacon``).
277
278In `Python 2.5`_ you must use
279the __future__ statement to get to use explicit relative imports and prevent
280implicit ones. In `Python 2.6`_ explicit relative imports are available without
281the statement, but you still want the __future__ statement to prevent implicit
282relative imports. In `Python 2.7`_ the __future__ statement is not needed. In
283other words, unless you are only supporting Python 2.7 or a version earlier
284than Python 2.5, use the __future__ statement.
285
286
Brett Cannonb7e6b892013-03-09 14:22:35 -0500287Mark all Unicode strings with a ``u`` prefix
288'''''''''''''''''''''''''''''''''''''''''''''
289
290While Python 2.6 has a ``__future__`` statement to automatically cause Python 2
291to treat all string literals as Unicode, Python 2.5 does not have that shortcut.
292This means you should go through and mark all string literals with a ``u``
293prefix to turn them explicitly into Unicode strings where appropriate. That
294leaves all unmarked string literals to be considered byte literals in Python 3.
295
296
Brett Cannonce71ab22011-02-05 22:05:05 +0000297
Brett Cannon8045d972011-02-03 22:01:54 +0000298Handle Common "Gotchas"
299-----------------------
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000300
Brett Cannon8045d972011-02-03 22:01:54 +0000301There are a few things that just consistently come up as sticking points for
302people which 2to3 cannot handle automatically or can easily be done in Python 2
303to help modernize your code.
304
305
Brett Cannonce71ab22011-02-05 22:05:05 +0000306``from __future__ import division``
307'''''''''''''''''''''''''''''''''''
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000308
Brett Cannonce71ab22011-02-05 22:05:05 +0000309While the exact same outcome can be had by using the ``-Qnew`` argument to
310Python, using this future statement lifts the requirement that your users use
311the flag to get the expected behavior of division in Python 3
312(e.g., ``1/2 == 0.5; 1//2 == 0``).
313
314
315
Antoine Pitrou5c28cfdc2011-02-05 11:53:39 +0000316Specify when opening a file as binary
317'''''''''''''''''''''''''''''''''''''
318
319Unless you have been working on Windows, there is a chance you have not always
320bothered to add the ``b`` mode when opening a binary file (e.g., ``rb`` for
321binary reading). Under Python 3, binary files and text files are clearly
322distinct and mutually incompatible; see the :mod:`io` module for details.
323Therefore, you **must** make a decision of whether a file will be used for
324binary access (allowing to read and/or write bytes data) or text access
325(allowing to read and/or write unicode data).
326
327Text files
328''''''''''
329
330Text files created using ``open()`` under Python 2 return byte strings,
331while under Python 3 they return unicode strings. Depending on your porting
332strategy, this can be an issue.
333
334If you want text files to return unicode strings in Python 2, you have two
335possibilities:
336
337* Under Python 2.6 and higher, use :func:`io.open`. Since :func:`io.open`
338 is essentially the same function in both Python 2 and Python 3, it will
339 help iron out any issues that might arise.
340
341* If pre-2.6 compatibility is needed, then you should use :func:`codecs.open`
342 instead. This will make sure that you get back unicode strings in Python 2.
343
Brett Cannon8045d972011-02-03 22:01:54 +0000344Subclass ``object``
345'''''''''''''''''''
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000346
Brett Cannonce71ab22011-02-05 22:05:05 +0000347New-style classes have been around since `Python 2.2`_. You need to make sure
348you are subclassing from ``object`` to avoid odd edge cases involving method
Brett Cannon8045d972011-02-03 22:01:54 +0000349resolution order, etc. This continues to be totally valid in Python 3 (although
350unneeded as all classes implicitly inherit from ``object``).
351
352
353Deal With the Bytes/String Dichotomy
354''''''''''''''''''''''''''''''''''''
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000355
Brett Cannon8045d972011-02-03 22:01:54 +0000356One of the biggest issues people have when porting code to Python 3 is handling
357the bytes/string dichotomy. Because Python 2 allowed the ``str`` type to hold
358textual data, people have over the years been rather loose in their delineation
359of what ``str`` instances held text compared to bytes. In Python 3 you cannot
360be so care-free anymore and need to properly handle the difference. The key
R David Murray790e0052012-04-23 14:44:00 -0400361handling this issue is to make sure that **every** string literal in your
Brett Cannon8045d972011-02-03 22:01:54 +0000362Python 2 code is either syntactically of functionally marked as either bytes or
363text data. After this is done you then need to make sure your APIs are designed
364to either handle a specific type or made to be properly polymorphic.
365
366
367Mark Up Python 2 String Literals
368********************************
369
370First thing you must do is designate every single string literal in Python 2
371as either textual or bytes data. If you are only supporting Python 2.6 or
372newer, this can be accomplished by marking bytes literals with a ``b`` prefix
373and then designating textual data with a ``u`` prefix or using the
374``unicode_literals`` future statement.
375
R David Murray790e0052012-04-23 14:44:00 -0400376If your project supports versions of Python predating 2.6, then you should use
Brett Cannon8045d972011-02-03 22:01:54 +0000377the six_ project and its ``b()`` function to denote bytes literals. For text
378literals you can either use six's ``u()`` function or use a ``u`` prefix.
379
380
381Decide what APIs Will Accept
382****************************
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000383
Brett Cannon8045d972011-02-03 22:01:54 +0000384In Python 2 it was very easy to accidentally create an API that accepted both
385bytes and textual data. But in Python 3, thanks to the more strict handling of
386disparate types, this loose usage of bytes and text together tends to fail.
387
388Take the dict ``{b'a': 'bytes', u'a': 'text'}`` in Python 2.6. It creates the
389dict ``{u'a': 'text'}`` since ``b'a' == u'a'``. But in Python 3 the equivalent
390dict creates ``{b'a': 'bytes', 'a': 'text'}``, i.e., no lost data. Similar
391issues can crop up when transitioning Python 2 code to Python 3.
392
393This means you need to choose what an API is going to accept and create and
394consistently stick to that API in both Python 2 and 3.
395
396
Brett Cannonce71ab22011-02-05 22:05:05 +0000397Bytes / Unicode Comparison
Antoine Pitrou8d8f7c52011-02-05 11:40:05 +0000398**************************
399
400In Python 3, mixing bytes and unicode is forbidden in most situations; it
401will raise a :class:`TypeError` where Python 2 would have attempted an implicit
402coercion between types. However, there is one case where it doesn't and
403it can be very misleading::
404
405 >>> b"" == ""
406 False
407
Brett Cannona2f15442011-02-09 22:55:13 +0000408This is because an equality comparison is required by the language to always
Antoine Pitrou8d8f7c52011-02-05 11:40:05 +0000409succeed (and return ``False`` for incompatible types). However, this also
410means that code incorrectly ported to Python 3 can display buggy behaviour
411if such comparisons are silently executed. To detect such situations,
412Python 3 has a ``-b`` flag that will display a warning::
413
414 $ python3 -b
415 >>> b"" == ""
416 __main__:1: BytesWarning: Comparison between bytes and string
417 False
418
419To turn the warning into an exception, use the ``-bb`` flag instead::
420
421 $ python3 -bb
422 >>> b"" == ""
423 Traceback (most recent call last):
424 File "<stdin>", line 1, in <module>
425 BytesWarning: Comparison between bytes and string
426
427
Antoine Pitroubd866e92011-02-05 12:13:38 +0000428Indexing bytes objects
429''''''''''''''''''''''
430
431Another potentially surprising change is the indexing behaviour of bytes
432objects in Python 3::
433
434 >>> b"xyz"[0]
435 120
436
437Indeed, Python 3 bytes objects (as well as :class:`bytearray` objects)
438are sequences of integers. But code converted from Python 2 will often
439assume that indexing a bytestring produces another bytestring, not an
440integer. To reconcile both behaviours, use slicing::
441
442 >>> b"xyz"[0:1]
443 b'x'
444 >>> n = 1
445 >>> b"xyz"[n:n+1]
446 b'y'
447
448The only remaining gotcha is that an out-of-bounds slice returns an empty
449bytes object instead of raising ``IndexError``:
450
451 >>> b"xyz"[3]
452 Traceback (most recent call last):
453 File "<stdin>", line 1, in <module>
454 IndexError: index out of range
455 >>> b"xyz"[3:4]
456 b''
457
458
Brett Cannon8045d972011-02-03 22:01:54 +0000459``__str__()``/``__unicode__()``
460'''''''''''''''''''''''''''''''
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000461
Brett Cannon8045d972011-02-03 22:01:54 +0000462In Python 2, objects can specify both a string and unicode representation of
463themselves. In Python 3, though, there is only a string representation. This
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000464becomes an issue as people can inadvertently do things in their ``__str__()``
Brett Cannon8045d972011-02-03 22:01:54 +0000465methods which have unpredictable results (e.g., infinite recursion if you
466happen to use the ``unicode(self).encode('utf8')`` idiom as the body of your
467``__str__()`` method).
468
469There are two ways to solve this issue. One is to use a custom 2to3 fixer. The
470blog post at http://lucumr.pocoo.org/2011/1/22/forwards-compatible-python/
471specifies how to do this. That will allow 2to3 to change all instances of ``def
R David Murray790e0052012-04-23 14:44:00 -0400472__unicode(self): ...`` to ``def __str__(self): ...``. This does require that you
Brett Cannon8045d972011-02-03 22:01:54 +0000473define your ``__str__()`` method in Python 2 before your ``__unicode__()``
474method.
475
476The other option is to use a mixin class. This allows you to only define a
477``__unicode__()`` method for your class and let the mixin derive
478``__str__()`` for you (code from
479http://lucumr.pocoo.org/2011/1/22/forwards-compatible-python/)::
480
Brett Cannon4b0c24a2011-02-03 22:14:58 +0000481 import sys
Brett Cannon8045d972011-02-03 22:01:54 +0000482
Brett Cannon4b0c24a2011-02-03 22:14:58 +0000483 class UnicodeMixin(object):
Brett Cannon8045d972011-02-03 22:01:54 +0000484
Brett Cannon4b0c24a2011-02-03 22:14:58 +0000485 """Mixin class to handle defining the proper __str__/__unicode__
486 methods in Python 2 or 3."""
Brett Cannon8045d972011-02-03 22:01:54 +0000487
Brett Cannon4b0c24a2011-02-03 22:14:58 +0000488 if sys.version_info[0] >= 3: # Python 3
489 def __str__(self):
490 return self.__unicode__()
491 else: # Python 2
492 def __str__(self):
493 return self.__unicode__().encode('utf8')
Brett Cannon8045d972011-02-03 22:01:54 +0000494
495
Brett Cannon4b0c24a2011-02-03 22:14:58 +0000496 class Spam(UnicodeMixin):
Brett Cannon8045d972011-02-03 22:01:54 +0000497
Brett Cannon4b0c24a2011-02-03 22:14:58 +0000498 def __unicode__(self):
499 return u'spam-spam-bacon-spam' # 2to3 will remove the 'u' prefix
Brett Cannon8045d972011-02-03 22:01:54 +0000500
501
Brett Cannon8045d972011-02-03 22:01:54 +0000502Don't Index on Exceptions
503'''''''''''''''''''''''''
Antoine Pitrou5c28cfdc2011-02-05 11:53:39 +0000504
Brett Cannon8045d972011-02-03 22:01:54 +0000505In Python 2, the following worked::
506
Brett Cannon4b0c24a2011-02-03 22:14:58 +0000507 >>> exc = Exception(1, 2, 3)
508 >>> exc.args[1]
509 2
510 >>> exc[1] # Python 2 only!
511 2
Brett Cannon8045d972011-02-03 22:01:54 +0000512
Eli Bendersky7ac34192011-02-07 04:44:19 +0000513But in Python 3, indexing directly on an exception is an error. You need to
514make sure to only index on the :attr:`BaseException.args` attribute which is a
Brett Cannon8045d972011-02-03 22:01:54 +0000515sequence containing all arguments passed to the :meth:`__init__` method.
516
Eli Bendersky7ac34192011-02-07 04:44:19 +0000517Even better is to use the documented attributes the exception provides.
Brett Cannon8045d972011-02-03 22:01:54 +0000518
Brett Cannon8045d972011-02-03 22:01:54 +0000519Don't use ``__getslice__`` & Friends
520''''''''''''''''''''''''''''''''''''
Antoine Pitrou5c28cfdc2011-02-05 11:53:39 +0000521
Brett Cannon8045d972011-02-03 22:01:54 +0000522Been deprecated for a while, but Python 3 finally drops support for
523``__getslice__()``, etc. Move completely over to :meth:`__getitem__` and
524friends.
525
526
Brett Cannon45aa7cc2011-02-05 22:16:40 +0000527Updating doctests
528'''''''''''''''''
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000529
Brett Cannon45aa7cc2011-02-05 22:16:40 +00005302to3_ will attempt to generate fixes for doctests that it comes across. It's
531not perfect, though. If you wrote a monolithic set of doctests (e.g., a single
532docstring containing all of your doctests), you should at least consider
533breaking the doctests up into smaller pieces to make it more manageable to fix.
534Otherwise it might very well be worth your time and effort to port your tests
535to :mod:`unittest`.
Brett Cannon8045d972011-02-03 22:01:54 +0000536
537
Jason R. Coombsa90e3642011-12-03 08:24:21 -0500538Update `map` for imbalanced input sequences
539'''''''''''''''''''''''''''''''''''''''''''
540
541With Python 2, `map` would pad input sequences of unequal length with
542`None` values, returning a sequence as long as the longest input sequence.
543
544With Python 3, if the input sequences to `map` are of unequal length, `map`
545will stop at the termination of the shortest of the sequences. For full
546compatibility with `map` from Python 2.x, also wrap the sequences in
547:func:`itertools.zip_longest`, e.g. ``map(func, *sequences)`` becomes
548``list(map(func, itertools.zip_longest(*sequences)))``.
549
Brett Cannon8045d972011-02-03 22:01:54 +0000550Eliminate ``-3`` Warnings
551-------------------------
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000552
Brett Cannon8045d972011-02-03 22:01:54 +0000553When you run your application's test suite, run it using the ``-3`` flag passed
554to Python. This will cause various warnings to be raised during execution about
555things that 2to3 cannot handle automatically (e.g., modules that have been
556removed). Try to eliminate those warnings to make your code even more portable
557to Python 3.
558
559
560Run 2to3
561--------
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000562
Brett Cannon8045d972011-02-03 22:01:54 +0000563Once you have made your Python 2 code future-compatible with Python 3, it's
564time to use 2to3_ to actually port your code.
565
566
567Manually
568''''''''
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000569
Brett Cannon8045d972011-02-03 22:01:54 +0000570To manually convert source code using 2to3_, you use the ``2to3`` script that
571is installed with Python 2.6 and later.::
572
Brett Cannon4b0c24a2011-02-03 22:14:58 +0000573 2to3 <directory or file to convert>
Brett Cannon8045d972011-02-03 22:01:54 +0000574
575This will cause 2to3 to write out a diff with all of the fixers applied for the
576converted source code. If you would like 2to3 to go ahead and apply the changes
577you can pass it the ``-w`` flag::
578
Brett Cannon4b0c24a2011-02-03 22:14:58 +0000579 2to3 -w <stuff to convert>
Brett Cannon8045d972011-02-03 22:01:54 +0000580
581There are other flags available to control exactly which fixers are applied,
582etc.
583
584
585During Installation
586'''''''''''''''''''
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000587
Brett Cannon8045d972011-02-03 22:01:54 +0000588When a user installs your project for Python 3, you can have either
589:mod:`distutils` or Distribute_ run 2to3_ on your behalf.
590For distutils, use the following idiom::
591
Brett Cannon4b0c24a2011-02-03 22:14:58 +0000592 try: # Python 3
593 from distutils.command.build_py import build_py_2to3 as build_py
594 except ImportError: # Python 2
595 from distutils.command.build_py import build_py
Brett Cannon8045d972011-02-03 22:01:54 +0000596
Georg Brandl829befb2011-02-13 09:59:39 +0000597 setup(cmdclass = {'build_py': build_py},
Brett Cannon4b0c24a2011-02-03 22:14:58 +0000598 # ...
599 )
Brett Cannon8045d972011-02-03 22:01:54 +0000600
Georg Brandl829befb2011-02-13 09:59:39 +0000601For Distribute::
Brett Cannon8045d972011-02-03 22:01:54 +0000602
Brett Cannon4b0c24a2011-02-03 22:14:58 +0000603 setup(use_2to3=True,
604 # ...
605 )
Brett Cannon8045d972011-02-03 22:01:54 +0000606
607This will allow you to not have to distribute a separate Python 3 version of
608your project. It does require, though, that when you perform development that
609you at least build your project and use the built Python 3 source for testing.
610
611
612Verify & Test
613-------------
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000614
Brett Cannon8045d972011-02-03 22:01:54 +0000615At this point you should (hopefully) have your project converted in such a way
616that it works in Python 3. Verify it by running your unit tests and making sure
617nothing has gone awry. If you miss something then figure out how to fix it in
618Python 3, backport to your Python 2 code, and run your code through 2to3 again
619to verify the fix transforms properly.
620
621
622.. _2to3: http://docs.python.org/py3k/library/2to3.html
623.. _Distribute: http://packages.python.org/distribute/
624
625
626.. _use_same_source:
627
628Python 2/3 Compatible Source
629============================
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000630
Brett Cannon8045d972011-02-03 22:01:54 +0000631While it may seem counter-intuitive, you can write Python code which is
632source-compatible between Python 2 & 3. It does lead to code that is not
633entirely idiomatic Python (e.g., having to extract the currently raised
634exception from ``sys.exc_info()[1]``), but it can be run under Python 2
Brett Cannon6277fa42011-02-18 01:34:28 +0000635**and** Python 3 without using 2to3_ as a translation step (although the tool
636should be used to help find potential portability problems). This allows you to
Brett Cannon8045d972011-02-03 22:01:54 +0000637continue to have a rapid development process regardless of whether you are
638developing under Python 2 or Python 3. Whether this approach or using
639:ref:`use_2to3` works best for you will be a per-project decision.
640
641To get a complete idea of what issues you will need to deal with, see the
642`What's New in Python 3.0`_. Others have reorganized the data in other formats
643such as http://docs.pythonsprints.com/python3_porting/py-porting.html .
644
645The following are some steps to take to try to support both Python 2 & 3 from
646the same source code.
647
648
649.. _What's New in Python 3.0: http://docs.python.org/release/3.0/whatsnew/3.0.html
650
651
Brett Cannon6277fa42011-02-18 01:34:28 +0000652Follow The Steps for Using 2to3_
653--------------------------------
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000654
Brett Cannon8045d972011-02-03 22:01:54 +0000655All of the steps outlined in how to
656:ref:`port Python 2 code with 2to3 <use_2to3>` apply
657to creating a Python 2/3 codebase. This includes trying only support Python 2.6
658or newer (the :mod:`__future__` statements work in Python 3 without issue),
659eliminating warnings that are triggered by ``-3``, etc.
660
Brett Cannon98135d02011-02-05 22:22:47 +0000661You should even consider running 2to3_ over your code (without committing the
662changes). This will let you know where potential pain points are within your
663code so that you can fix them properly before they become an issue.
Brett Cannon8045d972011-02-03 22:01:54 +0000664
665
666Use six_
667--------
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000668
Brett Cannon8045d972011-02-03 22:01:54 +0000669The six_ project contains many things to help you write portable Python code.
670You should make sure to read its documentation from beginning to end and use
671any and all features it provides. That way you will minimize any mistakes you
672might make in writing cross-version code.
673
674
675Capturing the Currently Raised Exception
676----------------------------------------
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000677
Brett Cannonce71ab22011-02-05 22:05:05 +0000678One change between Python 2 and 3 that will require changing how you code (if
679you support `Python 2.5`_ and earlier) is
680accessing the currently raised exception. In Python 2.5 and earlier the syntax
681to access the current exception is::
Brett Cannon8045d972011-02-03 22:01:54 +0000682
Brett Cannon4b0c24a2011-02-03 22:14:58 +0000683 try:
684 raise Exception()
685 except Exception, exc:
686 # Current exception is 'exc'
687 pass
Brett Cannon8045d972011-02-03 22:01:54 +0000688
Brett Cannonce71ab22011-02-05 22:05:05 +0000689This syntax changed in Python 3 (and backported to `Python 2.6`_ and later)
690to::
Brett Cannon8045d972011-02-03 22:01:54 +0000691
Brett Cannon4b0c24a2011-02-03 22:14:58 +0000692 try:
693 raise Exception()
694 except Exception as exc:
695 # Current exception is 'exc'
Brett Cannonce71ab22011-02-05 22:05:05 +0000696 # In Python 3, 'exc' is restricted to the block; Python 2.6 will "leak"
Brett Cannon4b0c24a2011-02-03 22:14:58 +0000697 pass
Brett Cannon8045d972011-02-03 22:01:54 +0000698
699Because of this syntax change you must change to capturing the current
700exception to::
701
Brett Cannon4b0c24a2011-02-03 22:14:58 +0000702 try:
703 raise Exception()
704 except Exception:
705 import sys
706 exc = sys.exc_info()[1]
707 # Current exception is 'exc'
708 pass
Brett Cannon8045d972011-02-03 22:01:54 +0000709
710You can get more information about the raised exception from
711:func:`sys.exc_info` than simply the current exception instance, but you most
Antoine Pitroue6a14642011-02-05 12:01:07 +0000712likely don't need it.
Brett Cannon8045d972011-02-03 22:01:54 +0000713
Antoine Pitroue6a14642011-02-05 12:01:07 +0000714.. note::
715 In Python 3, the traceback is attached to the exception instance
Brett Cannonce71ab22011-02-05 22:05:05 +0000716 through the ``__traceback__`` attribute. If the instance is saved in
Antoine Pitroue6a14642011-02-05 12:01:07 +0000717 a local variable that persists outside of the ``except`` block, the
718 traceback will create a reference cycle with the current frame and its
719 dictionary of local variables. This will delay reclaiming dead
720 resources until the next cyclic :term:`garbage collection` pass.
721
722 In Python 2, this problem only occurs if you save the traceback itself
723 (e.g. the third element of the tuple returned by :func:`sys.exc_info`)
724 in a variable.
Brett Cannon8045d972011-02-03 22:01:54 +0000725
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000726
Brett Cannon8045d972011-02-03 22:01:54 +0000727Other Resources
728===============
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000729
Brett Cannon6277fa42011-02-18 01:34:28 +0000730The authors of the following blog posts, wiki pages, and books deserve special
731thanks for making public their tips for porting Python 2 code to Python 3 (and
732thus helping provide information for this document):
Brett Cannon8045d972011-02-03 22:01:54 +0000733
Brett Cannon6277fa42011-02-18 01:34:28 +0000734* http://python3porting.com/
Brett Cannon8045d972011-02-03 22:01:54 +0000735* http://docs.pythonsprints.com/python3_porting/py-porting.html
736* http://techspot.zzzeek.org/2011/01/24/zzzeek-s-guide-to-python-3-porting/
737* http://dabeaz.blogspot.com/2011/01/porting-py65-and-my-superboard-to.html
738* http://lucumr.pocoo.org/2011/1/22/forwards-compatible-python/
739* http://lucumr.pocoo.org/2010/2/11/porting-to-python-3-a-guide/
740* http://wiki.python.org/moin/PortingPythonToPy3k
Brett Cannonb7e6b892013-03-09 14:22:35 -0500741* https://wiki.ubuntu.com/Python/3
Brett Cannon8045d972011-02-03 22:01:54 +0000742
743If you feel there is something missing from this document that should be added,
744please email the python-porting_ mailing list.
745
746.. _python-porting: http://mail.python.org/mailman/listinfo/python-porting