blob: 98d58b871842f5e47c10c16e82beea6baac72d54 [file] [log] [blame]
Brett Cannon8045d972011-02-03 22:01:54 +00001.. _pyporting-howto:
2
3*********************************
4Porting Python 2 Code to Python 3
5*********************************
6
7:author: Brett Cannon
8
9.. topic:: Abstract
10
Brett Cannon4b0c24a2011-02-03 22:14:58 +000011 With Python 3 being the future of Python while Python 2 is still in active
12 use, it is good to have your project available for both major releases of
Brett Cannonc39e8922014-03-07 12:28:35 -050013 Python. This guide is meant to help you figure out how best to support both
14 Python 2 & 3 simultaneously.
Brett Cannon8045d972011-02-03 22:01:54 +000015
Brett Cannon4b0c24a2011-02-03 22:14:58 +000016 If you are looking to port an extension module instead of pure Python code,
Éric Araujo5405a0b2011-02-05 16:03:12 +000017 please see :ref:`cporting-howto`.
Brett Cannon8045d972011-02-03 22:01:54 +000018
Brett Cannonc39e8922014-03-07 12:28:35 -050019 If you would like to read one core Python developer's take on why Python 3
20 came into existence, you can read Nick Coghlan's `Python 3 Q & A`_.
Brett Cannon8045d972011-02-03 22:01:54 +000021
Brett Cannonc39e8922014-03-07 12:28:35 -050022 If you prefer to read a (free) book on porting a project to Python 3,
23 consider reading `Porting to Python 3`_ by Lennart Regebro which should cover
24 much of what is discussed in this HOWTO.
Georg Brandl2cb2fa92011-02-07 15:30:45 +000025
Brett Cannonc39e8922014-03-07 12:28:35 -050026 For help with porting, you can email the python-porting_ mailing list with
27 questions.
Brett Cannon8045d972011-02-03 22:01:54 +000028
Brett Cannonc39e8922014-03-07 12:28:35 -050029The Short Version
30=================
Brett Cannon8045d972011-02-03 22:01:54 +000031
Brett Cannonc39e8922014-03-07 12:28:35 -050032* Decide what's the oldest version of Python 2 you want to support (if at all)
33* Make sure you have a thorough test suite and use continuous integration
34 testing to make sure you stay compatible with the versions of Python you care
35 about
36* If you have dependencies, check their Python 3 status using caniusepython3
37 (`command-line tool <https://pypi.python.org/pypi/caniusepython3>`__,
38 `web app <https://caniusepython3.com/>`__)
Brett Cannonb7e6b892013-03-09 14:22:35 -050039
Brett Cannonc39e8922014-03-07 12:28:35 -050040With that done, your options are:
Brett Cannon8045d972011-02-03 22:01:54 +000041
Brett Cannonc39e8922014-03-07 12:28:35 -050042* If you are dropping Python 2 support, use 2to3_ to port to Python 3
43* If you are keeping Python 2 support, then start writing Python 2/3-compatible
44 code starting **TODAY**
45
46 + If you have dependencies that have not been ported, reach out to them to port
47 their project while working to make your code compatible with Python 3 so
48 you're ready when your dependencies are all ported
49 + If all your dependencies have been ported (or you have none), go ahead and
50 port to Python 3
51
52* If you are creating a new project that wants to have 2/3 compatibility,
53 code in Python 3 and then backport to Python 2
Brett Cannon8045d972011-02-03 22:01:54 +000054
55
Brett Cannonc39e8922014-03-07 12:28:35 -050056Before You Begin
57================
Georg Brandl2cb2fa92011-02-07 15:30:45 +000058
Brett Cannonc39e8922014-03-07 12:28:35 -050059If your project is on the Cheeseshop_/PyPI_, make sure it has the proper
60`trove classifiers`_ to signify what versions of Python it **currently**
61supports. At minimum you should specify the major version(s), e.g.
62``Programming Language :: Python :: 2`` if your project currently only supports
63Python 2. It is preferrable that you be as specific as possible by listing every
64major/minor version of Python that you support, e.g. if your project supports
65Python 2.6 and 2.7, then you want the classifiers of::
Brett Cannon8045d972011-02-03 22:01:54 +000066
Brett Cannonc39e8922014-03-07 12:28:35 -050067 Programming Language :: Python :: 2
68 Programming Language :: Python :: 2.6
69 Programming Language :: Python :: 2.7
Brett Cannon8045d972011-02-03 22:01:54 +000070
Brett Cannonc39e8922014-03-07 12:28:35 -050071Once your project supports Python 3 you will want to go back and add the
72appropriate classifiers for Python 3 as well. This is important as setting the
73``Programming Language :: Python :: 3`` classifier will lead to your project
74being listed under the `Python 3 Packages`_ section of PyPI.
Brett Cannon8045d972011-02-03 22:01:54 +000075
Brett Cannonc39e8922014-03-07 12:28:35 -050076Make sure you have a robust test suite. You need to
77make sure everything continues to work, just like when you support a new
78minor/feature release of Python. This means making sure your test suite is
79thorough and is ported properly between Python 2 & 3 (consider using coverage_
80to measure that you have effective test coverage). You will also most likely
81want to use something like tox_ to automate testing between all of your
82supported versions of Python. You will also want to **port your tests first** so
83that you can make sure that you detect breakage during the transition. Tests also
84tend to be simpler than the code they are testing so it gives you an idea of how
85easy it can be to port code.
Brett Cannon8045d972011-02-03 22:01:54 +000086
Brett Cannonc39e8922014-03-07 12:28:35 -050087Drop support for older Python versions if possible. `Python 2.5`_
Eli Bendersky2d062de2011-02-07 04:19:57 +000088introduced a lot of useful syntax and libraries which have become idiomatic
89in Python 3. `Python 2.6`_ introduced future statements which makes
90compatibility much easier if you are going from Python 2 to 3.
Brett Cannonc39e8922014-03-07 12:28:35 -050091`Python 2.7`_ continues the trend in the stdlib. Choose the newest version
Eli Bendersky2d062de2011-02-07 04:19:57 +000092of Python which you believe can be your minimum support version
Brett Cannon8045d972011-02-03 22:01:54 +000093and work from there.
94
Brett Cannonc39e8922014-03-07 12:28:35 -050095Target the newest version of Python 3 that you can. Beyond just the usual
Brett Cannonb7e6b892013-03-09 14:22:35 -050096bugfixes, compatibility has continued to improve between Python 2 and 3 as time
Brett Cannonc39e8922014-03-07 12:28:35 -050097has passed. E.g. Python 3.3 added back the ``u`` prefix for
98strings, making source-compatible Python code easier to write.
Brett Cannonb7e6b892013-03-09 14:22:35 -050099
Brett Cannon8045d972011-02-03 22:01:54 +0000100
Brett Cannonc39e8922014-03-07 12:28:35 -0500101Writing Source-Compatible Python 2/3 Code
102=========================================
103
104Over the years the Python community has discovered that the easiest way to
105support both Python 2 and 3 in parallel is to write Python code that works in
106either version. While this might sound counter-intuitive at first, it actually
107is not difficult and typically only requires following some select
108(non-idiomatic) practices and using some key projects to help make bridging
109between Python 2 and 3 easier.
110
111Projects to Consider
112--------------------
113
114The lowest level library for suppoting Python 2 & 3 simultaneously is six_.
115Reading through its documentation will give you an idea of where exactly the
116Python language changed between versions 2 & 3 and thus what you will want the
117library to help you continue to support.
118
119To help automate porting your code over to using six, you can use
120modernize_. This project will attempt to rewrite your code to be as modern as
121possible while using six to smooth out any differences between Python 2 & 3.
122
123If you want to write your compatible code to feel more like Python 3 there is
124the future_ project. It tries to provide backports of objects from Python 3 so
125that you can use them from Python 2-compatible code, e.g. replacing the
126``bytes`` type from Python 2 with the one from Python 3.
127It also provides a translation script like modernize (its translation code is
128actually partially based on it) to help start working with a pre-existing code
129base. It is also unique in that its translation script will also port Python 3
130code backwards as well as Python 2 code forwards.
Brett Cannon8045d972011-02-03 22:01:54 +0000131
132
Brett Cannonc39e8922014-03-07 12:28:35 -0500133Tips & Tricks
134-------------
Brett Cannon8045d972011-02-03 22:01:54 +0000135
Brett Cannonc39e8922014-03-07 12:28:35 -0500136To help with writing source-compatible code using one of the projects mentioned
137in `Projects to Consider`_, consider following the below suggestions. Some of
138them are handled by the suggested projects, so if you do use one of them then
139read their documentation first to see which suggestions below will taken care of
140for you.
Brett Cannon8045d972011-02-03 22:01:54 +0000141
142Support Python 2.7
Brett Cannonc39e8922014-03-07 12:28:35 -0500143//////////////////
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000144
Brett Cannon8045d972011-02-03 22:01:54 +0000145As a first step, make sure that your project is compatible with `Python 2.7`_.
146This is just good to do as Python 2.7 is the last release of Python 2 and thus
147will be used for a rather long time. It also allows for use of the ``-3`` flag
Brett Cannonc39e8922014-03-07 12:28:35 -0500148to Python to help discover places in your code where compatibility might be an
149issue (the ``-3`` flag is in Python 2.6 but Python 2.7 adds more warnings).
Brett Cannon8045d972011-02-03 22:01:54 +0000150
Brett Cannonce71ab22011-02-05 22:05:05 +0000151Try to Support `Python 2.6`_ and Newer Only
Brett Cannonc39e8922014-03-07 12:28:35 -0500152///////////////////////////////////////////
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000153
Brett Cannon8045d972011-02-03 22:01:54 +0000154While not possible for all projects, if you can support `Python 2.6`_ and newer
155**only**, your life will be much easier. Various future statements, stdlib
156additions, etc. exist only in Python 2.6 and later which greatly assist in
Brett Cannonc39e8922014-03-07 12:28:35 -0500157supporting Python 3. But if you project must keep support for `Python 2.5`_ then
158it is still possible to simultaneously support Python 3.
Brett Cannon8045d972011-02-03 22:01:54 +0000159
160Below are the benefits you gain if you only have to support Python 2.6 and
161newer. Some of these options are personal choice while others are
162**strongly** recommended (the ones that are more for personal choice are
163labeled as such). If you continue to support older versions of Python then you
Brett Cannonc39e8922014-03-07 12:28:35 -0500164at least need to watch out for situations that these solutions fix and handle
165them appropriately (which is where library help from e.g. six_ comes in handy).
Brett Cannon8045d972011-02-03 22:01:54 +0000166
167
Brett Cannon8045d972011-02-03 22:01:54 +0000168``from __future__ import print_function``
169'''''''''''''''''''''''''''''''''''''''''
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000170
Brett Cannonc39e8922014-03-07 12:28:35 -0500171It will not only get you used to typing ``print()`` as a function instead of a
172statement, but it will also give you the various benefits the function has over
173the Python 2 statement (six_ provides a function if you support Python 2.5 or
174older).
Brett Cannon8045d972011-02-03 22:01:54 +0000175
176
177``from __future__ import unicode_literals``
178'''''''''''''''''''''''''''''''''''''''''''
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000179
Brett Cannonc39e8922014-03-07 12:28:35 -0500180If you choose to use this future statement then all string literals in
181Python 2 will be assumed to be Unicode (as is already the case in Python 3).
182If you choose not to use this future statement then you should mark all of your
183text strings with a ``u`` prefix and only support Python 3.3 or newer. But you
184are **strongly** advised to do one or the other (six_ provides a function in
185case you don't want to use the future statement **and** you want to support
186Python 3.2 or older).
Brett Cannon8045d972011-02-03 22:01:54 +0000187
188
Brett Cannonc39e8922014-03-07 12:28:35 -0500189Bytes/string literals
190'''''''''''''''''''''
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000191
Brett Cannonc39e8922014-03-07 12:28:35 -0500192This is a **very** important one. Prefix Python 2 strings that
193are meant to contain bytes with a ``b`` prefix to very clearly delineate
194what is and is not a Python 3 text string (six_ provides a function to use for
195Python 2.5 compatibility).
Brett Cannon8045d972011-02-03 22:01:54 +0000196
Brett Cannonb7e6b892013-03-09 14:22:35 -0500197This point cannot be stressed enough: make sure you know what all of your string
Brett Cannonc39e8922014-03-07 12:28:35 -0500198literals in Python 2 are meant to be in Python 3. Any string literal that
Brett Cannonb7e6b892013-03-09 14:22:35 -0500199should be treated as bytes should have the ``b`` prefix. Any string literal
200that should be Unicode/text in Python 2 should either have the ``u`` literal
201(supported, but ignored, in Python 3.3 and later) or you should have
202``from __future__ import unicode_literals`` at the top of the file. But the key
Brett Cannonc39e8922014-03-07 12:28:35 -0500203point is you should know how Python 3 will treat every one one of your string
Brett Cannonb7e6b892013-03-09 14:22:35 -0500204literals and you should mark them as appropriate.
205
Brett Cannon8045d972011-02-03 22:01:54 +0000206There are some differences between byte literals in Python 2 and those in
207Python 3 thanks to the bytes type just being an alias to ``str`` in Python 2.
Brett Cannonc39e8922014-03-07 12:28:35 -0500208See the `Handle Common "Gotchas"`_ section for what to watch out for.
Brett Cannon8045d972011-02-03 22:01:54 +0000209
Brett Cannonc39e8922014-03-07 12:28:35 -0500210``from __future__ import absolute_import``
211''''''''''''''''''''''''''''''''''''''''''
212Discussed in more detail below, but you should use this future statement to
213prevent yourself from accidentally using implicit relative imports.
Brett Cannon8045d972011-02-03 22:01:54 +0000214
215
Brett Cannonce71ab22011-02-05 22:05:05 +0000216Supporting `Python 2.5`_ and Newer Only
Brett Cannonc39e8922014-03-07 12:28:35 -0500217///////////////////////////////////////
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000218
Brett Cannonce71ab22011-02-05 22:05:05 +0000219If you are supporting `Python 2.5`_ and newer there are still some features of
220Python that you can utilize.
221
222
Ezio Melottic17c1f62011-04-21 14:49:03 +0300223``from __future__ import absolute_import``
224''''''''''''''''''''''''''''''''''''''''''
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000225
Brett Cannonce71ab22011-02-05 22:05:05 +0000226Implicit relative imports (e.g., importing ``spam.bacon`` from within
Brett Cannonc39e8922014-03-07 12:28:35 -0500227``spam.eggs`` with the statement ``import bacon``) do not work in Python 3.
Brett Cannonce71ab22011-02-05 22:05:05 +0000228This future statement moves away from that and allows the use of explicit
229relative imports (e.g., ``from . import bacon``).
230
231In `Python 2.5`_ you must use
232the __future__ statement to get to use explicit relative imports and prevent
233implicit ones. In `Python 2.6`_ explicit relative imports are available without
234the statement, but you still want the __future__ statement to prevent implicit
235relative imports. In `Python 2.7`_ the __future__ statement is not needed. In
236other words, unless you are only supporting Python 2.7 or a version earlier
Brett Cannonc39e8922014-03-07 12:28:35 -0500237than Python 2.5, use this __future__ statement.
Brett Cannonce71ab22011-02-05 22:05:05 +0000238
239
Brett Cannonb7e6b892013-03-09 14:22:35 -0500240Mark all Unicode strings with a ``u`` prefix
241'''''''''''''''''''''''''''''''''''''''''''''
242
243While Python 2.6 has a ``__future__`` statement to automatically cause Python 2
244to treat all string literals as Unicode, Python 2.5 does not have that shortcut.
245This means you should go through and mark all string literals with a ``u``
Brett Cannonc39e8922014-03-07 12:28:35 -0500246prefix to turn them explicitly into text strings where appropriate and only
247support Python 3.3 or newer. Otherwise use a project like six_ which provides a
248function to pass all text string literals through.
Brett Cannonb7e6b892013-03-09 14:22:35 -0500249
250
Brett Cannonc39e8922014-03-07 12:28:35 -0500251Capturing the Currently Raised Exception
252''''''''''''''''''''''''''''''''''''''''
253
254In Python 2.5 and earlier the syntax to access the current exception is::
255
256 try:
257 raise Exception()
258 except Exception, exc:
259 # Current exception is 'exc'.
260 pass
261
262This syntax changed in Python 3 (and backported to `Python 2.6`_ and later)
263to::
264
265 try:
266 raise Exception()
267 except Exception as exc:
268 # Current exception is 'exc'.
269 # In Python 3, 'exc' is restricted to the block; in Python 2.6/2.7 it will "leak".
270 pass
271
272Because of this syntax change you must change how you capture the current
273exception in Python 2.5 and earlier to::
274
275 try:
276 raise Exception()
277 except Exception:
278 import sys
279 exc = sys.exc_info()[1]
280 # Current exception is 'exc'.
281 pass
282
283You can get more information about the raised exception from
284:func:`sys.exc_info` than simply the current exception instance, but you most
285likely don't need it.
286
287.. note::
288 In Python 3, the traceback is attached to the exception instance
289 through the ``__traceback__`` attribute. If the instance is saved in
290 a local variable that persists outside of the ``except`` block, the
291 traceback will create a reference cycle with the current frame and its
292 dictionary of local variables. This will delay reclaiming dead
293 resources until the next cyclic :term:`garbage collection` pass.
294
295 In Python 2, this problem only occurs if you save the traceback itself
296 (e.g. the third element of the tuple returned by :func:`sys.exc_info`)
297 in a variable.
298
Brett Cannonce71ab22011-02-05 22:05:05 +0000299
Brett Cannon8045d972011-02-03 22:01:54 +0000300Handle Common "Gotchas"
Brett Cannonc39e8922014-03-07 12:28:35 -0500301///////////////////////
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000302
Brett Cannonc39e8922014-03-07 12:28:35 -0500303These are things to watch out for no matter what version of Python 2 you are
304supporting which are not syntactic considerations.
Brett Cannon8045d972011-02-03 22:01:54 +0000305
306
Brett Cannonce71ab22011-02-05 22:05:05 +0000307``from __future__ import division``
308'''''''''''''''''''''''''''''''''''
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000309
Brett Cannonce71ab22011-02-05 22:05:05 +0000310While the exact same outcome can be had by using the ``-Qnew`` argument to
311Python, using this future statement lifts the requirement that your users use
312the flag to get the expected behavior of division in Python 3
313(e.g., ``1/2 == 0.5; 1//2 == 0``).
314
315
316
Antoine Pitrou5c28cfdc2011-02-05 11:53:39 +0000317Specify when opening a file as binary
318'''''''''''''''''''''''''''''''''''''
319
320Unless you have been working on Windows, there is a chance you have not always
321bothered to add the ``b`` mode when opening a binary file (e.g., ``rb`` for
322binary reading). Under Python 3, binary files and text files are clearly
323distinct and mutually incompatible; see the :mod:`io` module for details.
324Therefore, you **must** make a decision of whether a file will be used for
325binary access (allowing to read and/or write bytes data) or text access
326(allowing to read and/or write unicode data).
327
328Text files
329''''''''''
330
331Text files created using ``open()`` under Python 2 return byte strings,
332while under Python 3 they return unicode strings. Depending on your porting
333strategy, this can be an issue.
334
335If you want text files to return unicode strings in Python 2, you have two
336possibilities:
337
338* Under Python 2.6 and higher, use :func:`io.open`. Since :func:`io.open`
339 is essentially the same function in both Python 2 and Python 3, it will
340 help iron out any issues that might arise.
341
342* If pre-2.6 compatibility is needed, then you should use :func:`codecs.open`
343 instead. This will make sure that you get back unicode strings in Python 2.
344
Brett Cannon8045d972011-02-03 22:01:54 +0000345Subclass ``object``
346'''''''''''''''''''
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000347
Brett Cannonce71ab22011-02-05 22:05:05 +0000348New-style classes have been around since `Python 2.2`_. You need to make sure
349you are subclassing from ``object`` to avoid odd edge cases involving method
Brett Cannon8045d972011-02-03 22:01:54 +0000350resolution order, etc. This continues to be totally valid in Python 3 (although
351unneeded as all classes implicitly inherit from ``object``).
352
353
354Deal With the Bytes/String Dichotomy
355''''''''''''''''''''''''''''''''''''
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000356
Brett Cannon8045d972011-02-03 22:01:54 +0000357One of the biggest issues people have when porting code to Python 3 is handling
358the bytes/string dichotomy. Because Python 2 allowed the ``str`` type to hold
359textual data, people have over the years been rather loose in their delineation
360of what ``str`` instances held text compared to bytes. In Python 3 you cannot
Brett Cannonc39e8922014-03-07 12:28:35 -0500361be so care-free anymore and need to properly handle the difference. The key to
R David Murray790e0052012-04-23 14:44:00 -0400362handling this issue is to make sure that **every** string literal in your
Brett Cannonc39e8922014-03-07 12:28:35 -0500363Python 2 code is either syntactically or functionally marked as either bytes or
Brett Cannon8045d972011-02-03 22:01:54 +0000364text data. After this is done you then need to make sure your APIs are designed
365to either handle a specific type or made to be properly polymorphic.
366
367
368Mark Up Python 2 String Literals
369********************************
370
371First thing you must do is designate every single string literal in Python 2
372as either textual or bytes data. If you are only supporting Python 2.6 or
373newer, this can be accomplished by marking bytes literals with a ``b`` prefix
374and then designating textual data with a ``u`` prefix or using the
375``unicode_literals`` future statement.
376
R David Murray790e0052012-04-23 14:44:00 -0400377If your project supports versions of Python predating 2.6, then you should use
Brett Cannon8045d972011-02-03 22:01:54 +0000378the six_ project and its ``b()`` function to denote bytes literals. For text
379literals you can either use six's ``u()`` function or use a ``u`` prefix.
380
381
382Decide what APIs Will Accept
383****************************
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000384
Brett Cannon8045d972011-02-03 22:01:54 +0000385In Python 2 it was very easy to accidentally create an API that accepted both
386bytes and textual data. But in Python 3, thanks to the more strict handling of
387disparate types, this loose usage of bytes and text together tends to fail.
388
389Take the dict ``{b'a': 'bytes', u'a': 'text'}`` in Python 2.6. It creates the
390dict ``{u'a': 'text'}`` since ``b'a' == u'a'``. But in Python 3 the equivalent
391dict creates ``{b'a': 'bytes', 'a': 'text'}``, i.e., no lost data. Similar
392issues can crop up when transitioning Python 2 code to Python 3.
393
394This means you need to choose what an API is going to accept and create and
395consistently stick to that API in both Python 2 and 3.
396
397
Brett Cannonce71ab22011-02-05 22:05:05 +0000398Bytes / Unicode Comparison
Antoine Pitrou8d8f7c52011-02-05 11:40:05 +0000399**************************
400
401In Python 3, mixing bytes and unicode is forbidden in most situations; it
402will raise a :class:`TypeError` where Python 2 would have attempted an implicit
403coercion between types. However, there is one case where it doesn't and
404it can be very misleading::
405
406 >>> b"" == ""
407 False
408
Brett Cannona2f15442011-02-09 22:55:13 +0000409This is because an equality comparison is required by the language to always
Antoine Pitrou8d8f7c52011-02-05 11:40:05 +0000410succeed (and return ``False`` for incompatible types). However, this also
411means that code incorrectly ported to Python 3 can display buggy behaviour
412if such comparisons are silently executed. To detect such situations,
413Python 3 has a ``-b`` flag that will display a warning::
414
415 $ python3 -b
416 >>> b"" == ""
417 __main__:1: BytesWarning: Comparison between bytes and string
418 False
419
420To turn the warning into an exception, use the ``-bb`` flag instead::
421
422 $ python3 -bb
423 >>> b"" == ""
424 Traceback (most recent call last):
425 File "<stdin>", line 1, in <module>
426 BytesWarning: Comparison between bytes and string
427
428
Antoine Pitroubd866e92011-02-05 12:13:38 +0000429Indexing bytes objects
430''''''''''''''''''''''
431
432Another potentially surprising change is the indexing behaviour of bytes
433objects in Python 3::
434
435 >>> b"xyz"[0]
436 120
437
438Indeed, Python 3 bytes objects (as well as :class:`bytearray` objects)
439are sequences of integers. But code converted from Python 2 will often
440assume that indexing a bytestring produces another bytestring, not an
441integer. To reconcile both behaviours, use slicing::
442
443 >>> b"xyz"[0:1]
444 b'x'
445 >>> n = 1
446 >>> b"xyz"[n:n+1]
447 b'y'
448
449The only remaining gotcha is that an out-of-bounds slice returns an empty
450bytes object instead of raising ``IndexError``:
451
452 >>> b"xyz"[3]
453 Traceback (most recent call last):
454 File "<stdin>", line 1, in <module>
455 IndexError: index out of range
456 >>> b"xyz"[3:4]
457 b''
458
459
Brett Cannon8045d972011-02-03 22:01:54 +0000460``__str__()``/``__unicode__()``
461'''''''''''''''''''''''''''''''
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000462
Brett Cannon8045d972011-02-03 22:01:54 +0000463In Python 2, objects can specify both a string and unicode representation of
464themselves. In Python 3, though, there is only a string representation. This
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000465becomes an issue as people can inadvertently do things in their ``__str__()``
Brett Cannon8045d972011-02-03 22:01:54 +0000466methods which have unpredictable results (e.g., infinite recursion if you
467happen to use the ``unicode(self).encode('utf8')`` idiom as the body of your
468``__str__()`` method).
469
Brett Cannonc39e8922014-03-07 12:28:35 -0500470You can use a mixin class to work around this. This allows you to only define a
Brett Cannon8045d972011-02-03 22:01:54 +0000471``__unicode__()`` method for your class and let the mixin derive
472``__str__()`` for you (code from
473http://lucumr.pocoo.org/2011/1/22/forwards-compatible-python/)::
474
Brett Cannon4b0c24a2011-02-03 22:14:58 +0000475 import sys
Brett Cannon8045d972011-02-03 22:01:54 +0000476
Brett Cannon4b0c24a2011-02-03 22:14:58 +0000477 class UnicodeMixin(object):
Brett Cannon8045d972011-02-03 22:01:54 +0000478
Brett Cannon4b0c24a2011-02-03 22:14:58 +0000479 """Mixin class to handle defining the proper __str__/__unicode__
480 methods in Python 2 or 3."""
Brett Cannon8045d972011-02-03 22:01:54 +0000481
Brett Cannon4b0c24a2011-02-03 22:14:58 +0000482 if sys.version_info[0] >= 3: # Python 3
483 def __str__(self):
484 return self.__unicode__()
485 else: # Python 2
486 def __str__(self):
487 return self.__unicode__().encode('utf8')
Brett Cannon8045d972011-02-03 22:01:54 +0000488
489
Brett Cannon4b0c24a2011-02-03 22:14:58 +0000490 class Spam(UnicodeMixin):
Brett Cannon8045d972011-02-03 22:01:54 +0000491
Brett Cannon4b0c24a2011-02-03 22:14:58 +0000492 def __unicode__(self):
493 return u'spam-spam-bacon-spam' # 2to3 will remove the 'u' prefix
Brett Cannon8045d972011-02-03 22:01:54 +0000494
495
Brett Cannon8045d972011-02-03 22:01:54 +0000496Don't Index on Exceptions
497'''''''''''''''''''''''''
Antoine Pitrou5c28cfdc2011-02-05 11:53:39 +0000498
Brett Cannon8045d972011-02-03 22:01:54 +0000499In Python 2, the following worked::
500
Brett Cannon4b0c24a2011-02-03 22:14:58 +0000501 >>> exc = Exception(1, 2, 3)
502 >>> exc.args[1]
503 2
504 >>> exc[1] # Python 2 only!
505 2
Brett Cannon8045d972011-02-03 22:01:54 +0000506
Eli Bendersky7ac34192011-02-07 04:44:19 +0000507But in Python 3, indexing directly on an exception is an error. You need to
508make sure to only index on the :attr:`BaseException.args` attribute which is a
Brett Cannon8045d972011-02-03 22:01:54 +0000509sequence containing all arguments passed to the :meth:`__init__` method.
510
Eli Bendersky7ac34192011-02-07 04:44:19 +0000511Even better is to use the documented attributes the exception provides.
Brett Cannon8045d972011-02-03 22:01:54 +0000512
Brett Cannonc39e8922014-03-07 12:28:35 -0500513
Brett Cannon8045d972011-02-03 22:01:54 +0000514Don't use ``__getslice__`` & Friends
515''''''''''''''''''''''''''''''''''''
Antoine Pitrou5c28cfdc2011-02-05 11:53:39 +0000516
Brett Cannon8045d972011-02-03 22:01:54 +0000517Been deprecated for a while, but Python 3 finally drops support for
518``__getslice__()``, etc. Move completely over to :meth:`__getitem__` and
519friends.
520
521
Brett Cannon45aa7cc2011-02-05 22:16:40 +0000522Updating doctests
523'''''''''''''''''
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000524
Brett Cannonc39e8922014-03-07 12:28:35 -0500525Don't forget to make them Python 2/3 compatible as well. If you wrote a
526monolithic set of doctests (e.g., a single docstring containing all of your
527doctests), you should at least consider breaking the doctests up into smaller
528pieces to make it more manageable to fix. Otherwise it might very well be worth
529your time and effort to port your tests to :mod:`unittest`.
Brett Cannon8045d972011-02-03 22:01:54 +0000530
531
Brett Cannonc39e8922014-03-07 12:28:35 -0500532Update ``map`` for imbalanced input sequences
533'''''''''''''''''''''''''''''''''''''''''''''
Jason R. Coombsa90e3642011-12-03 08:24:21 -0500534
Brett Cannonc39e8922014-03-07 12:28:35 -0500535With Python 2, when ``map`` was given more than one input sequence it would pad
536the shorter sequences with `None` values, returning a sequence as long as the
537longest input sequence.
Jason R. Coombsa90e3642011-12-03 08:24:21 -0500538
Brett Cannonc39e8922014-03-07 12:28:35 -0500539With Python 3, if the input sequences to ``map`` are of unequal length, ``map``
Jason R. Coombsa90e3642011-12-03 08:24:21 -0500540will stop at the termination of the shortest of the sequences. For full
Brett Cannonc39e8922014-03-07 12:28:35 -0500541compatibility with ``map`` from Python 2.x, wrap the sequence arguments in
Jason R. Coombsa90e3642011-12-03 08:24:21 -0500542:func:`itertools.zip_longest`, e.g. ``map(func, *sequences)`` becomes
543``list(map(func, itertools.zip_longest(*sequences)))``.
544
Brett Cannon8045d972011-02-03 22:01:54 +0000545Eliminate ``-3`` Warnings
546-------------------------
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000547
Brett Cannon8045d972011-02-03 22:01:54 +0000548When you run your application's test suite, run it using the ``-3`` flag passed
549to Python. This will cause various warnings to be raised during execution about
Brett Cannonc39e8922014-03-07 12:28:35 -0500550things that are semantic changes between Python 2 and 3. Try to eliminate those
551warnings to make your code even more portable to Python 3.
Brett Cannon8045d972011-02-03 22:01:54 +0000552
553
Brett Cannonc39e8922014-03-07 12:28:35 -0500554Alternative Approaches
555======================
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000556
Brett Cannonc39e8922014-03-07 12:28:35 -0500557While supporting Python 2 & 3 simultaneously is typically the preferred choice
558by people so that they can continue to improve code and have it work for the
559most number of users, your life may be easier if you only have to support one
560major version of Python going forward.
561
562Supporting Only Python 3 Going Forward From Python 2 Code
563---------------------------------------------------------
564
565If you have Python 2 code but going forward only want to improve it as Python 3
566code, then you can use 2to3_ to translate your Python 2 code to Python 3 code.
567This is only recommended, though, if your current version of your project is
568going into maintenance mode and you want all new features to be exclusive to
569Python 3.
Brett Cannon8045d972011-02-03 22:01:54 +0000570
571
Brett Cannonc39e8922014-03-07 12:28:35 -0500572Backporting Python 3 code to Python 2
573-------------------------------------
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000574
Brett Cannonc39e8922014-03-07 12:28:35 -0500575If you have Python 3 code and have little interest in supporting Python 2 you
576can use 3to2_ to translate from Python 3 code to Python 2 code. This is only
577recommended if you don't plan to heavily support Python 2 users. Otherwise
578write your code for Python 3 and then backport as far back as you want. This
579is typically easier than going from Python 2 to 3 as you will have worked out
580any difficulties with e.g. bytes/strings, etc.
Brett Cannon8045d972011-02-03 22:01:54 +0000581
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000582
Brett Cannon8045d972011-02-03 22:01:54 +0000583Other Resources
584===============
Georg Brandl2cb2fa92011-02-07 15:30:45 +0000585
Brett Cannon6277fa42011-02-18 01:34:28 +0000586The authors of the following blog posts, wiki pages, and books deserve special
587thanks for making public their tips for porting Python 2 code to Python 3 (and
Brett Cannonc39e8922014-03-07 12:28:35 -0500588thus helping provide information for this document and its various revisions
589over the years):
Brett Cannon8045d972011-02-03 22:01:54 +0000590
Brett Cannonc39e8922014-03-07 12:28:35 -0500591* http://wiki.python.org/moin/PortingPythonToPy3k
Brett Cannon6277fa42011-02-18 01:34:28 +0000592* http://python3porting.com/
Brett Cannon8045d972011-02-03 22:01:54 +0000593* http://docs.pythonsprints.com/python3_porting/py-porting.html
594* http://techspot.zzzeek.org/2011/01/24/zzzeek-s-guide-to-python-3-porting/
595* http://dabeaz.blogspot.com/2011/01/porting-py65-and-my-superboard-to.html
596* http://lucumr.pocoo.org/2011/1/22/forwards-compatible-python/
597* http://lucumr.pocoo.org/2010/2/11/porting-to-python-3-a-guide/
Brett Cannonb7e6b892013-03-09 14:22:35 -0500598* https://wiki.ubuntu.com/Python/3
Brett Cannon8045d972011-02-03 22:01:54 +0000599
600If you feel there is something missing from this document that should be added,
601please email the python-porting_ mailing list.
602
Brett Cannonc39e8922014-03-07 12:28:35 -0500603
604
605.. _2to3: http://docs.python.org/2/library/2to3.html
606.. _3to2: https://pypi.python.org/pypi/3to2
607.. _Cheeseshop: PyPI_
608.. _coverage: https://pypi.python.org/pypi/coverage
609.. _future: http://python-future.org/
610.. _modernize: https://github.com/mitsuhiko/python-modernize
611.. _Porting to Python 3: http://python3porting.com/
612.. _PyPI: http://pypi.python.org/
613.. _Python 2.2: http://www.python.org/2.2.x
614.. _Python 2.5: http://www.python.org/2.5.x
615.. _Python 2.6: http://www.python.org/2.6.x
616.. _Python 2.7: http://www.python.org/2.7.x
617.. _Python 2.5: http://www.python.org/2.5.x
618.. _Python 3.3: http://www.python.org/3.3.x
619.. _Python 3 Packages: https://pypi.python.org/pypi?:action=browse&c=533&show=all
620.. _Python 3 Q & A: http://ncoghlan-devs-python-notes.readthedocs.org/en/latest/python3/questions_and_answers.html
Brett Cannon8045d972011-02-03 22:01:54 +0000621.. _python-porting: http://mail.python.org/mailman/listinfo/python-porting
Brett Cannonc39e8922014-03-07 12:28:35 -0500622.. _six: https://pypi.python.org/pypi/six
623.. _tox: https://pypi.python.org/pypi/tox
624.. _trove classifiers: https://pypi.python.org/pypi?%3Aaction=list_classifiers
625