blob: bea21535947ea005b9c51d243c88d3e15c431b77 [file] [log] [blame]
Benjamin Petersonf1c08f02008-09-26 02:58:36 +00001.. highlightlang:: c
2
Éric Araujo52a5a032011-08-19 01:22:42 +02003.. _cporting-howto:
4
Larry Hastings0555cde2012-02-28 15:17:23 -08005*************************************
6Porting Extension Modules to Python 3
7*************************************
Benjamin Petersonf1c08f02008-09-26 02:58:36 +00008
9:author: Benjamin Peterson
10
11
12.. topic:: Abstract
13
Larry Hastings0555cde2012-02-28 15:17:23 -080014 Although changing the C-API was not one of Python 3's objectives,
15 the many Python-level changes made leaving Python 2's API intact
16 impossible. In fact, some changes such as :func:`int` and
17 :func:`long` unification are more obvious on the C level. This
18 document endeavors to document incompatibilities and how they can
19 be worked around.
Benjamin Petersonf1c08f02008-09-26 02:58:36 +000020
21
22Conditional compilation
23=======================
24
Larry Hastings0555cde2012-02-28 15:17:23 -080025The easiest way to compile only some code for Python 3 is to check
26if :c:macro:`PY_MAJOR_VERSION` is greater than or equal to 3. ::
Benjamin Petersonf1c08f02008-09-26 02:58:36 +000027
28 #if PY_MAJOR_VERSION >= 3
29 #define IS_PY3K
30 #endif
31
32API functions that are not present can be aliased to their equivalents within
Georg Brandlda84d212008-09-26 21:15:21 +000033conditional blocks.
Benjamin Petersonf1c08f02008-09-26 02:58:36 +000034
35
36Changes to Object APIs
37======================
38
Larry Hastings0555cde2012-02-28 15:17:23 -080039Python 3 merged together some types with similar functions while cleanly
Benjamin Petersonf1c08f02008-09-26 02:58:36 +000040separating others.
41
42
43str/unicode Unification
44-----------------------
45
46
Larry Hastings0555cde2012-02-28 15:17:23 -080047Python 3's :func:`str` (``PyString_*`` functions in C) type is equivalent to
48Python 2's :func:`unicode` (``PyUnicode_*``). The old 8-bit string type has
49become :func:`bytes`. Python 2.6 and later provide a compatibility header,
Benjamin Petersonf1c08f02008-09-26 02:58:36 +000050:file:`bytesobject.h`, mapping ``PyBytes`` names to ``PyString`` ones. For best
Larry Hastings0555cde2012-02-28 15:17:23 -080051compatibility with Python 3, :c:type:`PyUnicode` should be used for textual data and
Sandro Tosi98ed08f2012-01-14 16:42:02 +010052:c:type:`PyBytes` for binary data. It's also important to remember that
Larry Hastings0555cde2012-02-28 15:17:23 -080053:c:type:`PyBytes` and :c:type:`PyUnicode` in Python 3 are not interchangeable like
54:c:type:`PyString` and :c:type:`PyUnicode` are in Python 2. The following example
Sandro Tosi98ed08f2012-01-14 16:42:02 +010055shows best practices with regards to :c:type:`PyUnicode`, :c:type:`PyString`,
56and :c:type:`PyBytes`. ::
Benjamin Petersonf1c08f02008-09-26 02:58:36 +000057
58 #include "stdlib.h"
59 #include "Python.h"
60 #include "bytesobject.h"
61
62 /* text example */
63 static PyObject *
64 say_hello(PyObject *self, PyObject *args) {
65 PyObject *name, *result;
66
67 if (!PyArg_ParseTuple(args, "U:say_hello", &name))
68 return NULL;
69
70 result = PyUnicode_FromFormat("Hello, %S!", name);
71 return result;
72 }
73
Benjamin Peterson4008ef02008-09-27 23:28:43 +000074 /* just a forward */
Benjamin Petersonf1c08f02008-09-26 02:58:36 +000075 static char * do_encode(PyObject *);
76
77 /* bytes example */
78 static PyObject *
79 encode_object(PyObject *self, PyObject *args) {
80 char *encoded;
81 PyObject *result, *myobj;
82
83 if (!PyArg_ParseTuple(args, "O:encode_object", &myobj))
84 return NULL;
85
86 encoded = do_encode(myobj);
87 if (encoded == NULL)
88 return NULL;
89 result = PyBytes_FromString(encoded);
90 free(encoded);
91 return result;
92 }
93
94
95long/int Unification
96--------------------
97
Larry Hastings0555cde2012-02-28 15:17:23 -080098Python 3 has only one integer type, :func:`int`. But it actually
99corresponds to Python 2's :func:`long` type--the :func:`int` type
100used in Python 2 was removed. In the C-API, ``PyInt_*`` functions
101are replaced by their ``PyLong_*`` equivalents.
102
103The best course of action here is using the ``PyInt_*`` functions aliased to
Andrew M. Kuchlinga178a692009-04-03 21:45:29 +0000104``PyLong_*`` found in :file:`intobject.h`. The abstract ``PyNumber_*`` APIs
Benjamin Peterson4008ef02008-09-27 23:28:43 +0000105can also be used in some cases. ::
Benjamin Petersonf1c08f02008-09-26 02:58:36 +0000106
107 #include "Python.h"
Benjamin Peterson4008ef02008-09-27 23:28:43 +0000108 #include "intobject.h"
Benjamin Petersonf1c08f02008-09-26 02:58:36 +0000109
110 static PyObject *
111 add_ints(PyObject *self, PyObject *args) {
112 int one, two;
113 PyObject *result;
114
115 if (!PyArg_ParseTuple(args, "ii:add_ints", &one, &two))
116 return NULL;
117
118 return PyInt_FromLong(one + two);
119 }
120
121
122
123Module initialization and state
124===============================
125
Larry Hastings0555cde2012-02-28 15:17:23 -0800126Python 3 has a revamped extension module initialization system. (See
127:pep:`3121`.) Instead of storing module state in globals, they should
128be stored in an interpreter specific structure. Creating modules that
129act correctly in both Python 2 and Python 3 is tricky. The following
130simple example demonstrates how. ::
Benjamin Petersonf1c08f02008-09-26 02:58:36 +0000131
132 #include "Python.h"
133
134 struct module_state {
135 PyObject *error;
136 };
137
138 #if PY_MAJOR_VERSION >= 3
139 #define GETSTATE(m) ((struct module_state*)PyModule_GetState(m))
140 #else
141 #define GETSTATE(m) (&_state)
142 static struct module_state _state;
143 #endif
144
145 static PyObject *
146 error_out(PyObject *m) {
147 struct module_state *st = GETSTATE(m);
148 PyErr_SetString(st->error, "something bad happened");
149 return NULL;
150 }
151
152 static PyMethodDef myextension_methods[] = {
153 {"error_out", (PyCFunction)error_out, METH_NOARGS, NULL},
154 {NULL, NULL}
155 };
156
157 #if PY_MAJOR_VERSION >= 3
158
159 static int myextension_traverse(PyObject *m, visitproc visit, void *arg) {
160 Py_VISIT(GETSTATE(m)->error);
161 return 0;
162 }
163
164 static int myextension_clear(PyObject *m) {
165 Py_CLEAR(GETSTATE(m)->error);
166 return 0;
167 }
168
169
170 static struct PyModuleDef moduledef = {
171 PyModuleDef_HEAD_INIT,
172 "myextension",
173 NULL,
174 sizeof(struct module_state),
175 myextension_methods,
176 NULL,
177 myextension_traverse,
178 myextension_clear,
179 NULL
180 };
181
182 #define INITERROR return NULL
183
184 PyObject *
185 PyInit_myextension(void)
186
187 #else
188 #define INITERROR return
189
190 void
191 initmyextension(void)
192 #endif
193 {
194 #if PY_MAJOR_VERSION >= 3
195 PyObject *module = PyModule_Create(&moduledef);
196 #else
197 PyObject *module = Py_InitModule("myextension", myextension_methods);
198 #endif
199
200 if (module == NULL)
201 INITERROR;
202 struct module_state *st = GETSTATE(module);
203
204 st->error = PyErr_NewException("myextension.Error", NULL, NULL);
205 if (st->error == NULL) {
206 Py_DECREF(module);
207 INITERROR;
208 }
209
210 #if PY_MAJOR_VERSION >= 3
211 return module;
212 #endif
213 }
Benjamin Peterson0eee7c62008-09-26 20:52:06 +0000214
215
Larry Hastingsfc45bba2011-10-09 13:03:44 +0100216CObject replaced with Capsule
217=============================
218
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100219The :c:type:`Capsule` object was introduced in Python 3.1 and 2.7 to replace
220:c:type:`CObject`. CObjects were useful,
221but the :c:type:`CObject` API was problematic: it didn't permit distinguishing
Larry Hastingsfc45bba2011-10-09 13:03:44 +0100222between valid CObjects, which allowed mismatched CObjects to crash the
223interpreter, and some of its APIs relied on undefined behavior in C.
224(For further reading on the rationale behind Capsules, please see :issue:`5630`.)
225
226If you're currently using CObjects, and you want to migrate to 3.1 or newer,
227you'll need to switch to Capsules.
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100228:c:type:`CObject` was deprecated in 3.1 and 2.7 and completely removed in
Larry Hastingsfc45bba2011-10-09 13:03:44 +0100229Python 3.2. If you only support 2.7, or 3.1 and above, you
Larry Hastings0555cde2012-02-28 15:17:23 -0800230can simply switch to :c:type:`Capsule`. If you need to support Python 3.0,
231or versions of Python earlier than 2.7,
232you'll have to support both CObjects and Capsules.
233(Note that Python 3.0 is no longer supported, and it is not recommended
234for production use.)
Larry Hastingsfc45bba2011-10-09 13:03:44 +0100235
236The following example header file :file:`capsulethunk.h` may
Larry Hastings0555cde2012-02-28 15:17:23 -0800237solve the problem for you. Simply write your code against the
238:c:type:`Capsule` API and include this header file after
239:file:`Python.h`. Your code will automatically use Capsules
240in versions of Python with Capsules, and switch to CObjects
241when Capsules are unavailable.
Larry Hastingsfc45bba2011-10-09 13:03:44 +0100242
243:file:`capsulethunk.h` simulates Capsules using CObjects. However,
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100244:c:type:`CObject` provides no place to store the capsule's "name". As a
245result the simulated :c:type:`Capsule` objects created by :file:`capsulethunk.h`
Larry Hastingsfc45bba2011-10-09 13:03:44 +0100246behave slightly differently from real Capsules. Specifically:
247
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100248 * The name parameter passed in to :c:func:`PyCapsule_New` is ignored.
Larry Hastingsfc45bba2011-10-09 13:03:44 +0100249
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100250 * The name parameter passed in to :c:func:`PyCapsule_IsValid` and
251 :c:func:`PyCapsule_GetPointer` is ignored, and no error checking
Larry Hastingsfc45bba2011-10-09 13:03:44 +0100252 of the name is performed.
253
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100254 * :c:func:`PyCapsule_GetName` always returns NULL.
Larry Hastingsfc45bba2011-10-09 13:03:44 +0100255
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100256 * :c:func:`PyCapsule_SetName` always throws an exception and
Larry Hastingsfc45bba2011-10-09 13:03:44 +0100257 returns failure. (Since there's no way to store a name
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100258 in a CObject, noisy failure of :c:func:`PyCapsule_SetName`
Larry Hastingsfc45bba2011-10-09 13:03:44 +0100259 was deemed preferable to silent failure here. If this is
260 inconveient, feel free to modify your local
261 copy as you see fit.)
262
263You can find :file:`capsulethunk.h` in the Python source distribution
Éric Araujo76c6aa82012-03-05 16:43:41 +0100264as :source:`Doc/includes/capsulethunk.h`. We also include it here for
265your convenience:
Larry Hastingsfc45bba2011-10-09 13:03:44 +0100266
267.. literalinclude:: ../includes/capsulethunk.h
268
269
270
Benjamin Peterson0eee7c62008-09-26 20:52:06 +0000271Other options
272=============
273
274If you are writing a new extension module, you might consider `Cython
275<http://www.cython.org>`_. It translates a Python-like language to C. The
Larry Hastings0555cde2012-02-28 15:17:23 -0800276extension modules it creates are compatible with Python 3 and Python 2.
Benjamin Peterson0eee7c62008-09-26 20:52:06 +0000277