blob: 1ad77d687e7c32f1a34cacaab2e3f76d33d76afa [file] [log] [blame]
Benjamin Petersone9bbc8b2008-09-28 02:06:32 +00001.. highlightlang:: c
2
Guido van Rossum56076da2008-12-02 22:58:36 +00003.. _cporting-howto:
4
Larry Hastings62417a02012-02-28 16:21:47 -08005*************************************
6Porting Extension Modules to Python 3
7*************************************
Benjamin Petersone9bbc8b2008-09-28 02:06:32 +00008
9:author: Benjamin Peterson
10
11
12.. topic:: Abstract
13
Larry Hastings62417a02012-02-28 16:21:47 -080014 Although changing the C-API was not one of Python 3's objectives,
15 the many Python-level changes made leaving Python 2's API intact
16 impossible. In fact, some changes such as :func:`int` and
17 :func:`long` unification are more obvious on the C level. This
18 document endeavors to document incompatibilities and how they can
19 be worked around.
Benjamin Petersone9bbc8b2008-09-28 02:06:32 +000020
21
22Conditional compilation
23=======================
24
Larry Hastings62417a02012-02-28 16:21:47 -080025The easiest way to compile only some code for Python 3 is to check
26if :c:macro:`PY_MAJOR_VERSION` is greater than or equal to 3. ::
Benjamin Petersone9bbc8b2008-09-28 02:06:32 +000027
28 #if PY_MAJOR_VERSION >= 3
29 #define IS_PY3K
30 #endif
31
32API functions that are not present can be aliased to their equivalents within
33conditional blocks.
34
35
36Changes to Object APIs
37======================
38
Larry Hastings62417a02012-02-28 16:21:47 -080039Python 3 merged together some types with similar functions while cleanly
Benjamin Petersone9bbc8b2008-09-28 02:06:32 +000040separating others.
41
42
43str/unicode Unification
44-----------------------
45
46
Larry Hastings62417a02012-02-28 16:21:47 -080047Python 3's :func:`str` (``PyString_*`` functions in C) type is equivalent to
48Python 2's :func:`unicode` (``PyUnicode_*``). The old 8-bit string type has
49become :func:`bytes`. Python 2.6 and later provide a compatibility header,
Benjamin Petersone9bbc8b2008-09-28 02:06:32 +000050:file:`bytesobject.h`, mapping ``PyBytes`` names to ``PyString`` ones. For best
Larry Hastings62417a02012-02-28 16:21:47 -080051compatibility with Python 3, :c:type:`PyUnicode` should be used for textual data and
Georg Brandl60203b42010-10-06 10:11:56 +000052:c:type:`PyBytes` for binary data. It's also important to remember that
Larry Hastings62417a02012-02-28 16:21:47 -080053:c:type:`PyBytes` and :c:type:`PyUnicode` in Python 3 are not interchangeable like
54:c:type:`PyString` and :c:type:`PyUnicode` are in Python 2. The following example
Georg Brandl682d7e02010-10-06 10:26:05 +000055shows best practices with regards to :c:type:`PyUnicode`, :c:type:`PyString`,
56and :c:type:`PyBytes`. ::
Benjamin Petersone9bbc8b2008-09-28 02:06:32 +000057
58 #include "stdlib.h"
59 #include "Python.h"
60 #include "bytesobject.h"
61
62 /* text example */
63 static PyObject *
64 say_hello(PyObject *self, PyObject *args) {
65 PyObject *name, *result;
66
67 if (!PyArg_ParseTuple(args, "U:say_hello", &name))
68 return NULL;
69
70 result = PyUnicode_FromFormat("Hello, %S!", name);
71 return result;
72 }
73
74 /* just a forward */
75 static char * do_encode(PyObject *);
76
77 /* bytes example */
78 static PyObject *
79 encode_object(PyObject *self, PyObject *args) {
80 char *encoded;
81 PyObject *result, *myobj;
82
83 if (!PyArg_ParseTuple(args, "O:encode_object", &myobj))
84 return NULL;
85
86 encoded = do_encode(myobj);
87 if (encoded == NULL)
88 return NULL;
89 result = PyBytes_FromString(encoded);
90 free(encoded);
91 return result;
92 }
93
94
95long/int Unification
96--------------------
97
Larry Hastings62417a02012-02-28 16:21:47 -080098Python 3 has only one integer type, :func:`int`. But it actually
99corresponds to Python 2's :func:`long` type--the :func:`int` type
100used in Python 2 was removed. In the C-API, ``PyInt_*`` functions
101are replaced by their ``PyLong_*`` equivalents.
102
Benjamin Petersone9bbc8b2008-09-28 02:06:32 +0000103
104Module initialization and state
105===============================
106
Larry Hastings62417a02012-02-28 16:21:47 -0800107Python 3 has a revamped extension module initialization system. (See
108:pep:`3121`.) Instead of storing module state in globals, they should
109be stored in an interpreter specific structure. Creating modules that
110act correctly in both Python 2 and Python 3 is tricky. The following
111simple example demonstrates how. ::
Benjamin Petersone9bbc8b2008-09-28 02:06:32 +0000112
113 #include "Python.h"
114
115 struct module_state {
116 PyObject *error;
117 };
118
119 #if PY_MAJOR_VERSION >= 3
120 #define GETSTATE(m) ((struct module_state*)PyModule_GetState(m))
121 #else
122 #define GETSTATE(m) (&_state)
123 static struct module_state _state;
124 #endif
125
126 static PyObject *
127 error_out(PyObject *m) {
128 struct module_state *st = GETSTATE(m);
129 PyErr_SetString(st->error, "something bad happened");
130 return NULL;
131 }
132
133 static PyMethodDef myextension_methods[] = {
134 {"error_out", (PyCFunction)error_out, METH_NOARGS, NULL},
135 {NULL, NULL}
136 };
137
138 #if PY_MAJOR_VERSION >= 3
139
140 static int myextension_traverse(PyObject *m, visitproc visit, void *arg) {
141 Py_VISIT(GETSTATE(m)->error);
142 return 0;
143 }
144
145 static int myextension_clear(PyObject *m) {
146 Py_CLEAR(GETSTATE(m)->error);
147 return 0;
148 }
149
150
151 static struct PyModuleDef moduledef = {
152 PyModuleDef_HEAD_INIT,
153 "myextension",
154 NULL,
155 sizeof(struct module_state),
156 myextension_methods,
157 NULL,
158 myextension_traverse,
159 myextension_clear,
160 NULL
161 };
162
163 #define INITERROR return NULL
164
165 PyObject *
166 PyInit_myextension(void)
167
168 #else
169 #define INITERROR return
170
171 void
172 initmyextension(void)
173 #endif
174 {
175 #if PY_MAJOR_VERSION >= 3
176 PyObject *module = PyModule_Create(&moduledef);
177 #else
178 PyObject *module = Py_InitModule("myextension", myextension_methods);
179 #endif
180
181 if (module == NULL)
182 INITERROR;
183 struct module_state *st = GETSTATE(module);
184
185 st->error = PyErr_NewException("myextension.Error", NULL, NULL);
186 if (st->error == NULL) {
187 Py_DECREF(module);
188 INITERROR;
189 }
190
191 #if PY_MAJOR_VERSION >= 3
192 return module;
193 #endif
194 }
195
196
Larry Hastings62417a02012-02-28 16:21:47 -0800197CObject replaced with Capsule
198=============================
199
200The :c:type:`Capsule` object was introduced in Python 3.1 and 2.7 to replace
201:c:type:`CObject`. CObjects were useful,
202but the :c:type:`CObject` API was problematic: it didn't permit distinguishing
203between valid CObjects, which allowed mismatched CObjects to crash the
204interpreter, and some of its APIs relied on undefined behavior in C.
205(For further reading on the rationale behind Capsules, please see :issue:`5630`.)
206
207If you're currently using CObjects, and you want to migrate to 3.1 or newer,
208you'll need to switch to Capsules.
209:c:type:`CObject` was deprecated in 3.1 and 2.7 and completely removed in
210Python 3.2. If you only support 2.7, or 3.1 and above, you
211can simply switch to :c:type:`Capsule`. If you need to support Python 3.0,
212or versions of Python earlier than 2.7,
213you'll have to support both CObjects and Capsules.
214(Note that Python 3.0 is no longer supported, and it is not recommended
215for production use.)
216
217The following example header file :file:`capsulethunk.h` may
218solve the problem for you. Simply write your code against the
219:c:type:`Capsule` API and include this header file after
220:file:`Python.h`. Your code will automatically use Capsules
221in versions of Python with Capsules, and switch to CObjects
222when Capsules are unavailable.
223
224:file:`capsulethunk.h` simulates Capsules using CObjects. However,
225:c:type:`CObject` provides no place to store the capsule's "name". As a
226result the simulated :c:type:`Capsule` objects created by :file:`capsulethunk.h`
227behave slightly differently from real Capsules. Specifically:
228
229 * The name parameter passed in to :c:func:`PyCapsule_New` is ignored.
230
231 * The name parameter passed in to :c:func:`PyCapsule_IsValid` and
232 :c:func:`PyCapsule_GetPointer` is ignored, and no error checking
233 of the name is performed.
234
235 * :c:func:`PyCapsule_GetName` always returns NULL.
236
Andrew Svetlov737fb892012-12-18 21:14:22 +0200237 * :c:func:`PyCapsule_SetName` always raises an exception and
Larry Hastings62417a02012-02-28 16:21:47 -0800238 returns failure. (Since there's no way to store a name
239 in a CObject, noisy failure of :c:func:`PyCapsule_SetName`
240 was deemed preferable to silent failure here. If this is
Martin v. Löwis2dee3942012-03-24 17:39:57 +0100241 inconvenient, feel free to modify your local
Larry Hastings62417a02012-02-28 16:21:47 -0800242 copy as you see fit.)
243
244You can find :file:`capsulethunk.h` in the Python source distribution
Éric Araujofdfaf0a2012-03-05 15:50:37 +0100245as :source:`Doc/includes/capsulethunk.h`. We also include it here for
246your convenience:
Larry Hastings62417a02012-02-28 16:21:47 -0800247
248.. literalinclude:: ../includes/capsulethunk.h
249
250
251
Benjamin Petersone9bbc8b2008-09-28 02:06:32 +0000252Other options
253=============
254
255If you are writing a new extension module, you might consider `Cython
256<http://www.cython.org>`_. It translates a Python-like language to C. The
Larry Hastings62417a02012-02-28 16:21:47 -0800257extension modules it creates are compatible with Python 3 and Python 2.
Benjamin Petersone9bbc8b2008-09-28 02:06:32 +0000258