blob: 1ad77d687e7c32f1a34cacaab2e3f76d33d76afa [file] [log] [blame]
Benjamin Petersonf1c08f02008-09-26 02:58:36 +00001.. highlightlang:: c
2
Éric Araujo52a5a032011-08-19 01:22:42 +02003.. _cporting-howto:
4
Larry Hastings0555cde2012-02-28 15:17:23 -08005*************************************
6Porting Extension Modules to Python 3
7*************************************
Benjamin Petersonf1c08f02008-09-26 02:58:36 +00008
9:author: Benjamin Peterson
10
11
12.. topic:: Abstract
13
Larry Hastings0555cde2012-02-28 15:17:23 -080014 Although changing the C-API was not one of Python 3's objectives,
15 the many Python-level changes made leaving Python 2's API intact
16 impossible. In fact, some changes such as :func:`int` and
17 :func:`long` unification are more obvious on the C level. This
18 document endeavors to document incompatibilities and how they can
19 be worked around.
Benjamin Petersonf1c08f02008-09-26 02:58:36 +000020
21
22Conditional compilation
23=======================
24
Larry Hastings0555cde2012-02-28 15:17:23 -080025The easiest way to compile only some code for Python 3 is to check
26if :c:macro:`PY_MAJOR_VERSION` is greater than or equal to 3. ::
Benjamin Petersonf1c08f02008-09-26 02:58:36 +000027
28 #if PY_MAJOR_VERSION >= 3
29 #define IS_PY3K
30 #endif
31
32API functions that are not present can be aliased to their equivalents within
Georg Brandlda84d212008-09-26 21:15:21 +000033conditional blocks.
Benjamin Petersonf1c08f02008-09-26 02:58:36 +000034
35
36Changes to Object APIs
37======================
38
Larry Hastings0555cde2012-02-28 15:17:23 -080039Python 3 merged together some types with similar functions while cleanly
Benjamin Petersonf1c08f02008-09-26 02:58:36 +000040separating others.
41
42
43str/unicode Unification
44-----------------------
45
46
Larry Hastings0555cde2012-02-28 15:17:23 -080047Python 3's :func:`str` (``PyString_*`` functions in C) type is equivalent to
48Python 2's :func:`unicode` (``PyUnicode_*``). The old 8-bit string type has
49become :func:`bytes`. Python 2.6 and later provide a compatibility header,
Benjamin Petersonf1c08f02008-09-26 02:58:36 +000050:file:`bytesobject.h`, mapping ``PyBytes`` names to ``PyString`` ones. For best
Larry Hastings0555cde2012-02-28 15:17:23 -080051compatibility with Python 3, :c:type:`PyUnicode` should be used for textual data and
Sandro Tosi98ed08f2012-01-14 16:42:02 +010052:c:type:`PyBytes` for binary data. It's also important to remember that
Larry Hastings0555cde2012-02-28 15:17:23 -080053:c:type:`PyBytes` and :c:type:`PyUnicode` in Python 3 are not interchangeable like
54:c:type:`PyString` and :c:type:`PyUnicode` are in Python 2. The following example
Sandro Tosi98ed08f2012-01-14 16:42:02 +010055shows best practices with regards to :c:type:`PyUnicode`, :c:type:`PyString`,
56and :c:type:`PyBytes`. ::
Benjamin Petersonf1c08f02008-09-26 02:58:36 +000057
58 #include "stdlib.h"
59 #include "Python.h"
60 #include "bytesobject.h"
61
62 /* text example */
63 static PyObject *
64 say_hello(PyObject *self, PyObject *args) {
65 PyObject *name, *result;
66
67 if (!PyArg_ParseTuple(args, "U:say_hello", &name))
68 return NULL;
69
70 result = PyUnicode_FromFormat("Hello, %S!", name);
71 return result;
72 }
73
Benjamin Peterson4008ef02008-09-27 23:28:43 +000074 /* just a forward */
Benjamin Petersonf1c08f02008-09-26 02:58:36 +000075 static char * do_encode(PyObject *);
76
77 /* bytes example */
78 static PyObject *
79 encode_object(PyObject *self, PyObject *args) {
80 char *encoded;
81 PyObject *result, *myobj;
82
83 if (!PyArg_ParseTuple(args, "O:encode_object", &myobj))
84 return NULL;
85
86 encoded = do_encode(myobj);
87 if (encoded == NULL)
88 return NULL;
89 result = PyBytes_FromString(encoded);
90 free(encoded);
91 return result;
92 }
93
94
95long/int Unification
96--------------------
97
Larry Hastings0555cde2012-02-28 15:17:23 -080098Python 3 has only one integer type, :func:`int`. But it actually
99corresponds to Python 2's :func:`long` type--the :func:`int` type
100used in Python 2 was removed. In the C-API, ``PyInt_*`` functions
101are replaced by their ``PyLong_*`` equivalents.
102
Benjamin Petersonf1c08f02008-09-26 02:58:36 +0000103
104Module initialization and state
105===============================
106
Larry Hastings0555cde2012-02-28 15:17:23 -0800107Python 3 has a revamped extension module initialization system. (See
108:pep:`3121`.) Instead of storing module state in globals, they should
109be stored in an interpreter specific structure. Creating modules that
110act correctly in both Python 2 and Python 3 is tricky. The following
111simple example demonstrates how. ::
Benjamin Petersonf1c08f02008-09-26 02:58:36 +0000112
113 #include "Python.h"
114
115 struct module_state {
116 PyObject *error;
117 };
118
119 #if PY_MAJOR_VERSION >= 3
120 #define GETSTATE(m) ((struct module_state*)PyModule_GetState(m))
121 #else
122 #define GETSTATE(m) (&_state)
123 static struct module_state _state;
124 #endif
125
126 static PyObject *
127 error_out(PyObject *m) {
128 struct module_state *st = GETSTATE(m);
129 PyErr_SetString(st->error, "something bad happened");
130 return NULL;
131 }
132
133 static PyMethodDef myextension_methods[] = {
134 {"error_out", (PyCFunction)error_out, METH_NOARGS, NULL},
135 {NULL, NULL}
136 };
137
138 #if PY_MAJOR_VERSION >= 3
139
140 static int myextension_traverse(PyObject *m, visitproc visit, void *arg) {
141 Py_VISIT(GETSTATE(m)->error);
142 return 0;
143 }
144
145 static int myextension_clear(PyObject *m) {
146 Py_CLEAR(GETSTATE(m)->error);
147 return 0;
148 }
149
150
151 static struct PyModuleDef moduledef = {
152 PyModuleDef_HEAD_INIT,
153 "myextension",
154 NULL,
155 sizeof(struct module_state),
156 myextension_methods,
157 NULL,
158 myextension_traverse,
159 myextension_clear,
160 NULL
161 };
162
163 #define INITERROR return NULL
164
165 PyObject *
166 PyInit_myextension(void)
167
168 #else
169 #define INITERROR return
170
171 void
172 initmyextension(void)
173 #endif
174 {
175 #if PY_MAJOR_VERSION >= 3
176 PyObject *module = PyModule_Create(&moduledef);
177 #else
178 PyObject *module = Py_InitModule("myextension", myextension_methods);
179 #endif
180
181 if (module == NULL)
182 INITERROR;
183 struct module_state *st = GETSTATE(module);
184
185 st->error = PyErr_NewException("myextension.Error", NULL, NULL);
186 if (st->error == NULL) {
187 Py_DECREF(module);
188 INITERROR;
189 }
190
191 #if PY_MAJOR_VERSION >= 3
192 return module;
193 #endif
194 }
Benjamin Peterson0eee7c62008-09-26 20:52:06 +0000195
196
Larry Hastingsfc45bba2011-10-09 13:03:44 +0100197CObject replaced with Capsule
198=============================
199
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100200The :c:type:`Capsule` object was introduced in Python 3.1 and 2.7 to replace
201:c:type:`CObject`. CObjects were useful,
202but the :c:type:`CObject` API was problematic: it didn't permit distinguishing
Larry Hastingsfc45bba2011-10-09 13:03:44 +0100203between valid CObjects, which allowed mismatched CObjects to crash the
204interpreter, and some of its APIs relied on undefined behavior in C.
205(For further reading on the rationale behind Capsules, please see :issue:`5630`.)
206
207If you're currently using CObjects, and you want to migrate to 3.1 or newer,
208you'll need to switch to Capsules.
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100209:c:type:`CObject` was deprecated in 3.1 and 2.7 and completely removed in
Larry Hastingsfc45bba2011-10-09 13:03:44 +0100210Python 3.2. If you only support 2.7, or 3.1 and above, you
Larry Hastings0555cde2012-02-28 15:17:23 -0800211can simply switch to :c:type:`Capsule`. If you need to support Python 3.0,
212or versions of Python earlier than 2.7,
213you'll have to support both CObjects and Capsules.
214(Note that Python 3.0 is no longer supported, and it is not recommended
215for production use.)
Larry Hastingsfc45bba2011-10-09 13:03:44 +0100216
217The following example header file :file:`capsulethunk.h` may
Larry Hastings0555cde2012-02-28 15:17:23 -0800218solve the problem for you. Simply write your code against the
219:c:type:`Capsule` API and include this header file after
220:file:`Python.h`. Your code will automatically use Capsules
221in versions of Python with Capsules, and switch to CObjects
222when Capsules are unavailable.
Larry Hastingsfc45bba2011-10-09 13:03:44 +0100223
224:file:`capsulethunk.h` simulates Capsules using CObjects. However,
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100225:c:type:`CObject` provides no place to store the capsule's "name". As a
226result the simulated :c:type:`Capsule` objects created by :file:`capsulethunk.h`
Larry Hastingsfc45bba2011-10-09 13:03:44 +0100227behave slightly differently from real Capsules. Specifically:
228
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100229 * The name parameter passed in to :c:func:`PyCapsule_New` is ignored.
Larry Hastingsfc45bba2011-10-09 13:03:44 +0100230
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100231 * The name parameter passed in to :c:func:`PyCapsule_IsValid` and
232 :c:func:`PyCapsule_GetPointer` is ignored, and no error checking
Larry Hastingsfc45bba2011-10-09 13:03:44 +0100233 of the name is performed.
234
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100235 * :c:func:`PyCapsule_GetName` always returns NULL.
Larry Hastingsfc45bba2011-10-09 13:03:44 +0100236
Andrew Svetlov4bb142b2012-12-18 21:27:37 +0200237 * :c:func:`PyCapsule_SetName` always raises an exception and
Larry Hastingsfc45bba2011-10-09 13:03:44 +0100238 returns failure. (Since there's no way to store a name
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100239 in a CObject, noisy failure of :c:func:`PyCapsule_SetName`
Larry Hastingsfc45bba2011-10-09 13:03:44 +0100240 was deemed preferable to silent failure here. If this is
Martin v. Löwisfcf37c12012-03-24 17:38:29 +0100241 inconvenient, feel free to modify your local
Larry Hastingsfc45bba2011-10-09 13:03:44 +0100242 copy as you see fit.)
243
244You can find :file:`capsulethunk.h` in the Python source distribution
Éric Araujo76c6aa82012-03-05 16:43:41 +0100245as :source:`Doc/includes/capsulethunk.h`. We also include it here for
246your convenience:
Larry Hastingsfc45bba2011-10-09 13:03:44 +0100247
248.. literalinclude:: ../includes/capsulethunk.h
249
250
251
Benjamin Peterson0eee7c62008-09-26 20:52:06 +0000252Other options
253=============
254
255If you are writing a new extension module, you might consider `Cython
256<http://www.cython.org>`_. It translates a Python-like language to C. The
Larry Hastings0555cde2012-02-28 15:17:23 -0800257extension modules it creates are compatible with Python 3 and Python 2.
Benjamin Peterson0eee7c62008-09-26 20:52:06 +0000258