blob: 6ebcc4164df59537d7871f4bcd3028648c3e9851 [file] [log] [blame]
Benjamin Petersone9bbc8b2008-09-28 02:06:32 +00001.. highlightlang:: c
2
Guido van Rossum56076da2008-12-02 22:58:36 +00003.. _cporting-howto:
4
Larry Hastings62417a02012-02-28 16:21:47 -08005*************************************
6Porting Extension Modules to Python 3
7*************************************
Benjamin Petersone9bbc8b2008-09-28 02:06:32 +00008
9:author: Benjamin Peterson
10
11
12.. topic:: Abstract
13
Larry Hastings62417a02012-02-28 16:21:47 -080014 Although changing the C-API was not one of Python 3's objectives,
15 the many Python-level changes made leaving Python 2's API intact
16 impossible. In fact, some changes such as :func:`int` and
17 :func:`long` unification are more obvious on the C level. This
18 document endeavors to document incompatibilities and how they can
19 be worked around.
Benjamin Petersone9bbc8b2008-09-28 02:06:32 +000020
21
22Conditional compilation
23=======================
24
Larry Hastings62417a02012-02-28 16:21:47 -080025The easiest way to compile only some code for Python 3 is to check
26if :c:macro:`PY_MAJOR_VERSION` is greater than or equal to 3. ::
Benjamin Petersone9bbc8b2008-09-28 02:06:32 +000027
28 #if PY_MAJOR_VERSION >= 3
29 #define IS_PY3K
30 #endif
31
32API functions that are not present can be aliased to their equivalents within
33conditional blocks.
34
35
36Changes to Object APIs
37======================
38
Larry Hastings62417a02012-02-28 16:21:47 -080039Python 3 merged together some types with similar functions while cleanly
Benjamin Petersone9bbc8b2008-09-28 02:06:32 +000040separating others.
41
42
43str/unicode Unification
44-----------------------
45
Georg Brandl89c558d2014-10-28 21:38:49 +010046Python 3's :func:`str` type is equivalent to Python 2's :func:`unicode`; the C
47functions are called ``PyUnicode_*`` for both. The old 8-bit string type has become
48:func:`bytes`, with C functions called ``PyBytes_*``. Python 2.6 and later provide a compatibility header,
Benjamin Petersone9bbc8b2008-09-28 02:06:32 +000049:file:`bytesobject.h`, mapping ``PyBytes`` names to ``PyString`` ones. For best
Larry Hastings62417a02012-02-28 16:21:47 -080050compatibility with Python 3, :c:type:`PyUnicode` should be used for textual data and
Georg Brandl60203b42010-10-06 10:11:56 +000051:c:type:`PyBytes` for binary data. It's also important to remember that
Larry Hastings62417a02012-02-28 16:21:47 -080052:c:type:`PyBytes` and :c:type:`PyUnicode` in Python 3 are not interchangeable like
53:c:type:`PyString` and :c:type:`PyUnicode` are in Python 2. The following example
Georg Brandl682d7e02010-10-06 10:26:05 +000054shows best practices with regards to :c:type:`PyUnicode`, :c:type:`PyString`,
55and :c:type:`PyBytes`. ::
Benjamin Petersone9bbc8b2008-09-28 02:06:32 +000056
57 #include "stdlib.h"
58 #include "Python.h"
59 #include "bytesobject.h"
60
61 /* text example */
62 static PyObject *
63 say_hello(PyObject *self, PyObject *args) {
64 PyObject *name, *result;
65
66 if (!PyArg_ParseTuple(args, "U:say_hello", &name))
67 return NULL;
68
69 result = PyUnicode_FromFormat("Hello, %S!", name);
70 return result;
71 }
72
73 /* just a forward */
74 static char * do_encode(PyObject *);
75
76 /* bytes example */
77 static PyObject *
78 encode_object(PyObject *self, PyObject *args) {
79 char *encoded;
80 PyObject *result, *myobj;
81
82 if (!PyArg_ParseTuple(args, "O:encode_object", &myobj))
83 return NULL;
84
85 encoded = do_encode(myobj);
86 if (encoded == NULL)
87 return NULL;
88 result = PyBytes_FromString(encoded);
89 free(encoded);
90 return result;
91 }
92
93
94long/int Unification
95--------------------
96
Larry Hastings62417a02012-02-28 16:21:47 -080097Python 3 has only one integer type, :func:`int`. But it actually
98corresponds to Python 2's :func:`long` type--the :func:`int` type
99used in Python 2 was removed. In the C-API, ``PyInt_*`` functions
100are replaced by their ``PyLong_*`` equivalents.
101
Benjamin Petersone9bbc8b2008-09-28 02:06:32 +0000102
103Module initialization and state
104===============================
105
Larry Hastings62417a02012-02-28 16:21:47 -0800106Python 3 has a revamped extension module initialization system. (See
107:pep:`3121`.) Instead of storing module state in globals, they should
108be stored in an interpreter specific structure. Creating modules that
109act correctly in both Python 2 and Python 3 is tricky. The following
110simple example demonstrates how. ::
Benjamin Petersone9bbc8b2008-09-28 02:06:32 +0000111
112 #include "Python.h"
113
114 struct module_state {
115 PyObject *error;
116 };
117
118 #if PY_MAJOR_VERSION >= 3
119 #define GETSTATE(m) ((struct module_state*)PyModule_GetState(m))
120 #else
121 #define GETSTATE(m) (&_state)
122 static struct module_state _state;
123 #endif
124
125 static PyObject *
126 error_out(PyObject *m) {
127 struct module_state *st = GETSTATE(m);
128 PyErr_SetString(st->error, "something bad happened");
129 return NULL;
130 }
131
132 static PyMethodDef myextension_methods[] = {
133 {"error_out", (PyCFunction)error_out, METH_NOARGS, NULL},
134 {NULL, NULL}
135 };
136
137 #if PY_MAJOR_VERSION >= 3
138
139 static int myextension_traverse(PyObject *m, visitproc visit, void *arg) {
140 Py_VISIT(GETSTATE(m)->error);
141 return 0;
142 }
143
144 static int myextension_clear(PyObject *m) {
145 Py_CLEAR(GETSTATE(m)->error);
146 return 0;
147 }
148
149
150 static struct PyModuleDef moduledef = {
151 PyModuleDef_HEAD_INIT,
152 "myextension",
153 NULL,
154 sizeof(struct module_state),
155 myextension_methods,
156 NULL,
157 myextension_traverse,
158 myextension_clear,
159 NULL
160 };
161
162 #define INITERROR return NULL
163
164 PyObject *
165 PyInit_myextension(void)
166
167 #else
168 #define INITERROR return
169
170 void
171 initmyextension(void)
172 #endif
173 {
174 #if PY_MAJOR_VERSION >= 3
175 PyObject *module = PyModule_Create(&moduledef);
176 #else
177 PyObject *module = Py_InitModule("myextension", myextension_methods);
178 #endif
179
180 if (module == NULL)
181 INITERROR;
182 struct module_state *st = GETSTATE(module);
183
184 st->error = PyErr_NewException("myextension.Error", NULL, NULL);
185 if (st->error == NULL) {
186 Py_DECREF(module);
187 INITERROR;
188 }
189
190 #if PY_MAJOR_VERSION >= 3
191 return module;
192 #endif
193 }
194
195
Larry Hastings62417a02012-02-28 16:21:47 -0800196CObject replaced with Capsule
197=============================
198
199The :c:type:`Capsule` object was introduced in Python 3.1 and 2.7 to replace
200:c:type:`CObject`. CObjects were useful,
201but the :c:type:`CObject` API was problematic: it didn't permit distinguishing
202between valid CObjects, which allowed mismatched CObjects to crash the
203interpreter, and some of its APIs relied on undefined behavior in C.
204(For further reading on the rationale behind Capsules, please see :issue:`5630`.)
205
206If you're currently using CObjects, and you want to migrate to 3.1 or newer,
207you'll need to switch to Capsules.
208:c:type:`CObject` was deprecated in 3.1 and 2.7 and completely removed in
209Python 3.2. If you only support 2.7, or 3.1 and above, you
210can simply switch to :c:type:`Capsule`. If you need to support Python 3.0,
211or versions of Python earlier than 2.7,
212you'll have to support both CObjects and Capsules.
213(Note that Python 3.0 is no longer supported, and it is not recommended
214for production use.)
215
216The following example header file :file:`capsulethunk.h` may
217solve the problem for you. Simply write your code against the
218:c:type:`Capsule` API and include this header file after
219:file:`Python.h`. Your code will automatically use Capsules
220in versions of Python with Capsules, and switch to CObjects
221when Capsules are unavailable.
222
223:file:`capsulethunk.h` simulates Capsules using CObjects. However,
224:c:type:`CObject` provides no place to store the capsule's "name". As a
225result the simulated :c:type:`Capsule` objects created by :file:`capsulethunk.h`
226behave slightly differently from real Capsules. Specifically:
227
228 * The name parameter passed in to :c:func:`PyCapsule_New` is ignored.
229
230 * The name parameter passed in to :c:func:`PyCapsule_IsValid` and
231 :c:func:`PyCapsule_GetPointer` is ignored, and no error checking
232 of the name is performed.
233
234 * :c:func:`PyCapsule_GetName` always returns NULL.
235
Andrew Svetlov737fb892012-12-18 21:14:22 +0200236 * :c:func:`PyCapsule_SetName` always raises an exception and
Larry Hastings62417a02012-02-28 16:21:47 -0800237 returns failure. (Since there's no way to store a name
238 in a CObject, noisy failure of :c:func:`PyCapsule_SetName`
239 was deemed preferable to silent failure here. If this is
Martin v. Löwis2dee3942012-03-24 17:39:57 +0100240 inconvenient, feel free to modify your local
Larry Hastings62417a02012-02-28 16:21:47 -0800241 copy as you see fit.)
242
243You can find :file:`capsulethunk.h` in the Python source distribution
Éric Araujofdfaf0a2012-03-05 15:50:37 +0100244as :source:`Doc/includes/capsulethunk.h`. We also include it here for
245your convenience:
Larry Hastings62417a02012-02-28 16:21:47 -0800246
247.. literalinclude:: ../includes/capsulethunk.h
248
249
250
Benjamin Petersone9bbc8b2008-09-28 02:06:32 +0000251Other options
252=============
253
254If you are writing a new extension module, you might consider `Cython
255<http://www.cython.org>`_. It translates a Python-like language to C. The
Larry Hastings62417a02012-02-28 16:21:47 -0800256extension modules it creates are compatible with Python 3 and Python 2.
Benjamin Petersone9bbc8b2008-09-28 02:06:32 +0000257