blob: d7a708630259e9aa0049af359f1daa1919515282 [file] [log] [blame]
Benjamin Petersonf1c08f02008-09-26 02:58:36 +00001.. highlightlang:: c
2
Éric Araujo52a5a032011-08-19 01:22:42 +02003.. _cporting-howto:
4
Larry Hastings0555cde2012-02-28 15:17:23 -08005*************************************
6Porting Extension Modules to Python 3
7*************************************
Benjamin Petersonf1c08f02008-09-26 02:58:36 +00008
9:author: Benjamin Peterson
10
11
12.. topic:: Abstract
13
Larry Hastings0555cde2012-02-28 15:17:23 -080014 Although changing the C-API was not one of Python 3's objectives,
15 the many Python-level changes made leaving Python 2's API intact
16 impossible. In fact, some changes such as :func:`int` and
17 :func:`long` unification are more obvious on the C level. This
18 document endeavors to document incompatibilities and how they can
19 be worked around.
Benjamin Petersonf1c08f02008-09-26 02:58:36 +000020
21
22Conditional compilation
23=======================
24
Larry Hastings0555cde2012-02-28 15:17:23 -080025The easiest way to compile only some code for Python 3 is to check
26if :c:macro:`PY_MAJOR_VERSION` is greater than or equal to 3. ::
Benjamin Petersonf1c08f02008-09-26 02:58:36 +000027
28 #if PY_MAJOR_VERSION >= 3
29 #define IS_PY3K
30 #endif
31
32API functions that are not present can be aliased to their equivalents within
Georg Brandlda84d212008-09-26 21:15:21 +000033conditional blocks.
Benjamin Petersonf1c08f02008-09-26 02:58:36 +000034
35
36Changes to Object APIs
37======================
38
Larry Hastings0555cde2012-02-28 15:17:23 -080039Python 3 merged together some types with similar functions while cleanly
Benjamin Petersonf1c08f02008-09-26 02:58:36 +000040separating others.
41
42
43str/unicode Unification
44-----------------------
45
Georg Brandle9e24b52014-10-28 21:38:49 +010046Python 3's :func:`str` type is equivalent to Python 2's :func:`unicode`; the C
47functions are called ``PyUnicode_*`` for both. The old 8-bit string type has become
48:func:`bytes`, with C functions called ``PyBytes_*``. Python 2.6 and later provide a compatibility header,
Benjamin Petersonf1c08f02008-09-26 02:58:36 +000049:file:`bytesobject.h`, mapping ``PyBytes`` names to ``PyString`` ones. For best
Larry Hastings0555cde2012-02-28 15:17:23 -080050compatibility with Python 3, :c:type:`PyUnicode` should be used for textual data and
Sandro Tosi98ed08f2012-01-14 16:42:02 +010051:c:type:`PyBytes` for binary data. It's also important to remember that
Larry Hastings0555cde2012-02-28 15:17:23 -080052:c:type:`PyBytes` and :c:type:`PyUnicode` in Python 3 are not interchangeable like
53:c:type:`PyString` and :c:type:`PyUnicode` are in Python 2. The following example
Sandro Tosi98ed08f2012-01-14 16:42:02 +010054shows best practices with regards to :c:type:`PyUnicode`, :c:type:`PyString`,
55and :c:type:`PyBytes`. ::
Benjamin Petersonf1c08f02008-09-26 02:58:36 +000056
57 #include "stdlib.h"
58 #include "Python.h"
59 #include "bytesobject.h"
60
61 /* text example */
62 static PyObject *
63 say_hello(PyObject *self, PyObject *args) {
64 PyObject *name, *result;
65
66 if (!PyArg_ParseTuple(args, "U:say_hello", &name))
67 return NULL;
68
69 result = PyUnicode_FromFormat("Hello, %S!", name);
70 return result;
71 }
72
Benjamin Peterson4008ef02008-09-27 23:28:43 +000073 /* just a forward */
Benjamin Petersonf1c08f02008-09-26 02:58:36 +000074 static char * do_encode(PyObject *);
75
76 /* bytes example */
77 static PyObject *
78 encode_object(PyObject *self, PyObject *args) {
79 char *encoded;
80 PyObject *result, *myobj;
81
82 if (!PyArg_ParseTuple(args, "O:encode_object", &myobj))
83 return NULL;
84
85 encoded = do_encode(myobj);
86 if (encoded == NULL)
87 return NULL;
88 result = PyBytes_FromString(encoded);
89 free(encoded);
90 return result;
91 }
92
93
94long/int Unification
95--------------------
96
Larry Hastings0555cde2012-02-28 15:17:23 -080097Python 3 has only one integer type, :func:`int`. But it actually
98corresponds to Python 2's :func:`long` type--the :func:`int` type
99used in Python 2 was removed. In the C-API, ``PyInt_*`` functions
100are replaced by their ``PyLong_*`` equivalents.
101
Benjamin Petersonf1c08f02008-09-26 02:58:36 +0000102
103Module initialization and state
104===============================
105
Larry Hastings0555cde2012-02-28 15:17:23 -0800106Python 3 has a revamped extension module initialization system. (See
107:pep:`3121`.) Instead of storing module state in globals, they should
108be stored in an interpreter specific structure. Creating modules that
109act correctly in both Python 2 and Python 3 is tricky. The following
110simple example demonstrates how. ::
Benjamin Petersonf1c08f02008-09-26 02:58:36 +0000111
112 #include "Python.h"
113
114 struct module_state {
115 PyObject *error;
116 };
117
118 #if PY_MAJOR_VERSION >= 3
119 #define GETSTATE(m) ((struct module_state*)PyModule_GetState(m))
120 #else
121 #define GETSTATE(m) (&_state)
122 static struct module_state _state;
123 #endif
124
125 static PyObject *
126 error_out(PyObject *m) {
127 struct module_state *st = GETSTATE(m);
128 PyErr_SetString(st->error, "something bad happened");
129 return NULL;
130 }
131
132 static PyMethodDef myextension_methods[] = {
133 {"error_out", (PyCFunction)error_out, METH_NOARGS, NULL},
134 {NULL, NULL}
135 };
136
137 #if PY_MAJOR_VERSION >= 3
138
139 static int myextension_traverse(PyObject *m, visitproc visit, void *arg) {
140 Py_VISIT(GETSTATE(m)->error);
141 return 0;
142 }
143
144 static int myextension_clear(PyObject *m) {
145 Py_CLEAR(GETSTATE(m)->error);
146 return 0;
147 }
148
149
150 static struct PyModuleDef moduledef = {
151 PyModuleDef_HEAD_INIT,
152 "myextension",
153 NULL,
154 sizeof(struct module_state),
155 myextension_methods,
156 NULL,
157 myextension_traverse,
158 myextension_clear,
159 NULL
160 };
161
162 #define INITERROR return NULL
163
164 PyObject *
165 PyInit_myextension(void)
166
167 #else
168 #define INITERROR return
169
170 void
171 initmyextension(void)
172 #endif
173 {
174 #if PY_MAJOR_VERSION >= 3
175 PyObject *module = PyModule_Create(&moduledef);
176 #else
177 PyObject *module = Py_InitModule("myextension", myextension_methods);
178 #endif
179
180 if (module == NULL)
181 INITERROR;
182 struct module_state *st = GETSTATE(module);
183
184 st->error = PyErr_NewException("myextension.Error", NULL, NULL);
185 if (st->error == NULL) {
186 Py_DECREF(module);
187 INITERROR;
188 }
189
190 #if PY_MAJOR_VERSION >= 3
191 return module;
192 #endif
193 }
Benjamin Peterson0eee7c62008-09-26 20:52:06 +0000194
195
Larry Hastingsfc45bba2011-10-09 13:03:44 +0100196CObject replaced with Capsule
197=============================
198
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100199The :c:type:`Capsule` object was introduced in Python 3.1 and 2.7 to replace
200:c:type:`CObject`. CObjects were useful,
201but the :c:type:`CObject` API was problematic: it didn't permit distinguishing
Larry Hastingsfc45bba2011-10-09 13:03:44 +0100202between valid CObjects, which allowed mismatched CObjects to crash the
203interpreter, and some of its APIs relied on undefined behavior in C.
204(For further reading on the rationale behind Capsules, please see :issue:`5630`.)
205
206If you're currently using CObjects, and you want to migrate to 3.1 or newer,
207you'll need to switch to Capsules.
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100208:c:type:`CObject` was deprecated in 3.1 and 2.7 and completely removed in
Larry Hastingsfc45bba2011-10-09 13:03:44 +0100209Python 3.2. If you only support 2.7, or 3.1 and above, you
Larry Hastings0555cde2012-02-28 15:17:23 -0800210can simply switch to :c:type:`Capsule`. If you need to support Python 3.0,
211or versions of Python earlier than 2.7,
212you'll have to support both CObjects and Capsules.
213(Note that Python 3.0 is no longer supported, and it is not recommended
214for production use.)
Larry Hastingsfc45bba2011-10-09 13:03:44 +0100215
216The following example header file :file:`capsulethunk.h` may
Larry Hastings0555cde2012-02-28 15:17:23 -0800217solve the problem for you. Simply write your code against the
218:c:type:`Capsule` API and include this header file after
219:file:`Python.h`. Your code will automatically use Capsules
220in versions of Python with Capsules, and switch to CObjects
221when Capsules are unavailable.
Larry Hastingsfc45bba2011-10-09 13:03:44 +0100222
223:file:`capsulethunk.h` simulates Capsules using CObjects. However,
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100224:c:type:`CObject` provides no place to store the capsule's "name". As a
225result the simulated :c:type:`Capsule` objects created by :file:`capsulethunk.h`
Larry Hastingsfc45bba2011-10-09 13:03:44 +0100226behave slightly differently from real Capsules. Specifically:
227
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100228 * The name parameter passed in to :c:func:`PyCapsule_New` is ignored.
Larry Hastingsfc45bba2011-10-09 13:03:44 +0100229
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100230 * The name parameter passed in to :c:func:`PyCapsule_IsValid` and
231 :c:func:`PyCapsule_GetPointer` is ignored, and no error checking
Larry Hastingsfc45bba2011-10-09 13:03:44 +0100232 of the name is performed.
233
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100234 * :c:func:`PyCapsule_GetName` always returns NULL.
Larry Hastingsfc45bba2011-10-09 13:03:44 +0100235
Andrew Svetlov4bb142b2012-12-18 21:27:37 +0200236 * :c:func:`PyCapsule_SetName` always raises an exception and
Larry Hastingsfc45bba2011-10-09 13:03:44 +0100237 returns failure. (Since there's no way to store a name
Sandro Tosi98ed08f2012-01-14 16:42:02 +0100238 in a CObject, noisy failure of :c:func:`PyCapsule_SetName`
Larry Hastingsfc45bba2011-10-09 13:03:44 +0100239 was deemed preferable to silent failure here. If this is
Martin v. Löwisfcf37c12012-03-24 17:38:29 +0100240 inconvenient, feel free to modify your local
Larry Hastingsfc45bba2011-10-09 13:03:44 +0100241 copy as you see fit.)
242
243You can find :file:`capsulethunk.h` in the Python source distribution
Éric Araujo76c6aa82012-03-05 16:43:41 +0100244as :source:`Doc/includes/capsulethunk.h`. We also include it here for
245your convenience:
Larry Hastingsfc45bba2011-10-09 13:03:44 +0100246
247.. literalinclude:: ../includes/capsulethunk.h
248
249
250
Benjamin Peterson0eee7c62008-09-26 20:52:06 +0000251Other options
252=============
253
254If you are writing a new extension module, you might consider `Cython
Georg Brandl0f5d6c02014-10-29 10:57:37 +0100255<http://cython.org/>`_. It translates a Python-like language to C. The
Larry Hastings0555cde2012-02-28 15:17:23 -0800256extension modules it creates are compatible with Python 3 and Python 2.
Benjamin Peterson0eee7c62008-09-26 20:52:06 +0000257