Blame - Doc/extending/newtypes.rst - platform/external/python/cpython3

blob: 0ea24613387fb25af9fc29b4eaa826e37b278e7e [file] [log] [blame]

Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1	.. highlightlang:: c
				2
				3
				4	.. _defining-new-types:
				5
				6	******************
				7	Defining New Types
				8	******************
				9
				10	.. sectionauthor:: Michael Hudson <mwh@python.net>
				11	.. sectionauthor:: Dave Kuhlman <dkuhlman@rexx.com>
				12	.. sectionauthor:: Jim Fulton <jim@zope.com>
				13
				14
				15	As mentioned in the last chapter, Python allows the writer of an extension
				16	module to define new types that can be manipulated from Python code, much like
				17	strings and lists in core Python.
				18
				19	This is not hard; the code for all extension types follows a pattern, but there
				20	are some details that you need to understand before you can get started.
				21
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	22
				23	.. _dnt-basics:
				24
				25	The Basics
				26	==========
				27
				28	The Python runtime sees all Python objects as variables of type
				29	:ctype:`PyObject\*`. A :ctype:`PyObject` is not a very magnificent object - it
				30	just contains the refcount and a pointer to the object's "type object". This is
				31	where the action is; the type object determines which (C) functions get called
				32	when, for instance, an attribute gets looked up on an object or it is multiplied
				33	by another object. These C functions are called "type methods" to distinguish
				34	them from things like ``[].append`` (which we call "object methods").
				35
				36	So, if you want to define a new object type, you need to create a new type
				37	object.
				38
				39	This sort of thing can only be explained by example, so here's a minimal, but
				40	complete, module that defines a new type:
				41
				42	.. literalinclude:: ../includes/noddy.c
				43
				44
				45	Now that's quite a bit to take in at once, but hopefully bits will seem familiar
				46	from the last chapter.
				47
				48	The first bit that will be new is::
				49
				50	typedef struct {
				51	PyObject_HEAD
				52	} noddy_NoddyObject;
				53
				54	This is what a Noddy object will contain---in this case, nothing more than every
				55	Python object contains, namely a refcount and a pointer to a type object. These
				56	are the fields the ``PyObject_HEAD`` macro brings in. The reason for the macro
				57	is to standardize the layout and to enable special debugging fields in debug
				58	builds. Note that there is no semicolon after the ``PyObject_HEAD`` macro; one
				59	is included in the macro definition. Be wary of adding one by accident; it's
				60	easy to do from habit, and your compiler might not complain, but someone else's
				61	probably will! (On Windows, MSVC is known to call this an error and refuse to
				62	compile the code.)
				63
				64	For contrast, let's take a look at the corresponding definition for standard
Georg Brandl	da65f60	2007-12-08 18:59:56 +0000	[diff] [blame]	65	Python floats::
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	66
				67	typedef struct {
				68	PyObject_HEAD
Georg Brandl	da65f60	2007-12-08 18:59:56 +0000	[diff] [blame]	69	double ob_fval;
				70	} PyFloatObject;
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	71
				72	Moving on, we come to the crunch --- the type object. ::
				73
				74	static PyTypeObject noddy_NoddyType = {
Georg Brandl	ec12e82	2009-02-27 17:11:23 +0000	[diff] [blame]	75	PyVarObject_HEAD_INIT(NULL, 0)
Georg Brandl	913b2a3	2008-12-05 15:12:15 +0000	[diff] [blame]	76	"noddy.Noddy", /* tp_name */
				77	sizeof(noddy_NoddyObject), /* tp_basicsize */
				78	0, /* tp_itemsize */
				79	0, /* tp_dealloc */
				80	0, /* tp_print */
				81	0, /* tp_getattr */
				82	0, /* tp_setattr */
Mark Dickinson	9f98926	2009-02-02 21:29:40 +0000	[diff] [blame]	83	0, /* tp_reserved */
Georg Brandl	913b2a3	2008-12-05 15:12:15 +0000	[diff] [blame]	84	0, /* tp_repr */
				85	0, /* tp_as_number */
				86	0, /* tp_as_sequence */
				87	0, /* tp_as_mapping */
				88	0, /* tp_hash */
				89	0, /* tp_call */
				90	0, /* tp_str */
				91	0, /* tp_getattro */
				92	0, /* tp_setattro */
				93	0, /* tp_as_buffer */
				94	Py_TPFLAGS_DEFAULT, /* tp_flags */
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	95	"Noddy objects", /* tp_doc */
				96	};
				97
				98	Now if you go and look up the definition of :ctype:`PyTypeObject` in
				99	:file:`object.h` you'll see that it has many more fields that the definition
				100	above. The remaining fields will be filled with zeros by the C compiler, and
				101	it's common practice to not specify them explicitly unless you need them.
				102
				103	This is so important that we're going to pick the top of it apart still
				104	further::
				105
Georg Brandl	ec12e82	2009-02-27 17:11:23 +0000	[diff] [blame]	106	PyVarObject_HEAD_INIT(NULL, 0)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	107
				108	This line is a bit of a wart; what we'd like to write is::
				109
Georg Brandl	ec12e82	2009-02-27 17:11:23 +0000	[diff] [blame]	110	PyVarObject_HEAD_INIT(&PyType_Type, 0)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	111
				112	as the type of a type object is "type", but this isn't strictly conforming C and
				113	some compilers complain. Fortunately, this member will be filled in for us by
				114	:cfunc:`PyType_Ready`. ::
				115
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	116	"noddy.Noddy", /* tp_name */
				117
				118	The name of our type. This will appear in the default textual representation of
				119	our objects and in some error messages, for example::
				120
				121	>>> "" + noddy.new_noddy()
				122	Traceback (most recent call last):
				123	File "<stdin>", line 1, in ?
				124	TypeError: cannot add type "noddy.Noddy" to string
				125
				126	Note that the name is a dotted name that includes both the module name and the
				127	name of the type within the module. The module in this case is :mod:`noddy` and
				128	the type is :class:`Noddy`, so we set the type name to :class:`noddy.Noddy`. ::
				129
				130	sizeof(noddy_NoddyObject), /* tp_basicsize */
				131
				132	This is so that Python knows how much memory to allocate when you call
				133	:cfunc:`PyObject_New`.
				134
				135	.. note::
				136
				137	If you want your type to be subclassable from Python, and your type has the same
				138	:attr:`tp_basicsize` as its base type, you may have problems with multiple
				139	inheritance. A Python subclass of your type will have to list your type first
				140	in its :attr:`__bases__`, or else it will not be able to call your type's
				141	:meth:`__new__` method without getting an error. You can avoid this problem by
				142	ensuring that your type has a larger value for :attr:`tp_basicsize` than its
				143	base type does. Most of the time, this will be true anyway, because either your
				144	base type will be :class:`object`, or else you will be adding data members to
				145	your base type, and therefore increasing its size.
				146
				147	::
				148
				149	0, /* tp_itemsize */
				150
				151	This has to do with variable length objects like lists and strings. Ignore this
				152	for now.
				153
				154	Skipping a number of type methods that we don't provide, we set the class flags
				155	to :const:`Py_TPFLAGS_DEFAULT`. ::
				156
Georg Brandl	913b2a3	2008-12-05 15:12:15 +0000	[diff] [blame]	157	Py_TPFLAGS_DEFAULT, /* tp_flags */
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	158
				159	All types should include this constant in their flags. It enables all of the
				160	members defined by the current version of Python.
				161
				162	We provide a doc string for the type in :attr:`tp_doc`. ::
				163
				164	"Noddy objects", /* tp_doc */
				165
				166	Now we get into the type methods, the things that make your objects different
				167	from the others. We aren't going to implement any of these in this version of
				168	the module. We'll expand this example later to have more interesting behavior.
				169
				170	For now, all we want to be able to do is to create new :class:`Noddy` objects.
				171	To enable object creation, we have to provide a :attr:`tp_new` implementation.
				172	In this case, we can just use the default implementation provided by the API
				173	function :cfunc:`PyType_GenericNew`. We'd like to just assign this to the
				174	:attr:`tp_new` slot, but we can't, for portability sake, On some platforms or
				175	compilers, we can't statically initialize a structure member with a function
				176	defined in another C module, so, instead, we'll assign the :attr:`tp_new` slot
				177	in the module initialization function just before calling
				178	:cfunc:`PyType_Ready`::
				179
				180	noddy_NoddyType.tp_new = PyType_GenericNew;
				181	if (PyType_Ready(&noddy_NoddyType) < 0)
				182	return;
				183
				184	All the other type methods are NULL, so we'll go over them later --- that's
				185	for a later section!
				186
				187	Everything else in the file should be familiar, except for some code in
Georg Brandl	913b2a3	2008-12-05 15:12:15 +0000	[diff] [blame]	188	:cfunc:`PyInit_noddy`::
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	189
				190	if (PyType_Ready(&noddy_NoddyType) < 0)
				191	return;
				192
				193	This initializes the :class:`Noddy` type, filing in a number of members,
				194	including :attr:`ob_type` that we initially set to NULL. ::
				195
				196	PyModule_AddObject(m, "Noddy", (PyObject *)&noddy_NoddyType);
				197
				198	This adds the type to the module dictionary. This allows us to create
				199	:class:`Noddy` instances by calling the :class:`Noddy` class::
				200
				201	>>> import noddy
				202	>>> mynoddy = noddy.Noddy()
				203
				204	That's it! All that remains is to build it; put the above code in a file called
				205	:file:`noddy.c` and ::
				206
				207	from distutils.core import setup, Extension
				208	setup(name="noddy", version="1.0",
				209	ext_modules=[Extension("noddy", ["noddy.c"])])
				210
				211	in a file called :file:`setup.py`; then typing ::
				212
				213	$ python setup.py build
				214
				215	at a shell should produce a file :file:`noddy.so` in a subdirectory; move to
				216	that directory and fire up Python --- you should be able to ``import noddy`` and
				217	play around with Noddy objects.
				218
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	219	That wasn't so hard, was it?
				220
				221	Of course, the current Noddy type is pretty uninteresting. It has no data and
				222	doesn't do anything. It can't even be subclassed.
				223
				224
				225	Adding data and methods to the Basic example
				226	--------------------------------------------
				227
				228	Let's expend the basic example to add some data and methods. Let's also make
				229	the type usable as a base class. We'll create a new module, :mod:`noddy2` that
				230	adds these capabilities:
				231
				232	.. literalinclude:: ../includes/noddy2.c
				233
				234
				235	This version of the module has a number of changes.
				236
				237	We've added an extra include::
				238
				239	#include "structmember.h"
				240
				241	This include provides declarations that we use to handle attributes, as
				242	described a bit later.
				243
				244	The name of the :class:`Noddy` object structure has been shortened to
				245	:class:`Noddy`. The type object name has been shortened to :class:`NoddyType`.
				246
				247	The :class:`Noddy` type now has three data attributes, first, last, and
				248	number. The first and last variables are Python strings containing first
				249	and last names. The number attribute is an integer.
				250
				251	The object structure is updated accordingly::
				252
				253	typedef struct {
				254	PyObject_HEAD
				255	PyObject *first;
				256	PyObject *last;
				257	int number;
				258	} Noddy;
				259
				260	Because we now have data to manage, we have to be more careful about object
				261	allocation and deallocation. At a minimum, we need a deallocation method::
				262
				263	static void
				264	Noddy_dealloc(Noddy* self)
				265	{
				266	Py_XDECREF(self->first);
				267	Py_XDECREF(self->last);
Georg Brandl	2ed237b	2008-12-07 14:09:20 +0000	[diff] [blame]	268	Py_TYPE(self)->tp_free((PyObject*)self);
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	269	}
				270
				271	which is assigned to the :attr:`tp_dealloc` member::
				272
				273	(destructor)Noddy_dealloc, /tp_dealloc/
				274
				275	This method decrements the reference counts of the two Python attributes. We use
				276	:cfunc:`Py_XDECREF` here because the :attr:`first` and :attr:`last` members
				277	could be NULL. It then calls the :attr:`tp_free` member of the object's type
				278	to free the object's memory. Note that the object's type might not be
				279	:class:`NoddyType`, because the object may be an instance of a subclass.
				280
				281	We want to make sure that the first and last names are initialized to empty
				282	strings, so we provide a new method::
				283
				284	static PyObject *
				285	Noddy_new(PyTypeObject type, PyObject args, PyObject *kwds)
				286	{
				287	Noddy *self;
				288
				289	self = (Noddy *)type->tp_alloc(type, 0);
				290	if (self != NULL) {
				291	self->first = PyString_FromString("");
				292	if (self->first == NULL)
				293	{
				294	Py_DECREF(self);
				295	return NULL;
				296	}
				297
				298	self->last = PyString_FromString("");
				299	if (self->last == NULL)
				300	{
				301	Py_DECREF(self);
				302	return NULL;
				303	}
				304
				305	self->number = 0;
				306	}
				307
				308	return (PyObject *)self;
				309	}
				310
				311	and install it in the :attr:`tp_new` member::
				312
				313	Noddy_new, /* tp_new */
				314
				315	The new member is responsible for creating (as opposed to initializing) objects
				316	of the type. It is exposed in Python as the :meth:`__new__` method. See the
				317	paper titled "Unifying types and classes in Python" for a detailed discussion of
				318	the :meth:`__new__` method. One reason to implement a new method is to assure
				319	the initial values of instance variables. In this case, we use the new method
				320	to make sure that the initial values of the members :attr:`first` and
				321	:attr:`last` are not NULL. If we didn't care whether the initial values were
				322	NULL, we could have used :cfunc:`PyType_GenericNew` as our new method, as we
				323	did before. :cfunc:`PyType_GenericNew` initializes all of the instance variable
				324	members to NULL.
				325
				326	The new method is a static method that is passed the type being instantiated and
				327	any arguments passed when the type was called, and that returns the new object
				328	created. New methods always accept positional and keyword arguments, but they
				329	often ignore the arguments, leaving the argument handling to initializer
				330	methods. Note that if the type supports subclassing, the type passed may not be
				331	the type being defined. The new method calls the tp_alloc slot to allocate
				332	memory. We don't fill the :attr:`tp_alloc` slot ourselves. Rather
				333	:cfunc:`PyType_Ready` fills it for us by inheriting it from our base class,
				334	which is :class:`object` by default. Most types use the default allocation.
				335
				336	.. note::
				337
				338	If you are creating a co-operative :attr:`tp_new` (one that calls a base type's
				339	:attr:`tp_new` or :meth:`__new__`), you must not try to determine what method
				340	to call using method resolution order at runtime. Always statically determine
				341	what type you are going to call, and call its :attr:`tp_new` directly, or via
				342	``type->tp_base->tp_new``. If you do not do this, Python subclasses of your
				343	type that also inherit from other Python-defined classes may not work correctly.
				344	(Specifically, you may not be able to create instances of such subclasses
				345	without getting a :exc:`TypeError`.)
				346
				347	We provide an initialization function::
				348
				349	static int
				350	Noddy_init(Noddy self, PyObject args, PyObject *kwds)
				351	{
				352	PyObject first=NULL, last=NULL, *tmp;
				353
				354	static char *kwlist[] = {"first", "last", "number", NULL};
				355
				356	if (! PyArg_ParseTupleAndKeywords(args, kwds, "\|OOi", kwlist,
				357	&first, &last,
				358	&self->number))
				359	return -1;
				360
				361	if (first) {
				362	tmp = self->first;
				363	Py_INCREF(first);
				364	self->first = first;
				365	Py_XDECREF(tmp);
				366	}
				367
				368	if (last) {
				369	tmp = self->last;
				370	Py_INCREF(last);
				371	self->last = last;
				372	Py_XDECREF(tmp);
				373	}
				374
				375	return 0;
				376	}
				377
				378	by filling the :attr:`tp_init` slot. ::
				379
				380	(initproc)Noddy_init, /* tp_init */
				381
				382	The :attr:`tp_init` slot is exposed in Python as the :meth:`__init__` method. It
				383	is used to initialize an object after it's created. Unlike the new method, we
				384	can't guarantee that the initializer is called. The initializer isn't called
				385	when unpickling objects and it can be overridden. Our initializer accepts
				386	arguments to provide initial values for our instance. Initializers always accept
				387	positional and keyword arguments.
				388
				389	Initializers can be called multiple times. Anyone can call the :meth:`__init__`
				390	method on our objects. For this reason, we have to be extra careful when
				391	assigning the new values. We might be tempted, for example to assign the
				392	:attr:`first` member like this::
				393
				394	if (first) {
				395	Py_XDECREF(self->first);
				396	Py_INCREF(first);
				397	self->first = first;
				398	}
				399
				400	But this would be risky. Our type doesn't restrict the type of the
				401	:attr:`first` member, so it could be any kind of object. It could have a
				402	destructor that causes code to be executed that tries to access the
				403	:attr:`first` member. To be paranoid and protect ourselves against this
				404	possibility, we almost always reassign members before decrementing their
				405	reference counts. When don't we have to do this?
				406
				407	* when we absolutely know that the reference count is greater than 1
				408
				409	* when we know that deallocation of the object [#]_ will not cause any calls
				410	back into our type's code
				411
				412	* when decrementing a reference count in a :attr:`tp_dealloc` handler when
				413	garbage-collections is not supported [#]_
				414
Christian Heimes	f75b290	2008-03-16 17:29:44 +0000	[diff] [blame]	415	We want to expose our instance variables as attributes. There are a
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	416	number of ways to do that. The simplest way is to define member definitions::
				417
				418	static PyMemberDef Noddy_members[] = {
				419	{"first", T_OBJECT_EX, offsetof(Noddy, first), 0,
				420	"first name"},
				421	{"last", T_OBJECT_EX, offsetof(Noddy, last), 0,
				422	"last name"},
				423	{"number", T_INT, offsetof(Noddy, number), 0,
				424	"noddy number"},
				425	{NULL} /* Sentinel */
				426	};
				427
				428	and put the definitions in the :attr:`tp_members` slot::
				429
				430	Noddy_members, /* tp_members */
				431
				432	Each member definition has a member name, type, offset, access flags and
				433	documentation string. See the "Generic Attribute Management" section below for
				434	details.
				435
				436	A disadvantage of this approach is that it doesn't provide a way to restrict the
				437	types of objects that can be assigned to the Python attributes. We expect the
				438	first and last names to be strings, but any Python objects can be assigned.
				439	Further, the attributes can be deleted, setting the C pointers to NULL. Even
				440	though we can make sure the members are initialized to non-NULL values, the
				441	members can be set to NULL if the attributes are deleted.
				442
				443	We define a single method, :meth:`name`, that outputs the objects name as the
				444	concatenation of the first and last names. ::
				445
				446	static PyObject *
				447	Noddy_name(Noddy* self)
				448	{
				449	static PyObject *format = NULL;
				450	PyObject args, result;
				451
				452	if (format == NULL) {
				453	format = PyString_FromString("%s %s");
				454	if (format == NULL)
				455	return NULL;
				456	}
				457
				458	if (self->first == NULL) {
				459	PyErr_SetString(PyExc_AttributeError, "first");
				460	return NULL;
				461	}
				462
				463	if (self->last == NULL) {
				464	PyErr_SetString(PyExc_AttributeError, "last");
				465	return NULL;
				466	}
				467
				468	args = Py_BuildValue("OO", self->first, self->last);
				469	if (args == NULL)
				470	return NULL;
				471
				472	result = PyString_Format(format, args);
				473	Py_DECREF(args);
				474
				475	return result;
				476	}
				477
				478	The method is implemented as a C function that takes a :class:`Noddy` (or
				479	:class:`Noddy` subclass) instance as the first argument. Methods always take an
				480	instance as the first argument. Methods often take positional and keyword
				481	arguments as well, but in this cased we don't take any and don't need to accept
				482	a positional argument tuple or keyword argument dictionary. This method is
				483	equivalent to the Python method::
				484
				485	def name(self):
				486	return "%s %s" % (self.first, self.last)
				487
				488	Note that we have to check for the possibility that our :attr:`first` and
				489	:attr:`last` members are NULL. This is because they can be deleted, in which
				490	case they are set to NULL. It would be better to prevent deletion of these
				491	attributes and to restrict the attribute values to be strings. We'll see how to
				492	do that in the next section.
				493
				494	Now that we've defined the method, we need to create an array of method
				495	definitions::
				496
				497	static PyMethodDef Noddy_methods[] = {
				498	{"name", (PyCFunction)Noddy_name, METH_NOARGS,
				499	"Return the name, combining the first and last name"
				500	},
				501	{NULL} /* Sentinel */
				502	};
				503
				504	and assign them to the :attr:`tp_methods` slot::
				505
				506	Noddy_methods, /* tp_methods */
				507
				508	Note that we used the :const:`METH_NOARGS` flag to indicate that the method is
				509	passed no arguments.
				510
				511	Finally, we'll make our type usable as a base class. We've written our methods
				512	carefully so far so that they don't make any assumptions about the type of the
				513	object being created or used, so all we need to do is to add the
				514	:const:`Py_TPFLAGS_BASETYPE` to our class flag definition::
				515
				516	Py_TPFLAGS_DEFAULT \| Py_TPFLAGS_BASETYPE, /tp_flags/
				517
Georg Brandl	913b2a3	2008-12-05 15:12:15 +0000	[diff] [blame]	518	We rename :cfunc:`PyInit_noddy` to :cfunc:`PyInit_noddy2` and update the module
				519	name in the :ctype:`PyModuleDef` struct.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	520
				521	Finally, we update our :file:`setup.py` file to build the new module::
				522
				523	from distutils.core import setup, Extension
				524	setup(name="noddy", version="1.0",
				525	ext_modules=[
				526	Extension("noddy", ["noddy.c"]),
				527	Extension("noddy2", ["noddy2.c"]),
				528	])
				529
				530
				531	Providing finer control over data attributes
				532	--------------------------------------------
				533
				534	In this section, we'll provide finer control over how the :attr:`first` and
				535	:attr:`last` attributes are set in the :class:`Noddy` example. In the previous
				536	version of our module, the instance variables :attr:`first` and :attr:`last`
				537	could be set to non-string values or even deleted. We want to make sure that
				538	these attributes always contain strings.
				539
				540	.. literalinclude:: ../includes/noddy3.c
				541
				542
				543	To provide greater control, over the :attr:`first` and :attr:`last` attributes,
				544	we'll use custom getter and setter functions. Here are the functions for
				545	getting and setting the :attr:`first` attribute::
				546
				547	Noddy_getfirst(Noddy self, void closure)
				548	{
				549	Py_INCREF(self->first);
				550	return self->first;
				551	}
				552
				553	static int
				554	Noddy_setfirst(Noddy self, PyObject value, void *closure)
				555	{
				556	if (value == NULL) {
				557	PyErr_SetString(PyExc_TypeError, "Cannot delete the first attribute");
				558	return -1;
				559	}
				560
				561	if (! PyString_Check(value)) {
				562	PyErr_SetString(PyExc_TypeError,
				563	"The first attribute value must be a string");
				564	return -1;
				565	}
				566
				567	Py_DECREF(self->first);
				568	Py_INCREF(value);
				569	self->first = value;
				570
				571	return 0;
				572	}
				573
				574	The getter function is passed a :class:`Noddy` object and a "closure", which is
				575	void pointer. In this case, the closure is ignored. (The closure supports an
				576	advanced usage in which definition data is passed to the getter and setter. This
				577	could, for example, be used to allow a single set of getter and setter functions
				578	that decide the attribute to get or set based on data in the closure.)
				579
				580	The setter function is passed the :class:`Noddy` object, the new value, and the
				581	closure. The new value may be NULL, in which case the attribute is being
				582	deleted. In our setter, we raise an error if the attribute is deleted or if the
				583	attribute value is not a string.
				584
				585	We create an array of :ctype:`PyGetSetDef` structures::
				586
				587	static PyGetSetDef Noddy_getseters[] = {
				588	{"first",
				589	(getter)Noddy_getfirst, (setter)Noddy_setfirst,
				590	"first name",
				591	NULL},
				592	{"last",
				593	(getter)Noddy_getlast, (setter)Noddy_setlast,
				594	"last name",
				595	NULL},
				596	{NULL} /* Sentinel */
				597	};
				598
				599	and register it in the :attr:`tp_getset` slot::
				600
				601	Noddy_getseters, /* tp_getset */
				602
Christian Heimes	f75b290	2008-03-16 17:29:44 +0000	[diff] [blame]	603	to register our attribute getters and setters.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	604
				605	The last item in a :ctype:`PyGetSetDef` structure is the closure mentioned
				606	above. In this case, we aren't using the closure, so we just pass NULL.
				607
				608	We also remove the member definitions for these attributes::
				609
				610	static PyMemberDef Noddy_members[] = {
				611	{"number", T_INT, offsetof(Noddy, number), 0,
				612	"noddy number"},
				613	{NULL} /* Sentinel */
				614	};
				615
				616	We also need to update the :attr:`tp_init` handler to only allow strings [#]_ to
				617	be passed::
				618
				619	static int
				620	Noddy_init(Noddy self, PyObject args, PyObject *kwds)
				621	{
				622	PyObject first=NULL, last=NULL, *tmp;
				623
				624	static char *kwlist[] = {"first", "last", "number", NULL};
				625
				626	if (! PyArg_ParseTupleAndKeywords(args, kwds, "\|SSi", kwlist,
				627	&first, &last,
				628	&self->number))
				629	return -1;
				630
				631	if (first) {
				632	tmp = self->first;
				633	Py_INCREF(first);
				634	self->first = first;
				635	Py_DECREF(tmp);
				636	}
				637
				638	if (last) {
				639	tmp = self->last;
				640	Py_INCREF(last);
				641	self->last = last;
				642	Py_DECREF(tmp);
				643	}
				644
				645	return 0;
				646	}
				647
				648	With these changes, we can assure that the :attr:`first` and :attr:`last`
				649	members are never NULL so we can remove checks for NULL values in almost all
				650	cases. This means that most of the :cfunc:`Py_XDECREF` calls can be converted to
				651	:cfunc:`Py_DECREF` calls. The only place we can't change these calls is in the
				652	deallocator, where there is the possibility that the initialization of these
				653	members failed in the constructor.
				654
				655	We also rename the module initialization function and module name in the
				656	initialization function, as we did before, and we add an extra definition to the
				657	:file:`setup.py` file.
				658
				659
				660	Supporting cyclic garbage collection
				661	------------------------------------
				662
				663	Python has a cyclic-garbage collector that can identify unneeded objects even
				664	when their reference counts are not zero. This can happen when objects are
				665	involved in cycles. For example, consider::
				666
				667	>>> l = []
				668	>>> l.append(l)
				669	>>> del l
				670
				671	In this example, we create a list that contains itself. When we delete it, it
				672	still has a reference from itself. Its reference count doesn't drop to zero.
				673	Fortunately, Python's cyclic-garbage collector will eventually figure out that
				674	the list is garbage and free it.
				675
				676	In the second version of the :class:`Noddy` example, we allowed any kind of
				677	object to be stored in the :attr:`first` or :attr:`last` attributes. [#]_ This
				678	means that :class:`Noddy` objects can participate in cycles::
				679
				680	>>> import noddy2
				681	>>> n = noddy2.Noddy()
				682	>>> l = [n]
				683	>>> n.first = l
				684
				685	This is pretty silly, but it gives us an excuse to add support for the
				686	cyclic-garbage collector to the :class:`Noddy` example. To support cyclic
				687	garbage collection, types need to fill two slots and set a class flag that
				688	enables these slots:
				689
				690	.. literalinclude:: ../includes/noddy4.c
				691
				692
				693	The traversal method provides access to subobjects that could participate in
				694	cycles::
				695
				696	static int
				697	Noddy_traverse(Noddy self, visitproc visit, void arg)
				698	{
				699	int vret;
				700
				701	if (self->first) {
				702	vret = visit(self->first, arg);
				703	if (vret != 0)
				704	return vret;
				705	}
				706	if (self->last) {
				707	vret = visit(self->last, arg);
				708	if (vret != 0)
				709	return vret;
				710	}
				711
				712	return 0;
				713	}
				714
				715	For each subobject that can participate in cycles, we need to call the
				716	:cfunc:`visit` function, which is passed to the traversal method. The
				717	:cfunc:`visit` function takes as arguments the subobject and the extra argument
				718	arg passed to the traversal method. It returns an integer value that must be
				719	returned if it is non-zero.
				720
Georg Brandl	e6bcc91	2008-05-12 18:05:20 +0000	[diff] [blame]	721	Python provides a :cfunc:`Py_VISIT` macro that automates calling visit
				722	functions. With :cfunc:`Py_VISIT`, :cfunc:`Noddy_traverse` can be simplified::
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	723
				724	static int
				725	Noddy_traverse(Noddy self, visitproc visit, void arg)
				726	{
				727	Py_VISIT(self->first);
				728	Py_VISIT(self->last);
				729	return 0;
				730	}
				731
				732	.. note::
				733
				734	Note that the :attr:`tp_traverse` implementation must name its arguments exactly
				735	visit and arg in order to use :cfunc:`Py_VISIT`. This is to encourage
				736	uniformity across these boring implementations.
				737
				738	We also need to provide a method for clearing any subobjects that can
				739	participate in cycles. We implement the method and reimplement the deallocator
				740	to use it::
				741
				742	static int
				743	Noddy_clear(Noddy *self)
				744	{
				745	PyObject *tmp;
				746
				747	tmp = self->first;
				748	self->first = NULL;
				749	Py_XDECREF(tmp);
				750
				751	tmp = self->last;
				752	self->last = NULL;
				753	Py_XDECREF(tmp);
				754
				755	return 0;
				756	}
				757
				758	static void
				759	Noddy_dealloc(Noddy* self)
				760	{
				761	Noddy_clear(self);
Georg Brandl	2ed237b	2008-12-07 14:09:20 +0000	[diff] [blame]	762	Py_TYPE(self)->tp_free((PyObject*)self);
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	763	}
				764
				765	Notice the use of a temporary variable in :cfunc:`Noddy_clear`. We use the
				766	temporary variable so that we can set each member to NULL before decrementing
				767	its reference count. We do this because, as was discussed earlier, if the
				768	reference count drops to zero, we might cause code to run that calls back into
				769	the object. In addition, because we now support garbage collection, we also
				770	have to worry about code being run that triggers garbage collection. If garbage
				771	collection is run, our :attr:`tp_traverse` handler could get called. We can't
				772	take a chance of having :cfunc:`Noddy_traverse` called when a member's reference
				773	count has dropped to zero and its value hasn't been set to NULL.
				774
Georg Brandl	e6bcc91	2008-05-12 18:05:20 +0000	[diff] [blame]	775	Python provides a :cfunc:`Py_CLEAR` that automates the careful decrementing of
				776	reference counts. With :cfunc:`Py_CLEAR`, the :cfunc:`Noddy_clear` function can
				777	be simplified::
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	778
				779	static int
				780	Noddy_clear(Noddy *self)
				781	{
				782	Py_CLEAR(self->first);
				783	Py_CLEAR(self->last);
				784	return 0;
				785	}
				786
				787	Finally, we add the :const:`Py_TPFLAGS_HAVE_GC` flag to the class flags::
				788
Georg Brandl	913b2a3	2008-12-05 15:12:15 +0000	[diff] [blame]	789	Py_TPFLAGS_DEFAULT \| Py_TPFLAGS_BASETYPE \| Py_TPFLAGS_HAVE_GC, /* tp_flags */
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	790
				791	That's pretty much it. If we had written custom :attr:`tp_alloc` or
				792	:attr:`tp_free` slots, we'd need to modify them for cyclic-garbage collection.
				793	Most extensions will use the versions automatically provided.
				794
				795
				796	Subclassing other types
				797	-----------------------
				798
				799	It is possible to create new extension types that are derived from existing
				800	types. It is easiest to inherit from the built in types, since an extension can
				801	easily use the :class:`PyTypeObject` it needs. It can be difficult to share
				802	these :class:`PyTypeObject` structures between extension modules.
				803
				804	In this example we will create a :class:`Shoddy` type that inherits from the
Georg Brandl	22b3431	2009-07-26 14:54:51 +0000	[diff] [blame]	805	built-in :class:`list` type. The new type will be completely compatible with
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	806	regular lists, but will have an additional :meth:`increment` method that
				807	increases an internal counter. ::
				808
				809	>>> import shoddy
				810	>>> s = shoddy.Shoddy(range(3))
				811	>>> s.extend(s)
Georg Brandl	6911e3c	2007-09-04 07:15:32 +0000	[diff] [blame]	812	>>> print(len(s))
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	813	6
Georg Brandl	6911e3c	2007-09-04 07:15:32 +0000	[diff] [blame]	814	>>> print(s.increment())
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	815	1
Georg Brandl	6911e3c	2007-09-04 07:15:32 +0000	[diff] [blame]	816	>>> print(s.increment())
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	817	2
				818
				819	.. literalinclude:: ../includes/shoddy.c
				820
				821
				822	As you can see, the source code closely resembles the :class:`Noddy` examples in
				823	previous sections. We will break down the main differences between them. ::
				824
				825	typedef struct {
Georg Brandl	a1c6a1c	2009-01-03 21:26:05 +0000	[diff] [blame]	826	PyListObject list;
				827	int state;
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	828	} Shoddy;
				829
				830	The primary difference for derived type objects is that the base type's object
				831	structure must be the first value. The base type will already include the
				832	:cfunc:`PyObject_HEAD` at the beginning of its structure.
				833
				834	When a Python object is a :class:`Shoddy` instance, its PyObject\* pointer can
				835	be safely cast to both PyListObject\* and Shoddy\*. ::
				836
				837	static int
				838	Shoddy_init(Shoddy self, PyObject args, PyObject *kwds)
				839	{
Georg Brandl	a1c6a1c	2009-01-03 21:26:05 +0000	[diff] [blame]	840	if (PyList_Type.tp_init((PyObject *)self, args, kwds) < 0)
				841	return -1;
				842	self->state = 0;
				843	return 0;
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	844	}
				845
				846	In the :attr:`__init__` method for our type, we can see how to call through to
				847	the :attr:`__init__` method of the base type.
				848
				849	This pattern is important when writing a type with custom :attr:`new` and
				850	:attr:`dealloc` methods. The :attr:`new` method should not actually create the
				851	memory for the object with :attr:`tp_alloc`, that will be handled by the base
				852	class when calling its :attr:`tp_new`.
				853
				854	When filling out the :cfunc:`PyTypeObject` for the :class:`Shoddy` type, you see
				855	a slot for :cfunc:`tp_base`. Due to cross platform compiler issues, you can't
				856	fill that field directly with the :cfunc:`PyList_Type`; it can be done later in
				857	the module's :cfunc:`init` function. ::
				858
				859	PyMODINIT_FUNC
Georg Brandl	913b2a3	2008-12-05 15:12:15 +0000	[diff] [blame]	860	PyInit_shoddy(void)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	861	{
Georg Brandl	a1c6a1c	2009-01-03 21:26:05 +0000	[diff] [blame]	862	PyObject *m;
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	863
Georg Brandl	a1c6a1c	2009-01-03 21:26:05 +0000	[diff] [blame]	864	ShoddyType.tp_base = &PyList_Type;
				865	if (PyType_Ready(&ShoddyType) < 0)
				866	return NULL;
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	867
Georg Brandl	a1c6a1c	2009-01-03 21:26:05 +0000	[diff] [blame]	868	m = PyModule_Create(&shoddymodule);
				869	if (m == NULL)
				870	return NULL;
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	871
Georg Brandl	a1c6a1c	2009-01-03 21:26:05 +0000	[diff] [blame]	872	Py_INCREF(&ShoddyType);
				873	PyModule_AddObject(m, "Shoddy", (PyObject *) &ShoddyType);
Georg Brandl	2115176	2009-03-31 15:52:41 +0000	[diff] [blame]	874	return m;
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	875	}
				876
				877	Before calling :cfunc:`PyType_Ready`, the type structure must have the
				878	:attr:`tp_base` slot filled in. When we are deriving a new type, it is not
				879	necessary to fill out the :attr:`tp_alloc` slot with :cfunc:`PyType_GenericNew`
				880	-- the allocate function from the base type will be inherited.
				881
				882	After that, calling :cfunc:`PyType_Ready` and adding the type object to the
				883	module is the same as with the basic :class:`Noddy` examples.
				884
				885
				886	.. _dnt-type-methods:
				887
				888	Type Methods
				889	============
				890
				891	This section aims to give a quick fly-by on the various type methods you can
				892	implement and what they do.
				893
				894	Here is the definition of :ctype:`PyTypeObject`, with some fields only used in
				895	debug builds omitted:
				896
				897	.. literalinclude:: ../includes/typestruct.h
				898
				899
				900	Now that's a lot of methods. Don't worry too much though - if you have a type
				901	you want to define, the chances are very good that you will only implement a
				902	handful of these.
				903
				904	As you probably expect by now, we're going to go over this and give more
				905	information about the various handlers. We won't go in the order they are
				906	defined in the structure, because there is a lot of historical baggage that
				907	impacts the ordering of the fields; be sure your type initialization keeps the
				908	fields in the right order! It's often easiest to find an example that includes
				909	all the fields you need (even if they're initialized to ``0``) and then change
				910	the values to suit your new type. ::
				911
				912	char tp_name; / For printing */
				913
				914	The name of the type - as mentioned in the last section, this will appear in
				915	various places, almost entirely for diagnostic purposes. Try to choose something
				916	that will be helpful in such a situation! ::
				917
				918	int tp_basicsize, tp_itemsize; /* For allocation */
				919
				920	These fields tell the runtime how much memory to allocate when new objects of
				921	this type are created. Python has some built-in support for variable length
				922	structures (think: strings, lists) which is where the :attr:`tp_itemsize` field
				923	comes in. This will be dealt with later. ::
				924
				925	char *tp_doc;
				926
				927	Here you can put a string (or its address) that you want returned when the
				928	Python script references ``obj.__doc__`` to retrieve the doc string.
				929
				930	Now we come to the basic type methods---the ones most extension types will
				931	implement.
				932
				933
				934	Finalization and De-allocation
				935	------------------------------
				936
				937	.. index::
				938	single: object; deallocation
				939	single: deallocation, object
				940	single: object; finalization
				941	single: finalization, of objects
				942
				943	::
				944
				945	destructor tp_dealloc;
				946
				947	This function is called when the reference count of the instance of your type is
				948	reduced to zero and the Python interpreter wants to reclaim it. If your type
				949	has memory to free or other clean-up to perform, put it here. The object itself
				950	needs to be freed here as well. Here is an example of this function::
				951
				952	static void
				953	newdatatype_dealloc(newdatatypeobject * obj)
				954	{
				955	free(obj->obj_UnderlyingDatatypePtr);
Georg Brandl	2ed237b	2008-12-07 14:09:20 +0000	[diff] [blame]	956	Py_TYPE(obj)->tp_free(obj);
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	957	}
				958
				959	.. index::
				960	single: PyErr_Fetch()
				961	single: PyErr_Restore()
				962
				963	One important requirement of the deallocator function is that it leaves any
				964	pending exceptions alone. This is important since deallocators are frequently
				965	called as the interpreter unwinds the Python stack; when the stack is unwound
				966	due to an exception (rather than normal returns), nothing is done to protect the
				967	deallocators from seeing that an exception has already been set. Any actions
				968	which a deallocator performs which may cause additional Python code to be
				969	executed may detect that an exception has been set. This can lead to misleading
				970	errors from the interpreter. The proper way to protect against this is to save
				971	a pending exception before performing the unsafe action, and restoring it when
				972	done. This can be done using the :cfunc:`PyErr_Fetch` and
				973	:cfunc:`PyErr_Restore` functions::
				974
				975	static void
				976	my_dealloc(PyObject *obj)
				977	{
				978	MyObject self = (MyObject ) obj;
				979	PyObject *cbresult;
				980
				981	if (self->my_callback != NULL) {
				982	PyObject err_type, err_value, *err_traceback;
				983	int have_error = PyErr_Occurred() ? 1 : 0;
				984
				985	if (have_error)
				986	PyErr_Fetch(&err_type, &err_value, &err_traceback);
				987
				988	cbresult = PyObject_CallObject(self->my_callback, NULL);
				989	if (cbresult == NULL)
				990	PyErr_WriteUnraisable(self->my_callback);
				991	else
				992	Py_DECREF(cbresult);
				993
				994	if (have_error)
				995	PyErr_Restore(err_type, err_value, err_traceback);
				996
				997	Py_DECREF(self->my_callback);
				998	}
Georg Brandl	2ed237b	2008-12-07 14:09:20 +0000	[diff] [blame]	999	Py_TYPE(obj)->tp_free((PyObject*)self);
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1000	}
				1001
				1002
				1003	Object Presentation
				1004	-------------------
				1005
				1006	.. index::
				1007	builtin: repr
				1008	builtin: str
				1009
				1010	In Python, there are two ways to generate a textual representation of an object:
				1011	the :func:`repr` function, and the :func:`str` function. (The :func:`print`
				1012	function just calls :func:`str`.) These handlers are both optional.
				1013
				1014	::
				1015
				1016	reprfunc tp_repr;
				1017	reprfunc tp_str;
				1018
				1019	The :attr:`tp_repr` handler should return a string object containing a
				1020	representation of the instance for which it is called. Here is a simple
				1021	example::
				1022
				1023	static PyObject *
				1024	newdatatype_repr(newdatatypeobject * obj)
				1025	{
				1026	return PyString_FromFormat("Repr-ified_newdatatype{{size:\%d}}",
				1027	obj->obj_UnderlyingDatatypePtr->size);
				1028	}
				1029
				1030	If no :attr:`tp_repr` handler is specified, the interpreter will supply a
				1031	representation that uses the type's :attr:`tp_name` and a uniquely-identifying
				1032	value for the object.
				1033
				1034	The :attr:`tp_str` handler is to :func:`str` what the :attr:`tp_repr` handler
				1035	described above is to :func:`repr`; that is, it is called when Python code calls
				1036	:func:`str` on an instance of your object. Its implementation is very similar
				1037	to the :attr:`tp_repr` function, but the resulting string is intended for human
				1038	consumption. If :attr:`tp_str` is not specified, the :attr:`tp_repr` handler is
				1039	used instead.
				1040
				1041	Here is a simple example::
				1042
				1043	static PyObject *
				1044	newdatatype_str(newdatatypeobject * obj)
				1045	{
				1046	return PyString_FromFormat("Stringified_newdatatype{{size:\%d}}",
				1047	obj->obj_UnderlyingDatatypePtr->size);
				1048	}
				1049
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1050
				1051
				1052	Attribute Management
				1053	--------------------
				1054
				1055	For every object which can support attributes, the corresponding type must
				1056	provide the functions that control how the attributes are resolved. There needs
				1057	to be a function which can retrieve attributes (if any are defined), and another
				1058	to set attributes (if setting attributes is allowed). Removing an attribute is
				1059	a special case, for which the new value passed to the handler is NULL.
				1060
				1061	Python supports two pairs of attribute handlers; a type that supports attributes
				1062	only needs to implement the functions for one pair. The difference is that one
				1063	pair takes the name of the attribute as a :ctype:`char\*`, while the other
				1064	accepts a :ctype:`PyObject\*`. Each type can use whichever pair makes more
				1065	sense for the implementation's convenience. ::
				1066
				1067	getattrfunc tp_getattr; /* char * version */
				1068	setattrfunc tp_setattr;
				1069	/* ... */
Amaury Forgeot d'Arc	87ce6d7	2008-07-02 22:59:48 +0000	[diff] [blame]	1070	getattrofunc tp_getattro; /* PyObject * version */
				1071	setattrofunc tp_setattro;
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1072
				1073	If accessing attributes of an object is always a simple operation (this will be
				1074	explained shortly), there are generic implementations which can be used to
				1075	provide the :ctype:`PyObject\*` version of the attribute management functions.
				1076	The actual need for type-specific attribute handlers almost completely
				1077	disappeared starting with Python 2.2, though there are many examples which have
				1078	not been updated to use some of the new generic mechanism that is available.
				1079
				1080
				1081	Generic Attribute Management
				1082	^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				1083
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1084	Most extension types only use simple attributes. So, what makes the
				1085	attributes simple? There are only a couple of conditions that must be met:
				1086
				1087	#. The name of the attributes must be known when :cfunc:`PyType_Ready` is
				1088	called.
				1089
				1090	#. No special processing is needed to record that an attribute was looked up or
				1091	set, nor do actions need to be taken based on the value.
				1092
				1093	Note that this list does not place any restrictions on the values of the
				1094	attributes, when the values are computed, or how relevant data is stored.
				1095
				1096	When :cfunc:`PyType_Ready` is called, it uses three tables referenced by the
Georg Brandl	9afde1c	2007-11-01 20:32:30 +0000	[diff] [blame]	1097	type object to create :term:`descriptor`\s which are placed in the dictionary of the
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1098	type object. Each descriptor controls access to one attribute of the instance
				1099	object. Each of the tables is optional; if all three are NULL, instances of
				1100	the type will only have attributes that are inherited from their base type, and
				1101	should leave the :attr:`tp_getattro` and :attr:`tp_setattro` fields NULL as
				1102	well, allowing the base type to handle attributes.
				1103
				1104	The tables are declared as three fields of the type object::
				1105
				1106	struct PyMethodDef *tp_methods;
				1107	struct PyMemberDef *tp_members;
				1108	struct PyGetSetDef *tp_getset;
				1109
				1110	If :attr:`tp_methods` is not NULL, it must refer to an array of
				1111	:ctype:`PyMethodDef` structures. Each entry in the table is an instance of this
				1112	structure::
				1113
				1114	typedef struct PyMethodDef {
				1115	char ml_name; / method name */
				1116	PyCFunction ml_meth; /* implementation function */
Georg Brandl	a1c6a1c	2009-01-03 21:26:05 +0000	[diff] [blame]	1117	int ml_flags; /* flags */
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1118	char ml_doc; / docstring */
				1119	} PyMethodDef;
				1120
				1121	One entry should be defined for each method provided by the type; no entries are
				1122	needed for methods inherited from a base type. One additional entry is needed
				1123	at the end; it is a sentinel that marks the end of the array. The
				1124	:attr:`ml_name` field of the sentinel must be NULL.
				1125
				1126	XXX Need to refer to some unified discussion of the structure fields, shared
				1127	with the next section.
				1128
				1129	The second table is used to define attributes which map directly to data stored
				1130	in the instance. A variety of primitive C types are supported, and access may
				1131	be read-only or read-write. The structures in the table are defined as::
				1132
				1133	typedef struct PyMemberDef {
				1134	char *name;
				1135	int type;
				1136	int offset;
				1137	int flags;
				1138	char *doc;
				1139	} PyMemberDef;
				1140
Georg Brandl	9afde1c	2007-11-01 20:32:30 +0000	[diff] [blame]	1141	For each entry in the table, a :term:`descriptor` will be constructed and added to the
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1142	type which will be able to extract a value from the instance structure. The
				1143	:attr:`type` field should contain one of the type codes defined in the
				1144	:file:`structmember.h` header; the value will be used to determine how to
				1145	convert Python values to and from C values. The :attr:`flags` field is used to
				1146	store flags which control how the attribute can be accessed.
				1147
				1148	XXX Need to move some of this to a shared section!
				1149
				1150	The following flag constants are defined in :file:`structmember.h`; they may be
				1151	combined using bitwise-OR.
				1152
				1153	+---------------------------+----------------------------------------------+
				1154	\| Constant \| Meaning \|
				1155	+===========================+==============================================+
				1156	\| :const:`READONLY` \| Never writable. \|
				1157	+---------------------------+----------------------------------------------+
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1158	\| :const:`READ_RESTRICTED` \| Not readable in restricted mode. \|
				1159	+---------------------------+----------------------------------------------+
				1160	\| :const:`WRITE_RESTRICTED` \| Not writable in restricted mode. \|
				1161	+---------------------------+----------------------------------------------+
				1162	\| :const:`RESTRICTED` \| Not readable or writable in restricted mode. \|
				1163	+---------------------------+----------------------------------------------+
				1164
				1165	.. index::
				1166	single: READONLY
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1167	single: READ_RESTRICTED
				1168	single: WRITE_RESTRICTED
				1169	single: RESTRICTED
				1170
				1171	An interesting advantage of using the :attr:`tp_members` table to build
				1172	descriptors that are used at runtime is that any attribute defined this way can
				1173	have an associated doc string simply by providing the text in the table. An
				1174	application can use the introspection API to retrieve the descriptor from the
				1175	class object, and get the doc string using its :attr:`__doc__` attribute.
				1176
				1177	As with the :attr:`tp_methods` table, a sentinel entry with a :attr:`name` value
				1178	of NULL is required.
				1179
Christian Heimes	5b5e81c	2007-12-31 16:14:33 +0000	[diff] [blame]	1180	.. XXX Descriptors need to be explained in more detail somewhere, but not here.
Georg Brandl	48310cd	2009-01-03 21:18:54 +0000	[diff] [blame]	1181
Christian Heimes	5b5e81c	2007-12-31 16:14:33 +0000	[diff] [blame]	1182	Descriptor objects have two handler functions which correspond to the
				1183	\member{tp_getattro} and \member{tp_setattro} handlers. The
				1184	\method{__get__()} handler is a function which is passed the descriptor,
				1185	instance, and type objects, and returns the value of the attribute, or it
				1186	returns \NULL{} and sets an exception. The \method{__set__()} handler is
				1187	passed the descriptor, instance, type, and new value;
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1188
				1189
				1190	Type-specific Attribute Management
				1191	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				1192
				1193	For simplicity, only the :ctype:`char\*` version will be demonstrated here; the
				1194	type of the name parameter is the only difference between the :ctype:`char\*`
				1195	and :ctype:`PyObject\*` flavors of the interface. This example effectively does
				1196	the same thing as the generic example above, but does not use the generic
Amaury Forgeot d'Arc	87ce6d7	2008-07-02 22:59:48 +0000	[diff] [blame]	1197	support added in Python 2.2. It explains how the handler functions are
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1198	called, so that if you do need to extend their functionality, you'll understand
				1199	what needs to be done.
				1200
				1201	The :attr:`tp_getattr` handler is called when the object requires an attribute
				1202	look-up. It is called in the same situations where the :meth:`__getattr__`
				1203	method of a class would be called.
				1204
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1205	Here is an example::
				1206
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1207	static PyObject *
				1208	newdatatype_getattr(newdatatypeobject obj, char name)
				1209	{
Amaury Forgeot d'Arc	87ce6d7	2008-07-02 22:59:48 +0000	[diff] [blame]	1210	if (strcmp(name, "data") == 0)
				1211	{
				1212	return PyInt_FromLong(obj->data);
				1213	}
				1214
				1215	PyErr_Format(PyExc_AttributeError,
				1216	"'%.50s' object has no attribute '%.400s'",
Georg Brandl	06788c9	2009-01-03 21:31:47 +0000	[diff] [blame]	1217	tp->tp_name, name);
Amaury Forgeot d'Arc	87ce6d7	2008-07-02 22:59:48 +0000	[diff] [blame]	1218	return NULL;
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1219	}
				1220
				1221	The :attr:`tp_setattr` handler is called when the :meth:`__setattr__` or
				1222	:meth:`__delattr__` method of a class instance would be called. When an
				1223	attribute should be deleted, the third parameter will be NULL. Here is an
				1224	example that simply raises an exception; if this were really all you wanted, the
				1225	:attr:`tp_setattr` handler should be set to NULL. ::
				1226
				1227	static int
				1228	newdatatype_setattr(newdatatypeobject obj, char name, PyObject *v)
				1229	{
				1230	(void)PyErr_Format(PyExc_RuntimeError, "Read-only attribute: \%s", name);
				1231	return -1;
				1232	}
				1233
Georg Brandl	890a49a	2009-03-31 18:56:38 +0000	[diff] [blame]	1234	Object Comparison
				1235	-----------------
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1236
Georg Brandl	890a49a	2009-03-31 18:56:38 +0000	[diff] [blame]	1237	::
Georg Brandl	48310cd	2009-01-03 21:18:54 +0000	[diff] [blame]	1238
Georg Brandl	890a49a	2009-03-31 18:56:38 +0000	[diff] [blame]	1239	richcmpfunc tp_richcompare;
Georg Brandl	48310cd	2009-01-03 21:18:54 +0000	[diff] [blame]	1240
Georg Brandl	890a49a	2009-03-31 18:56:38 +0000	[diff] [blame]	1241	The :attr:`tp_richcompare` handler is called when comparisons are needed. It is
				1242	analogous to the :ref:`rich comparison methods <richcmpfuncs>`, like
				1243	:meth:`__lt__`, and also called by :cfunc:`PyObject_RichCompare` and
				1244	:cfunc:`PyObject_RichCompareBool`.
Georg Brandl	48310cd	2009-01-03 21:18:54 +0000	[diff] [blame]	1245
Georg Brandl	890a49a	2009-03-31 18:56:38 +0000	[diff] [blame]	1246	This function is called with two Python objects and the operator as arguments,
				1247	where the operator is one of ``Py_EQ``, ``Py_NE``, ``Py_LE``, ``Py_GT``,
				1248	``Py_LT`` or ``Py_GT``. It should compare the two objects with respect to the
				1249	specified operator and return ``Py_True`` or ``Py_False`` if the comparison is
				1250	successfull, ``Py_NotImplemented`` to indicate that comparison is not
				1251	implemented and the other object's comparison method should be tried, or NULL
				1252	if an exception was set.
Georg Brandl	48310cd	2009-01-03 21:18:54 +0000	[diff] [blame]	1253
Georg Brandl	890a49a	2009-03-31 18:56:38 +0000	[diff] [blame]	1254	Here is a sample implementation, for a datatype that is considered equal if the
				1255	size of an internal pointer is equal::
Georg Brandl	48310cd	2009-01-03 21:18:54 +0000	[diff] [blame]	1256
Georg Brandl	890a49a	2009-03-31 18:56:38 +0000	[diff] [blame]	1257	static int
				1258	newdatatype_richcmp(PyObject obj1, PyObject obj2, int op)
				1259	{
				1260	PyObject *result;
				1261	int c, size1, size2;
Georg Brandl	48310cd	2009-01-03 21:18:54 +0000	[diff] [blame]	1262
Georg Brandl	890a49a	2009-03-31 18:56:38 +0000	[diff] [blame]	1263	/* code to make sure that both arguments are of type
				1264	newdatatype omitted */
Georg Brandl	48310cd	2009-01-03 21:18:54 +0000	[diff] [blame]	1265
Georg Brandl	890a49a	2009-03-31 18:56:38 +0000	[diff] [blame]	1266	size1 = obj1->obj_UnderlyingDatatypePtr->size;
				1267	size2 = obj2->obj_UnderlyingDatatypePtr->size;
				1268
				1269	switch (op) {
				1270	case Py_LT: c = size1 < size2; break;
				1271	case Py_LE: c = size1 <= size2; break;
				1272	case Py_EQ: c = size1 == size2; break;
				1273	case Py_NE: c = size1 != size2; break;
				1274	case Py_GT: c = size1 > size2; break;
				1275	case Py_GE: c = size1 >= size2; break;
				1276	}
				1277	result = c ? Py_True : Py_False;
				1278	Py_INCREF(result);
				1279	return result;
				1280	}
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1281
				1282
				1283	Abstract Protocol Support
				1284	-------------------------
				1285
				1286	Python supports a variety of abstract 'protocols;' the specific interfaces
				1287	provided to use these interfaces are documented in :ref:`abstract`.
				1288
				1289
				1290	A number of these abstract interfaces were defined early in the development of
				1291	the Python implementation. In particular, the number, mapping, and sequence
				1292	protocols have been part of Python since the beginning. Other protocols have
				1293	been added over time. For protocols which depend on several handler routines
				1294	from the type implementation, the older protocols have been defined as optional
				1295	blocks of handlers referenced by the type object. For newer protocols there are
				1296	additional slots in the main type object, with a flag bit being set to indicate
				1297	that the slots are present and should be checked by the interpreter. (The flag
				1298	bit does not indicate that the slot values are non-NULL. The flag may be set
				1299	to indicate the presence of a slot, but a slot may still be unfilled.) ::
				1300
				1301	PyNumberMethods tp_as_number;
				1302	PySequenceMethods tp_as_sequence;
				1303	PyMappingMethods tp_as_mapping;
				1304
				1305	If you wish your object to be able to act like a number, a sequence, or a
				1306	mapping object, then you place the address of a structure that implements the C
				1307	type :ctype:`PyNumberMethods`, :ctype:`PySequenceMethods`, or
				1308	:ctype:`PyMappingMethods`, respectively. It is up to you to fill in this
				1309	structure with appropriate values. You can find examples of the use of each of
				1310	these in the :file:`Objects` directory of the Python source distribution. ::
				1311
				1312	hashfunc tp_hash;
				1313
				1314	This function, if you choose to provide it, should return a hash number for an
				1315	instance of your data type. Here is a moderately pointless example::
				1316
				1317	static long
				1318	newdatatype_hash(newdatatypeobject *obj)
				1319	{
				1320	long result;
				1321	result = obj->obj_UnderlyingDatatypePtr->size;
				1322	result = result * 3;
				1323	return result;
				1324	}
				1325
				1326	::
				1327
				1328	ternaryfunc tp_call;
				1329
				1330	This function is called when an instance of your data type is "called", for
				1331	example, if ``obj1`` is an instance of your data type and the Python script
				1332	contains ``obj1('hello')``, the :attr:`tp_call` handler is invoked.
				1333
				1334	This function takes three arguments:
				1335
				1336	#. arg1 is the instance of the data type which is the subject of the call. If
				1337	the call is ``obj1('hello')``, then arg1 is ``obj1``.
				1338
				1339	#. arg2 is a tuple containing the arguments to the call. You can use
				1340	:cfunc:`PyArg_ParseTuple` to extract the arguments.
				1341
				1342	#. arg3 is a dictionary of keyword arguments that were passed. If this is
				1343	non-NULL and you support keyword arguments, use
				1344	:cfunc:`PyArg_ParseTupleAndKeywords` to extract the arguments. If you do not
				1345	want to support keyword arguments and this is non-NULL, raise a
				1346	:exc:`TypeError` with a message saying that keyword arguments are not supported.
				1347
				1348	Here is a desultory example of the implementation of the call function. ::
				1349
				1350	/* Implement the call function.
				1351	* obj1 is the instance receiving the call.
				1352	* obj2 is a tuple containing the arguments to the call, in this
				1353	* case 3 strings.
				1354	*/
				1355	static PyObject *
				1356	newdatatype_call(newdatatypeobject obj, PyObject args, PyObject *other)
				1357	{
				1358	PyObject *result;
				1359	char *arg1;
				1360	char *arg2;
				1361	char *arg3;
				1362
				1363	if (!PyArg_ParseTuple(args, "sss:call", &arg1, &arg2, &arg3)) {
				1364	return NULL;
				1365	}
				1366	result = PyString_FromFormat(
				1367	"Returning -- value: [\%d] arg1: [\%s] arg2: [\%s] arg3: [\%s]\n",
				1368	obj->obj_UnderlyingDatatypePtr->size,
				1369	arg1, arg2, arg3);
				1370	printf("\%s", PyString_AS_STRING(result));
				1371	return result;
				1372	}
				1373
				1374	XXX some fields need to be added here... ::
				1375
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1376	/* Iterators */
				1377	getiterfunc tp_iter;
				1378	iternextfunc tp_iternext;
				1379
				1380	These functions provide support for the iterator protocol. Any object which
				1381	wishes to support iteration over its contents (which may be generated during
				1382	iteration) must implement the ``tp_iter`` handler. Objects which are returned
				1383	by a ``tp_iter`` handler must implement both the ``tp_iter`` and ``tp_iternext``
				1384	handlers. Both handlers take exactly one parameter, the instance for which they
				1385	are being called, and return a new reference. In the case of an error, they
				1386	should set an exception and return NULL.
				1387
				1388	For an object which represents an iterable collection, the ``tp_iter`` handler
				1389	must return an iterator object. The iterator object is responsible for
				1390	maintaining the state of the iteration. For collections which can support
				1391	multiple iterators which do not interfere with each other (as lists and tuples
				1392	do), a new iterator should be created and returned. Objects which can only be
				1393	iterated over once (usually due to side effects of iteration) should implement
				1394	this handler by returning a new reference to themselves, and should also
				1395	implement the ``tp_iternext`` handler. File objects are an example of such an
				1396	iterator.
				1397
				1398	Iterator objects should implement both handlers. The ``tp_iter`` handler should
				1399	return a new reference to the iterator (this is the same as the ``tp_iter``
				1400	handler for objects which can only be iterated over destructively). The
				1401	``tp_iternext`` handler should return a new reference to the next object in the
				1402	iteration if there is one. If the iteration has reached the end, it may return
				1403	NULL without setting an exception or it may set :exc:`StopIteration`; avoiding
				1404	the exception can yield slightly better performance. If an actual error occurs,
				1405	it should set an exception and return NULL.
				1406
				1407
				1408	.. _weakref-support:
				1409
				1410	Weak Reference Support
				1411	----------------------
				1412
				1413	One of the goals of Python's weak-reference implementation is to allow any type
				1414	to participate in the weak reference mechanism without incurring the overhead on
				1415	those objects which do not benefit by weak referencing (such as numbers).
				1416
				1417	For an object to be weakly referencable, the extension must include a
				1418	:ctype:`PyObject\*` field in the instance structure for the use of the weak
				1419	reference mechanism; it must be initialized to NULL by the object's
				1420	constructor. It must also set the :attr:`tp_weaklistoffset` field of the
				1421	corresponding type object to the offset of the field. For example, the instance
				1422	type is defined with the following structure::
				1423
				1424	typedef struct {
				1425	PyObject_HEAD
				1426	PyClassObject in_class; / The class object */
				1427	PyObject in_dict; / A dictionary */
				1428	PyObject in_weakreflist; / List of weak references */
				1429	} PyInstanceObject;
				1430
				1431	The statically-declared type object for instances is defined this way::
				1432
				1433	PyTypeObject PyInstance_Type = {
Georg Brandl	ec12e82	2009-02-27 17:11:23 +0000	[diff] [blame]	1434	PyVarObject_HEAD_INIT(&PyType_Type, 0)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1435	0,
				1436	"module.instance",
				1437
				1438	/* Lots of stuff omitted for brevity... */
				1439
				1440	Py_TPFLAGS_DEFAULT, /* tp_flags */
				1441	0, /* tp_doc */
				1442	0, /* tp_traverse */
				1443	0, /* tp_clear */
				1444	0, /* tp_richcompare */
				1445	offsetof(PyInstanceObject, in_weakreflist), /* tp_weaklistoffset */
				1446	};
				1447
				1448	The type constructor is responsible for initializing the weak reference list to
				1449	NULL::
				1450
				1451	static PyObject *
				1452	instance_new() {
				1453	/* Other initialization stuff omitted for brevity */
				1454
				1455	self->in_weakreflist = NULL;
				1456
				1457	return (PyObject *) self;
				1458	}
				1459
				1460	The only further addition is that the destructor needs to call the weak
				1461	reference manager to clear any weak references. This should be done before any
				1462	other parts of the destruction have occurred, but is only required if the weak
				1463	reference list is non-NULL::
				1464
				1465	static void
				1466	instance_dealloc(PyInstanceObject *inst)
				1467	{
				1468	/* Allocate temporaries if needed, but do not begin
				1469	destruction just yet.
				1470	*/
				1471
				1472	if (inst->in_weakreflist != NULL)
				1473	PyObject_ClearWeakRefs((PyObject *) inst);
				1474
				1475	/* Proceed with object destruction normally. */
				1476	}
				1477
				1478
				1479	More Suggestions
				1480	----------------
				1481
				1482	Remember that you can omit most of these functions, in which case you provide
				1483	``0`` as a value. There are type definitions for each of the functions you must
				1484	provide. They are in :file:`object.h` in the Python include directory that
				1485	comes with the source distribution of Python.
				1486
				1487	In order to learn how to implement any specific method for your new data type,
Mark Dickinson	9f98926	2009-02-02 21:29:40 +0000	[diff] [blame]	1488	do the following: Download and unpack the Python source distribution. Go to
				1489	the :file:`Objects` directory, then search the C source files for ``tp_`` plus
				1490	the function you want (for example, ``tp_richcompare``). You will find examples
				1491	of the function you want to implement.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1492
				1493	When you need to verify that an object is an instance of the type you are
				1494	implementing, use the :cfunc:`PyObject_TypeCheck` function. A sample of its use
				1495	might be something like the following::
				1496
				1497	if (! PyObject_TypeCheck(some_object, &MyType)) {
				1498	PyErr_SetString(PyExc_TypeError, "arg #1 not a mything");
				1499	return NULL;
				1500	}
				1501
				1502	.. rubric:: Footnotes
				1503
				1504	.. [#] This is true when we know that the object is a basic type, like a string or a
				1505	float.
				1506
				1507	.. [#] We relied on this in the :attr:`tp_dealloc` handler in this example, because our
				1508	type doesn't support garbage collection. Even if a type supports garbage
				1509	collection, there are calls that can be made to "untrack" the object from
				1510	garbage collection, however, these calls are advanced and not covered here.
				1511
				1512	.. [#] We now know that the first and last members are strings, so perhaps we could be
				1513	less careful about decrementing their reference counts, however, we accept
				1514	instances of string subclasses. Even though deallocating normal strings won't
				1515	call back into our objects, we can't guarantee that deallocating an instance of
Christian Heimes	f75b290	2008-03-16 17:29:44 +0000	[diff] [blame]	1516	a string subclass won't call back into our objects.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1517
				1518	.. [#] Even in the third version, we aren't guaranteed to avoid cycles. Instances of
				1519	string subclasses are allowed and string subclasses could allow cycles even if
				1520	normal strings don't.
				1521