| ====================== |
| Descriptor HowTo Guide |
| ====================== |
| |
| :Author: Raymond Hettinger |
| :Contact: <python at rcn dot com> |
| |
| .. Contents:: |
| |
| Abstract |
| -------- |
| |
| Defines descriptors, summarizes the protocol, and shows how descriptors are |
| called. Examines a custom descriptor and several built-in python descriptors |
| including functions, properties, static methods, and class methods. Shows how |
| each works by giving a pure Python equivalent and a sample application. |
| |
| Learning about descriptors not only provides access to a larger toolset, it |
| creates a deeper understanding of how Python works and an appreciation for the |
| elegance of its design. |
| |
| |
| Definition and Introduction |
| --------------------------- |
| |
| In general, a descriptor is an object attribute with "binding behavior", one |
| whose attribute access has been overridden by methods in the descriptor |
| protocol. Those methods are :meth:`__get__`, :meth:`__set__`, and |
| :meth:`__delete__`. If any of those methods are defined for an object, it is |
| said to be a descriptor. |
| |
| The default behavior for attribute access is to get, set, or delete the |
| attribute from an object's dictionary. For instance, ``a.x`` has a lookup chain |
| starting with ``a.__dict__['x']``, then ``type(a).__dict__['x']``, and |
| continuing through the base classes of ``type(a)`` excluding metaclasses. If the |
| looked-up value is an object defining one of the descriptor methods, then Python |
| may override the default behavior and invoke the descriptor method instead. |
| Where this occurs in the precedence chain depends on which descriptor methods |
| were defined. Note that descriptors are only invoked for new style objects or |
| classes (a class is new style if it inherits from :class:`object` or |
| :class:`type`). |
| |
| Descriptors are a powerful, general purpose protocol. They are the mechanism |
| behind properties, methods, static methods, class methods, and :func:`super()`. |
| They are used throughout Python itself to implement the new style classes |
| introduced in version 2.2. Descriptors simplify the underlying C-code and offer |
| a flexible set of new tools for everyday Python programs. |
| |
| |
| Descriptor Protocol |
| ------------------- |
| |
| ``descr.__get__(self, obj, type=None) --> value`` |
| |
| ``descr.__set__(self, obj, value) --> None`` |
| |
| ``descr.__delete__(self, obj) --> None`` |
| |
| That is all there is to it. Define any of these methods and an object is |
| considered a descriptor and can override default behavior upon being looked up |
| as an attribute. |
| |
| If an object defines both :meth:`__get__` and :meth:`__set__`, it is considered |
| a data descriptor. Descriptors that only define :meth:`__get__` are called |
| non-data descriptors (they are typically used for methods but other uses are |
| possible). |
| |
| Data and non-data descriptors differ in how overrides are calculated with |
| respect to entries in an instance's dictionary. If an instance's dictionary |
| has an entry with the same name as a data descriptor, the data descriptor |
| takes precedence. If an instance's dictionary has an entry with the same |
| name as a non-data descriptor, the dictionary entry takes precedence. |
| |
| To make a read-only data descriptor, define both :meth:`__get__` and |
| :meth:`__set__` with the :meth:`__set__` raising an :exc:`AttributeError` when |
| called. Defining the :meth:`__set__` method with an exception raising |
| placeholder is enough to make it a data descriptor. |
| |
| |
| Invoking Descriptors |
| -------------------- |
| |
| A descriptor can be called directly by its method name. For example, |
| ``d.__get__(obj)``. |
| |
| Alternatively, it is more common for a descriptor to be invoked automatically |
| upon attribute access. For example, ``obj.d`` looks up ``d`` in the dictionary |
| of ``obj``. If ``d`` defines the method :meth:`__get__`, then ``d.__get__(obj)`` |
| is invoked according to the precedence rules listed below. |
| |
| The details of invocation depend on whether ``obj`` is an object or a class. |
| Either way, descriptors only work for new style objects and classes. A class is |
| new style if it is a subclass of :class:`object`. |
| |
| For objects, the machinery is in :meth:`object.__getattribute__` which |
| transforms ``b.x`` into ``type(b).__dict__['x'].__get__(b, type(b))``. The |
| implementation works through a precedence chain that gives data descriptors |
| priority over instance variables, instance variables priority over non-data |
| descriptors, and assigns lowest priority to :meth:`__getattr__` if provided. The |
| full C implementation can be found in :c:func:`PyObject_GenericGetAttr()` in |
| `Objects/object.c <http://svn.python.org/view/python/trunk/Objects/object.c?view=markup>`_\. |
| |
| For classes, the machinery is in :meth:`type.__getattribute__` which transforms |
| ``B.x`` into ``B.__dict__['x'].__get__(None, B)``. In pure Python, it looks |
| like:: |
| |
| def __getattribute__(self, key): |
| "Emulate type_getattro() in Objects/typeobject.c" |
| v = object.__getattribute__(self, key) |
| if hasattr(v, '__get__'): |
| return v.__get__(None, self) |
| return v |
| |
| The important points to remember are: |
| |
| * descriptors are invoked by the :meth:`__getattribute__` method |
| * overriding :meth:`__getattribute__` prevents automatic descriptor calls |
| * :meth:`__getattribute__` is only available with new style classes and objects |
| * :meth:`object.__getattribute__` and :meth:`type.__getattribute__` make |
| different calls to :meth:`__get__`. |
| * data descriptors always override instance dictionaries. |
| * non-data descriptors may be overridden by instance dictionaries. |
| |
| The object returned by ``super()`` also has a custom :meth:`__getattribute__` |
| method for invoking descriptors. The call ``super(B, obj).m()`` searches |
| ``obj.__class__.__mro__`` for the base class ``A`` immediately following ``B`` |
| and then returns ``A.__dict__['m'].__get__(obj, A)``. If not a descriptor, |
| ``m`` is returned unchanged. If not in the dictionary, ``m`` reverts to a |
| search using :meth:`object.__getattribute__`. |
| |
| Note, in Python 2.2, ``super(B, obj).m()`` would only invoke :meth:`__get__` if |
| ``m`` was a data descriptor. In Python 2.3, non-data descriptors also get |
| invoked unless an old-style class is involved. The implementation details are |
| in :c:func:`super_getattro()` in |
| `Objects/typeobject.c <http://svn.python.org/view/python/trunk/Objects/typeobject.c?view=markup>`_ |
| and a pure Python equivalent can be found in `Guido's Tutorial`_. |
| |
| .. _`Guido's Tutorial`: http://www.python.org/2.2.3/descrintro.html#cooperation |
| |
| The details above show that the mechanism for descriptors is embedded in the |
| :meth:`__getattribute__()` methods for :class:`object`, :class:`type`, and |
| :func:`super`. Classes inherit this machinery when they derive from |
| :class:`object` or if they have a meta-class providing similar functionality. |
| Likewise, classes can turn-off descriptor invocation by overriding |
| :meth:`__getattribute__()`. |
| |
| |
| Descriptor Example |
| ------------------ |
| |
| The following code creates a class whose objects are data descriptors which |
| print a message for each get or set. Overriding :meth:`__getattribute__` is |
| alternate approach that could do this for every attribute. However, this |
| descriptor is useful for monitoring just a few chosen attributes:: |
| |
| class RevealAccess(object): |
| """A data descriptor that sets and returns values |
| normally and prints a message logging their access. |
| """ |
| |
| def __init__(self, initval=None, name='var'): |
| self.val = initval |
| self.name = name |
| |
| def __get__(self, obj, objtype): |
| print('Retrieving', self.name) |
| return self.val |
| |
| def __set__(self, obj, val): |
| print('Updating', self.name) |
| self.val = val |
| |
| >>> class MyClass(object): |
| x = RevealAccess(10, 'var "x"') |
| y = 5 |
| |
| >>> m = MyClass() |
| >>> m.x |
| Retrieving var "x" |
| 10 |
| >>> m.x = 20 |
| Updating var "x" |
| >>> m.x |
| Retrieving var "x" |
| 20 |
| >>> m.y |
| 5 |
| |
| The protocol is simple and offers exciting possibilities. Several use cases are |
| so common that they have been packaged into individual function calls. |
| Properties, bound and unbound methods, static methods, and class methods are all |
| based on the descriptor protocol. |
| |
| |
| Properties |
| ---------- |
| |
| Calling :func:`property` is a succinct way of building a data descriptor that |
| triggers function calls upon access to an attribute. Its signature is:: |
| |
| property(fget=None, fset=None, fdel=None, doc=None) -> property attribute |
| |
| The documentation shows a typical use to define a managed attribute ``x``:: |
| |
| class C(object): |
| def getx(self): return self.__x |
| def setx(self, value): self.__x = value |
| def delx(self): del self.__x |
| x = property(getx, setx, delx, "I'm the 'x' property.") |
| |
| To see how :func:`property` is implemented in terms of the descriptor protocol, |
| here is a pure Python equivalent:: |
| |
| class Property(object): |
| "Emulate PyProperty_Type() in Objects/descrobject.c" |
| |
| def __init__(self, fget=None, fset=None, fdel=None, doc=None): |
| self.fget = fget |
| self.fset = fset |
| self.fdel = fdel |
| self.__doc__ = doc |
| |
| def __get__(self, obj, objtype=None): |
| if obj is None: |
| return self |
| if self.fget is None: |
| raise AttributeError, "unreadable attribute" |
| return self.fget(obj) |
| |
| def __set__(self, obj, value): |
| if self.fset is None: |
| raise AttributeError, "can't set attribute" |
| self.fset(obj, value) |
| |
| def __delete__(self, obj): |
| if self.fdel is None: |
| raise AttributeError, "can't delete attribute" |
| self.fdel(obj) |
| |
| The :func:`property` builtin helps whenever a user interface has granted |
| attribute access and then subsequent changes require the intervention of a |
| method. |
| |
| For instance, a spreadsheet class may grant access to a cell value through |
| ``Cell('b10').value``. Subsequent improvements to the program require the cell |
| to be recalculated on every access; however, the programmer does not want to |
| affect existing client code accessing the attribute directly. The solution is |
| to wrap access to the value attribute in a property data descriptor:: |
| |
| class Cell(object): |
| . . . |
| def getvalue(self, obj): |
| "Recalculate cell before returning value" |
| self.recalc() |
| return obj._value |
| value = property(getvalue) |
| |
| |
| Functions and Methods |
| --------------------- |
| |
| Python's object oriented features are built upon a function based environment. |
| Using non-data descriptors, the two are merged seamlessly. |
| |
| Class dictionaries store methods as functions. In a class definition, methods |
| are written using :keyword:`def` and :keyword:`lambda`, the usual tools for |
| creating functions. The only difference from regular functions is that the |
| first argument is reserved for the object instance. By Python convention, the |
| instance reference is called *self* but may be called *this* or any other |
| variable name. |
| |
| To support method calls, functions include the :meth:`__get__` method for |
| binding methods during attribute access. This means that all functions are |
| non-data descriptors which return bound or unbound methods depending whether |
| they are invoked from an object or a class. In pure python, it works like |
| this:: |
| |
| class Function(object): |
| . . . |
| def __get__(self, obj, objtype=None): |
| "Simulate func_descr_get() in Objects/funcobject.c" |
| return types.MethodType(self, obj, objtype) |
| |
| Running the interpreter shows how the function descriptor works in practice:: |
| |
| >>> class D(object): |
| def f(self, x): |
| return x |
| |
| >>> d = D() |
| >>> D.__dict__['f'] # Stored internally as a function |
| <function f at 0x00C45070> |
| >>> D.f # Get from a class becomes an unbound method |
| <unbound method D.f> |
| >>> d.f # Get from an instance becomes a bound method |
| <bound method D.f of <__main__.D object at 0x00B18C90>> |
| |
| The output suggests that bound and unbound methods are two different types. |
| While they could have been implemented that way, the actual C implementation of |
| :c:type:`PyMethod_Type` in |
| `Objects/classobject.c <http://svn.python.org/view/python/trunk/Objects/classobject.c?view=markup>`_ |
| is a single object with two different representations depending on whether the |
| :attr:`im_self` field is set or is *NULL* (the C equivalent of *None*). |
| |
| Likewise, the effects of calling a method object depend on the :attr:`im_self` |
| field. If set (meaning bound), the original function (stored in the |
| :attr:`im_func` field) is called as expected with the first argument set to the |
| instance. If unbound, all of the arguments are passed unchanged to the original |
| function. The actual C implementation of :func:`instancemethod_call()` is only |
| slightly more complex in that it includes some type checking. |
| |
| |
| Static Methods and Class Methods |
| -------------------------------- |
| |
| Non-data descriptors provide a simple mechanism for variations on the usual |
| patterns of binding functions into methods. |
| |
| To recap, functions have a :meth:`__get__` method so that they can be converted |
| to a method when accessed as attributes. The non-data descriptor transforms a |
| ``obj.f(*args)`` call into ``f(obj, *args)``. Calling ``klass.f(*args)`` |
| becomes ``f(*args)``. |
| |
| This chart summarizes the binding and its two most useful variants: |
| |
| +-----------------+----------------------+------------------+ |
| | Transformation | Called from an | Called from a | |
| | | Object | Class | |
| +=================+======================+==================+ |
| | function | f(obj, \*args) | f(\*args) | |
| +-----------------+----------------------+------------------+ |
| | staticmethod | f(\*args) | f(\*args) | |
| +-----------------+----------------------+------------------+ |
| | classmethod | f(type(obj), \*args) | f(klass, \*args) | |
| +-----------------+----------------------+------------------+ |
| |
| Static methods return the underlying function without changes. Calling either |
| ``c.f`` or ``C.f`` is the equivalent of a direct lookup into |
| ``object.__getattribute__(c, "f")`` or ``object.__getattribute__(C, "f")``. As a |
| result, the function becomes identically accessible from either an object or a |
| class. |
| |
| Good candidates for static methods are methods that do not reference the |
| ``self`` variable. |
| |
| For instance, a statistics package may include a container class for |
| experimental data. The class provides normal methods for computing the average, |
| mean, median, and other descriptive statistics that depend on the data. However, |
| there may be useful functions which are conceptually related but do not depend |
| on the data. For instance, ``erf(x)`` is handy conversion routine that comes up |
| in statistical work but does not directly depend on a particular dataset. |
| It can be called either from an object or the class: ``s.erf(1.5) --> .9332`` or |
| ``Sample.erf(1.5) --> .9332``. |
| |
| Since staticmethods return the underlying function with no changes, the example |
| calls are unexciting:: |
| |
| >>> class E(object): |
| def f(x): |
| print(x) |
| f = staticmethod(f) |
| |
| >>> print(E.f(3)) |
| 3 |
| >>> print(E().f(3)) |
| 3 |
| |
| Using the non-data descriptor protocol, a pure Python version of |
| :func:`staticmethod` would look like this:: |
| |
| class StaticMethod(object): |
| "Emulate PyStaticMethod_Type() in Objects/funcobject.c" |
| |
| def __init__(self, f): |
| self.f = f |
| |
| def __get__(self, obj, objtype=None): |
| return self.f |
| |
| Unlike static methods, class methods prepend the class reference to the |
| argument list before calling the function. This format is the same |
| for whether the caller is an object or a class:: |
| |
| >>> class E(object): |
| def f(klass, x): |
| return klass.__name__, x |
| f = classmethod(f) |
| |
| >>> print(E.f(3)) |
| ('E', 3) |
| >>> print(E().f(3)) |
| ('E', 3) |
| |
| |
| This behavior is useful whenever the function only needs to have a class |
| reference and does not care about any underlying data. One use for classmethods |
| is to create alternate class constructors. In Python 2.3, the classmethod |
| :func:`dict.fromkeys` creates a new dictionary from a list of keys. The pure |
| Python equivalent is:: |
| |
| class Dict: |
| . . . |
| def fromkeys(klass, iterable, value=None): |
| "Emulate dict_fromkeys() in Objects/dictobject.c" |
| d = klass() |
| for key in iterable: |
| d[key] = value |
| return d |
| fromkeys = classmethod(fromkeys) |
| |
| Now a new dictionary of unique keys can be constructed like this:: |
| |
| >>> Dict.fromkeys('abracadabra') |
| {'a': None, 'r': None, 'b': None, 'c': None, 'd': None} |
| |
| Using the non-data descriptor protocol, a pure Python version of |
| :func:`classmethod` would look like this:: |
| |
| class ClassMethod(object): |
| "Emulate PyClassMethod_Type() in Objects/funcobject.c" |
| |
| def __init__(self, f): |
| self.f = f |
| |
| def __get__(self, obj, klass=None): |
| if klass is None: |
| klass = type(obj) |
| def newfunc(*args): |
| return self.f(klass, *args) |
| return newfunc |
| |