Georg Brandl | 45cceeb | 2010-05-19 21:39:51 +0000 | [diff] [blame] | 1 | ====================== |
| 2 | Descriptor HowTo Guide |
| 3 | ====================== |
| 4 | |
| 5 | :Author: Raymond Hettinger |
| 6 | :Contact: <python at rcn dot com> |
| 7 | |
| 8 | .. Contents:: |
| 9 | |
| 10 | Abstract |
| 11 | -------- |
| 12 | |
| 13 | Defines descriptors, summarizes the protocol, and shows how descriptors are |
| 14 | called. Examines a custom descriptor and several built-in python descriptors |
| 15 | including functions, properties, static methods, and class methods. Shows how |
| 16 | each works by giving a pure Python equivalent and a sample application. |
| 17 | |
| 18 | Learning about descriptors not only provides access to a larger toolset, it |
| 19 | creates a deeper understanding of how Python works and an appreciation for the |
| 20 | elegance of its design. |
| 21 | |
| 22 | |
| 23 | Definition and Introduction |
| 24 | --------------------------- |
| 25 | |
| 26 | In general, a descriptor is an object attribute with "binding behavior", one |
| 27 | whose attribute access has been overridden by methods in the descriptor |
| 28 | protocol. Those methods are :meth:`__get__`, :meth:`__set__`, and |
| 29 | :meth:`__delete__`. If any of those methods are defined for an object, it is |
| 30 | said to be a descriptor. |
| 31 | |
| 32 | The default behavior for attribute access is to get, set, or delete the |
| 33 | attribute from an object's dictionary. For instance, ``a.x`` has a lookup chain |
| 34 | starting with ``a.__dict__['x']``, then ``type(a).__dict__['x']``, and |
| 35 | continuing through the base classes of ``type(a)`` excluding metaclasses. If the |
| 36 | looked-up value is an object defining one of the descriptor methods, then Python |
| 37 | may override the default behavior and invoke the descriptor method instead. |
| 38 | Where this occurs in the precedence chain depends on which descriptor methods |
| 39 | were defined. Note that descriptors are only invoked for new style objects or |
| 40 | classes (a class is new style if it inherits from :class:`object` or |
| 41 | :class:`type`). |
| 42 | |
| 43 | Descriptors are a powerful, general purpose protocol. They are the mechanism |
| 44 | behind properties, methods, static methods, class methods, and :func:`super()`. |
Ezio Melotti | 222e61e | 2011-07-31 22:49:18 +0300 | [diff] [blame] | 45 | They are used throughout Python itself to implement the new style classes |
Georg Brandl | 45cceeb | 2010-05-19 21:39:51 +0000 | [diff] [blame] | 46 | introduced in version 2.2. Descriptors simplify the underlying C-code and offer |
| 47 | a flexible set of new tools for everyday Python programs. |
| 48 | |
| 49 | |
| 50 | Descriptor Protocol |
| 51 | ------------------- |
| 52 | |
| 53 | ``descr.__get__(self, obj, type=None) --> value`` |
| 54 | |
| 55 | ``descr.__set__(self, obj, value) --> None`` |
| 56 | |
| 57 | ``descr.__delete__(self, obj) --> None`` |
| 58 | |
| 59 | That is all there is to it. Define any of these methods and an object is |
| 60 | considered a descriptor and can override default behavior upon being looked up |
| 61 | as an attribute. |
| 62 | |
| 63 | If an object defines both :meth:`__get__` and :meth:`__set__`, it is considered |
| 64 | a data descriptor. Descriptors that only define :meth:`__get__` are called |
| 65 | non-data descriptors (they are typically used for methods but other uses are |
| 66 | possible). |
| 67 | |
| 68 | Data and non-data descriptors differ in how overrides are calculated with |
| 69 | respect to entries in an instance's dictionary. If an instance's dictionary |
| 70 | has an entry with the same name as a data descriptor, the data descriptor |
| 71 | takes precedence. If an instance's dictionary has an entry with the same |
| 72 | name as a non-data descriptor, the dictionary entry takes precedence. |
| 73 | |
| 74 | To make a read-only data descriptor, define both :meth:`__get__` and |
| 75 | :meth:`__set__` with the :meth:`__set__` raising an :exc:`AttributeError` when |
| 76 | called. Defining the :meth:`__set__` method with an exception raising |
| 77 | placeholder is enough to make it a data descriptor. |
| 78 | |
| 79 | |
| 80 | Invoking Descriptors |
| 81 | -------------------- |
| 82 | |
| 83 | A descriptor can be called directly by its method name. For example, |
| 84 | ``d.__get__(obj)``. |
| 85 | |
| 86 | Alternatively, it is more common for a descriptor to be invoked automatically |
| 87 | upon attribute access. For example, ``obj.d`` looks up ``d`` in the dictionary |
| 88 | of ``obj``. If ``d`` defines the method :meth:`__get__`, then ``d.__get__(obj)`` |
| 89 | is invoked according to the precedence rules listed below. |
| 90 | |
| 91 | The details of invocation depend on whether ``obj`` is an object or a class. |
| 92 | Either way, descriptors only work for new style objects and classes. A class is |
| 93 | new style if it is a subclass of :class:`object`. |
| 94 | |
| 95 | For objects, the machinery is in :meth:`object.__getattribute__` which |
| 96 | transforms ``b.x`` into ``type(b).__dict__['x'].__get__(b, type(b))``. The |
| 97 | implementation works through a precedence chain that gives data descriptors |
| 98 | priority over instance variables, instance variables priority over non-data |
| 99 | descriptors, and assigns lowest priority to :meth:`__getattr__` if provided. The |
Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 100 | full C implementation can be found in :c:func:`PyObject_GenericGetAttr()` in |
Georg Brandl | 45cceeb | 2010-05-19 21:39:51 +0000 | [diff] [blame] | 101 | `Objects/object.c <http://svn.python.org/view/python/trunk/Objects/object.c?view=markup>`_\. |
| 102 | |
| 103 | For classes, the machinery is in :meth:`type.__getattribute__` which transforms |
| 104 | ``B.x`` into ``B.__dict__['x'].__get__(None, B)``. In pure Python, it looks |
| 105 | like:: |
| 106 | |
| 107 | def __getattribute__(self, key): |
| 108 | "Emulate type_getattro() in Objects/typeobject.c" |
| 109 | v = object.__getattribute__(self, key) |
| 110 | if hasattr(v, '__get__'): |
| 111 | return v.__get__(None, self) |
| 112 | return v |
| 113 | |
| 114 | The important points to remember are: |
| 115 | |
| 116 | * descriptors are invoked by the :meth:`__getattribute__` method |
| 117 | * overriding :meth:`__getattribute__` prevents automatic descriptor calls |
| 118 | * :meth:`__getattribute__` is only available with new style classes and objects |
| 119 | * :meth:`object.__getattribute__` and :meth:`type.__getattribute__` make |
| 120 | different calls to :meth:`__get__`. |
| 121 | * data descriptors always override instance dictionaries. |
| 122 | * non-data descriptors may be overridden by instance dictionaries. |
| 123 | |
| 124 | The object returned by ``super()`` also has a custom :meth:`__getattribute__` |
| 125 | method for invoking descriptors. The call ``super(B, obj).m()`` searches |
| 126 | ``obj.__class__.__mro__`` for the base class ``A`` immediately following ``B`` |
| 127 | and then returns ``A.__dict__['m'].__get__(obj, A)``. If not a descriptor, |
| 128 | ``m`` is returned unchanged. If not in the dictionary, ``m`` reverts to a |
| 129 | search using :meth:`object.__getattribute__`. |
| 130 | |
| 131 | Note, in Python 2.2, ``super(B, obj).m()`` would only invoke :meth:`__get__` if |
| 132 | ``m`` was a data descriptor. In Python 2.3, non-data descriptors also get |
| 133 | invoked unless an old-style class is involved. The implementation details are |
Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 134 | in :c:func:`super_getattro()` in |
Georg Brandl | 45cceeb | 2010-05-19 21:39:51 +0000 | [diff] [blame] | 135 | `Objects/typeobject.c <http://svn.python.org/view/python/trunk/Objects/typeobject.c?view=markup>`_ |
| 136 | and a pure Python equivalent can be found in `Guido's Tutorial`_. |
| 137 | |
| 138 | .. _`Guido's Tutorial`: http://www.python.org/2.2.3/descrintro.html#cooperation |
| 139 | |
| 140 | The details above show that the mechanism for descriptors is embedded in the |
| 141 | :meth:`__getattribute__()` methods for :class:`object`, :class:`type`, and |
| 142 | :func:`super`. Classes inherit this machinery when they derive from |
| 143 | :class:`object` or if they have a meta-class providing similar functionality. |
| 144 | Likewise, classes can turn-off descriptor invocation by overriding |
| 145 | :meth:`__getattribute__()`. |
| 146 | |
| 147 | |
| 148 | Descriptor Example |
| 149 | ------------------ |
| 150 | |
| 151 | The following code creates a class whose objects are data descriptors which |
| 152 | print a message for each get or set. Overriding :meth:`__getattribute__` is |
| 153 | alternate approach that could do this for every attribute. However, this |
| 154 | descriptor is useful for monitoring just a few chosen attributes:: |
| 155 | |
| 156 | class RevealAccess(object): |
| 157 | """A data descriptor that sets and returns values |
| 158 | normally and prints a message logging their access. |
| 159 | """ |
| 160 | |
| 161 | def __init__(self, initval=None, name='var'): |
| 162 | self.val = initval |
| 163 | self.name = name |
| 164 | |
| 165 | def __get__(self, obj, objtype): |
| 166 | print('Retrieving', self.name) |
| 167 | return self.val |
| 168 | |
| 169 | def __set__(self, obj, val): |
| 170 | print('Updating', self.name) |
| 171 | self.val = val |
| 172 | |
| 173 | >>> class MyClass(object): |
| 174 | x = RevealAccess(10, 'var "x"') |
| 175 | y = 5 |
| 176 | |
| 177 | >>> m = MyClass() |
| 178 | >>> m.x |
| 179 | Retrieving var "x" |
| 180 | 10 |
| 181 | >>> m.x = 20 |
| 182 | Updating var "x" |
| 183 | >>> m.x |
| 184 | Retrieving var "x" |
| 185 | 20 |
| 186 | >>> m.y |
| 187 | 5 |
| 188 | |
| 189 | The protocol is simple and offers exciting possibilities. Several use cases are |
| 190 | so common that they have been packaged into individual function calls. |
| 191 | Properties, bound and unbound methods, static methods, and class methods are all |
| 192 | based on the descriptor protocol. |
| 193 | |
| 194 | |
| 195 | Properties |
| 196 | ---------- |
| 197 | |
| 198 | Calling :func:`property` is a succinct way of building a data descriptor that |
| 199 | triggers function calls upon access to an attribute. Its signature is:: |
| 200 | |
| 201 | property(fget=None, fset=None, fdel=None, doc=None) -> property attribute |
| 202 | |
| 203 | The documentation shows a typical use to define a managed attribute ``x``:: |
| 204 | |
| 205 | class C(object): |
| 206 | def getx(self): return self.__x |
| 207 | def setx(self, value): self.__x = value |
| 208 | def delx(self): del self.__x |
| 209 | x = property(getx, setx, delx, "I'm the 'x' property.") |
| 210 | |
| 211 | To see how :func:`property` is implemented in terms of the descriptor protocol, |
| 212 | here is a pure Python equivalent:: |
| 213 | |
| 214 | class Property(object): |
| 215 | "Emulate PyProperty_Type() in Objects/descrobject.c" |
| 216 | |
| 217 | def __init__(self, fget=None, fset=None, fdel=None, doc=None): |
| 218 | self.fget = fget |
| 219 | self.fset = fset |
| 220 | self.fdel = fdel |
| 221 | self.__doc__ = doc |
| 222 | |
| 223 | def __get__(self, obj, objtype=None): |
| 224 | if obj is None: |
| 225 | return self |
| 226 | if self.fget is None: |
| 227 | raise AttributeError, "unreadable attribute" |
| 228 | return self.fget(obj) |
| 229 | |
| 230 | def __set__(self, obj, value): |
| 231 | if self.fset is None: |
| 232 | raise AttributeError, "can't set attribute" |
| 233 | self.fset(obj, value) |
| 234 | |
| 235 | def __delete__(self, obj): |
| 236 | if self.fdel is None: |
| 237 | raise AttributeError, "can't delete attribute" |
| 238 | self.fdel(obj) |
| 239 | |
| 240 | The :func:`property` builtin helps whenever a user interface has granted |
| 241 | attribute access and then subsequent changes require the intervention of a |
| 242 | method. |
| 243 | |
| 244 | For instance, a spreadsheet class may grant access to a cell value through |
| 245 | ``Cell('b10').value``. Subsequent improvements to the program require the cell |
| 246 | to be recalculated on every access; however, the programmer does not want to |
| 247 | affect existing client code accessing the attribute directly. The solution is |
| 248 | to wrap access to the value attribute in a property data descriptor:: |
| 249 | |
| 250 | class Cell(object): |
| 251 | . . . |
| 252 | def getvalue(self, obj): |
| 253 | "Recalculate cell before returning value" |
| 254 | self.recalc() |
| 255 | return obj._value |
| 256 | value = property(getvalue) |
| 257 | |
| 258 | |
| 259 | Functions and Methods |
| 260 | --------------------- |
| 261 | |
| 262 | Python's object oriented features are built upon a function based environment. |
| 263 | Using non-data descriptors, the two are merged seamlessly. |
| 264 | |
| 265 | Class dictionaries store methods as functions. In a class definition, methods |
| 266 | are written using :keyword:`def` and :keyword:`lambda`, the usual tools for |
| 267 | creating functions. The only difference from regular functions is that the |
| 268 | first argument is reserved for the object instance. By Python convention, the |
| 269 | instance reference is called *self* but may be called *this* or any other |
| 270 | variable name. |
| 271 | |
| 272 | To support method calls, functions include the :meth:`__get__` method for |
| 273 | binding methods during attribute access. This means that all functions are |
| 274 | non-data descriptors which return bound or unbound methods depending whether |
| 275 | they are invoked from an object or a class. In pure python, it works like |
| 276 | this:: |
| 277 | |
| 278 | class Function(object): |
| 279 | . . . |
| 280 | def __get__(self, obj, objtype=None): |
| 281 | "Simulate func_descr_get() in Objects/funcobject.c" |
| 282 | return types.MethodType(self, obj, objtype) |
| 283 | |
| 284 | Running the interpreter shows how the function descriptor works in practice:: |
| 285 | |
| 286 | >>> class D(object): |
| 287 | def f(self, x): |
| 288 | return x |
| 289 | |
| 290 | >>> d = D() |
| 291 | >>> D.__dict__['f'] # Stored internally as a function |
| 292 | <function f at 0x00C45070> |
| 293 | >>> D.f # Get from a class becomes an unbound method |
| 294 | <unbound method D.f> |
| 295 | >>> d.f # Get from an instance becomes a bound method |
| 296 | <bound method D.f of <__main__.D object at 0x00B18C90>> |
| 297 | |
| 298 | The output suggests that bound and unbound methods are two different types. |
Georg Brandl | 6faee4e | 2010-09-21 14:48:28 +0000 | [diff] [blame] | 299 | While they could have been implemented that way, the actual C implementation of |
Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 300 | :c:type:`PyMethod_Type` in |
Georg Brandl | 45cceeb | 2010-05-19 21:39:51 +0000 | [diff] [blame] | 301 | `Objects/classobject.c <http://svn.python.org/view/python/trunk/Objects/classobject.c?view=markup>`_ |
| 302 | is a single object with two different representations depending on whether the |
| 303 | :attr:`im_self` field is set or is *NULL* (the C equivalent of *None*). |
| 304 | |
| 305 | Likewise, the effects of calling a method object depend on the :attr:`im_self` |
| 306 | field. If set (meaning bound), the original function (stored in the |
| 307 | :attr:`im_func` field) is called as expected with the first argument set to the |
| 308 | instance. If unbound, all of the arguments are passed unchanged to the original |
| 309 | function. The actual C implementation of :func:`instancemethod_call()` is only |
| 310 | slightly more complex in that it includes some type checking. |
| 311 | |
| 312 | |
| 313 | Static Methods and Class Methods |
| 314 | -------------------------------- |
| 315 | |
| 316 | Non-data descriptors provide a simple mechanism for variations on the usual |
| 317 | patterns of binding functions into methods. |
| 318 | |
| 319 | To recap, functions have a :meth:`__get__` method so that they can be converted |
| 320 | to a method when accessed as attributes. The non-data descriptor transforms a |
| 321 | ``obj.f(*args)`` call into ``f(obj, *args)``. Calling ``klass.f(*args)`` |
| 322 | becomes ``f(*args)``. |
| 323 | |
| 324 | This chart summarizes the binding and its two most useful variants: |
| 325 | |
| 326 | +-----------------+----------------------+------------------+ |
| 327 | | Transformation | Called from an | Called from a | |
| 328 | | | Object | Class | |
| 329 | +=================+======================+==================+ |
| 330 | | function | f(obj, \*args) | f(\*args) | |
| 331 | +-----------------+----------------------+------------------+ |
| 332 | | staticmethod | f(\*args) | f(\*args) | |
| 333 | +-----------------+----------------------+------------------+ |
| 334 | | classmethod | f(type(obj), \*args) | f(klass, \*args) | |
| 335 | +-----------------+----------------------+------------------+ |
| 336 | |
| 337 | Static methods return the underlying function without changes. Calling either |
| 338 | ``c.f`` or ``C.f`` is the equivalent of a direct lookup into |
| 339 | ``object.__getattribute__(c, "f")`` or ``object.__getattribute__(C, "f")``. As a |
| 340 | result, the function becomes identically accessible from either an object or a |
| 341 | class. |
| 342 | |
| 343 | Good candidates for static methods are methods that do not reference the |
| 344 | ``self`` variable. |
| 345 | |
| 346 | For instance, a statistics package may include a container class for |
| 347 | experimental data. The class provides normal methods for computing the average, |
| 348 | mean, median, and other descriptive statistics that depend on the data. However, |
| 349 | there may be useful functions which are conceptually related but do not depend |
| 350 | on the data. For instance, ``erf(x)`` is handy conversion routine that comes up |
| 351 | in statistical work but does not directly depend on a particular dataset. |
| 352 | It can be called either from an object or the class: ``s.erf(1.5) --> .9332`` or |
| 353 | ``Sample.erf(1.5) --> .9332``. |
| 354 | |
| 355 | Since staticmethods return the underlying function with no changes, the example |
| 356 | calls are unexciting:: |
| 357 | |
| 358 | >>> class E(object): |
| 359 | def f(x): |
| 360 | print(x) |
| 361 | f = staticmethod(f) |
| 362 | |
| 363 | >>> print(E.f(3)) |
| 364 | 3 |
| 365 | >>> print(E().f(3)) |
| 366 | 3 |
| 367 | |
| 368 | Using the non-data descriptor protocol, a pure Python version of |
| 369 | :func:`staticmethod` would look like this:: |
| 370 | |
| 371 | class StaticMethod(object): |
| 372 | "Emulate PyStaticMethod_Type() in Objects/funcobject.c" |
| 373 | |
| 374 | def __init__(self, f): |
| 375 | self.f = f |
| 376 | |
| 377 | def __get__(self, obj, objtype=None): |
| 378 | return self.f |
| 379 | |
| 380 | Unlike static methods, class methods prepend the class reference to the |
| 381 | argument list before calling the function. This format is the same |
| 382 | for whether the caller is an object or a class:: |
| 383 | |
| 384 | >>> class E(object): |
| 385 | def f(klass, x): |
| 386 | return klass.__name__, x |
| 387 | f = classmethod(f) |
| 388 | |
| 389 | >>> print(E.f(3)) |
| 390 | ('E', 3) |
| 391 | >>> print(E().f(3)) |
| 392 | ('E', 3) |
| 393 | |
| 394 | |
| 395 | This behavior is useful whenever the function only needs to have a class |
| 396 | reference and does not care about any underlying data. One use for classmethods |
| 397 | is to create alternate class constructors. In Python 2.3, the classmethod |
| 398 | :func:`dict.fromkeys` creates a new dictionary from a list of keys. The pure |
| 399 | Python equivalent is:: |
| 400 | |
| 401 | class Dict: |
| 402 | . . . |
| 403 | def fromkeys(klass, iterable, value=None): |
| 404 | "Emulate dict_fromkeys() in Objects/dictobject.c" |
| 405 | d = klass() |
| 406 | for key in iterable: |
| 407 | d[key] = value |
| 408 | return d |
| 409 | fromkeys = classmethod(fromkeys) |
| 410 | |
| 411 | Now a new dictionary of unique keys can be constructed like this:: |
| 412 | |
| 413 | >>> Dict.fromkeys('abracadabra') |
| 414 | {'a': None, 'r': None, 'b': None, 'c': None, 'd': None} |
| 415 | |
| 416 | Using the non-data descriptor protocol, a pure Python version of |
| 417 | :func:`classmethod` would look like this:: |
| 418 | |
| 419 | class ClassMethod(object): |
| 420 | "Emulate PyClassMethod_Type() in Objects/funcobject.c" |
| 421 | |
| 422 | def __init__(self, f): |
| 423 | self.f = f |
| 424 | |
| 425 | def __get__(self, obj, klass=None): |
| 426 | if klass is None: |
| 427 | klass = type(obj) |
| 428 | def newfunc(*args): |
| 429 | return self.f(klass, *args) |
| 430 | return newfunc |
| 431 | |