Raymond Hettinger | f11363d | 2010-04-11 08:14:45 +0000 | [diff] [blame] | 1 | ====================== |
| 2 | Descriptor HowTo Guide |
| 3 | ====================== |
| 4 | |
| 5 | :Author: Raymond Hettinger |
| 6 | :Contact: <python at rcn dot com> |
| 7 | |
| 8 | .. Contents:: |
| 9 | |
| 10 | Abstract |
| 11 | -------- |
| 12 | |
| 13 | Defines descriptors, summarizes the protocol, and shows how descriptors are |
| 14 | called. Examines a custom descriptor and several built-in python descriptors |
| 15 | including functions, properties, static methods, and class methods. Shows how |
| 16 | each works by giving a pure Python equivalent and a sample application. |
| 17 | |
| 18 | Learning about descriptors not only provides access to a larger toolset, it |
| 19 | creates a deeper understanding of how Python works and an appreciation for the |
| 20 | elegance of its design. |
| 21 | |
| 22 | |
| 23 | Definition and Introduction |
| 24 | --------------------------- |
| 25 | |
| 26 | In general, a descriptor is an object attribute with "binding behavior", one |
| 27 | whose attribute access has been overridden by methods in the descriptor |
| 28 | protocol. Those methods are :meth:`__get__`, :meth:`__set__`, and |
| 29 | :meth:`__delete__`. If any of those methods are defined for an object, it is |
| 30 | said to be a descriptor. |
| 31 | |
| 32 | The default behavior for attribute access is to get, set, or delete the |
| 33 | attribute from an object's dictionary. For instance, ``a.x`` has a lookup chain |
| 34 | starting with ``a.__dict__['x']``, then ``type(a).__dict__['x']``, and |
| 35 | continuing through the base classes of ``type(a)`` excluding metaclasses. If the |
| 36 | looked-up value is an object defining one of the descriptor methods, then Python |
| 37 | may override the default behavior and invoke the descriptor method instead. |
| 38 | Where this occurs in the precedence chain depends on which descriptor methods |
| 39 | were defined. Note that descriptors are only invoked for new style objects or |
| 40 | classes (a class is new style if it inherits from :class:`object` or |
| 41 | :class:`type`). |
| 42 | |
| 43 | Descriptors are a powerful, general purpose protocol. They are the mechanism |
| 44 | behind properties, methods, static methods, class methods, and :func:`super()`. |
Ezio Melotti | c1fc2e1 | 2011-07-31 22:48:45 +0300 | [diff] [blame] | 45 | They are used throughout Python itself to implement the new style classes |
Raymond Hettinger | f11363d | 2010-04-11 08:14:45 +0000 | [diff] [blame] | 46 | introduced in version 2.2. Descriptors simplify the underlying C-code and offer |
| 47 | a flexible set of new tools for everyday Python programs. |
| 48 | |
| 49 | |
| 50 | Descriptor Protocol |
| 51 | ------------------- |
| 52 | |
| 53 | ``descr.__get__(self, obj, type=None) --> value`` |
| 54 | |
| 55 | ``descr.__set__(self, obj, value) --> None`` |
| 56 | |
| 57 | ``descr.__delete__(self, obj) --> None`` |
| 58 | |
| 59 | That is all there is to it. Define any of these methods and an object is |
| 60 | considered a descriptor and can override default behavior upon being looked up |
| 61 | as an attribute. |
| 62 | |
| 63 | If an object defines both :meth:`__get__` and :meth:`__set__`, it is considered |
| 64 | a data descriptor. Descriptors that only define :meth:`__get__` are called |
| 65 | non-data descriptors (they are typically used for methods but other uses are |
| 66 | possible). |
| 67 | |
| 68 | Data and non-data descriptors differ in how overrides are calculated with |
| 69 | respect to entries in an instance's dictionary. If an instance's dictionary |
| 70 | has an entry with the same name as a data descriptor, the data descriptor |
| 71 | takes precedence. If an instance's dictionary has an entry with the same |
| 72 | name as a non-data descriptor, the dictionary entry takes precedence. |
| 73 | |
| 74 | To make a read-only data descriptor, define both :meth:`__get__` and |
| 75 | :meth:`__set__` with the :meth:`__set__` raising an :exc:`AttributeError` when |
| 76 | called. Defining the :meth:`__set__` method with an exception raising |
| 77 | placeholder is enough to make it a data descriptor. |
| 78 | |
| 79 | |
| 80 | Invoking Descriptors |
| 81 | -------------------- |
| 82 | |
| 83 | A descriptor can be called directly by its method name. For example, |
| 84 | ``d.__get__(obj)``. |
| 85 | |
| 86 | Alternatively, it is more common for a descriptor to be invoked automatically |
| 87 | upon attribute access. For example, ``obj.d`` looks up ``d`` in the dictionary |
| 88 | of ``obj``. If ``d`` defines the method :meth:`__get__`, then ``d.__get__(obj)`` |
| 89 | is invoked according to the precedence rules listed below. |
| 90 | |
| 91 | The details of invocation depend on whether ``obj`` is an object or a class. |
| 92 | Either way, descriptors only work for new style objects and classes. A class is |
| 93 | new style if it is a subclass of :class:`object`. |
| 94 | |
| 95 | For objects, the machinery is in :meth:`object.__getattribute__` which |
| 96 | transforms ``b.x`` into ``type(b).__dict__['x'].__get__(b, type(b))``. The |
| 97 | implementation works through a precedence chain that gives data descriptors |
| 98 | priority over instance variables, instance variables priority over non-data |
Benjamin Peterson | 05179d5 | 2014-10-06 21:10:25 -0400 | [diff] [blame] | 99 | descriptors, and assigns lowest priority to :meth:`__getattr__` if provided. |
| 100 | The full C implementation can be found in :c:func:`PyObject_GenericGetAttr()` in |
| 101 | :source:`Objects/object.c`. |
Raymond Hettinger | f11363d | 2010-04-11 08:14:45 +0000 | [diff] [blame] | 102 | |
| 103 | For classes, the machinery is in :meth:`type.__getattribute__` which transforms |
| 104 | ``B.x`` into ``B.__dict__['x'].__get__(None, B)``. In pure Python, it looks |
| 105 | like:: |
| 106 | |
| 107 | def __getattribute__(self, key): |
| 108 | "Emulate type_getattro() in Objects/typeobject.c" |
| 109 | v = object.__getattribute__(self, key) |
| 110 | if hasattr(v, '__get__'): |
| 111 | return v.__get__(None, self) |
| 112 | return v |
| 113 | |
| 114 | The important points to remember are: |
| 115 | |
| 116 | * descriptors are invoked by the :meth:`__getattribute__` method |
| 117 | * overriding :meth:`__getattribute__` prevents automatic descriptor calls |
| 118 | * :meth:`__getattribute__` is only available with new style classes and objects |
| 119 | * :meth:`object.__getattribute__` and :meth:`type.__getattribute__` make |
| 120 | different calls to :meth:`__get__`. |
| 121 | * data descriptors always override instance dictionaries. |
| 122 | * non-data descriptors may be overridden by instance dictionaries. |
| 123 | |
| 124 | The object returned by ``super()`` also has a custom :meth:`__getattribute__` |
| 125 | method for invoking descriptors. The call ``super(B, obj).m()`` searches |
| 126 | ``obj.__class__.__mro__`` for the base class ``A`` immediately following ``B`` |
Benjamin Peterson | 97b3618 | 2013-10-18 12:57:55 -0400 | [diff] [blame] | 127 | and then returns ``A.__dict__['m'].__get__(obj, B)``. If not a descriptor, |
Raymond Hettinger | f11363d | 2010-04-11 08:14:45 +0000 | [diff] [blame] | 128 | ``m`` is returned unchanged. If not in the dictionary, ``m`` reverts to a |
| 129 | search using :meth:`object.__getattribute__`. |
| 130 | |
| 131 | Note, in Python 2.2, ``super(B, obj).m()`` would only invoke :meth:`__get__` if |
| 132 | ``m`` was a data descriptor. In Python 2.3, non-data descriptors also get |
| 133 | invoked unless an old-style class is involved. The implementation details are |
Benjamin Peterson | 05179d5 | 2014-10-06 21:10:25 -0400 | [diff] [blame] | 134 | in :c:func:`super_getattro()` in :source:`Objects/typeobject.c`. |
Raymond Hettinger | f11363d | 2010-04-11 08:14:45 +0000 | [diff] [blame] | 135 | |
Georg Brandl | 0ffb462 | 2014-10-29 09:37:43 +0100 | [diff] [blame] | 136 | .. _`Guido's Tutorial`: https://www.python.org/download/releases/2.2.3/descrintro/#cooperation |
Raymond Hettinger | f11363d | 2010-04-11 08:14:45 +0000 | [diff] [blame] | 137 | |
| 138 | The details above show that the mechanism for descriptors is embedded in the |
| 139 | :meth:`__getattribute__()` methods for :class:`object`, :class:`type`, and |
| 140 | :func:`super`. Classes inherit this machinery when they derive from |
| 141 | :class:`object` or if they have a meta-class providing similar functionality. |
| 142 | Likewise, classes can turn-off descriptor invocation by overriding |
| 143 | :meth:`__getattribute__()`. |
| 144 | |
| 145 | |
| 146 | Descriptor Example |
| 147 | ------------------ |
| 148 | |
| 149 | The following code creates a class whose objects are data descriptors which |
| 150 | print a message for each get or set. Overriding :meth:`__getattribute__` is |
| 151 | alternate approach that could do this for every attribute. However, this |
| 152 | descriptor is useful for monitoring just a few chosen attributes:: |
| 153 | |
| 154 | class RevealAccess(object): |
| 155 | """A data descriptor that sets and returns values |
| 156 | normally and prints a message logging their access. |
| 157 | """ |
| 158 | |
| 159 | def __init__(self, initval=None, name='var'): |
| 160 | self.val = initval |
| 161 | self.name = name |
| 162 | |
| 163 | def __get__(self, obj, objtype): |
| 164 | print 'Retrieving', self.name |
| 165 | return self.val |
| 166 | |
| 167 | def __set__(self, obj, val): |
Serhiy Storchaka | 610f84a | 2013-12-23 18:19:34 +0200 | [diff] [blame] | 168 | print 'Updating', self.name |
Raymond Hettinger | f11363d | 2010-04-11 08:14:45 +0000 | [diff] [blame] | 169 | self.val = val |
| 170 | |
| 171 | >>> class MyClass(object): |
| 172 | x = RevealAccess(10, 'var "x"') |
| 173 | y = 5 |
| 174 | |
| 175 | >>> m = MyClass() |
| 176 | >>> m.x |
| 177 | Retrieving var "x" |
| 178 | 10 |
| 179 | >>> m.x = 20 |
| 180 | Updating var "x" |
| 181 | >>> m.x |
| 182 | Retrieving var "x" |
| 183 | 20 |
| 184 | >>> m.y |
| 185 | 5 |
| 186 | |
| 187 | The protocol is simple and offers exciting possibilities. Several use cases are |
| 188 | so common that they have been packaged into individual function calls. |
| 189 | Properties, bound and unbound methods, static methods, and class methods are all |
| 190 | based on the descriptor protocol. |
| 191 | |
| 192 | |
| 193 | Properties |
| 194 | ---------- |
| 195 | |
| 196 | Calling :func:`property` is a succinct way of building a data descriptor that |
| 197 | triggers function calls upon access to an attribute. Its signature is:: |
| 198 | |
| 199 | property(fget=None, fset=None, fdel=None, doc=None) -> property attribute |
| 200 | |
| 201 | The documentation shows a typical use to define a managed attribute ``x``:: |
| 202 | |
| 203 | class C(object): |
| 204 | def getx(self): return self.__x |
| 205 | def setx(self, value): self.__x = value |
| 206 | def delx(self): del self.__x |
| 207 | x = property(getx, setx, delx, "I'm the 'x' property.") |
| 208 | |
| 209 | To see how :func:`property` is implemented in terms of the descriptor protocol, |
| 210 | here is a pure Python equivalent:: |
| 211 | |
| 212 | class Property(object): |
| 213 | "Emulate PyProperty_Type() in Objects/descrobject.c" |
| 214 | |
| 215 | def __init__(self, fget=None, fset=None, fdel=None, doc=None): |
| 216 | self.fget = fget |
| 217 | self.fset = fset |
| 218 | self.fdel = fdel |
Raymond Hettinger | 8c5c3e3 | 2013-03-10 09:36:52 -0700 | [diff] [blame] | 219 | if doc is None and fget is not None: |
| 220 | doc = fget.__doc__ |
Raymond Hettinger | f11363d | 2010-04-11 08:14:45 +0000 | [diff] [blame] | 221 | self.__doc__ = doc |
| 222 | |
| 223 | def __get__(self, obj, objtype=None): |
| 224 | if obj is None: |
| 225 | return self |
| 226 | if self.fget is None: |
Raymond Hettinger | 8c5c3e3 | 2013-03-10 09:36:52 -0700 | [diff] [blame] | 227 | raise AttributeError("unreadable attribute") |
Raymond Hettinger | f11363d | 2010-04-11 08:14:45 +0000 | [diff] [blame] | 228 | return self.fget(obj) |
| 229 | |
| 230 | def __set__(self, obj, value): |
| 231 | if self.fset is None: |
Raymond Hettinger | 8c5c3e3 | 2013-03-10 09:36:52 -0700 | [diff] [blame] | 232 | raise AttributeError("can't set attribute") |
Raymond Hettinger | f11363d | 2010-04-11 08:14:45 +0000 | [diff] [blame] | 233 | self.fset(obj, value) |
| 234 | |
| 235 | def __delete__(self, obj): |
| 236 | if self.fdel is None: |
Raymond Hettinger | 8c5c3e3 | 2013-03-10 09:36:52 -0700 | [diff] [blame] | 237 | raise AttributeError("can't delete attribute") |
Raymond Hettinger | f11363d | 2010-04-11 08:14:45 +0000 | [diff] [blame] | 238 | self.fdel(obj) |
| 239 | |
Raymond Hettinger | 8c5c3e3 | 2013-03-10 09:36:52 -0700 | [diff] [blame] | 240 | def getter(self, fget): |
| 241 | return type(self)(fget, self.fset, self.fdel, self.__doc__) |
| 242 | |
| 243 | def setter(self, fset): |
| 244 | return type(self)(self.fget, fset, self.fdel, self.__doc__) |
| 245 | |
| 246 | def deleter(self, fdel): |
| 247 | return type(self)(self.fget, self.fset, fdel, self.__doc__) |
| 248 | |
Raymond Hettinger | f11363d | 2010-04-11 08:14:45 +0000 | [diff] [blame] | 249 | The :func:`property` builtin helps whenever a user interface has granted |
| 250 | attribute access and then subsequent changes require the intervention of a |
| 251 | method. |
| 252 | |
| 253 | For instance, a spreadsheet class may grant access to a cell value through |
| 254 | ``Cell('b10').value``. Subsequent improvements to the program require the cell |
| 255 | to be recalculated on every access; however, the programmer does not want to |
| 256 | affect existing client code accessing the attribute directly. The solution is |
| 257 | to wrap access to the value attribute in a property data descriptor:: |
| 258 | |
| 259 | class Cell(object): |
| 260 | . . . |
| 261 | def getvalue(self, obj): |
| 262 | "Recalculate cell before returning value" |
| 263 | self.recalc() |
| 264 | return obj._value |
| 265 | value = property(getvalue) |
| 266 | |
| 267 | |
| 268 | Functions and Methods |
| 269 | --------------------- |
| 270 | |
| 271 | Python's object oriented features are built upon a function based environment. |
| 272 | Using non-data descriptors, the two are merged seamlessly. |
| 273 | |
| 274 | Class dictionaries store methods as functions. In a class definition, methods |
| 275 | are written using :keyword:`def` and :keyword:`lambda`, the usual tools for |
| 276 | creating functions. The only difference from regular functions is that the |
| 277 | first argument is reserved for the object instance. By Python convention, the |
| 278 | instance reference is called *self* but may be called *this* or any other |
| 279 | variable name. |
| 280 | |
| 281 | To support method calls, functions include the :meth:`__get__` method for |
| 282 | binding methods during attribute access. This means that all functions are |
| 283 | non-data descriptors which return bound or unbound methods depending whether |
| 284 | they are invoked from an object or a class. In pure python, it works like |
| 285 | this:: |
| 286 | |
| 287 | class Function(object): |
| 288 | . . . |
| 289 | def __get__(self, obj, objtype=None): |
| 290 | "Simulate func_descr_get() in Objects/funcobject.c" |
| 291 | return types.MethodType(self, obj, objtype) |
| 292 | |
| 293 | Running the interpreter shows how the function descriptor works in practice:: |
| 294 | |
| 295 | >>> class D(object): |
| 296 | def f(self, x): |
| 297 | return x |
| 298 | |
| 299 | >>> d = D() |
| 300 | >>> D.__dict__['f'] # Stored internally as a function |
| 301 | <function f at 0x00C45070> |
| 302 | >>> D.f # Get from a class becomes an unbound method |
| 303 | <unbound method D.f> |
| 304 | >>> d.f # Get from an instance becomes a bound method |
| 305 | <bound method D.f of <__main__.D object at 0x00B18C90>> |
| 306 | |
| 307 | The output suggests that bound and unbound methods are two different types. |
Georg Brandl | 0930228 | 2010-10-06 09:32:48 +0000 | [diff] [blame] | 308 | While they could have been implemented that way, the actual C implementation of |
Benjamin Peterson | 05179d5 | 2014-10-06 21:10:25 -0400 | [diff] [blame] | 309 | :c:type:`PyMethod_Type` in :source:`Objects/classobject.c` is a single object |
| 310 | with two different representations depending on whether the :attr:`im_self` |
| 311 | field is set or is *NULL* (the C equivalent of *None*). |
Raymond Hettinger | f11363d | 2010-04-11 08:14:45 +0000 | [diff] [blame] | 312 | |
| 313 | Likewise, the effects of calling a method object depend on the :attr:`im_self` |
| 314 | field. If set (meaning bound), the original function (stored in the |
| 315 | :attr:`im_func` field) is called as expected with the first argument set to the |
| 316 | instance. If unbound, all of the arguments are passed unchanged to the original |
| 317 | function. The actual C implementation of :func:`instancemethod_call()` is only |
| 318 | slightly more complex in that it includes some type checking. |
| 319 | |
| 320 | |
| 321 | Static Methods and Class Methods |
| 322 | -------------------------------- |
| 323 | |
| 324 | Non-data descriptors provide a simple mechanism for variations on the usual |
| 325 | patterns of binding functions into methods. |
| 326 | |
| 327 | To recap, functions have a :meth:`__get__` method so that they can be converted |
| 328 | to a method when accessed as attributes. The non-data descriptor transforms a |
| 329 | ``obj.f(*args)`` call into ``f(obj, *args)``. Calling ``klass.f(*args)`` |
| 330 | becomes ``f(*args)``. |
| 331 | |
| 332 | This chart summarizes the binding and its two most useful variants: |
| 333 | |
Georg Brandl | d073107 | 2010-04-13 06:43:54 +0000 | [diff] [blame] | 334 | +-----------------+----------------------+------------------+ |
| 335 | | Transformation | Called from an | Called from a | |
| 336 | | | Object | Class | |
| 337 | +=================+======================+==================+ |
| 338 | | function | f(obj, \*args) | f(\*args) | |
| 339 | +-----------------+----------------------+------------------+ |
| 340 | | staticmethod | f(\*args) | f(\*args) | |
| 341 | +-----------------+----------------------+------------------+ |
| 342 | | classmethod | f(type(obj), \*args) | f(klass, \*args) | |
| 343 | +-----------------+----------------------+------------------+ |
Raymond Hettinger | f11363d | 2010-04-11 08:14:45 +0000 | [diff] [blame] | 344 | |
| 345 | Static methods return the underlying function without changes. Calling either |
| 346 | ``c.f`` or ``C.f`` is the equivalent of a direct lookup into |
| 347 | ``object.__getattribute__(c, "f")`` or ``object.__getattribute__(C, "f")``. As a |
| 348 | result, the function becomes identically accessible from either an object or a |
| 349 | class. |
| 350 | |
| 351 | Good candidates for static methods are methods that do not reference the |
| 352 | ``self`` variable. |
| 353 | |
| 354 | For instance, a statistics package may include a container class for |
| 355 | experimental data. The class provides normal methods for computing the average, |
| 356 | mean, median, and other descriptive statistics that depend on the data. However, |
| 357 | there may be useful functions which are conceptually related but do not depend |
| 358 | on the data. For instance, ``erf(x)`` is handy conversion routine that comes up |
| 359 | in statistical work but does not directly depend on a particular dataset. |
| 360 | It can be called either from an object or the class: ``s.erf(1.5) --> .9332`` or |
| 361 | ``Sample.erf(1.5) --> .9332``. |
| 362 | |
| 363 | Since staticmethods return the underlying function with no changes, the example |
| 364 | calls are unexciting:: |
| 365 | |
| 366 | >>> class E(object): |
| 367 | def f(x): |
| 368 | print x |
| 369 | f = staticmethod(f) |
| 370 | |
| 371 | >>> print E.f(3) |
| 372 | 3 |
| 373 | >>> print E().f(3) |
| 374 | 3 |
| 375 | |
| 376 | Using the non-data descriptor protocol, a pure Python version of |
| 377 | :func:`staticmethod` would look like this:: |
| 378 | |
| 379 | class StaticMethod(object): |
| 380 | "Emulate PyStaticMethod_Type() in Objects/funcobject.c" |
| 381 | |
| 382 | def __init__(self, f): |
| 383 | self.f = f |
| 384 | |
| 385 | def __get__(self, obj, objtype=None): |
| 386 | return self.f |
| 387 | |
| 388 | Unlike static methods, class methods prepend the class reference to the |
| 389 | argument list before calling the function. This format is the same |
| 390 | for whether the caller is an object or a class:: |
| 391 | |
| 392 | >>> class E(object): |
| 393 | def f(klass, x): |
| 394 | return klass.__name__, x |
| 395 | f = classmethod(f) |
| 396 | |
| 397 | >>> print E.f(3) |
| 398 | ('E', 3) |
| 399 | >>> print E().f(3) |
| 400 | ('E', 3) |
| 401 | |
| 402 | |
| 403 | This behavior is useful whenever the function only needs to have a class |
| 404 | reference and does not care about any underlying data. One use for classmethods |
| 405 | is to create alternate class constructors. In Python 2.3, the classmethod |
| 406 | :func:`dict.fromkeys` creates a new dictionary from a list of keys. The pure |
| 407 | Python equivalent is:: |
| 408 | |
Raymond Hettinger | 7ed6f7e | 2013-03-10 09:49:08 -0700 | [diff] [blame] | 409 | class Dict(object): |
Raymond Hettinger | f11363d | 2010-04-11 08:14:45 +0000 | [diff] [blame] | 410 | . . . |
| 411 | def fromkeys(klass, iterable, value=None): |
| 412 | "Emulate dict_fromkeys() in Objects/dictobject.c" |
| 413 | d = klass() |
| 414 | for key in iterable: |
| 415 | d[key] = value |
| 416 | return d |
| 417 | fromkeys = classmethod(fromkeys) |
| 418 | |
| 419 | Now a new dictionary of unique keys can be constructed like this:: |
| 420 | |
| 421 | >>> Dict.fromkeys('abracadabra') |
| 422 | {'a': None, 'r': None, 'b': None, 'c': None, 'd': None} |
| 423 | |
| 424 | Using the non-data descriptor protocol, a pure Python version of |
| 425 | :func:`classmethod` would look like this:: |
| 426 | |
| 427 | class ClassMethod(object): |
| 428 | "Emulate PyClassMethod_Type() in Objects/funcobject.c" |
| 429 | |
| 430 | def __init__(self, f): |
| 431 | self.f = f |
| 432 | |
| 433 | def __get__(self, obj, klass=None): |
| 434 | if klass is None: |
| 435 | klass = type(obj) |
| 436 | def newfunc(*args): |
| 437 | return self.f(klass, *args) |
| 438 | return newfunc |
| 439 | |