blob: b87d2c2f292be088ddc787072d038c69c0c4f313 [file] [log] [blame]
Guido van Rossum1fb071c1997-08-25 21:36:44 +00001<HTML>
2
3<HEAD>
Guido van Rossum7ade6da1997-09-11 22:54:49 +00004<TITLE>Metaclasses in Python 1.5</TITLE>
Guido van Rossum1fb071c1997-08-25 21:36:44 +00005</HEAD>
6
7<BODY BGCOLOR="FFFFFF">
8
Guido van Rossum7ade6da1997-09-11 22:54:49 +00009<H1>Metaclasses in Python 1.5</H1>
10<H2>(A.k.a. The Killer Joke :-)</H2>
Guido van Rossum1fb071c1997-08-25 21:36:44 +000011
Guido van Rossum7ade6da1997-09-11 22:54:49 +000012<P><b>Note: this document describes a feature only released in <A
13HREF="../../1.5a3/">Python 1.5a3</A>.</b>
Guido van Rossum1fb071c1997-08-25 21:36:44 +000014
15<P>In previous Python releases (and still in 1.5), there is something
16called the ``Don Beaudry hook'', after its inventor and champion.
17This allows C extensions to provide alternate class behavior, thereby
18allowing the Python class syntax to be used to define other class-like
19entities. Don Beaudry has used this in his infamous <A
20HREF="http://maigret.cog.brown.edu/pyutil/">MESS</A> package; Jim
21Fulton has used it in his <A
22HREF="http://www.digicool.com/papers/ExtensionClass.html">Extension
23Classes</A> package. (It has also been referred to as the ``Don
Guido van Rossum7b877a91997-09-08 02:20:57 +000024Beaudry <i>hack</i>,'' but that's a misnomer. There's nothing hackish
Guido van Rossum1fb071c1997-08-25 21:36:44 +000025about it -- in fact, it is rather elegant and deep, even though
26there's something dark to it.)
27
Guido van Rossum626a8d01997-09-11 23:01:04 +000028<P>(On first reading, you may want to skip directly to the examples in
29the section "Writing Metaclasses in Python" below, unless you want
30your head to explode.) (XXX I should really restructure this document
31to place the historic notes last. After 1.5a4 is released...)
32
33<P>
34
35<HR>
36
Guido van Rossum1fb071c1997-08-25 21:36:44 +000037<P>Documentation of the Don Beaudry hook has purposefully been kept
38minimal, since it is a feature of incredible power, and is easily
39abused. Basically, it checks whether the <b>type of the base
40class</b> is callable, and if so, it is called to create the new
41class.
42
43<P>Note the two indirection levels. Take a simple example:
44
45<PRE>
46class B:
47 pass
48
49class C(B):
50 pass
51</PRE>
52
53Take a look at the second class definition, and try to fathom ``the
54type of the base class is callable.''
55
56<P>(Types are not classes, by the way. See questions 4.2, 4.19 and in
57particular 6.22 in the <A
58HREF="http://grail.cnri.reston.va.us/cgi-bin/faqw.py" >Python FAQ</A>
59for more on this topic.)
60
61<P>
62
63<UL>
64
65<LI>The <b>base class</b> is B; this one's easy.<P>
66
67<LI>Since B is a class, its type is ``class''; so the <b>type of the
68base class</b> is the type ``class''. This is also known as
69types.ClassType, assuming the standard module <code>types</code> has
70been imported.<P>
71
72<LI>Now is the type ``class'' <b>callable</b>? No, because types (in
73core Python) are never callable. Classes are callable (calling a
74class creates a new instance) but types aren't.<P>
75
76</UL>
77
78<P>So our conclusion is that in our example, the type of the base
79class (of C) is not callable. So the Don Beaudry hook does not apply,
80and the default class creation mechanism is used (which is also used
81when there is no base class). In fact, the Don Beaudry hook never
82applies when using only core Python, since the type of a core object
83is never callable.
84
85<P>So what do Don and Jim do in order to use Don's hook? Write an
86extension that defines at least two new Python object types. The
87first would be the type for ``class-like'' objects usable as a base
88class, to trigger Don's hook. This type must be made callable.
89That's why we need a second type. Whether an object is callable
90depends on its type. So whether a type object is callable depends on
91<i>its</i> type, which is a <i>meta-type</i>. (In core Python there
92is only one meta-type, the type ``type'' (types.TypeType), which is
93the type of all type objects, even itself.) A new meta-type must
94be defined that makes the type of the class-like objects callable.
95(Normally, a third type would also be needed, the new ``instance''
96type, but this is not an absolute requirement -- the new class type
97could return an object of some existing type when invoked to create an
98instance.)
99
100<P>Still confused? Here's a simple device due to Don himself to
101explain metaclasses. Take a simple class definition; assume B is a
102special class that triggers Don's hook:
103
104<PRE>
105class C(B):
106 a = 1
107 b = 2
108</PRE>
109
110This can be though of as equivalent to:
111
112<PRE>
113C = type(B)('C', (B,), {'a': 1, 'b': 2})
114</PRE>
115
116If that's too dense for you, here's the same thing written out using
117temporary variables:
118
119<PRE>
120creator = type(B) # The type of the base class
121name = 'C' # The name of the new class
122bases = (B,) # A tuple containing the base class(es)
123namespace = {'a': 1, 'b': 2} # The namespace of the class statement
124C = creator(name, bases, namespace)
125</PRE>
126
127This is analogous to what happens without the Don Beaudry hook, except
128that in that case the creator function is set to the default class
129creator.
130
131<P>In either case, the creator is called with three arguments. The
132first one, <i>name</i>, is the name of the new class (as given at the
133top of the class statement). The <i>bases</i> argument is a tuple of
134base classes (a singleton tuple if there's only one base class, like
135the example). Finally, <i>namespace</i> is a dictionary containing
136the local variables collected during execution of the class statement.
137
138<P>Note that the contents of the namespace dictionary is simply
139whatever names were defined in the class statement. A little-known
140fact is that when Python executes a class statement, it enters a new
141local namespace, and all assignments and function definitions take
142place in this namespace. Thus, after executing the following class
143statement:
144
145<PRE>
146class C:
147 a = 1
148 def f(s): pass
149</PRE>
150
151the class namespace's contents would be {'a': 1, 'f': &lt;function f
152...&gt;}.
153
Guido van Rossum7ade6da1997-09-11 22:54:49 +0000154<P>But enough already about writing Python metaclasses in C; read the
Guido van Rossum1fb071c1997-08-25 21:36:44 +0000155documentation of <A
156HREF="http://maigret.cog.brown.edu/pyutil/">MESS</A> or <A
157HREF="http://www.digicool.com/papers/ExtensionClass.html" >Extension
158Classes</A> for more information.
159
Guido van Rossum626a8d01997-09-11 23:01:04 +0000160<P>
161
162<HR>
163
Guido van Rossum1fb071c1997-08-25 21:36:44 +0000164<H2>Writing Metaclasses in Python</H2>
165
166<P>In Python 1.5, the requirement to write a C extension in order to
Guido van Rossum7ade6da1997-09-11 22:54:49 +0000167write metaclasses has been dropped (though you can still do
Guido van Rossum1fb071c1997-08-25 21:36:44 +0000168it, of course). In addition to the check ``is the type of the base
169class callable,'' there's a check ``does the base class have a
170__class__ attribute.'' If so, it is assumed that the __class__
171attribute refers to a class.
172
173<P>Let's repeat our simple example from above:
174
175<PRE>
176class C(B):
177 a = 1
178 b = 2
179</PRE>
180
181Assuming B has a __class__ attribute, this translates into:
182
183<PRE>
184C = B.__class__('C', (B,), {'a': 1, 'b': 2})
185</PRE>
186
187This is exactly the same as before except that instead of type(B),
188B.__class__ is invoked. If you have read <A HREF=
189"http://grail.cnri.reston.va.us/cgi-bin/faqw.py?req=show&file=faq06.022.htp"
190>FAQ question 6.22</A> you will understand that while there is a big
191technical difference between type(B) and B.__class__, they play the
192same role at different abstraction levels. And perhaps at some point
193in the future they will really be the same thing (at which point you
194would be able to derive subclasses from built-in types).
195
Guido van Rossum7b877a91997-09-08 02:20:57 +0000196<P>At this point it may be worth mentioning that C.__class__ is the
197same object as B.__class__, i.e., C's metaclass is the same as B's
198metaclass. In other words, subclassing an existing class creates a
199new (meta)inststance of the base class's metaclass.
200
Guido van Rossum1fb071c1997-08-25 21:36:44 +0000201<P>Going back to the example, the class B.__class__ is instantiated,
202passing its constructor the same three arguments that are passed to
Guido van Rossum7ade6da1997-09-11 22:54:49 +0000203the default class constructor or to an extension's metaclass:
204<i>name</i>, <i>bases</i>, and <i>namespace</i>.
Guido van Rossum1fb071c1997-08-25 21:36:44 +0000205
206<P>It is easy to be confused by what exactly happens when using a
207metaclass, because we lose the absolute distinction between classes
208and instances: a class is an instance of a metaclass (a
209``metainstance''), but technically (i.e. in the eyes of the python
210runtime system), the metaclass is just a class, and the metainstance
211is just an instance. At the end of the class statement, the metaclass
212whose metainstance is used as a base class is instantiated, yielding a
213second metainstance (of the same metaclass). This metainstance is
214then used as a (normal, non-meta) class; instantiation of the class
215means calling the metainstance, and this will return a real instance.
216And what class is that an instance of? Conceptually, it is of course
217an instance of our metainstance; but in most cases the Python runtime
218system will see it as an instance of a a helper class used by the
219metaclass to implement its (non-meta) instances...
220
221<P>Hopefully an example will make things clearer. Let's presume we
222have a metaclass MetaClass1. It's helper class (for non-meta
223instances) is callled HelperClass1. We now (manually) instantiate
224MetaClass1 once to get an empty special base class:
225
226<PRE>
227BaseClass1 = MetaClass1("BaseClass1", (), {})
228</PRE>
229
230We can now use BaseClass1 as a base class in a class statement:
231
232<PRE>
233class MySpecialClass(BaseClass1):
234 i = 1
235 def f(s): pass
236</PRE>
237
238At this point, MySpecialClass is defined; it is a metainstance of
239MetaClass1 just like BaseClass1, and in fact the expression
240``BaseClass1.__class__ == MySpecialClass.__class__ == MetaClass1''
241yields true.
242
243<P>We are now ready to create instances of MySpecialClass. Let's
244assume that no constructor arguments are required:
245
246<PRE>
247x = MySpecialClass()
Guido van Rossum7b877a91997-09-08 02:20:57 +0000248y = MySpecialClass()
Guido van Rossum1fb071c1997-08-25 21:36:44 +0000249print x.__class__, y.__class__
250</PRE>
251
252The print statement shows that x and y are instances of HelperClass1.
253How did this happen? MySpecialClass is an instance of MetaClass1
254(``meta'' is irrelevant here); when an instance is called, its
255__call__ method is invoked, and presumably the __call__ method defined
256by MetaClass1 returns an instance of HelperClass1.
257
Guido van Rossum7ade6da1997-09-11 22:54:49 +0000258<P>Now let's see how we could use metaclasses -- what can we do
Guido van Rossum1fb071c1997-08-25 21:36:44 +0000259with metaclasses that we can't easily do without them? Here's one
260idea: a metaclass could automatically insert trace calls for all
261method calls. Let's first develop a simplified example, without
262support for inheritance or other ``advanced'' Python features (we'll
263add those later).
264
265<PRE>
266import types
267
268class Tracing:
269 def __init__(self, name, bases, namespace):
270 """Create a new class."""
271 self.__name__ = name
272 self.__bases__ = bases
273 self.__namespace__ = namespace
274 def __call__(self):
275 """Create a new instance."""
276 return Instance(self)
277
278class Instance:
279 def __init__(self, klass):
280 self.__klass__ = klass
281 def __getattr__(self, name):
282 try:
283 value = self.__klass__.__namespace__[name]
284 except KeyError:
285 raise AttributeError, name
Guido van Rossum0cdb8871997-08-26 00:08:51 +0000286 if type(value) is not types.FunctionType:
Guido van Rossum1fb071c1997-08-25 21:36:44 +0000287 return value
288 return BoundMethod(value, self)
289
290class BoundMethod:
291 def __init__(self, function, instance):
292 self.function = function
293 self.instance = instance
294 def __call__(self, *args):
Guido van Rossum0cdb8871997-08-26 00:08:51 +0000295 print "calling", self.function, "for", self.instance, "with", args
Guido van Rossum1fb071c1997-08-25 21:36:44 +0000296 return apply(self.function, (self.instance,) + args)
Guido van Rossum0cdb8871997-08-26 00:08:51 +0000297
298Trace = Tracing('Trace', (), {})
299
300class MyTracedClass(Trace):
301 def method1(self, a):
302 self.a = a
303 def method2(self):
304 return self.a
305
306aninstance = MyTracedClass()
307
308aninstance.method1(10)
309
310print "the answer is %d" % aninstance.method2()
311</PRE>
312
313Confused already? The intention is to read this from top down. The
314Tracing class is the metaclass we're defining. Its structure is
315really simple.
316
317<P>
318
319<UL>
320
321<LI>The __init__ method is invoked when a new Tracing instance is
322created, e.g. the definition of class MyTracedClass later in the
323example. It simply saves the class name, base classes and namespace
324as instance variables.<P>
325
326<LI>The __call__ method is invoked when a Tracing instance is called,
327e.g. the creation of aninstance later in the example. It returns an
328instance of the class Instance, which is defined next.<P>
329
330</UL>
331
332<P>The class Instance is the class used for all instances of classes
333built using the Tracing metaclass, e.g. aninstance. It has two
334methods:
335
336<P>
337
338<UL>
339
340<LI>The __init__ method is invoked from the Tracing.__call__ method
341above to initialize a new instance. It saves the class reference as
342an instance variable. It uses a funny name because the user's
343instance variables (e.g. self.a later in the example) live in the same
344namespace.<P>
345
346<LI>The __getattr__ method is invoked whenever the user code
347references an attribute of the instance that is not an instance
348variable (nor a class variable; but except for __init__ and
349__getattr__ there are no class variables). It will be called, for
350example, when aninstance.method1 is referenced in the example, with
351self set to aninstance and name set to the string "method1".<P>
352
353</UL>
354
355<P>The __getattr__ method looks the name up in the __namespace__
356dictionary. If it isn't found, it raises an AttributeError exception.
357(In a more realistic example, it would first have to look through the
358base classes as well.) If it is found, there are two possibilities:
359it's either a function or it isn't. If it's not a function, it is
360assumed to be a class variable, and its value is returned. If it's a
361function, we have to ``wrap'' it in instance of yet another helper
362class, BoundMethod.
363
364<P>The BoundMethod class is needed to implement a familiar feature:
365when a method is defined, it has an initial argument, self, which is
366automatically bound to the relevant instance when it is called. For
367example, aninstance.method1(10) is equivalent to method1(aninstance,
36810). In the example if this call, first a temporary BoundMethod
369instance is created with the following constructor call: temp =
370BoundMethod(method1, aninstance); then this instance is called as
371temp(10). After the call, the temporary instance is discarded.
372
373<P>
374
375<UL>
376
377<LI>The __init__ method is invoked for the constructor call
378BoundMethod(method1, aninstance). It simply saves away its
379arguments.<P>
380
381<LI>The __call__ method is invoked when the bound method instance is
382called, as in temp(10). It needs to call method1(aninstance, 10).
383However, even though self.function is now method1 and self.instance is
384aninstance, it can't call self.function(self.instance, args) directly,
385because it should work regardless of the number of arguments passed.
386(For simplicity, support for keyword arguments has been omitted.)<P>
387
388</UL>
389
390<P>In order to be able to support arbitrary argument lists, the
391__call__ method first constructs a new argument tuple. Conveniently,
392because of the notation *args in __call__'s own argument list, the
393arguments to __call__ (except for self) are placed in the tuple args.
394To construct the desired argument list, we concatenate a singleton
395tuple containing the instance with the args tuple: (self.instance,) +
396args. (Note the trailing comma used to construct the singleton
397tuple.) In our example, the resulting argument tuple is (aninstance,
39810).
399
400<P>The intrinsic function apply() takes a function and an argument
401tuple and calls the function for it. In our example, we are calling
402apply(method1, (aninstance, 10)) which is equivalent to calling
403method(aninstance, 10).
404
405<P>From here on, things should come together quite easily. The output
406of the example code is something like this:
407
408<PRE>
409calling <function method1 at ae8d8> for <Instance instance at 95ab0> with (10,)
410calling <function method2 at ae900> for <Instance instance at 95ab0> with ()
411the answer is 10
412</PRE>
413
414<P>That was about the shortest meaningful example that I could come up
415with. A real tracing metaclass (for example, <A
416HREF="#Trace">Trace.py</A> discussed below) needs to be more
417complicated in two dimensions.
418
419<P>First, it needs to support more advanced Python features such as
420class variables, inheritance, __init__ methods, and keyword arguments.
421
422<P>Second, it needs to provide a more flexible way to handle the
423actual tracing information; perhaps it should be possible to write
424your own tracing function that gets called, perhaps it should be
425possible to enable and disable tracing on a per-class or per-instance
426basis, and perhaps a filter so that only interesting calls are traced;
427it should also be able to trace the return value of the call (or the
428exception it raised if an error occurs). Even the Trace.py example
429doesn't support all these features yet.
430
431<P>
432
Guido van Rossum1fb071c1997-08-25 21:36:44 +0000433<HR>
434
Guido van Rossum0cdb8871997-08-26 00:08:51 +0000435<H1>Real-life Examples</H1>
Guido van Rossum1fb071c1997-08-25 21:36:44 +0000436
Guido van Rossum0cdb8871997-08-26 00:08:51 +0000437<P>Have a look at some very preliminary examples that I coded up to
Guido van Rossum7ade6da1997-09-11 22:54:49 +0000438teach myself how to write metaclasses:
Guido van Rossum1fb071c1997-08-25 21:36:44 +0000439
440<DL>
441
442<DT><A HREF="Enum.py">Enum.py</A>
443
444<DD>This (ab)uses the class syntax as an elegant way to define
445enumerated types. The resulting classes are never instantiated --
446rather, their class attributes are the enumerated values. For
447example:
448
449<PRE>
450class Color(Enum):
451 red = 1
452 green = 2
453 blue = 3
454print Color.red
455</PRE>
456
457will print the string ``Color.red'', while ``Color.red==1'' is true,
458and ``Color.red + 1'' raise a TypeError exception.
459
460<P>
461
Guido van Rossum0cdb8871997-08-26 00:08:51 +0000462<DT><A NAME=Trace></A><A HREF="Trace.py">Trace.py</A>
Guido van Rossum1fb071c1997-08-25 21:36:44 +0000463
Guido van Rossum0cdb8871997-08-26 00:08:51 +0000464<DD>The resulting classes work much like standard
465classes, but by setting a special class or instance attribute
466__trace_output__ to point to a file, all calls to the class's methods
467are traced. It was a bit of a struggle to get this right. This
468should probably redone using the generic metaclass below.
Guido van Rossum1fb071c1997-08-25 21:36:44 +0000469
470<P>
471
472<DT><A HREF="Meta.py">Meta.py</A>
473
474<DD>A generic metaclass. This is an attempt at finding out how much
475standard class behavior can be mimicked by a metaclass. The
476preliminary answer appears to be that everything's fine as long as the
477class (or its clients) don't look at the instance's __class__
478attribute, nor at the class's __dict__ attribute. The use of
479__getattr__ internally makes the classic implementation of __getattr__
480hooks tough; we provide a similar hook _getattr_ instead.
481(__setattr__ and __delattr__ are not affected.)
482(XXX Hm. Could detect presence of __getattr__ and rename it.)
483
484<P>
485
486<DT><A HREF="Eiffel.py">Eiffel.py</A>
Guido van Rossum7b877a91997-09-08 02:20:57 +0000487
Guido van Rossum1fb071c1997-08-25 21:36:44 +0000488<DD>Uses the above generic metaclass to implement Eiffel style
489pre-conditions and post-conditions.
490
491<P>
Guido van Rossum0cdb8871997-08-26 00:08:51 +0000492
493<DT><A HREF="Synch.py">Synch.py</A>
494
495<DD>Uses the above generic metaclass to implement synchronized
496methods.
497
498<P>
499
Guido van Rossum7b877a91997-09-08 02:20:57 +0000500<DT><A HREF="Simple.py">Simple.py</A>
501
502<DD>The example module used above.
503
504<P>
505
Guido van Rossum1fb071c1997-08-25 21:36:44 +0000506</DL>
507
Guido van Rossum0cdb8871997-08-26 00:08:51 +0000508<P>A pattern seems to be emerging: almost all these uses of
509metaclasses (except for Enum, which is probably more cute than useful)
510mostly work by placing wrappers around method calls. An obvious
511problem with that is that it's not easy to combine the features of
512different metaclasses, while this would actually be quite useful: for
513example, I wouldn't mind getting a trace from the test run of the
514Synch module, and it would be interesting to add preconditions to it
515as well. This needs more research. Perhaps a metaclass could be
516provided that allows stackable wrappers...
517
Guido van Rossum626a8d01997-09-11 23:01:04 +0000518<P>
519
Guido van Rossum7b877a91997-09-08 02:20:57 +0000520<HR>
521
522<H2>Things You Could Do With Metaclasses</H2>
523
524<P>There are lots of things you could do with metaclasses. Most of
525these can also be done with creative use of __getattr__, but
526metaclasses make it easier to modify the attribute lookup behavior of
527classes. Here's a partial list.
528
529<P>
530
531<UL>
532
533<LI>Enforce different inheritance semantics, e.g. automatically call
534base class methods when a derived class overrides<P>
535
536<LI>Implement class methods (e.g. if the first argument is not named
537'self')<P>
538
539<LI>Implement that each instance is initialized with <b>copies</b> of
540all class variables<P>
541
542<LI>Implement a different way to store instance variables (e.g. in a
543list kept outside the the instance but indexed by the instance's id())<P>
544
545<LI>Automatically wrap or trap all or certain methods
546
547<UL>
548
549<LI>for tracing
550
551<LI>for precondition and postcondition checking
552
553<LI>for synchronized methods
554
555<LI>for automatic value caching
556
557</UL>
558<P>
559
560<LI>When an attribute is a parameterless function, call it on
561reference (to mimic it being an instance variable); same on assignment<P>
562
563<LI>Instrumentation: see how many times various attributes are used<P>
564
565<LI>Different semantics for __setattr__ and __getattr__ (e.g. disable
566them when they are being used recursively)<P>
567
568<LI>Abuse class syntax for other things<P>
569
570<LI>Experiment with automatic type checking<P>
571
572<LI>Delegation (or acquisition)<P>
573
574<LI>Dynamic inheritance patterns<P>
575
576<LI>Automatic caching of methods<P>
577
578</UL>
579
580<P>
581
582<HR>
583
584<H4>Credits</H4>
585
586<P>Many thanks to David Ascher and Donald Beaudry for their comments
587on earlier draft of this paper. Also thanks to Matt Conway and Tommy
588Burnette for putting a seed for the idea of metaclasses in my
589mind, nearly three years ago, even though at the time my response was
590``you can do that with __getattr__ hooks...'' :-)
591
592<P>
593
594<HR>
595
Guido van Rossum1fb071c1997-08-25 21:36:44 +0000596</BODY>
597
598</HTML>