Major speedup for new-style class creation. Turns out there was some
trampolining going on with the tp_new descriptor, where the inherited
PyType_GenericNew was overwritten with the much slower slot_tp_new
which would end up calling tp_new_wrapper which would eventually call
PyType_GenericNew. Add a special case for this to update_one_slot().
XXX Hope there isn't a loophole in this. I'll buy the first person to
point out a bug in the reasoning a beer.
Backport candidate (but I won't do it).
diff --git a/Objects/typeobject.c b/Objects/typeobject.c
index f46734b..020cbf2 100644
--- a/Objects/typeobject.c
+++ b/Objects/typeobject.c
@@ -4081,6 +4081,28 @@
use_generic = 1;
}
}
+ else if (descr->ob_type == &PyCFunction_Type &&
+ PyCFunction_GET_FUNCTION(descr) ==
+ (PyCFunction)tp_new_wrapper &&
+ strcmp(p->name, "__new__") == 0)
+ {
+ /* The __new__ wrapper is not a wrapper descriptor,
+ so must be special-cased differently.
+ If we don't do this, creating an instance will
+ always use slot_tp_new which will look up
+ __new__ in the MRO which will call tp_new_wrapper
+ which will look through the base classes looking
+ for a static base and call its tp_new (usually
+ PyType_GenericNew), after performing various
+ sanity checks and constructing a new argument
+ list. Cut all that nonsense short -- this speeds
+ up instance creation tremendously. */
+ specific = type->tp_new;
+ /* XXX I'm not 100% sure that there isn't a hole
+ in this reasoning that requires additional
+ sanity checks. I'll buy the first person to
+ point out a bug in this reasoning a beer. */
+ }
else {
use_generic = 1;
generic = p->function;