Guido suggests, and I agree, to insist that SIZEOF_VOID_P be a power of 2.
This simplifies the rounding in _PyObject_VAR_SIZE, allows to restore the
pre-rounding calling sequence, and allows some nice little simplifications
in its callers.  I'm still making it return a size_t, though.
diff --git a/Objects/typeobject.c b/Objects/typeobject.c
index 0342e71..0ec8175 100644
--- a/Objects/typeobject.c
+++ b/Objects/typeobject.c
@@ -191,9 +191,7 @@
 PyType_GenericAlloc(PyTypeObject *type, int nitems)
 {
 	PyObject *obj;
-	size_t size;
-
-	_PyObject_VAR_SIZE(size, type, nitems);
+	const size_t size = _PyObject_VAR_SIZE(type, nitems);
 
 	if (PyType_IS_GC(type))
 		obj = _PyObject_GC_Malloc(type, nitems);