_PyObject_VAR_SIZE:  always round up to a multiple-of-pointer-size value.
As Guido suggested, this makes the new subclassing code substantially
simpler.  But the mechanics of doing it w/ C macro semantics are a mess,
and _PyObject_VAR_SIZE has a new calling sequence now.

Question:  The PyObject_NEW_VAR macro appears to be part of the public API.
Regardless of what it expands to, the notion that it has to round up the
memory it allocates is new, and extensions containing the old
PyObject_NEW_VAR macro expansion (which was embedded in the
PyObject_NEW_VAR expansion) won't do this rounding.  But the rounding
isn't actually *needed* except for new-style instances with dict pointers
after a variable-length blob of embedded data.  So my guess is that we do
not need to bump the API version for this (as the rounding isn't needed
for anything an extension can do unless it's recompiled anyway).  What's
your guess?
diff --git a/Objects/typeobject.c b/Objects/typeobject.c
index 59ec588..0342e71 100644
--- a/Objects/typeobject.c
+++ b/Objects/typeobject.c
@@ -190,28 +190,13 @@
 PyObject *
 PyType_GenericAlloc(PyTypeObject *type, int nitems)
 {
-#define PTRSIZE (sizeof(PyObject *))
-
-	size_t size = (size_t)_PyObject_VAR_SIZE(type, nitems);
-	size_t padding = 0;
 	PyObject *obj;
+	size_t size;
 
-	/* Round up size, if necessary, so that the __dict__ pointer
-	   following the variable part is properly aligned for the platform.
-	   This is needed only for types with a vrbl number of items
-	   before the __dict__ pointer == types that record the dict offset
-	   as a negative offset from the end of the object.  If tp_dictoffset
-	   is 0, there is no __dict__; if positive, tp_dict was declared in a C
-	   struct so the compiler already took care of aligning it. */
-        if (type->tp_dictoffset < 0) {
-		padding = PTRSIZE - size % PTRSIZE;
-		if (padding == PTRSIZE)
-			padding = 0;
-		size += padding;
-	}
+	_PyObject_VAR_SIZE(size, type, nitems);
 
 	if (PyType_IS_GC(type))
-		obj = _PyObject_GC_Malloc(type, nitems, padding);
+		obj = _PyObject_GC_Malloc(type, nitems);
 	else
 		obj = PyObject_MALLOC(size);