Restore dicts' tp_compare slot, and change dict_richcompare to say it
doesn't know how to do LE, LT, GE, GT.  dict_richcompare can't do the
latter any faster than dict_compare can.  More importantly, for
cmp(dict1, dict2), Python *first* tries rich compares with EQ, LT, and
GT one at a time, even if the tp_compare slot is defined, and
dict_richcompare called dict_compare for the latter two because
it couldn't do them itself.  The result was a lot of wasted calls to
dict_compare.  Now dict_richcompare gives up at once the times Python
calls it with LT and GT from try_rich_to_3way_compare(), and dict_compare
is called only once (when Python gets around to trying the tp_compare
slot).
Continued mystery:  despite that this cut the number of calls to
dict_compare approximately in half in test_mutants.py, the latter still
runs amazingly slowly.  Running under the debugger doesn't show excessive
activity in the dict comparison code anymore, so I'm guessing the culprit
is somewhere else -- but where?  Perhaps in the element (key/value)
comparison code?  We clearly spend a lot of time figuring out how to
compare things.
diff --git a/Objects/dictobject.c b/Objects/dictobject.c
index b699153..9b5b9f4 100644
--- a/Objects/dictobject.c
+++ b/Objects/dictobject.c
@@ -1160,20 +1160,8 @@
 			return NULL;
 		res = (cmp == (op == Py_EQ)) ? Py_True : Py_False;
 	}
-	else {
-		cmp = dict_compare((dictobject *)v, (dictobject *)w);
-		if (cmp < 0 && PyErr_Occurred())
-			return NULL;
-		switch (op) {
-			case Py_LT: cmp = cmp <  0; break;
-			case Py_LE: cmp = cmp <= 0; break;
-			case Py_GT: cmp = cmp >  0; break;
-			case Py_GE: cmp = cmp >= 0; break;
-			default:
-				assert(!"op unexpected");
-		}
-		res = cmp ? Py_True : Py_False;
-	}
+	else
+		res = Py_NotImplemented;
 	Py_INCREF(res);
 	return res;
  }
@@ -1541,7 +1529,7 @@
 	(printfunc)dict_print,			/* tp_print */
 	(getattrfunc)dict_getattr,		/* tp_getattr */
 	0,					/* tp_setattr */
-	0,					/* tp_compare */
+	(cmpfunc)dict_compare,			/* tp_compare */
 	(reprfunc)dict_repr,			/* tp_repr */
 	0,					/* tp_as_number */
 	&dict_as_sequence,			/* tp_as_sequence */