One last tweak to the tracing machinery: this actually computes what I intended
all along.  Before instr_lb tended to be too high.

I don't think this actually makes any difference, given what the compiler
produces, but it makes me a bit happier.
diff --git a/Python/ceval.c b/Python/ceval.c
index 4930433..afc480e 100644
--- a/Python/ceval.c
+++ b/Python/ceval.c
@@ -2966,15 +2966,17 @@
 			if (addr + *p > frame->f_lasti)
 				break;
 			addr += *p++;
+			if (*p) *instr_lb = addr;
 			line += *p++;
 			--size;
 		}
+
 		if (addr == frame->f_lasti) {
 			frame->f_lineno = line;
 			call_trace(func, obj, frame, 
 				   PyTrace_LINE, Py_None);
 		}
-		*instr_lb = addr;
+
 		if (size > 0) {
 			while (--size >= 0) {
 				addr += *p++;