x_mul(): Made life easier for C optimizers in the "grade school" algorithm. MSVC 6 wasn't impressed <wink>. Something odd: the x_mul algorithm appears to get substantially worse than quadratic time as the inputs grow larger: bits in each input x_mul time k_mul time ------------------ ---------- ---------- 15360 0.01 0.00 30720 0.04 0.01 61440 0.16 0.04 122880 0.64 0.14 245760 2.56 0.40 491520 10.76 1.23 983040 71.28 3.69 1966080 459.31 11.07 That is, x_mul is perfectly quadratic-time until a little burp at 2.56->10.76, and after that goes to hell in a hurry. Under Karatsuba, doubling the input size "should take" 3 times longer instead of 4, and that remains the case throughout this range. I conclude that my "be nice to the cache" reworkings of k_mul() are paying.

commit: 115c888b97125e31851fc64cc0292c4fc3b122dd [log] [tgz]
author: Tim Peters <tim.peters@gmail.com> Mon Aug 12 18:25:43 2002 +0000
committer: Tim Peters <tim.peters@gmail.com> Mon Aug 12 18:25:43 2002 +0000
tree: c42608ce0600e39b0a3ccf8ad21085a9042764fd
parent: d64c1def7ca69e346bade11f2a99651eb8e2ff35 [diff]
diff --git a/Objects/longobject.c b/Objects/longobject.c
index 0eefc90..0801e64 100644
--- a/Objects/longobject.c
+++ b/Objects/longobject.c

@@ -1539,20 +1539,21 @@
 		twodigits carry = 0;
 		twodigits f = a->ob_digit[i];
 		int j;
+		digit *pz = z->ob_digit + i;
 
 		SIGCHECK({
 			Py_DECREF(z);
 			return NULL;
 		})
 		for (j = 0; j < size_b; ++j) {
-			carry += z->ob_digit[i+j] + b->ob_digit[j] * f;
-			z->ob_digit[i+j] = (digit) (carry & MASK);
+			carry += *pz + b->ob_digit[j] * f;
+			*pz++ = (digit) (carry & MASK);
 			carry >>= SHIFT;
 		}
 		for (; carry != 0; ++j) {
 			assert(i+j < z->ob_size);
-			carry += z->ob_digit[i+j];
-			z->ob_digit[i+j] = (digit) (carry & MASK);
+			carry += *pz;
+			*pz++ = (digit) (carry & MASK);
 			carry >>= SHIFT;
 		}
 	}
commit	115c888b97125e31851fc64cc0292c4fc3b122dd	[log] [tgz]
author	Tim Peters <tim.peters@gmail.com>	Mon Aug 12 18:25:43 2002 +0000
committer	Tim Peters <tim.peters@gmail.com>	Mon Aug 12 18:25:43 2002 +0000
tree	c42608ce0600e39b0a3ccf8ad21085a9042764fd
parent	d64c1def7ca69e346bade11f2a99651eb8e2ff35 [diff]