bpo-41513: Save unnecessary steps in the hypot() calculation (#21994)

commit: 27de28607a248e5ffb8838162fca466a58c2e284 [log] [tgz]
author: Raymond Hettinger <rhettinger@users.noreply.github.com> Sat Aug 29 09:11:04 2020 -0700
committer: GitHub <noreply@github.com> Sat Aug 29 09:11:04 2020 -0700
tree: 6670673aaddeae849604ebeed2931960b0606f77
parent: e6dcd371b2c54a94584dd124e8c592a496d46a47 [diff] [blame]
diff --git a/Modules/mathmodule.c b/Modules/mathmodule.c
index 1704d8e..4ff2a06 100644
--- a/Modules/mathmodule.c
+++ b/Modules/mathmodule.c

@@ -2447,6 +2447,14 @@
 never goes above 1.0.  And when values at or below 1.0 are squared,
 they remain at or below 1.0, thus preserving the summation invariant.
 
+Another interesting assertion is that csum+lo*lo == csum. In the loop,
+each scaled vector element has a magnitude less than 1.0.  After the
+Veltkamp split, *lo* has a maximum value of 2**-27.  So the maximum
+value of *lo* squared is 2**-54.  The value of ulp(1.0)/2.0 is 2**-53.
+Given that csum >= 1.0, we have:
+    lo**2 <= 2**-54 < 2**-53 == 1/2*ulp(1.0) <= ulp(csum)/2
+Since lo**2 is less than 1/2 ulp(csum), we have csum+lo*lo == csum.
+
 The square root differential correction is needed because a
 correctly rounded square root of a correctly rounded sum of
 squares can still be off by as much as one ulp.
@@ -2519,11 +2527,8 @@
             csum += x;
             frac += (oldcsum - csum) + x;
 
-            x = lo * lo;
-            assert(fabs(csum) >= fabs(x));
-            oldcsum = csum;
-            csum += x;
-            frac += (oldcsum - csum) + x;
+            assert(csum + lo * lo == csum);
+            frac += lo * lo;
         }
         h = sqrt(csum - 1.0 + frac);
commit	27de28607a248e5ffb8838162fca466a58c2e284	[log] [tgz]
author	Raymond Hettinger <rhettinger@users.noreply.github.com>	Sat Aug 29 09:11:04 2020 -0700
committer	GitHub <noreply@github.com>	Sat Aug 29 09:11:04 2020 -0700
tree	6670673aaddeae849604ebeed2931960b0606f77
parent	e6dcd371b2c54a94584dd124e8c592a496d46a47 [diff] [blame]