Implement an old idea of Christian Tismer's: use polynomial division
instead of multiplication to generate the probe sequence. The idea is
recorded in Python-Dev for Dec 2000, but that version is prone to rare
infinite loops.
The value is in getting *all* the bits of the hash code to participate;
and, e.g., this speeds up querying every key in a dict with keys
[i << 16 for i in range(20000)] by a factor of 500. Should be equally
valuable in any bad case where the high-order hash bits were getting
ignored.
Also wrote up some of the motivations behind Python's ever-more-subtle
hash table strategy.
diff --git a/Misc/NEWS b/Misc/NEWS
index 3a15837..b960117 100644
--- a/Misc/NEWS
+++ b/Misc/NEWS
@@ -116,6 +116,14 @@
to crash if the element comparison routines for the dict keys and/or
values mutated the dicts. Making the code bulletproof slowed it down.
+- Collisions in dicts now use polynomial division instead of multiplication
+ to generate the probe sequence, following an idea of Christian Tismer's.
+ This allows all bits of the hash code to come into play. It should have
+ little or no effect on speed in ordinary cases, but can help dramatically
+ in bad cases. For example, looking up every key in a dict d with
+ d.keys() = [i << 16 for i in range(20000)] is approximately 500x faster
+ now.
+
Library
- calendar.py uses month and day names based on the current locale.