blob: f34ee175ba657433ce5c2bce266cb0fd3d28c400 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001:mod:`bisect` --- Array bisection algorithm
2===========================================
3
4.. module:: bisect
5 :synopsis: Array bisection algorithms for binary searching.
6.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org>
Raymond Hettinger20933e02010-09-01 06:58:25 +00007.. sectionauthor:: Raymond Hettinger <python at rcn.com>
Christian Heimes5b5e81c2007-12-31 16:14:33 +00008.. example based on the PyModules FAQ entry by Aaron Watters <arw@pythonpros.com>
Georg Brandl116aa622007-08-15 14:28:22 +00009
Raymond Hettinger10480942011-01-10 03:26:08 +000010**Source code:** :source:`Lib/bisect.py`
11
Raymond Hettinger4f707fd2011-01-10 19:54:11 +000012--------------
13
Georg Brandl116aa622007-08-15 14:28:22 +000014This module provides support for maintaining a list in sorted order without
15having to sort the list after each insertion. For long lists of items with
16expensive comparison operations, this can be an improvement over the more common
17approach. The module is called :mod:`bisect` because it uses a basic bisection
18algorithm to do its work. The source code may be most useful as a working
19example of the algorithm (the boundary conditions are already right!).
20
21The following functions are provided:
22
23
Raymond Hettinger871934d2020-10-19 22:04:01 -070024.. function:: bisect_left(a, x, lo=0, hi=len(a), *, key=None)
Georg Brandl116aa622007-08-15 14:28:22 +000025
Raymond Hettinger20933e02010-09-01 06:58:25 +000026 Locate the insertion point for *x* in *a* to maintain sorted order.
Georg Brandl0d8f0732009-04-05 22:20:44 +000027 The parameters *lo* and *hi* may be used to specify a subset of the list
28 which should be considered; by default the entire list is used. If *x* is
29 already present in *a*, the insertion point will be before (to the left of)
30 any existing entries. The return value is suitable for use as the first
Raymond Hettinger20933e02010-09-01 06:58:25 +000031 parameter to ``list.insert()`` assuming that *a* is already sorted.
Georg Brandl116aa622007-08-15 14:28:22 +000032
Raymond Hettinger20933e02010-09-01 06:58:25 +000033 The returned insertion point *i* partitions the array *a* into two halves so
Raymond Hettinger871934d2020-10-19 22:04:01 -070034 that ``all(val < x for val in a[lo : i])`` for the left side and
35 ``all(val >= x for val in a[i : hi])`` for the right side.
Georg Brandl116aa622007-08-15 14:28:22 +000036
Raymond Hettinger871934d2020-10-19 22:04:01 -070037 *key* specifies a :term:`key function` of one argument that is used to
38 extract a comparison key from each input element. The default value is
39 ``None`` (compare the elements directly).
40
41 .. versionchanged:: 3.10
42 Added the *key* parameter.
43
44
45.. function:: bisect_right(a, x, lo=0, hi=len(a), *, key=None)
Georg Brandl0d8f0732009-04-05 22:20:44 +000046 bisect(a, x, lo=0, hi=len(a))
Georg Brandl116aa622007-08-15 14:28:22 +000047
Georg Brandl0d8f0732009-04-05 22:20:44 +000048 Similar to :func:`bisect_left`, but returns an insertion point which comes
49 after (to the right of) any existing entries of *x* in *a*.
Georg Brandl116aa622007-08-15 14:28:22 +000050
Raymond Hettinger20933e02010-09-01 06:58:25 +000051 The returned insertion point *i* partitions the array *a* into two halves so
Raymond Hettinger871934d2020-10-19 22:04:01 -070052 that ``all(val <= x for val in a[lo : i])`` for the left side and
53 ``all(val > x for val in a[i : hi])`` for the right side.
Georg Brandl116aa622007-08-15 14:28:22 +000054
Raymond Hettinger871934d2020-10-19 22:04:01 -070055 *key* specifies a :term:`key function` of one argument that is used to
56 extract a comparison key from each input element. The default value is
57 ``None`` (compare the elements directly).
Georg Brandl116aa622007-08-15 14:28:22 +000058
Raymond Hettinger871934d2020-10-19 22:04:01 -070059 .. versionchanged:: 3.10
60 Added the *key* parameter.
Georg Brandl116aa622007-08-15 14:28:22 +000061
Raymond Hettinger871934d2020-10-19 22:04:01 -070062
63.. function:: insort_left(a, x, lo=0, hi=len(a), *, key=None)
64
65 Insert *x* in *a* in sorted order.
66
67 *key* specifies a :term:`key function` of one argument that is used to
68 extract a comparison key from each input element. The default value is
69 ``None`` (compare the elements directly).
70
71 This function first runs :func:`bisect_left` to locate an insertion point.
72 Next, it runs the :meth:`insert` method on *a* to insert *x* at the
73 appropriate position to maintain sort order.
74
75 Keep in mind that the ``O(log n)`` search is dominated by the slow O(n)
76 insertion step.
77
78 .. versionchanged:: 3.10
79 Added the *key* parameter.
80
81
82.. function:: insort_right(a, x, lo=0, hi=len(a), *, key=None)
Georg Brandl0d8f0732009-04-05 22:20:44 +000083 insort(a, x, lo=0, hi=len(a))
Georg Brandl116aa622007-08-15 14:28:22 +000084
Georg Brandl0d8f0732009-04-05 22:20:44 +000085 Similar to :func:`insort_left`, but inserting *x* in *a* after any existing
86 entries of *x*.
Georg Brandl116aa622007-08-15 14:28:22 +000087
Raymond Hettinger871934d2020-10-19 22:04:01 -070088 *key* specifies a :term:`key function` of one argument that is used to
89 extract a comparison key from each input element. The default value is
90 ``None`` (compare the elements directly).
91
92 This function first runs :func:`bisect_right` to locate an insertion point.
93 Next, it runs the :meth:`insert` method on *a* to insert *x* at the
94 appropriate position to maintain sort order.
95
96 Keep in mind that the ``O(log n)`` search is dominated by the slow O(n)
97 insertion step.
98
99 .. versionchanged:: 3.10
100 Added the *key* parameter.
101
102
103Performance Notes
104-----------------
105
106When writing time sensitive code using *bisect()* and *insort()*, keep these
107thoughts in mind:
108
109* Bisection is effective for searching ranges of values.
110 For locating specific values, dictionaries are more performant.
111
112* The *insort()* functions are ``O(n)`` because the logarithmic search step
113 is dominated by the linear time insertion step.
114
115* The search functions are stateless and discard key function results after
116 they are used. Consequently, if the search functions are used in a loop,
117 the key function may be called again and again on the same array elements.
118 If the key function isn't fast, consider wrapping it with
119 :func:`functools.cache` to avoid duplicate computations. Alternatively,
120 consider searching an array of precomputed keys to locate the insertion
121 point (as shown in the examples section below).
122
Raymond Hettinger20933e02010-09-01 06:58:25 +0000123.. seealso::
124
Raymond Hettinger871934d2020-10-19 22:04:01 -0700125 * `Sorted Collections
126 <http://www.grantjenks.com/docs/sortedcollections/>`_ is a high performance
127 module that uses *bisect* to managed sorted collections of data.
128
129 * The `SortedCollection recipe
130 <https://code.activestate.com/recipes/577197-sortedcollection/>`_ uses
131 bisect to build a full-featured collection class with straight-forward search
132 methods and support for a key-function. The keys are precomputed to save
133 unnecessary calls to the key function during searches.
Raymond Hettinger20933e02010-09-01 06:58:25 +0000134
Georg Brandl116aa622007-08-15 14:28:22 +0000135
Raymond Hettinger87c9d6c2010-08-07 07:36:55 +0000136Searching Sorted Lists
137----------------------
138
Raymond Hettinger20933e02010-09-01 06:58:25 +0000139The above :func:`bisect` functions are useful for finding insertion points but
140can be tricky or awkward to use for common searching tasks. The following five
Raymond Hettinger87c9d6c2010-08-07 07:36:55 +0000141functions show how to transform them into the standard lookups for sorted
142lists::
143
Raymond Hettinger20933e02010-09-01 06:58:25 +0000144 def index(a, x):
145 'Locate the leftmost value exactly equal to x'
146 i = bisect_left(a, x)
147 if i != len(a) and a[i] == x:
148 return i
149 raise ValueError
Raymond Hettinger87c9d6c2010-08-07 07:36:55 +0000150
Raymond Hettinger20933e02010-09-01 06:58:25 +0000151 def find_lt(a, x):
152 'Find rightmost value less than x'
153 i = bisect_left(a, x)
154 if i:
155 return a[i-1]
156 raise ValueError
157
158 def find_le(a, x):
159 'Find rightmost value less than or equal to x'
160 i = bisect_right(a, x)
161 if i:
162 return a[i-1]
163 raise ValueError
164
165 def find_gt(a, x):
166 'Find leftmost value greater than x'
167 i = bisect_right(a, x)
168 if i != len(a):
Raymond Hettinger87c9d6c2010-08-07 07:36:55 +0000169 return a[i]
Raymond Hettinger20933e02010-09-01 06:58:25 +0000170 raise ValueError
Raymond Hettinger87c9d6c2010-08-07 07:36:55 +0000171
Raymond Hettinger20933e02010-09-01 06:58:25 +0000172 def find_ge(a, x):
173 'Find leftmost item greater than or equal to x'
174 i = bisect_left(a, x)
175 if i != len(a):
Raymond Hettinger87c9d6c2010-08-07 07:36:55 +0000176 return a[i]
Raymond Hettinger20933e02010-09-01 06:58:25 +0000177 raise ValueError
Raymond Hettinger87c9d6c2010-08-07 07:36:55 +0000178
Raymond Hettinger87c9d6c2010-08-07 07:36:55 +0000179
Raymond Hettinger871934d2020-10-19 22:04:01 -0700180Examples
181--------
Georg Brandl116aa622007-08-15 14:28:22 +0000182
183.. _bisect-example:
184
Raymond Hettinger20933e02010-09-01 06:58:25 +0000185The :func:`bisect` function can be useful for numeric table lookups. This
186example uses :func:`bisect` to look up a letter grade for an exam score (say)
187based on a set of ordered numeric breakpoints: 90 and up is an 'A', 80 to 89 is
188a 'B', and so on::
Georg Brandl116aa622007-08-15 14:28:22 +0000189
Raymond Hettinger20933e02010-09-01 06:58:25 +0000190 >>> def grade(score, breakpoints=[60, 70, 80, 90], grades='FDCBA'):
191 ... i = bisect(breakpoints, score)
192 ... return grades[i]
Georg Brandl116aa622007-08-15 14:28:22 +0000193 ...
Raymond Hettinger20933e02010-09-01 06:58:25 +0000194 >>> [grade(score) for score in [33, 99, 77, 70, 89, 90, 100]]
195 ['F', 'A', 'C', 'C', 'B', 'A', 'A']
Georg Brandl116aa622007-08-15 14:28:22 +0000196
Raymond Hettinger871934d2020-10-19 22:04:01 -0700197One technique to avoid repeated calls to a key function is to search a list of
198precomputed keys to find the index of a record::
Raymond Hettingere046d2a2009-06-11 22:01:24 +0000199
200 >>> data = [('red', 5), ('blue', 1), ('yellow', 8), ('black', 0)]
Raymond Hettinger871934d2020-10-19 22:04:01 -0700201 >>> data.sort(key=lambda r: r[1]) # Or use operator.itemgetter(1).
202 >>> keys = [r[1] for r in data] # Precompute a list of keys.
Raymond Hettingere046d2a2009-06-11 22:01:24 +0000203 >>> data[bisect_left(keys, 0)]
204 ('black', 0)
205 >>> data[bisect_left(keys, 1)]
206 ('blue', 1)
207 >>> data[bisect_left(keys, 5)]
208 ('red', 5)
209 >>> data[bisect_left(keys, 8)]
210 ('yellow', 8)
Raymond Hettinger87c9d6c2010-08-07 07:36:55 +0000211