blob: ca853e0f8229d3a2b6d21d1f37bed96e1a0a0dfd [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001:mod:`bisect` --- Array bisection algorithm
2===========================================
3
4.. module:: bisect
5 :synopsis: Array bisection algorithms for binary searching.
6.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org>
Raymond Hettinger20933e02010-09-01 06:58:25 +00007.. sectionauthor:: Raymond Hettinger <python at rcn.com>
Christian Heimes5b5e81c2007-12-31 16:14:33 +00008.. example based on the PyModules FAQ entry by Aaron Watters <arw@pythonpros.com>
Georg Brandl116aa622007-08-15 14:28:22 +00009
10This module provides support for maintaining a list in sorted order without
11having to sort the list after each insertion. For long lists of items with
12expensive comparison operations, this can be an improvement over the more common
13approach. The module is called :mod:`bisect` because it uses a basic bisection
14algorithm to do its work. The source code may be most useful as a working
15example of the algorithm (the boundary conditions are already right!).
16
Éric Araujo6e6cb8e2010-11-16 19:13:50 +000017.. seealso::
18
19 Latest version of the :source:`bisect module Python source code
20 <Lib/bisect.py>`
21
Georg Brandl116aa622007-08-15 14:28:22 +000022The following functions are provided:
23
24
Georg Brandl0d8f0732009-04-05 22:20:44 +000025.. function:: bisect_left(a, x, lo=0, hi=len(a))
Georg Brandl116aa622007-08-15 14:28:22 +000026
Raymond Hettinger20933e02010-09-01 06:58:25 +000027 Locate the insertion point for *x* in *a* to maintain sorted order.
Georg Brandl0d8f0732009-04-05 22:20:44 +000028 The parameters *lo* and *hi* may be used to specify a subset of the list
29 which should be considered; by default the entire list is used. If *x* is
30 already present in *a*, the insertion point will be before (to the left of)
31 any existing entries. The return value is suitable for use as the first
Raymond Hettinger20933e02010-09-01 06:58:25 +000032 parameter to ``list.insert()`` assuming that *a* is already sorted.
Georg Brandl116aa622007-08-15 14:28:22 +000033
Raymond Hettinger20933e02010-09-01 06:58:25 +000034 The returned insertion point *i* partitions the array *a* into two halves so
35 that ``all(val < x for val in a[lo:i])`` for the left side and
36 ``all(val >= x for val in a[i:hi])`` for the right side.
Georg Brandl116aa622007-08-15 14:28:22 +000037
Georg Brandl0d8f0732009-04-05 22:20:44 +000038.. function:: bisect_right(a, x, lo=0, hi=len(a))
39 bisect(a, x, lo=0, hi=len(a))
Georg Brandl116aa622007-08-15 14:28:22 +000040
Georg Brandl0d8f0732009-04-05 22:20:44 +000041 Similar to :func:`bisect_left`, but returns an insertion point which comes
42 after (to the right of) any existing entries of *x* in *a*.
Georg Brandl116aa622007-08-15 14:28:22 +000043
Raymond Hettinger20933e02010-09-01 06:58:25 +000044 The returned insertion point *i* partitions the array *a* into two halves so
45 that ``all(val <= x for val in a[lo:i])`` for the left side and
46 ``all(val > x for val in a[i:hi])`` for the right side.
Georg Brandl116aa622007-08-15 14:28:22 +000047
Georg Brandl0d8f0732009-04-05 22:20:44 +000048.. function:: insort_left(a, x, lo=0, hi=len(a))
Georg Brandl116aa622007-08-15 14:28:22 +000049
Georg Brandl0d8f0732009-04-05 22:20:44 +000050 Insert *x* in *a* in sorted order. This is equivalent to
Raymond Hettinger20933e02010-09-01 06:58:25 +000051 ``a.insert(bisect.bisect_left(a, x, lo, hi), x)`` assuming that *a* is
52 already sorted. Keep in mind that the O(log n) search is dominated by
53 the slow O(n) insertion step.
Georg Brandl116aa622007-08-15 14:28:22 +000054
Georg Brandl0d8f0732009-04-05 22:20:44 +000055.. function:: insort_right(a, x, lo=0, hi=len(a))
56 insort(a, x, lo=0, hi=len(a))
Georg Brandl116aa622007-08-15 14:28:22 +000057
Georg Brandl0d8f0732009-04-05 22:20:44 +000058 Similar to :func:`insort_left`, but inserting *x* in *a* after any existing
59 entries of *x*.
Georg Brandl116aa622007-08-15 14:28:22 +000060
Raymond Hettinger20933e02010-09-01 06:58:25 +000061.. seealso::
62
63 `SortedCollection recipe
64 <http://code.activestate.com/recipes/577197-sortedcollection/>`_ that uses
65 bisect to build a full-featured collection class with straight-forward search
66 methods and support for a key-function. The keys are precomputed to save
67 unnecessary calls to the key function during searches.
68
Georg Brandl116aa622007-08-15 14:28:22 +000069
Raymond Hettinger87c9d6c2010-08-07 07:36:55 +000070Searching Sorted Lists
71----------------------
72
Raymond Hettinger20933e02010-09-01 06:58:25 +000073The above :func:`bisect` functions are useful for finding insertion points but
74can be tricky or awkward to use for common searching tasks. The following five
Raymond Hettinger87c9d6c2010-08-07 07:36:55 +000075functions show how to transform them into the standard lookups for sorted
76lists::
77
Raymond Hettinger20933e02010-09-01 06:58:25 +000078 def index(a, x):
79 'Locate the leftmost value exactly equal to x'
80 i = bisect_left(a, x)
81 if i != len(a) and a[i] == x:
82 return i
83 raise ValueError
Raymond Hettinger87c9d6c2010-08-07 07:36:55 +000084
Raymond Hettinger20933e02010-09-01 06:58:25 +000085 def find_lt(a, x):
86 'Find rightmost value less than x'
87 i = bisect_left(a, x)
88 if i:
89 return a[i-1]
90 raise ValueError
91
92 def find_le(a, x):
93 'Find rightmost value less than or equal to x'
94 i = bisect_right(a, x)
95 if i:
96 return a[i-1]
97 raise ValueError
98
99 def find_gt(a, x):
100 'Find leftmost value greater than x'
101 i = bisect_right(a, x)
102 if i != len(a):
Raymond Hettinger87c9d6c2010-08-07 07:36:55 +0000103 return a[i]
Raymond Hettinger20933e02010-09-01 06:58:25 +0000104 raise ValueError
Raymond Hettinger87c9d6c2010-08-07 07:36:55 +0000105
Raymond Hettinger20933e02010-09-01 06:58:25 +0000106 def find_ge(a, x):
107 'Find leftmost item greater than or equal to x'
108 i = bisect_left(a, x)
109 if i != len(a):
Raymond Hettinger87c9d6c2010-08-07 07:36:55 +0000110 return a[i]
Raymond Hettinger20933e02010-09-01 06:58:25 +0000111 raise ValueError
Raymond Hettinger87c9d6c2010-08-07 07:36:55 +0000112
Raymond Hettinger87c9d6c2010-08-07 07:36:55 +0000113
114Other Examples
115--------------
Georg Brandl116aa622007-08-15 14:28:22 +0000116
117.. _bisect-example:
118
Raymond Hettinger20933e02010-09-01 06:58:25 +0000119The :func:`bisect` function can be useful for numeric table lookups. This
120example uses :func:`bisect` to look up a letter grade for an exam score (say)
121based on a set of ordered numeric breakpoints: 90 and up is an 'A', 80 to 89 is
122a 'B', and so on::
Georg Brandl116aa622007-08-15 14:28:22 +0000123
Raymond Hettinger20933e02010-09-01 06:58:25 +0000124 >>> def grade(score, breakpoints=[60, 70, 80, 90], grades='FDCBA'):
125 ... i = bisect(breakpoints, score)
126 ... return grades[i]
Georg Brandl116aa622007-08-15 14:28:22 +0000127 ...
Raymond Hettinger20933e02010-09-01 06:58:25 +0000128 >>> [grade(score) for score in [33, 99, 77, 70, 89, 90, 100]]
129 ['F', 'A', 'C', 'C', 'B', 'A', 'A']
Georg Brandl116aa622007-08-15 14:28:22 +0000130
Raymond Hettingere046d2a2009-06-11 22:01:24 +0000131Unlike the :func:`sorted` function, it does not make sense for the :func:`bisect`
132functions to have *key* or *reversed* arguments because that would lead to an
Georg Brandl6faee4e2010-09-21 14:48:28 +0000133inefficient design (successive calls to bisect functions would not "remember"
Raymond Hettingere046d2a2009-06-11 22:01:24 +0000134all of the previous key lookups).
Georg Brandl116aa622007-08-15 14:28:22 +0000135
Raymond Hettingere046d2a2009-06-11 22:01:24 +0000136Instead, it is better to search a list of precomputed keys to find the index
137of the record in question::
138
139 >>> data = [('red', 5), ('blue', 1), ('yellow', 8), ('black', 0)]
Raymond Hettinger27352a52009-06-11 22:06:06 +0000140 >>> data.sort(key=lambda r: r[1])
141 >>> keys = [r[1] for r in data] # precomputed list of keys
Raymond Hettingere046d2a2009-06-11 22:01:24 +0000142 >>> data[bisect_left(keys, 0)]
143 ('black', 0)
144 >>> data[bisect_left(keys, 1)]
145 ('blue', 1)
146 >>> data[bisect_left(keys, 5)]
147 ('red', 5)
148 >>> data[bisect_left(keys, 8)]
149 ('yellow', 8)
Raymond Hettinger87c9d6c2010-08-07 07:36:55 +0000150