Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1 | |
| 2 | :mod:`bisect` --- Array bisection algorithm |
| 3 | =========================================== |
| 4 | |
| 5 | .. module:: bisect |
| 6 | :synopsis: Array bisection algorithms for binary searching. |
| 7 | .. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org> |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 8 | .. example based on the PyModules FAQ entry by Aaron Watters <arw@pythonpros.com> |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 9 | |
| 10 | This module provides support for maintaining a list in sorted order without |
| 11 | having to sort the list after each insertion. For long lists of items with |
| 12 | expensive comparison operations, this can be an improvement over the more common |
| 13 | approach. The module is called :mod:`bisect` because it uses a basic bisection |
| 14 | algorithm to do its work. The source code may be most useful as a working |
| 15 | example of the algorithm (the boundary conditions are already right!). |
| 16 | |
Raymond Hettinger | 47ed1c1 | 2010-08-07 21:55:06 +0000 | [diff] [blame] | 17 | .. versionadded:: 2.1 |
| 18 | |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 19 | The following functions are provided: |
| 20 | |
| 21 | |
| 22 | .. function:: bisect_left(list, item[, lo[, hi]]) |
| 23 | |
| 24 | Locate the proper insertion point for *item* in *list* to maintain sorted order. |
| 25 | The parameters *lo* and *hi* may be used to specify a subset of the list which |
| 26 | should be considered; by default the entire list is used. If *item* is already |
| 27 | present in *list*, the insertion point will be before (to the left of) any |
| 28 | existing entries. The return value is suitable for use as the first parameter |
| 29 | to ``list.insert()``. This assumes that *list* is already sorted. |
| 30 | |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 31 | |
| 32 | .. function:: bisect_right(list, item[, lo[, hi]]) |
Raymond Hettinger | 47ed1c1 | 2010-08-07 21:55:06 +0000 | [diff] [blame] | 33 | .. function:: bisect(list, item[, lo[, hi]]) |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 34 | |
| 35 | Similar to :func:`bisect_left`, but returns an insertion point which comes after |
| 36 | (to the right of) any existing entries of *item* in *list*. |
| 37 | |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 38 | |
| 39 | .. function:: insort_left(list, item[, lo[, hi]]) |
| 40 | |
| 41 | Insert *item* in *list* in sorted order. This is equivalent to |
| 42 | ``list.insert(bisect.bisect_left(list, item, lo, hi), item)``. This assumes |
| 43 | that *list* is already sorted. |
| 44 | |
Raymond Hettinger | 47ed1c1 | 2010-08-07 21:55:06 +0000 | [diff] [blame] | 45 | Also note that while the fast search step is O(log n), the slower insertion |
| 46 | step is O(n), so the overall operation is slow. |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 47 | |
| 48 | .. function:: insort_right(list, item[, lo[, hi]]) |
Raymond Hettinger | 47ed1c1 | 2010-08-07 21:55:06 +0000 | [diff] [blame] | 49 | insort(a, x, lo=0, hi=len(a)) |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 50 | |
| 51 | Similar to :func:`insort_left`, but inserting *item* in *list* after any |
| 52 | existing entries of *item*. |
| 53 | |
Raymond Hettinger | 47ed1c1 | 2010-08-07 21:55:06 +0000 | [diff] [blame] | 54 | Also note that while the fast search step is O(log n), the slower insertion |
| 55 | step is O(n), so the overall operation is slow. |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 56 | |
Raymond Hettinger | 47ed1c1 | 2010-08-07 21:55:06 +0000 | [diff] [blame] | 57 | Searching Sorted Lists |
| 58 | ---------------------- |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 59 | |
Raymond Hettinger | 47ed1c1 | 2010-08-07 21:55:06 +0000 | [diff] [blame] | 60 | The above :func:`bisect` functions are useful for finding insertion points, but |
| 61 | can be tricky or awkward to use for common searching tasks. The following three |
| 62 | functions show how to transform them into the standard lookups for sorted |
| 63 | lists:: |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 64 | |
Raymond Hettinger | 47ed1c1 | 2010-08-07 21:55:06 +0000 | [diff] [blame] | 65 | def find(a, key): |
| 66 | '''Find leftmost item exact equal to the key. |
| 67 | Raise ValueError if no such item exists. |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 68 | |
Raymond Hettinger | 47ed1c1 | 2010-08-07 21:55:06 +0000 | [diff] [blame] | 69 | ''' |
| 70 | i = bisect_left(a, key) |
| 71 | if i < len(a) and a[i] == key: |
| 72 | return a[i] |
| 73 | raise ValueError('No item found with key equal to: %r' % (key,)) |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 74 | |
Raymond Hettinger | 47ed1c1 | 2010-08-07 21:55:06 +0000 | [diff] [blame] | 75 | def find_le(a, key): |
| 76 | '''Find largest item less-than or equal to key. |
| 77 | Raise ValueError if no such item exists. |
| 78 | If multiple keys are equal, return the leftmost. |
| 79 | |
| 80 | ''' |
| 81 | i = bisect_left(a, key) |
| 82 | if i < len(a) and a[i] == key: |
| 83 | return a[i] |
| 84 | if i == 0: |
| 85 | raise ValueError('No item found with key at or below: %r' % (key,)) |
| 86 | return a[i-1] |
| 87 | |
| 88 | def find_ge(a, key): |
| 89 | '''Find smallest item greater-than or equal to key. |
| 90 | Raise ValueError if no such item exists. |
| 91 | If multiple keys are equal, return the leftmost. |
| 92 | |
| 93 | ''' |
| 94 | i = bisect_left(a, key) |
| 95 | if i == len(a): |
| 96 | raise ValueError('No item found with key at or above: %r' % (key,)) |
| 97 | return a[i] |
| 98 | |
| 99 | Other Examples |
| 100 | -------------- |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 101 | |
| 102 | .. _bisect-example: |
| 103 | |
| 104 | The :func:`bisect` function is generally useful for categorizing numeric data. |
| 105 | This example uses :func:`bisect` to look up a letter grade for an exam total |
| 106 | (say) based on a set of ordered numeric breakpoints: 85 and up is an 'A', 75..84 |
Georg Brandl | e8f1b00 | 2008-03-22 22:04:10 +0000 | [diff] [blame] | 107 | is a 'B', etc. |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 108 | |
| 109 | >>> grades = "FEDCBA" |
| 110 | >>> breakpoints = [30, 44, 66, 75, 85] |
| 111 | >>> from bisect import bisect |
| 112 | >>> def grade(total): |
| 113 | ... return grades[bisect(breakpoints, total)] |
| 114 | ... |
| 115 | >>> grade(66) |
| 116 | 'C' |
| 117 | >>> map(grade, [33, 99, 77, 44, 12, 88]) |
| 118 | ['E', 'A', 'B', 'D', 'F', 'A'] |
| 119 | |
Raymond Hettinger | 87be88c | 2009-06-11 22:04:00 +0000 | [diff] [blame] | 120 | Unlike the :func:`sorted` function, it does not make sense for the :func:`bisect` |
| 121 | functions to have *key* or *reversed* arguments because that would lead to an |
| 122 | inefficent design (successive calls to bisect functions would not "remember" |
| 123 | all of the previous key lookups). |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 124 | |
Raymond Hettinger | 87be88c | 2009-06-11 22:04:00 +0000 | [diff] [blame] | 125 | Instead, it is better to search a list of precomputed keys to find the index |
| 126 | of the record in question:: |
| 127 | |
| 128 | >>> data = [('red', 5), ('blue', 1), ('yellow', 8), ('black', 0)] |
Raymond Hettinger | 2742e7e | 2009-06-11 22:08:10 +0000 | [diff] [blame] | 129 | >>> data.sort(key=lambda r: r[1]) |
| 130 | >>> keys = [r[1] for r in data] # precomputed list of keys |
Raymond Hettinger | 87be88c | 2009-06-11 22:04:00 +0000 | [diff] [blame] | 131 | >>> data[bisect_left(keys, 0)] |
| 132 | ('black', 0) |
| 133 | >>> data[bisect_left(keys, 1)] |
| 134 | ('blue', 1) |
| 135 | >>> data[bisect_left(keys, 5)] |
| 136 | ('red', 5) |
| 137 | >>> data[bisect_left(keys, 8)] |
| 138 | ('yellow', 8) |
Raymond Hettinger | 47ed1c1 | 2010-08-07 21:55:06 +0000 | [diff] [blame] | 139 | |
| 140 | .. seealso:: |
| 141 | |
| 142 | `SortedCollection recipe |
| 143 | <http://code.activestate.com/recipes/577197-sortedcollection/>`_ that |
| 144 | encapsulates precomputed keys, allowing straight-forward insertion and |
| 145 | searching using a *key* function. |