blob: 10cd6a0d364db3698583b757e6a6d718dab209cc [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001
2:mod:`sets` --- Unordered collections of unique elements
3========================================================
4
5.. module:: sets
6 :synopsis: Implementation of sets of unique elements.
7.. moduleauthor:: Greg V. Wilson <gvwilson@nevex.com>
8.. moduleauthor:: Alex Martelli <aleax@aleax.it>
9.. moduleauthor:: Guido van Rossum <guido@python.org>
10.. sectionauthor:: Raymond D. Hettinger <python@rcn.com>
11
12
13.. versionadded:: 2.3
14
15.. deprecated:: 2.6
16 The built-in ``set``/``frozenset`` types replace this module.
17
18The :mod:`sets` module provides classes for constructing and manipulating
19unordered collections of unique elements. Common uses include membership
20testing, removing duplicates from a sequence, and computing standard math
21operations on sets such as intersection, union, difference, and symmetric
22difference.
23
24Like other collections, sets support ``x in set``, ``len(set)``, and ``for x in
25set``. Being an unordered collection, sets do not record element position or
26order of insertion. Accordingly, sets do not support indexing, slicing, or
27other sequence-like behavior.
28
29Most set applications use the :class:`Set` class which provides every set method
30except for :meth:`__hash__`. For advanced applications requiring a hash method,
31the :class:`ImmutableSet` class adds a :meth:`__hash__` method but omits methods
32which alter the contents of the set. Both :class:`Set` and :class:`ImmutableSet`
33derive from :class:`BaseSet`, an abstract class useful for determining whether
34something is a set: ``isinstance(obj, BaseSet)``.
35
36The set classes are implemented using dictionaries. Accordingly, the
37requirements for set elements are the same as those for dictionary keys; namely,
38that the element defines both :meth:`__eq__` and :meth:`__hash__`. As a result,
39sets cannot contain mutable elements such as lists or dictionaries. However,
40they can contain immutable collections such as tuples or instances of
41:class:`ImmutableSet`. For convenience in implementing sets of sets, inner sets
42are automatically converted to immutable form, for example,
43``Set([Set(['dog'])])`` is transformed to ``Set([ImmutableSet(['dog'])])``.
44
45
46.. class:: Set([iterable])
47
48 Constructs a new empty :class:`Set` object. If the optional *iterable*
49 parameter is supplied, updates the set with elements obtained from iteration.
50 All of the elements in *iterable* should be immutable or be transformable to an
51 immutable using the protocol described in section :ref:`immutable-transforms`.
52
53
54.. class:: ImmutableSet([iterable])
55
56 Constructs a new empty :class:`ImmutableSet` object. If the optional *iterable*
57 parameter is supplied, updates the set with elements obtained from iteration.
58 All of the elements in *iterable* should be immutable or be transformable to an
59 immutable using the protocol described in section :ref:`immutable-transforms`.
60
61 Because :class:`ImmutableSet` objects provide a :meth:`__hash__` method, they
62 can be used as set elements or as dictionary keys. :class:`ImmutableSet`
63 objects do not have methods for adding or removing elements, so all of the
64 elements must be known when the constructor is called.
65
66
67.. _set-objects:
68
69Set Objects
70-----------
71
72Instances of :class:`Set` and :class:`ImmutableSet` both provide the following
73operations:
74
75+-------------------------------+------------+---------------------------------+
76| Operation | Equivalent | Result |
77+===============================+============+=================================+
78| ``len(s)`` | | cardinality of set *s* |
79+-------------------------------+------------+---------------------------------+
80| ``x in s`` | | test *x* for membership in *s* |
81+-------------------------------+------------+---------------------------------+
82| ``x not in s`` | | test *x* for non-membership in |
83| | | *s* |
84+-------------------------------+------------+---------------------------------+
85| ``s.issubset(t)`` | ``s <= t`` | test whether every element in |
86| | | *s* is in *t* |
87+-------------------------------+------------+---------------------------------+
88| ``s.issuperset(t)`` | ``s >= t`` | test whether every element in |
89| | | *t* is in *s* |
90+-------------------------------+------------+---------------------------------+
91| ``s.union(t)`` | ``s | t`` | new set with elements from both |
92| | | *s* and *t* |
93+-------------------------------+------------+---------------------------------+
94| ``s.intersection(t)`` | ``s & t`` | new set with elements common to |
95| | | *s* and *t* |
96+-------------------------------+------------+---------------------------------+
97| ``s.difference(t)`` | ``s - t`` | new set with elements in *s* |
98| | | but not in *t* |
99+-------------------------------+------------+---------------------------------+
100| ``s.symmetric_difference(t)`` | ``s ^ t`` | new set with elements in either |
101| | | *s* or *t* but not both |
102+-------------------------------+------------+---------------------------------+
103| ``s.copy()`` | | new set with a shallow copy of |
104| | | *s* |
105+-------------------------------+------------+---------------------------------+
106
107Note, the non-operator versions of :meth:`union`, :meth:`intersection`,
108:meth:`difference`, and :meth:`symmetric_difference` will accept any iterable as
109an argument. In contrast, their operator based counterparts require their
110arguments to be sets. This precludes error-prone constructions like
111``Set('abc') & 'cbs'`` in favor of the more readable
112``Set('abc').intersection('cbs')``.
113
114.. versionchanged:: 2.3.1
115 Formerly all arguments were required to be sets.
116
117In addition, both :class:`Set` and :class:`ImmutableSet` support set to set
118comparisons. Two sets are equal if and only if every element of each set is
119contained in the other (each is a subset of the other). A set is less than
120another set if and only if the first set is a proper subset of the second set
121(is a subset, but is not equal). A set is greater than another set if and only
122if the first set is a proper superset of the second set (is a superset, but is
123not equal).
124
125The subset and equality comparisons do not generalize to a complete ordering
126function. For example, any two disjoint sets are not equal and are not subsets
127of each other, so *all* of the following return ``False``: ``a<b``, ``a==b``,
128or ``a>b``. Accordingly, sets do not implement the :meth:`__cmp__` method.
129
130Since sets only define partial ordering (subset relationships), the output of
131the :meth:`list.sort` method is undefined for lists of sets.
132
133The following table lists operations available in :class:`ImmutableSet` but not
134found in :class:`Set`:
135
136+-------------+------------------------------+
137| Operation | Result |
138+=============+==============================+
139| ``hash(s)`` | returns a hash value for *s* |
140+-------------+------------------------------+
141
142The following table lists operations available in :class:`Set` but not found in
143:class:`ImmutableSet`:
144
145+--------------------------------------+-------------+---------------------------------+
146| Operation | Equivalent | Result |
147+======================================+=============+=================================+
148| ``s.update(t)`` | *s* \|= *t* | return set *s* with elements |
149| | | added from *t* |
150+--------------------------------------+-------------+---------------------------------+
151| ``s.intersection_update(t)`` | *s* &= *t* | return set *s* keeping only |
152| | | elements also found in *t* |
153+--------------------------------------+-------------+---------------------------------+
154| ``s.difference_update(t)`` | *s* -= *t* | return set *s* after removing |
155| | | elements found in *t* |
156+--------------------------------------+-------------+---------------------------------+
157| ``s.symmetric_difference_update(t)`` | *s* ^= *t* | return set *s* with elements |
158| | | from *s* or *t* but not both |
159+--------------------------------------+-------------+---------------------------------+
160| ``s.add(x)`` | | add element *x* to set *s* |
161+--------------------------------------+-------------+---------------------------------+
162| ``s.remove(x)`` | | remove *x* from set *s*; raises |
163| | | :exc:`KeyError` if not present |
164+--------------------------------------+-------------+---------------------------------+
165| ``s.discard(x)`` | | removes *x* from set *s* if |
166| | | present |
167+--------------------------------------+-------------+---------------------------------+
168| ``s.pop()`` | | remove and return an arbitrary |
169| | | element from *s*; raises |
170| | | :exc:`KeyError` if empty |
171+--------------------------------------+-------------+---------------------------------+
172| ``s.clear()`` | | remove all elements from set |
173| | | *s* |
174+--------------------------------------+-------------+---------------------------------+
175
176Note, the non-operator versions of :meth:`update`, :meth:`intersection_update`,
177:meth:`difference_update`, and :meth:`symmetric_difference_update` will accept
178any iterable as an argument.
179
180.. versionchanged:: 2.3.1
181 Formerly all arguments were required to be sets.
182
183Also note, the module also includes a :meth:`union_update` method which is an
184alias for :meth:`update`. The method is included for backwards compatibility.
185Programmers should prefer the :meth:`update` method because it is supported by
186the builtin :class:`set()` and :class:`frozenset()` types.
187
188
189.. _set-example:
190
191Example
192-------
193
194::
195
196 >>> from sets import Set
197 >>> engineers = Set(['John', 'Jane', 'Jack', 'Janice'])
198 >>> programmers = Set(['Jack', 'Sam', 'Susan', 'Janice'])
199 >>> managers = Set(['Jane', 'Jack', 'Susan', 'Zack'])
200 >>> employees = engineers | programmers | managers # union
201 >>> engineering_management = engineers & managers # intersection
202 >>> fulltime_management = managers - engineers - programmers # difference
203 >>> engineers.add('Marvin') # add element
204 >>> print engineers
205 Set(['Jane', 'Marvin', 'Janice', 'John', 'Jack'])
206 >>> employees.issuperset(engineers) # superset test
207 False
208 >>> employees.update(engineers) # update from another set
209 >>> employees.issuperset(engineers)
210 True
211 >>> for group in [engineers, programmers, managers, employees]:
212 ... group.discard('Susan') # unconditionally remove element
213 ... print group
214 ...
215 Set(['Jane', 'Marvin', 'Janice', 'John', 'Jack'])
216 Set(['Janice', 'Jack', 'Sam'])
217 Set(['Jane', 'Zack', 'Jack'])
218 Set(['Jack', 'Sam', 'Jane', 'Marvin', 'Janice', 'John', 'Zack'])
219
220
221.. _immutable-transforms:
222
223Protocol for automatic conversion to immutable
224----------------------------------------------
225
226Sets can only contain immutable elements. For convenience, mutable :class:`Set`
227objects are automatically copied to an :class:`ImmutableSet` before being added
228as a set element.
229
230The mechanism is to always add a hashable element, or if it is not hashable, the
231element is checked to see if it has an :meth:`__as_immutable__` method which
232returns an immutable equivalent.
233
234Since :class:`Set` objects have a :meth:`__as_immutable__` method returning an
235instance of :class:`ImmutableSet`, it is possible to construct sets of sets.
236
237A similar mechanism is needed by the :meth:`__contains__` and :meth:`remove`
238methods which need to hash an element to check for membership in a set. Those
239methods check an element for hashability and, if not, check for a
240:meth:`__as_temporarily_immutable__` method which returns the element wrapped by
241a class that provides temporary methods for :meth:`__hash__`, :meth:`__eq__`,
242and :meth:`__ne__`.
243
244The alternate mechanism spares the need to build a separate copy of the original
245mutable object.
246
247:class:`Set` objects implement the :meth:`__as_temporarily_immutable__` method
248which returns the :class:`Set` object wrapped by a new class
249:class:`_TemporarilyImmutableSet`.
250
251The two mechanisms for adding hashability are normally invisible to the user;
252however, a conflict can arise in a multi-threaded environment where one thread
253is updating a set while another has temporarily wrapped it in
254:class:`_TemporarilyImmutableSet`. In other words, sets of mutable sets are not
255thread-safe.
256
257
258.. _comparison-to-builtin-set:
259
260Comparison to the built-in :class:`set` types
261---------------------------------------------
262
263The built-in :class:`set` and :class:`frozenset` types were designed based on
264lessons learned from the :mod:`sets` module. The key differences are:
265
266* :class:`Set` and :class:`ImmutableSet` were renamed to :class:`set` and
267 :class:`frozenset`.
268
269* There is no equivalent to :class:`BaseSet`. Instead, use ``isinstance(x,
270 (set, frozenset))``.
271
272* The hash algorithm for the built-ins performs significantly better (fewer
273 collisions) for most datasets.
274
275* The built-in versions have more space efficient pickles.
276
277* The built-in versions do not have a :meth:`union_update` method. Instead, use
278 the :meth:`update` method which is equivalent.
279
280* The built-in versions do not have a ``_repr(sorted=True)`` method.
281 Instead, use the built-in :func:`repr` and :func:`sorted` functions:
282 ``repr(sorted(s))``.
283
284* The built-in version does not have a protocol for automatic conversion to
285 immutable. Many found this feature to be confusing and no one in the community
286 reported having found real uses for it.
287