blob: 646e2054608148dce10cf6930b4d26b6330322b6 [file] [log] [blame]
Raymond Hettinger584cb192002-08-23 15:18:38 +00001\section{\module{sets} ---
2 Unordered collections of unique elements}
3
4\declaremodule{standard}{sets}
5\modulesynopsis{Implementation of sets of unique elements.}
6\moduleauthor{Greg V. Wilson}{gvwilson@nevex.com}
7\moduleauthor{Alex Martelli}{aleax@aleax.it}
8\moduleauthor{Guido van Rossum}{guido@python.org}
9\sectionauthor{Raymond D. Hettinger}{python@rcn.com}
10
11\versionadded{2.3}
12
13The \module{sets} module provides classes for constructing and manipulating
14unordered collections of unique elements. Common uses include membership
15testing, removing duplicates from a sequence, and computing standard math
16operations on sets such as intersection, union, difference, and symmetric
17difference.
18
Fred Draked10c6c92002-08-23 17:22:36 +000019Like other collections, sets support \code{\var{x} in \var{set}},
20\code{len(\var{set})}, and \code{for \var{x} in \var{set}}. Being an
21unordered collection, sets do not record element position or order of
22insertion. Accordingly, sets do not support indexing, slicing, or
23other sequence-like behavior.
Raymond Hettinger584cb192002-08-23 15:18:38 +000024
25Most set applications use the \class{Set} class which provides every set
26method except for \method{__hash__()}. For advanced applications requiring
27a hash method, the \class{ImmutableSet} class adds a \method{__hash__()}
28method but omits methods which alter the contents of the set. Both
29\class{Set} and \class{ImmutableSet} derive from \class{BaseSet}, an
30abstract class useful for determining whether something is a set:
Fred Draked10c6c92002-08-23 17:22:36 +000031\code{isinstance(\var{obj}, BaseSet)}.
Raymond Hettinger584cb192002-08-23 15:18:38 +000032
Fred Draked10c6c92002-08-23 17:22:36 +000033The set classes are implemented using dictionaries. As a result, sets
34cannot contain mutable elements such as lists or dictionaries.
35However, they can contain immutable collections such as tuples or
Raymond Hettingerfa8dd5f2002-08-23 18:10:54 +000036instances of \class{ImmutableSet}. For convenience in implementing
Fred Draked10c6c92002-08-23 17:22:36 +000037sets of sets, inner sets are automatically converted to immutable
38form, for example, \code{Set([Set(['dog'])])} is transformed to
Raymond Hettinger584cb192002-08-23 15:18:38 +000039\code{Set([ImmutableSet(['dog'])])}.
40
41\begin{classdesc}{Set}{\optional{iterable}}
42Constructs a new empty \class{Set} object. If the optional \var{iterable}
43parameter is supplied, updates the set with elements obtained from iteration.
44All of the elements in \var{iterable} should be immutable or be transformable
Fred Draked10c6c92002-08-23 17:22:36 +000045to an immutable using the protocol described in
46section~\ref{immutable-transforms}.
Raymond Hettinger584cb192002-08-23 15:18:38 +000047\end{classdesc}
48
49\begin{classdesc}{ImmutableSet}{\optional{iterable}}
50Constructs a new empty \class{ImmutableSet} object. If the optional
51\var{iterable} parameter is supplied, updates the set with elements obtained
52from iteration. All of the elements in \var{iterable} should be immutable or
Fred Draked10c6c92002-08-23 17:22:36 +000053be transformable to an immutable using the protocol described in
54section~\ref{immutable-transforms}.
Raymond Hettinger584cb192002-08-23 15:18:38 +000055
56Because \class{ImmutableSet} objects provide a \method{__hash__()} method,
57they can be used as set elements or as dictionary keys. \class{ImmutableSet}
58objects do not have methods for adding or removing elements, so all of the
59elements must be known when the constructor is called.
60\end{classdesc}
61
62
Fred Drake2e3ae212003-01-06 15:50:32 +000063\subsection{Set Objects \label{set-objects}}
Raymond Hettinger584cb192002-08-23 15:18:38 +000064
65Instances of \class{Set} and \class{ImmutableSet} both provide
66the following operations:
67
Fred Draked10c6c92002-08-23 17:22:36 +000068\begin{tableii}{c|l}{code}{Operation}{Result}
69 \lineii{len(\var{s})}{cardinality of set \var{s}}
Raymond Hettinger584cb192002-08-23 15:18:38 +000070
71 \hline
Fred Draked10c6c92002-08-23 17:22:36 +000072 \lineii{\var{x} in \var{s}}
73 {test \var{x} for membership in \var{s}}
74 \lineii{\var{x} not in \var{s}}
75 {test \var{x} for non-membership in \var{s}}
76 \lineii{\var{s}.issubset(\var{t})}
Tim Petersea76c982002-08-25 18:43:10 +000077 {test whether every element in \var{s} is in \var{t};
78 \code{\var{s} <= \var{t}} is equivalent}
Fred Draked10c6c92002-08-23 17:22:36 +000079 \lineii{\var{s}.issuperset(\var{t})}
Tim Petersea76c982002-08-25 18:43:10 +000080 {test whether every element in \var{t} is in \var{s};
81 \code{\var{s} >= \var{t}} is equivalent}
Raymond Hettinger584cb192002-08-23 15:18:38 +000082
83 \hline
Fred Draked10c6c92002-08-23 17:22:36 +000084 \lineii{\var{s} | \var{t}}
85 {new set with elements from both \var{s} and \var{t}}
86 \lineii{\var{s}.union(\var{t})}
87 {new set with elements from both \var{s} and \var{t}}
88 \lineii{\var{s} \&\ \var{t}}
89 {new set with elements common to \var{s} and \var{t}}
90 \lineii{\var{s}.intersection(\var{t})}
91 {new set with elements common to \var{s} and \var{t}}
92 \lineii{\var{s} - \var{t}}
93 {new set with elements in \var{s} but not in \var{t}}
94 \lineii{\var{s}.difference(\var{t})}
95 {new set with elements in \var{s} but not in \var{t}}
96 \lineii{\var{s} \textasciicircum\ \var{t}}
97 {new set with elements in either \var{s} or \var{t} but not both}
98 \lineii{\var{s}.symmetric_difference(\var{t})}
99 {new set with elements in either \var{s} or \var{t} but not both}
100 \lineii{\var{s}.copy()}
101 {new set with a shallow copy of \var{s}}
102\end{tableii}
Raymond Hettinger584cb192002-08-23 15:18:38 +0000103
Tim Petersea76c982002-08-25 18:43:10 +0000104In addition, both \class{Set} and \class{ImmutableSet}
105support set to set comparisons. Two sets are equal if and only if
106every element of each set is contained in the other (each is a subset
107of the other).
108A set is less than another set if and only if the first set is a proper
109subset of the second set (is a subset, but is not equal).
110A set is greater than another set if and only if the first set is a proper
111superset of the second set (is a superset, but is not equal).
Raymond Hettinger584cb192002-08-23 15:18:38 +0000112
Raymond Hettinger3801ec72003-01-15 15:46:05 +0000113The subset and equality comparisons do not generalize to a complete
114ordering function. For example, any two disjoint sets are not equal and
115are not subsets of each other, so \emph{none} of the following are true:
116\code{\var{a}<\var{b}}, \code{\var{a}==\var{b}}, or \code{\var{a}>\var{b}}.
117Accordingly, sets do not implement the \method{__cmp__} method.
118
119Since sets only define partial ordering (subset relationships), the output
120of the \method{list.sort()} method is undefined for lists of sets.
121
Raymond Hettinger584cb192002-08-23 15:18:38 +0000122The following table lists operations available in \class{ImmutableSet}
123but not found in \class{Set}:
124
Fred Draked10c6c92002-08-23 17:22:36 +0000125\begin{tableii}{c|l|c}{code}{Operation}{Result}
126 \lineii{hash(\var{s})}{returns a hash value for \var{s}}
127\end{tableii}
Raymond Hettinger584cb192002-08-23 15:18:38 +0000128
129The following table lists operations available in \class{Set}
130but not found in \class{ImmutableSet}:
131
Fred Draked10c6c92002-08-23 17:22:36 +0000132\begin{tableii}{c|l}{code}{Operation}{Result}
133 \lineii{\var{s} |= \var{t}}
134 {return set \var{s} with elements added from \var{t}}
135 \lineii{\var{s}.union_update(\var{t})}
136 {return set \var{s} with elements added from \var{t}}
137 \lineii{\var{s} \&= \var{t}}
138 {return set \var{s} keeping only elements also found in \var{t}}
139 \lineii{\var{s}.intersection_update(\var{t})}
140 {return set \var{s} keeping only elements also found in \var{t}}
141 \lineii{\var{s} -= \var{t}}
142 {return set \var{s} after removing elements found in \var{t}}
143 \lineii{\var{s}.difference_update(\var{t})}
144 {return set \var{s} after removing elements found in \var{t}}
145 \lineii{\var{s} \textasciicircum= \var{t}}
146 {return set \var{s} with elements from \var{s} or \var{t}
147 but not both}
148 \lineii{\var{s}.symmetric_difference_update(\var{t})}
149 {return set \var{s} with elements from \var{s} or \var{t}
150 but not both}
Raymond Hettinger584cb192002-08-23 15:18:38 +0000151
152 \hline
Fred Draked10c6c92002-08-23 17:22:36 +0000153 \lineii{\var{s}.add(\var{x})}
Fred Drake2e3ae212003-01-06 15:50:32 +0000154 {add element \var{x} to set \var{s}}
Fred Draked10c6c92002-08-23 17:22:36 +0000155 \lineii{\var{s}.remove(\var{x})}
Fred Drake2e3ae212003-01-06 15:50:32 +0000156 {remove \var{x} from set \var{s}}
Fred Draked10c6c92002-08-23 17:22:36 +0000157 \lineii{\var{s}.discard(\var{x})}
Fred Drake2e3ae212003-01-06 15:50:32 +0000158 {removes \var{x} from set \var{s} if present}
Fred Draked10c6c92002-08-23 17:22:36 +0000159 \lineii{\var{s}.pop()}
Fred Drake2e3ae212003-01-06 15:50:32 +0000160 {remove and return an arbitrary element from \var{s}}
Fred Draked10c6c92002-08-23 17:22:36 +0000161 \lineii{\var{s}.update(\var{t})}
Fred Drake2e3ae212003-01-06 15:50:32 +0000162 {add elements from \var{t} to set \var{s}}
Fred Draked10c6c92002-08-23 17:22:36 +0000163 \lineii{\var{s}.clear()}
Fred Drake2e3ae212003-01-06 15:50:32 +0000164 {remove all elements from set \var{s}}
Fred Draked10c6c92002-08-23 17:22:36 +0000165\end{tableii}
Raymond Hettinger584cb192002-08-23 15:18:38 +0000166
167
Fred Drake2e3ae212003-01-06 15:50:32 +0000168\subsection{Example \label{set-example}}
Raymond Hettinger584cb192002-08-23 15:18:38 +0000169
170\begin{verbatim}
171>>> from sets import Set
172>>> engineers = Set(['John', 'Jane', 'Jack', 'Janice'])
173>>> programmers = Set(['Jack', 'Sam', 'Susan', 'Janice'])
174>>> management = Set(['Jane', 'Jack', 'Susan', 'Zack'])
175>>> employees = engineers | programmers | management # union
176>>> engineering_management = engineers & programmers # intersection
177>>> fulltime_management = management - engineers - programmers # difference
178>>> engineers.add('Marvin') # add element
179>>> print engineers
180Set(['Jane', 'Marvin', 'Janice', 'John', 'Jack'])
181>>> employees.issuperset(engineers) # superset test
182False
183>>> employees.update(engineers) # update from another set
184>>> employees.issuperset(engineers)
185True
186>>> for group in [engineers, programmers, management, employees]:
Raymond Hettinger3801ec72003-01-15 15:46:05 +0000187... group.discard('Susan') # unconditionally remove element
188... print group
189...
Raymond Hettinger584cb192002-08-23 15:18:38 +0000190Set(['Jane', 'Marvin', 'Janice', 'John', 'Jack'])
191Set(['Janice', 'Jack', 'Sam'])
192Set(['Jane', 'Zack', 'Jack'])
193Set(['Jack', 'Sam', 'Jane', 'Marvin', 'Janice', 'John', 'Zack'])
194\end{verbatim}
195
196
197\subsection{Protocol for automatic conversion to immutable
198 \label{immutable-transforms}}
199
200Sets can only contain immutable elements. For convenience, mutable
201\class{Set} objects are automatically copied to an \class{ImmutableSet}
202before being added as a set element.
203
Fred Draked10c6c92002-08-23 17:22:36 +0000204The mechanism is to always add a hashable element, or if it is not
205hashable, the element is checked to see if it has an
206\method{_as_immutable()} method which returns an immutable equivalent.
Raymond Hettinger584cb192002-08-23 15:18:38 +0000207
208Since \class{Set} objects have a \method{_as_immutable()} method
209returning an instance of \class{ImmutableSet}, it is possible to
210construct sets of sets.
211
212A similar mechanism is needed by the \method{__contains__()} and
213\method{remove()} methods which need to hash an element to check
214for membership in a set. Those methods check an element for hashability
Tim Petersb81b2522002-08-23 17:48:23 +0000215and, if not, check for a \method{_as_temporarily_immutable()} method
Raymond Hettinger584cb192002-08-23 15:18:38 +0000216which returns the element wrapped by a class that provides temporary
217methods for \method{__hash__()}, \method{__eq__()}, and \method{__ne__()}.
218
219The alternate mechanism spares the need to build a separate copy of
220the original mutable object.
221
Tim Petersb81b2522002-08-23 17:48:23 +0000222\class{Set} objects implement the \method{_as_temporarily_immutable()}
Fred Draked10c6c92002-08-23 17:22:36 +0000223method which returns the \class{Set} object wrapped by a new class
Raymond Hettinger584cb192002-08-23 15:18:38 +0000224\class{_TemporarilyImmutableSet}.
225
226The two mechanisms for adding hashability are normally invisible to the
227user; however, a conflict can arise in a multi-threaded environment
Raymond Hettingerfa8dd5f2002-08-23 18:10:54 +0000228where one thread is updating a set while another has temporarily wrapped it
Raymond Hettinger584cb192002-08-23 15:18:38 +0000229in \class{_TemporarilyImmutableSet}. In other words, sets of mutable sets
230are not thread-safe.