blob: 8ce62c88980efcc5920265b07a2a38ad2e228250 [file] [log] [blame]
Raymond Hettinger584cb192002-08-23 15:18:38 +00001\section{\module{sets} ---
2 Unordered collections of unique elements}
3
4\declaremodule{standard}{sets}
5\modulesynopsis{Implementation of sets of unique elements.}
6\moduleauthor{Greg V. Wilson}{gvwilson@nevex.com}
7\moduleauthor{Alex Martelli}{aleax@aleax.it}
8\moduleauthor{Guido van Rossum}{guido@python.org}
9\sectionauthor{Raymond D. Hettinger}{python@rcn.com}
10
11\versionadded{2.3}
12
13The \module{sets} module provides classes for constructing and manipulating
14unordered collections of unique elements. Common uses include membership
15testing, removing duplicates from a sequence, and computing standard math
16operations on sets such as intersection, union, difference, and symmetric
17difference.
18
Fred Draked10c6c92002-08-23 17:22:36 +000019Like other collections, sets support \code{\var{x} in \var{set}},
20\code{len(\var{set})}, and \code{for \var{x} in \var{set}}. Being an
21unordered collection, sets do not record element position or order of
22insertion. Accordingly, sets do not support indexing, slicing, or
23other sequence-like behavior.
Raymond Hettinger584cb192002-08-23 15:18:38 +000024
25Most set applications use the \class{Set} class which provides every set
26method except for \method{__hash__()}. For advanced applications requiring
27a hash method, the \class{ImmutableSet} class adds a \method{__hash__()}
28method but omits methods which alter the contents of the set. Both
29\class{Set} and \class{ImmutableSet} derive from \class{BaseSet}, an
30abstract class useful for determining whether something is a set:
Fred Draked10c6c92002-08-23 17:22:36 +000031\code{isinstance(\var{obj}, BaseSet)}.
Raymond Hettinger584cb192002-08-23 15:18:38 +000032
Fred Draked10c6c92002-08-23 17:22:36 +000033The set classes are implemented using dictionaries. As a result, sets
34cannot contain mutable elements such as lists or dictionaries.
35However, they can contain immutable collections such as tuples or
Raymond Hettingerfa8dd5f2002-08-23 18:10:54 +000036instances of \class{ImmutableSet}. For convenience in implementing
Fred Draked10c6c92002-08-23 17:22:36 +000037sets of sets, inner sets are automatically converted to immutable
38form, for example, \code{Set([Set(['dog'])])} is transformed to
Raymond Hettinger584cb192002-08-23 15:18:38 +000039\code{Set([ImmutableSet(['dog'])])}.
40
41\begin{classdesc}{Set}{\optional{iterable}}
42Constructs a new empty \class{Set} object. If the optional \var{iterable}
43parameter is supplied, updates the set with elements obtained from iteration.
44All of the elements in \var{iterable} should be immutable or be transformable
Fred Draked10c6c92002-08-23 17:22:36 +000045to an immutable using the protocol described in
46section~\ref{immutable-transforms}.
Raymond Hettinger584cb192002-08-23 15:18:38 +000047\end{classdesc}
48
49\begin{classdesc}{ImmutableSet}{\optional{iterable}}
50Constructs a new empty \class{ImmutableSet} object. If the optional
51\var{iterable} parameter is supplied, updates the set with elements obtained
52from iteration. All of the elements in \var{iterable} should be immutable or
Fred Draked10c6c92002-08-23 17:22:36 +000053be transformable to an immutable using the protocol described in
54section~\ref{immutable-transforms}.
Raymond Hettinger584cb192002-08-23 15:18:38 +000055
56Because \class{ImmutableSet} objects provide a \method{__hash__()} method,
57they can be used as set elements or as dictionary keys. \class{ImmutableSet}
58objects do not have methods for adding or removing elements, so all of the
59elements must be known when the constructor is called.
60\end{classdesc}
61
62
Fred Draked10c6c92002-08-23 17:22:36 +000063\subsection{Set Objects}
Raymond Hettinger584cb192002-08-23 15:18:38 +000064
65Instances of \class{Set} and \class{ImmutableSet} both provide
66the following operations:
67
Fred Draked10c6c92002-08-23 17:22:36 +000068\begin{tableii}{c|l}{code}{Operation}{Result}
69 \lineii{len(\var{s})}{cardinality of set \var{s}}
Raymond Hettinger584cb192002-08-23 15:18:38 +000070
71 \hline
Fred Draked10c6c92002-08-23 17:22:36 +000072 \lineii{\var{x} in \var{s}}
73 {test \var{x} for membership in \var{s}}
74 \lineii{\var{x} not in \var{s}}
75 {test \var{x} for non-membership in \var{s}}
76 \lineii{\var{s}.issubset(\var{t})}
77 {test whether every element in \var{s} is in \var{t}}
78 \lineii{\var{s}.issuperset(\var{t})}
79 {test whether every element in \var{t} is in \var{s}}
Raymond Hettinger584cb192002-08-23 15:18:38 +000080
81 \hline
Fred Draked10c6c92002-08-23 17:22:36 +000082 \lineii{\var{s} | \var{t}}
83 {new set with elements from both \var{s} and \var{t}}
84 \lineii{\var{s}.union(\var{t})}
85 {new set with elements from both \var{s} and \var{t}}
86 \lineii{\var{s} \&\ \var{t}}
87 {new set with elements common to \var{s} and \var{t}}
88 \lineii{\var{s}.intersection(\var{t})}
89 {new set with elements common to \var{s} and \var{t}}
90 \lineii{\var{s} - \var{t}}
91 {new set with elements in \var{s} but not in \var{t}}
92 \lineii{\var{s}.difference(\var{t})}
93 {new set with elements in \var{s} but not in \var{t}}
94 \lineii{\var{s} \textasciicircum\ \var{t}}
95 {new set with elements in either \var{s} or \var{t} but not both}
96 \lineii{\var{s}.symmetric_difference(\var{t})}
97 {new set with elements in either \var{s} or \var{t} but not both}
98 \lineii{\var{s}.copy()}
99 {new set with a shallow copy of \var{s}}
100\end{tableii}
Raymond Hettinger584cb192002-08-23 15:18:38 +0000101
102In addition to the above operations, both \class{Set} and \class{ImmutableSet}
Raymond Hettingere87ab3f2002-08-24 07:33:06 +0000103support set to set equality comparisons. Two sets are equal if and only if
104every element of each set is contained in the other.
Raymond Hettinger584cb192002-08-23 15:18:38 +0000105
106The following table lists operations available in \class{ImmutableSet}
107but not found in \class{Set}:
108
Fred Draked10c6c92002-08-23 17:22:36 +0000109\begin{tableii}{c|l|c}{code}{Operation}{Result}
110 \lineii{hash(\var{s})}{returns a hash value for \var{s}}
111\end{tableii}
Raymond Hettinger584cb192002-08-23 15:18:38 +0000112
113The following table lists operations available in \class{Set}
114but not found in \class{ImmutableSet}:
115
Fred Draked10c6c92002-08-23 17:22:36 +0000116\begin{tableii}{c|l}{code}{Operation}{Result}
117 \lineii{\var{s} |= \var{t}}
118 {return set \var{s} with elements added from \var{t}}
119 \lineii{\var{s}.union_update(\var{t})}
120 {return set \var{s} with elements added from \var{t}}
121 \lineii{\var{s} \&= \var{t}}
122 {return set \var{s} keeping only elements also found in \var{t}}
123 \lineii{\var{s}.intersection_update(\var{t})}
124 {return set \var{s} keeping only elements also found in \var{t}}
125 \lineii{\var{s} -= \var{t}}
126 {return set \var{s} after removing elements found in \var{t}}
127 \lineii{\var{s}.difference_update(\var{t})}
128 {return set \var{s} after removing elements found in \var{t}}
129 \lineii{\var{s} \textasciicircum= \var{t}}
130 {return set \var{s} with elements from \var{s} or \var{t}
131 but not both}
132 \lineii{\var{s}.symmetric_difference_update(\var{t})}
133 {return set \var{s} with elements from \var{s} or \var{t}
134 but not both}
Raymond Hettinger584cb192002-08-23 15:18:38 +0000135
136 \hline
Fred Draked10c6c92002-08-23 17:22:36 +0000137 \lineii{\var{s}.add(\var{x})}
138 {Add element \var{x} to set \var{s}}
139 \lineii{\var{s}.remove(\var{x})}
140 {Remove element \var{x} from set \var{s}}
141 \lineii{\var{s}.discard(\var{x})}
Raymond Hettingerfa8dd5f2002-08-23 18:10:54 +0000142 {Removes element \var{x} from set \var{s}. Like \var{s}.remove(\var{x})
143 but does not raise KeyError if \var{x} is not in \var{s}}
Fred Draked10c6c92002-08-23 17:22:36 +0000144 \lineii{\var{s}.pop()}
Tim Peters54fd3e62002-08-23 17:45:43 +0000145 {Remove and return an element from \var{s}; no guarantee is
146 made about which element is removed}
Fred Draked10c6c92002-08-23 17:22:36 +0000147 \lineii{\var{s}.update(\var{t})}
148 {Add elements from \var{t} to set \var{s}}
149 \lineii{\var{s}.clear()}
150 {Remove all elements from set \var{s}}
151\end{tableii}
Raymond Hettinger584cb192002-08-23 15:18:38 +0000152
153
154\subsection{Example}
155
156\begin{verbatim}
157>>> from sets import Set
158>>> engineers = Set(['John', 'Jane', 'Jack', 'Janice'])
159>>> programmers = Set(['Jack', 'Sam', 'Susan', 'Janice'])
160>>> management = Set(['Jane', 'Jack', 'Susan', 'Zack'])
161>>> employees = engineers | programmers | management # union
162>>> engineering_management = engineers & programmers # intersection
163>>> fulltime_management = management - engineers - programmers # difference
164>>> engineers.add('Marvin') # add element
165>>> print engineers
166Set(['Jane', 'Marvin', 'Janice', 'John', 'Jack'])
167>>> employees.issuperset(engineers) # superset test
168False
169>>> employees.update(engineers) # update from another set
170>>> employees.issuperset(engineers)
171True
172>>> for group in [engineers, programmers, management, employees]:
173 group.discard('Susan') # unconditionally remove element
174 print group
175
176Set(['Jane', 'Marvin', 'Janice', 'John', 'Jack'])
177Set(['Janice', 'Jack', 'Sam'])
178Set(['Jane', 'Zack', 'Jack'])
179Set(['Jack', 'Sam', 'Jane', 'Marvin', 'Janice', 'John', 'Zack'])
180\end{verbatim}
181
182
183\subsection{Protocol for automatic conversion to immutable
184 \label{immutable-transforms}}
185
186Sets can only contain immutable elements. For convenience, mutable
187\class{Set} objects are automatically copied to an \class{ImmutableSet}
188before being added as a set element.
189
Fred Draked10c6c92002-08-23 17:22:36 +0000190The mechanism is to always add a hashable element, or if it is not
191hashable, the element is checked to see if it has an
192\method{_as_immutable()} method which returns an immutable equivalent.
Raymond Hettinger584cb192002-08-23 15:18:38 +0000193
194Since \class{Set} objects have a \method{_as_immutable()} method
195returning an instance of \class{ImmutableSet}, it is possible to
196construct sets of sets.
197
198A similar mechanism is needed by the \method{__contains__()} and
199\method{remove()} methods which need to hash an element to check
200for membership in a set. Those methods check an element for hashability
Tim Petersb81b2522002-08-23 17:48:23 +0000201and, if not, check for a \method{_as_temporarily_immutable()} method
Raymond Hettinger584cb192002-08-23 15:18:38 +0000202which returns the element wrapped by a class that provides temporary
203methods for \method{__hash__()}, \method{__eq__()}, and \method{__ne__()}.
204
205The alternate mechanism spares the need to build a separate copy of
206the original mutable object.
207
Tim Petersb81b2522002-08-23 17:48:23 +0000208\class{Set} objects implement the \method{_as_temporarily_immutable()}
Fred Draked10c6c92002-08-23 17:22:36 +0000209method which returns the \class{Set} object wrapped by a new class
Raymond Hettinger584cb192002-08-23 15:18:38 +0000210\class{_TemporarilyImmutableSet}.
211
212The two mechanisms for adding hashability are normally invisible to the
213user; however, a conflict can arise in a multi-threaded environment
Raymond Hettingerfa8dd5f2002-08-23 18:10:54 +0000214where one thread is updating a set while another has temporarily wrapped it
Raymond Hettinger584cb192002-08-23 15:18:38 +0000215in \class{_TemporarilyImmutableSet}. In other words, sets of mutable sets
216are not thread-safe.