blob: 2ed3ab27e5df2575024b4976329b767237fbc421 [file] [log] [blame]
Fred Drakeed0fa3d2003-07-30 19:14:09 +00001\documentclass{howto}
2\usepackage{distutils}
3% $Id$
4
5\title{What's New in Python 2.4}
6\release{0.0}
7\author{A.M.\ Kuchling}
8\authoraddress{\email{amk@amk.ca}}
9
10\begin{document}
11\maketitle
12\tableofcontents
13
14This article explains the new features in Python 2.4. No release date
Raymond Hettingerd4462302003-11-26 17:52:45 +000015for Python 2.4 has been set; expect that this will happen mid-2004.
Fred Drakeed0fa3d2003-07-30 19:14:09 +000016
17While Python 2.3 was primarily a library development release, Python
182.4 may extend the core language and interpreter in
19as-yet-undetermined ways.
20
21This article doesn't attempt to provide a complete specification of
22the new features, but instead provides a convenient overview. For
Andrew M. Kuchling35f2b052003-12-18 13:28:13 +000023full details, you should refer to the documentation for Python 2.4,
24such as the \citetitle[../lib/lib.html]{Python Library Reference} and
25the \citetitle[../ref/ref.html]{Python Reference Manual}.
Fred Drakeed0fa3d2003-07-30 19:14:09 +000026If you want to understand the complete implementation and design
27rationale, refer to the PEP for a particular new feature.
28
Andrew M. Kuchling35f2b052003-12-18 13:28:13 +000029
Raymond Hettinger7e0282f2003-11-24 07:14:54 +000030%======================================================================
31\section{PEP 218: Built-In Set Objects}
32
33Two new built-in types, \function{set(iterable)} and
34\function{frozenset(iterable)} provide high speed data types for
35membership testing, for eliminating duplicates from sequences, and
36for mathematical operations like unions, intersections, differences,
37and symmetric differences.
38
39\begin{verbatim}
40>>> a = set('abracadabra') # form a set from a string
41>>> 'z' in a # fast membership testing
42False
43>>> a # unique letters in a
44set(['a', 'r', 'b', 'c', 'd'])
45>>> ''.join(a) # convert back into a string
46'arbcd'
Raymond Hettingerd4462302003-11-26 17:52:45 +000047
Raymond Hettinger7e0282f2003-11-24 07:14:54 +000048>>> b = set('alacazam') # form a second set
49>>> a - b # letters in a but not in b
50set(['r', 'd', 'b'])
51>>> a | b # letters in either a or b
52set(['a', 'c', 'r', 'd', 'b', 'm', 'z', 'l'])
53>>> a & b # letters in both a and b
54set(['a', 'c'])
55>>> a ^ b # letters in a or b but not both
56set(['r', 'd', 'b', 'm', 'z', 'l'])
Raymond Hettingerd4462302003-11-26 17:52:45 +000057
Raymond Hettinger7e0282f2003-11-24 07:14:54 +000058>>> a.add('z') # add a new element
59>>> a.update('wxy') # add multiple new elements
60>>> a
61set(['a', 'c', 'b', 'd', 'r', 'w', 'y', 'x', 'z'])
62>>> a.remove('x') # take one element out
63>>> a
64set(['a', 'c', 'b', 'd', 'r', 'w', 'y', 'z'])
65\end{verbatim}
66
67The type \function{frozenset()} is an immutable version of \function{set()}.
68Since it is immutable and hashable, it may be used as a dictionary key or
69as a member of another set. Accordingly, it does not have methods
70like \method{add()} and \method{remove()} which could alter its contents.
71
Andrew M. Kuchling35f2b052003-12-18 13:28:13 +000072% XXX what happens to the sets module?
73
Raymond Hettinger7e0282f2003-11-24 07:14:54 +000074\begin{seealso}
75\seepep{218}{Adding a Built-In Set Object Type}{Originally proposed by
76Greg Wilson and ultimately implemented by Raymond Hettinger.}
77\end{seealso}
Fred Drakeed0fa3d2003-07-30 19:14:09 +000078
79%======================================================================
Andrew M. Kuchling35f2b052003-12-18 13:28:13 +000080\section{PEP 237: Unifying Long Integers and Integers}
81
82XXX write this.
83
84%======================================================================
Andrew M. Kuchling1a420252003-11-08 15:58:49 +000085\section{PEP 322: Reverse Iteration}
Fred Drakeed0fa3d2003-07-30 19:14:09 +000086
Andrew M. Kuchling1a420252003-11-08 15:58:49 +000087A new built-in function, \function{reversed(seq)}, takes a sequence
88and returns an iterator that returns the elements of the sequence
89in reverse order.
90
91\begin{verbatim}
Raymond Hettingerbc3cba22003-11-12 16:39:30 +000092>>> for i in reversed(xrange(1,4)):
Andrew M. Kuchling1a420252003-11-08 15:58:49 +000093... print i
94...
953
962
971
98\end{verbatim}
99
Raymond Hettingerbc3cba22003-11-12 16:39:30 +0000100Compared to extended slicing, \code{range(1,4)[::-1]}, \function{reversed()}
101is easier to read, runs faster, and uses substantially less memory.
102
Andrew M. Kuchling1a420252003-11-08 15:58:49 +0000103Note that \function{reversed()} only accepts sequences, not arbitrary
Raymond Hettingerbc3cba22003-11-12 16:39:30 +0000104iterators. If you want to reverse an iterator, first convert it to
105a list with \function{list()}.
Andrew M. Kuchling1a420252003-11-08 15:58:49 +0000106
107\begin{verbatim}
108>>> input = open('/etc/passwd', 'r')
109>>> for line in reversed(list(input)):
110... print line
111...
112root:*:0:0:System Administrator:/var/root:/bin/tcsh
113 ...
114\end{verbatim}
Fred Drakeed0fa3d2003-07-30 19:14:09 +0000115
Andrew M. Kuchlingf7a6b672003-11-08 16:05:37 +0000116\begin{seealso}
117\seepep{322}{Reverse Iteration}{Written and implemented by Raymond Hettinger.}
118
119\end{seealso}
120
Fred Drakeed0fa3d2003-07-30 19:14:09 +0000121
122%======================================================================
123\section{Other Language Changes}
124
125Here are all of the changes that Python 2.4 makes to the core Python
126language.
127
128\begin{itemize}
Raymond Hettingerd4462302003-11-26 17:52:45 +0000129
130\item The string methods, \method{ljust()}, \method{rjust()}, and
Andrew M. Kuchling67087562003-11-26 18:03:48 +0000131\method{center()} now take an optional argument for specifying a
Raymond Hettingerd4462302003-11-26 17:52:45 +0000132fill character other than a space.
133
Andrew M. Kuchling35f2b052003-12-18 13:28:13 +0000134\item Strings also gained an \method{rsplit()} method that
135works like the \method{split()} method but splits from the end of the string.
136
137\begin{verbatim}
138>>> 'a b c'.split(None, 1)
139['a', 'b c']
140>>> 'a b c'.rsplit(None, 1)
141['a b', 'c']
142\end{verbatim}
143
Andrew M. Kuchling2fb4d512003-10-21 12:31:16 +0000144\item The \method{sort()} method of lists gained three keyword
145arguments, \var{cmp}, \var{key}, and \var{reverse}. These arguments
146make some common usages of \method{sort()} simpler. All are optional.
147
148\var{cmp} is the same as the previous single argument to
149\method{sort()}; if provided, the value should be a comparison
150function that takes two arguments and returns -1, 0, or +1 depending
151on how the arguments compare.
152
153\var{key} should be a single-argument function that takes a list
154element and returns a comparison key for the element. The list is
Raymond Hettinger607c00f2003-11-12 16:27:50 +0000155then sorted using the comparison keys. The following example sorts a
156list case-insensitively:
Andrew M. Kuchling2fb4d512003-10-21 12:31:16 +0000157
158\begin{verbatim}
159>>> L = ['A', 'b', 'c', 'D']
160>>> L.sort() # Case-sensitive sort
161>>> L
162['A', 'D', 'b', 'c']
163>>> L.sort(key=lambda x: x.lower())
164>>> L
165['A', 'b', 'c', 'D']
166>>> L.sort(cmp=lambda x,y: cmp(x.lower(), y.lower()))
167>>> L
168['A', 'b', 'c', 'D']
169\end{verbatim}
170
171The last example, which uses the \var{cmp} parameter, is the old way
172to perform a case-insensitive sort. It works, but is slower than
173using a \var{key} parameter. Using \var{key} results in calling the
174\method{lower()} method once for each element in the list while using
175\var{cmp} will call the method twice for each comparison.
176
Andrew M. Kuchling981a9182003-11-13 21:33:26 +0000177For simple key functions and comparison functions, it is often
178possible to avoid a \keyword{lambda} expression by using an unbound
Raymond Hettinger607c00f2003-11-12 16:27:50 +0000179method instead. For example, the above case-insensitive sort is best
180coded as:
181
182\begin{verbatim}
183>>> L.sort(key=str.lower)
184>>> L
185['A', 'b', 'c', 'D']
186\end{verbatim}
187
Andrew M. Kuchling2fb4d512003-10-21 12:31:16 +0000188The \var{reverse} parameter should have a Boolean value. If the value is
189\constant{True}, the list will be sorted into reverse order. Instead
Raymond Hettinger607c00f2003-11-12 16:27:50 +0000190of \code{L.sort(lambda x,y: cmp(y.score, x.score))}, you can now write:
191\code{L.sort(key = lambda x: x.score, reverse=True)}.
Fred Drakeed0fa3d2003-07-30 19:14:09 +0000192
Andrew M. Kuchling981a9182003-11-13 21:33:26 +0000193The results of sorting are now guaranteed to be stable. This means
194that two entries with equal keys will be returned in the same order as
195they were input. For example, you can sort a list of people by name,
196and then sort the list by age, resulting in a list sorted by age where
197people with the same age are in name-sorted order.
Raymond Hettinger607c00f2003-11-12 16:27:50 +0000198
Andrew M. Kuchling35f2b052003-12-18 13:28:13 +0000199\item There is a new built-in function \function{sorted(iterable)} that works
Raymond Hettinger64958a12003-12-17 20:43:33 +0000200like the in-place \method{list.sort()} method but has been made suitable
201for use in expressions. The differences are:
Raymond Hettinger607c00f2003-11-12 16:27:50 +0000202 \begin{itemize}
Raymond Hettinger7d1dd042003-11-12 16:42:10 +0000203 \item the input may be any iterable;
204 \item a newly formed copy is sorted, leaving the original intact; and
Raymond Hettinger607c00f2003-11-12 16:27:50 +0000205 \item the expression returns the new sorted copy
206 \end{itemize}
Andrew M. Kuchling1a420252003-11-08 15:58:49 +0000207
208\begin{verbatim}
209>>> L = [9,7,8,3,2,4,1,6,5]
Raymond Hettinger64958a12003-12-17 20:43:33 +0000210>>> [10+i for i in sorted(L)] # usable in a list comprehension
Raymond Hettinger607c00f2003-11-12 16:27:50 +0000211[11, 12, 13, 14, 15, 16, 17, 18, 19]
212>>> L = [9,7,8,3,2,4,1,6,5] # original is left unchanged
213[9,7,8,3,2,4,1,6,5]
Raymond Hettingerd4462302003-11-26 17:52:45 +0000214
Raymond Hettinger64958a12003-12-17 20:43:33 +0000215>>> sorted('Monte Python') # any iterable may be an input
Raymond Hettinger607c00f2003-11-12 16:27:50 +0000216[' ', 'M', 'P', 'e', 'h', 'n', 'n', 'o', 'o', 't', 't', 'y']
Raymond Hettingerd4462302003-11-26 17:52:45 +0000217
218>>> # List the contents of a dict sorted by key values
Raymond Hettinger607c00f2003-11-12 16:27:50 +0000219>>> colormap = dict(red=1, blue=2, green=3, black=4, yellow=5)
Raymond Hettinger64958a12003-12-17 20:43:33 +0000220>>> for k, v in sorted(colormap.iteritems()):
Raymond Hettinger607c00f2003-11-12 16:27:50 +0000221... print k, v
222...
223black 4
224blue 2
225green 3
226red 1
227yellow 5
228
Andrew M. Kuchling1a420252003-11-08 15:58:49 +0000229\end{verbatim}
230
Raymond Hettinger607c00f2003-11-12 16:27:50 +0000231\item The \function{zip()} built-in function and \function{itertools.izip()}
Andrew M. Kuchling67087562003-11-26 18:03:48 +0000232 now return an empty list instead of raising a \exception{TypeError}
Raymond Hettinger607c00f2003-11-12 16:27:50 +0000233 exception if called with no arguments. This makes the functions more
234 suitable for use with variable length argument lists:
235
236\begin{verbatim}
237>>> def transpose(array):
238... return zip(*array)
239...
240>>> transpose([(1,2,3), (4,5,6)])
241[(1, 4), (2, 5), (3, 6)]
242>>> transpose([])
243[]
244\end{verbatim}
Andrew M. Kuchling6aedcfc2003-10-21 12:48:23 +0000245
Fred Drakeed0fa3d2003-07-30 19:14:09 +0000246\end{itemize}
247
248
249%======================================================================
250\subsection{Optimizations}
251
252\begin{itemize}
253
254\item Optimizations should be described here.
255
256\end{itemize}
257
258The net result of the 2.4 optimizations is that Python 2.4 runs the
259pystone benchmark around XX\% faster than Python 2.3 and YY\% faster
260than Python 2.2.
261
262
263%======================================================================
264\section{New, Improved, and Deprecated Modules}
265
266As usual, Python's standard library received a number of enhancements and
267bug fixes. Here's a partial list of the most notable changes, sorted
268alphabetically by module name. Consult the
269\file{Misc/NEWS} file in the source tree for a more
270complete list of changes, or look through the CVS logs for all the
271details.
272
273\begin{itemize}
274
Andrew M. Kuchling69f31eb2003-08-13 23:11:04 +0000275\item The \module{curses} modules now supports the ncurses extension
276 \function{use_default_colors()}. On platforms where the terminal
277 supports transparency, this makes it possible to use a transparent background.
278 (Contributed by J\"org Lehmann.)
Andrew M. Kuchling6aedcfc2003-10-21 12:48:23 +0000279
Raymond Hettinger607c00f2003-11-12 16:27:50 +0000280\item The \module{heapq} module has been converted to C. The resulting
281 ten-fold improvement in speed makes the module suitable for handling
282 high volumes of data.
Andrew M. Kuchling1a420252003-11-08 15:58:49 +0000283
Andrew M. Kuchlingdff9dbd2003-11-20 22:22:19 +0000284\item The \module{imaplib} module now supports IMAP's THREAD command.
285(Contributed by Yves Dionne.)
286
Andrew M. Kuchlingad809552003-12-06 23:19:23 +0000287\item The \module{itertools} module gained a
288 \function{groupby(\var{iterable}\optional{, \var{func}})} function,
289 inspired by the GROUP BY clause from SQL.
290 \var{iterable} returns a succession of elements, and the optional
291 \var{func} is a function that takes an element and returns a key
292 value; if omitted, the key is simply the element itself.
293 \function{groupby()} then groups the elements into subsequences
294 which have matching values of the key, and returns a series of 2-tuples
295 containing the key value and an iterator over the subsequence.
296
297Here's an example. The \var{key} function simply returns whether a
298number is even or odd, so the result of \function{groupby()} is to
299return consecutive runs of odd or even numbers.
300
301\begin{verbatim}
302>>> import itertools
303>>> L = [2,4,6, 7,8,9,11, 12, 14]
304>>> for key_val, it in itertools.groupby(L, lambda x: x % 2):
305... print key_val, list(it)
306...
3070 [2, 4, 6]
3081 [7]
3090 [8]
3101 [9, 11]
3110 [12, 14]
312>>>
313\end{verbatim}
314
Raymond Hettingerfeb78c92003-12-12 13:13:47 +0000315Like its SQL counterpart, \function{groupby()} is typically used with
316sorted input. The logic for \function{groupby()} is similar to the
317\UNIX{} \code{uniq} filter which makes it handy for eliminating,
318counting, or identifying duplicate elements:
319
320\begin{verbatim}
321>>> word = 'abracadabra'
Raymond Hettinger64958a12003-12-17 20:43:33 +0000322>>> letters = sorted(word) # Turn string into sorted list of letters
323>>> letters
Andrew M. Kuchling4612bc52003-12-16 20:59:37 +0000324['a', 'a', 'a', 'a', 'a', 'b', 'b', 'c', 'd', 'r', 'r']
Raymond Hettinger64958a12003-12-17 20:43:33 +0000325>>> [k for k, g in groupby(word)] # List unique letters
Raymond Hettingerfeb78c92003-12-12 13:13:47 +0000326['a', 'b', 'c', 'd', 'r']
Raymond Hettinger64958a12003-12-17 20:43:33 +0000327>>> [(k, len(list(g))) for k, g in groupby(word)] # Count letter occurences
Raymond Hettingerfeb78c92003-12-12 13:13:47 +0000328[('a', 5), ('b', 2), ('c', 1), ('d', 1), ('r', 2)]
Raymond Hettinger64958a12003-12-17 20:43:33 +0000329>>> [k for k, g in groupby(word) if len(list(g)) > 1] # List duplicate letters
Raymond Hettingerfeb78c92003-12-12 13:13:47 +0000330['a', 'b', 'r']
331\end{verbatim}
332
Andrew M. Kuchling35f2b052003-12-18 13:28:13 +0000333\item \module{itertools} also gained a function named \function{tee(\var{iterator}, \var{N})} that returns \var{N} independent iterators
334that replicate \var{iterator}. If \var{N} is omitted, the default is
3352.
336
337\begin{verbatim}
338>>> L = [1,2,3]
339>>> i1, i2 = itertools.tee(L)
340>>> i1,i2
341(<itertools.tee object at 0x402c2080>, <itertools.tee object at 0x402c2090>)
342>>> list(i1)
343[1, 2, 3]
344>>> list(i2)
345[1, 2, 3]
346>\end{verbatim}
347
348Note that \function{tee()} has to keep copies of the values returned
349by the iterator; in the worst case it may need to keep all of them.
350This should therefore be used carefully if \var{iterator}
351returns a very large stream of results.
352
Andrew M. Kuchlingdff9dbd2003-11-20 22:22:19 +0000353\item A new \function{getsid()} function was added to the
354\module{posix} module that underlies the \module{os} module.
355(Contributed by J. Raynor.)
356
Andrew M. Kuchling35f2b052003-12-18 13:28:13 +0000357\item The \module{operator} module gained two new functions,
358\function{attrgetter(\var{attr})} and \function{itemgetter(\var{index})}.
359Both functions return callables that take a single argument and return
360the corresponding attribute or item; these callables are handy for use
361with \function{map()} or \function{list.sort()}. For example, here's a simple
362us
363
364\begin{verbatim}
365>>> L = [('c', 2), ('d', 1), ('a', '4'), ('b', 3)]
366>>> map(operator.itemgetter(0), L)
367['c', 'd', 'a', 'b']
368>>> map(operator.itemgetter(1), L)
369[2, 1, '4', 3]
370>>> L.sort(key=operator.itemgetter(1)) # Sort list by second item in tuples
371>>> L
372[('d', 1), ('c', 2), ('b', 3), ('a', '4')]
373\end{verbatim}
374
Andrew M. Kuchling6aedcfc2003-10-21 12:48:23 +0000375\item The \module{random} module has a new method called \method{getrandbits(N)}
Raymond Hettinger607c00f2003-11-12 16:27:50 +0000376 which returns an N-bit long integer. This method supports the existing
377 \method{randrange()} method, making it possible to efficiently generate
378 arbitrarily large random numbers (suitable for prime number generation in
379 RSA applications).
Andrew M. Kuchling6aedcfc2003-10-21 12:48:23 +0000380
381\item The regular expression language accepted by the \module{re} module
382 was extended with simple conditional expressions, written as
383 \code{(?(\var{group})\var{A}|\var{B})}. \var{group} is either a
384 numeric group ID or a group name defined with \code{(?P<group>...)}
385 earlier in the expression. If the specified group matched, the
386 regular expression pattern \var{A} will be tested against the string; if
387 the group didn't match, the pattern \var{B} will be used instead.
Andrew M. Kuchling69f31eb2003-08-13 23:11:04 +0000388
Fred Drakeed0fa3d2003-07-30 19:14:09 +0000389\end{itemize}
390
391
392%======================================================================
393% whole new modules get described in \subsections here
394
395
396% ======================================================================
397\section{Build and C API Changes}
398
399Changes to Python's build process and to the C API include:
400
401\begin{itemize}
402
Andrew M. Kuchling6aedcfc2003-10-21 12:48:23 +0000403 \item Three new convenience macros were added for common return
404 values from extension functions: \csimplemacro{Py_RETURN_NONE},
405 \csimplemacro{Py_RETURN_TRUE}, and \csimplemacro{Py_RETURN_FALSE}.
406
407 \item A new function, \cfunction{PyTuple_Pack(N, obj1, obj2, ...,
408 objN)}, constructs tuples from a variable length argument list of
409 Python objects.
Fred Drakeed0fa3d2003-07-30 19:14:09 +0000410
Andrew M. Kuchling2ce1d472003-11-26 18:05:26 +0000411 \item A new function, \cfunction{PyDict_Contains(d, k)}, implements
412 fast dictionary lookups without masking exceptions raised during the
413 look-up process.
Raymond Hettingerd4462302003-11-26 17:52:45 +0000414
Fred Drakeed0fa3d2003-07-30 19:14:09 +0000415\end{itemize}
416
417
418%======================================================================
419\subsection{Port-Specific Changes}
420
421Platform-specific changes go here.
422
423
424%======================================================================
425\section{Other Changes and Fixes \label{section-other}}
426
427As usual, there were a bunch of other improvements and bugfixes
428scattered throughout the source tree. A search through the CVS change
429logs finds there were XXX patches applied and YYY bugs fixed between
430Python 2.3 and 2.4. Both figures are likely to be underestimates.
431
432Some of the more notable changes are:
433
434\begin{itemize}
435
436\item Details go here.
437
438\end{itemize}
439
440
441%======================================================================
442\section{Porting to Python 2.4}
443
444This section lists previously described changes that may require
445changes to your code:
446
447\begin{itemize}
448
Raymond Hettinger607c00f2003-11-12 16:27:50 +0000449\item The \function{zip()} built-in function and \function{itertools.izip()}
450 now return an empty list instead of raising a \exception{TypeError}
451 exception if called with no arguments.
Andrew M. Kuchling6aedcfc2003-10-21 12:48:23 +0000452
453\item \function{dircache.listdir()} now passes exceptions to the caller
454 instead of returning empty lists.
Fred Drakeed0fa3d2003-07-30 19:14:09 +0000455
456\end{itemize}
457
458
459%======================================================================
460\section{Acknowledgements \label{acks}}
461
462The author would like to thank the following people for offering
463suggestions, corrections and assistance with various drafts of this
Andrew M. Kuchling981a9182003-11-13 21:33:26 +0000464article: Raymond Hettinger.
Fred Drakeed0fa3d2003-07-30 19:14:09 +0000465
466\end{document}