blob: 58a0a3c07d45b9281f4732ce612f29d798795d8d [file] [log] [blame]
Fred Drakeed0fa3d2003-07-30 19:14:09 +00001\documentclass{howto}
2\usepackage{distutils}
3% $Id$
4
Andrew M. Kuchlingc8f8a812004-07-04 01:26:42 +00005% Don't write extensive text for new sections; I'll do that.
6% Feel free to add commented-out reminders of things that need
7% to be covered. --amk
8
Andrew M. Kuchlingd0b6d9d2004-07-04 15:35:00 +00009% XXX pydoc can display links to module docs -- but when?
10%
11
Fred Drakeed0fa3d2003-07-30 19:14:09 +000012\title{What's New in Python 2.4}
Andrew M. Kuchling89ba1ff2004-07-14 21:56:19 +000013\release{0.2}
Fred Drakeed0fa3d2003-07-30 19:14:09 +000014\author{A.M.\ Kuchling}
Fred Drakeb914ef02004-01-02 06:57:50 +000015\authoraddress{
16 \strong{Python Software Foundation}\\
17 Email: \email{amk@amk.ca}
18}
Fred Drakeed0fa3d2003-07-30 19:14:09 +000019
20\begin{document}
21\maketitle
22\tableofcontents
23
Andrew M. Kuchling89ba1ff2004-07-14 21:56:19 +000024This article explains the new features in Python 2.4 alpha2, scheduled
25for release in late July 2004. The final version of Python 2.4 is
Andrew M. Kuchling3bf85f12004-07-05 01:37:07 +000026expected to be released around September 2004.
Fred Drakeed0fa3d2003-07-30 19:14:09 +000027
Andrew M. Kuchling3bf85f12004-07-05 01:37:07 +000028Python 2.4 is a medium-sized release. It doesn't introduce as many
Andrew M. Kuchling3b790912004-07-04 16:39:40 +000029changes as the radical Python 2.2, but introduces more features than
30the conservative 2.3 release did. The most significant new language
31feature (as of this writing) is the addition of generator expressions;
Andrew M. Kuchling3bf85f12004-07-05 01:37:07 +000032most other changes are to the standard library.
Fred Drakeed0fa3d2003-07-30 19:14:09 +000033
34This article doesn't attempt to provide a complete specification of
Andrew M. Kuchling3b790912004-07-04 16:39:40 +000035every single new feature, but instead provides a convenient overview.
36For full details, you should refer to the documentation for Python
372.4, such as the \citetitle[../lib/lib.html]{Python Library Reference}
38and the \citetitle[../ref/ref.html]{Python Reference Manual}. If you
39want to understand the complete implementation and design rationale,
40refer to the PEP for a particular new feature or to the module
41documentation.
Fred Drakeed0fa3d2003-07-30 19:14:09 +000042
Andrew M. Kuchling35f2b052003-12-18 13:28:13 +000043
Raymond Hettinger7e0282f2003-11-24 07:14:54 +000044%======================================================================
45\section{PEP 218: Built-In Set Objects}
46
Andrew M. Kuchling3bf85f12004-07-05 01:37:07 +000047Python 2.3 introduced the \module{sets} module. C implementations of
48set data types have now been added to the Python core as two new
49built-in types, \function{set(\var{iterable})} and
50\function{frozenset(\var{iterable})}. They provide high speed
51operations for membership testing, for eliminating duplicates from
52sequences, and for mathematical operations like unions, intersections,
53differences, and symmetric differences.
Raymond Hettinger7e0282f2003-11-24 07:14:54 +000054
55\begin{verbatim}
56>>> a = set('abracadabra') # form a set from a string
57>>> 'z' in a # fast membership testing
58False
59>>> a # unique letters in a
60set(['a', 'r', 'b', 'c', 'd'])
61>>> ''.join(a) # convert back into a string
62'arbcd'
Raymond Hettingerd4462302003-11-26 17:52:45 +000063
Raymond Hettinger7e0282f2003-11-24 07:14:54 +000064>>> b = set('alacazam') # form a second set
65>>> a - b # letters in a but not in b
66set(['r', 'd', 'b'])
67>>> a | b # letters in either a or b
68set(['a', 'c', 'r', 'd', 'b', 'm', 'z', 'l'])
69>>> a & b # letters in both a and b
70set(['a', 'c'])
71>>> a ^ b # letters in a or b but not both
72set(['r', 'd', 'b', 'm', 'z', 'l'])
Raymond Hettingerd4462302003-11-26 17:52:45 +000073
Raymond Hettinger7e0282f2003-11-24 07:14:54 +000074>>> a.add('z') # add a new element
75>>> a.update('wxy') # add multiple new elements
76>>> a
77set(['a', 'c', 'b', 'd', 'r', 'w', 'y', 'x', 'z'])
78>>> a.remove('x') # take one element out
79>>> a
80set(['a', 'c', 'b', 'd', 'r', 'w', 'y', 'z'])
81\end{verbatim}
82
Andrew M. Kuchling3bf85f12004-07-05 01:37:07 +000083The \function{frozenset} type is an immutable version of \function{set}.
Raymond Hettinger7e0282f2003-11-24 07:14:54 +000084Since it is immutable and hashable, it may be used as a dictionary key or
Andrew M. Kuchling3bf85f12004-07-05 01:37:07 +000085as a member of another set.
Raymond Hettinger7e0282f2003-11-24 07:14:54 +000086
Andrew M. Kuchling3bf85f12004-07-05 01:37:07 +000087The \module{sets} module remains in the standard library, and may be
88useful if you wish to subclass the \class{Set} or \class{ImmutableSet}
89classes. There are currently no plans to deprecate the module.
Andrew M. Kuchling35f2b052003-12-18 13:28:13 +000090
Raymond Hettinger7e0282f2003-11-24 07:14:54 +000091\begin{seealso}
92\seepep{218}{Adding a Built-In Set Object Type}{Originally proposed by
93Greg Wilson and ultimately implemented by Raymond Hettinger.}
94\end{seealso}
Fred Drakeed0fa3d2003-07-30 19:14:09 +000095
96%======================================================================
Andrew M. Kuchling35f2b052003-12-18 13:28:13 +000097\section{PEP 237: Unifying Long Integers and Integers}
98
Andrew M. Kuchling3bf85f12004-07-05 01:37:07 +000099The lengthy transition process for this PEP, begun in Python 2.2,
Andrew M. Kuchlingd4be86c2004-07-04 01:44:04 +0000100takes another step forward in Python 2.4. In 2.3, certain integer
101operations that would behave differently after int/long unification
102triggered \exception{FutureWarning} warnings and returned values
Andrew M. Kuchling3bf85f12004-07-05 01:37:07 +0000103limited to 32 or 64 bits (depending on your platform). In 2.4, these
104expressions no longer produce a warning and instead produce a
105different result that's usually a long integer.
Andrew M. Kuchlingd4be86c2004-07-04 01:44:04 +0000106
107The problematic expressions are primarily left shifts and lengthy
Raymond Hettingerca1a7752004-07-12 13:00:45 +0000108hexadecimal and octal constants. For example,
109\code{2 \textless{}\textless{} 32} results
Andrew M. Kuchling3bf85f12004-07-05 01:37:07 +0000110in a warning in 2.3, evaluating to 0 on 32-bit platforms. In Python
1112.4, this expression now returns the correct answer, 8589934592.
Andrew M. Kuchlingd4be86c2004-07-04 01:44:04 +0000112
113\begin{seealso}
114\seepep{237}{Unifying Long Integers and Integers}{Original PEP
Andrew M. Kuchling3bf85f12004-07-05 01:37:07 +0000115written by Moshe Zadka and GvR. The changes for 2.4 were implemented by
Andrew M. Kuchlingd4be86c2004-07-04 01:44:04 +0000116Kalle Svensson.}
117\end{seealso}
Andrew M. Kuchling35f2b052003-12-18 13:28:13 +0000118
119%======================================================================
Andrew M. Kuchlingc8f8a812004-07-04 01:26:42 +0000120\section{PEP 289: Generator Expressions}
Raymond Hettinger354433a2004-05-19 08:20:33 +0000121
Andrew M. Kuchlingc8f8a812004-07-04 01:26:42 +0000122The iterator feature introduced in Python 2.2 makes it easier to write
123programs that loop through large data sets without having the entire
124data set in memory at one time. Programmers can use iterators and the
125\module{itertools} module to write code in a fairly functional style.
126
Andrew M. Kuchling3bf85f12004-07-05 01:37:07 +0000127% XXX avoid metaphor
128List comprehensions have been the fly in the ointment because they
Andrew M. Kuchlingc8f8a812004-07-04 01:26:42 +0000129produce a Python list object containing all of the items, unavoidably
Andrew M. Kuchling3bf85f12004-07-05 01:37:07 +0000130pulling them all into memory. When trying to write a
131functionally-styled program, it would be natural to write something
132like:
Raymond Hettinger354433a2004-05-19 08:20:33 +0000133
134\begin{verbatim}
Andrew M. Kuchlingc8f8a812004-07-04 01:26:42 +0000135links = [link for link in get_all_links() if not link.followed]
136for link in links:
137 ...
Raymond Hettinger354433a2004-05-19 08:20:33 +0000138\end{verbatim}
139
Andrew M. Kuchlingc8f8a812004-07-04 01:26:42 +0000140instead of
Raymond Hettinger354433a2004-05-19 08:20:33 +0000141
142\begin{verbatim}
Andrew M. Kuchlingc8f8a812004-07-04 01:26:42 +0000143for link in get_all_links():
144 if link.followed:
145 continue
146 ...
147\end{verbatim}
Raymond Hettinger354433a2004-05-19 08:20:33 +0000148
Andrew M. Kuchlingc8f8a812004-07-04 01:26:42 +0000149The first form is more concise and perhaps more readable, but if
150you're dealing with a large number of link objects the second form
151would have to be used.
Raymond Hettinger354433a2004-05-19 08:20:33 +0000152
Andrew M. Kuchlingc8f8a812004-07-04 01:26:42 +0000153Generator expressions work similarly to list comprehensions but don't
154materialize the entire list; instead they create a generator that will
155return elements one by one. The above example could be written as:
Raymond Hettinger354433a2004-05-19 08:20:33 +0000156
Andrew M. Kuchlingc8f8a812004-07-04 01:26:42 +0000157\begin{verbatim}
158links = (link for link in get_all_links() if not link.followed)
159for link in links:
160 ...
161\end{verbatim}
Raymond Hettinger170a6222004-05-19 19:45:19 +0000162
Andrew M. Kuchlingc8f8a812004-07-04 01:26:42 +0000163Generator expressions always have to be written inside parentheses, as
164in the above example. The parentheses signalling a function call also
165count, so if you want to create a iterator that will be immediately
166passed to a function you could write:
Raymond Hettinger170a6222004-05-19 19:45:19 +0000167
Andrew M. Kuchlingc8f8a812004-07-04 01:26:42 +0000168\begin{verbatim}
169print sum(obj.count for obj in list_all_objects())
170\end{verbatim}
Raymond Hettinger170a6222004-05-19 19:45:19 +0000171
Andrew M. Kuchling3bf85f12004-07-05 01:37:07 +0000172Generator expressions differ from list comprehensions in various small
173ways. Most notably, the loop variable (\var{obj} in the above
174example) is not accessible outside of the generator expression. List
175comprehensions leave the variable assigned to its last value; future
176versions of Python will change this, making list comprehensions match
177generator expressions in this respect.
Raymond Hettinger354433a2004-05-19 08:20:33 +0000178
179\begin{seealso}
180\seepep{289}{Generator Expressions}{Proposed by Raymond Hettinger and
181implemented by Jiwon Seo with early efforts steered by Hye-Shik Chang.}
182\end{seealso}
183
184%======================================================================
Andrew M. Kuchlingd91fcbe2004-08-02 12:44:28 +0000185\section{PEP 318: Decorators for Functions, Methods and Classes}
186
Andrew M. Kuchling77a602f2004-08-02 13:48:18 +0000187Python 2.2 extended Python's object model by adding static methods and
188class methods, but it didn't extend Python's syntax to provide any new
189way of defining static or class methods. Instead, you had to write a
190\keyword{def} statement in the usual way, and pass the resulting
191method to a \function{staticmethod()} or \function{classmethod()}
192function that would wrap up the function as a method of the new type.
193Your code would look like this:
194
195\begin{verbatim}
196class C:
197 def meth (cls):
198 ...
199
200 meth = classmethod(meth) # Rebind name to wrapped-up class method
201\end{verbatim}
202
203If the method was very long, it would be easy to miss or forget the
204\function{classmethod()} invocation after the function body.
205
206The intention was always to add some syntax to make such definitions
207more readable, but at the time of 2.2's release a good syntax was not
208obvious. Years later, when Python 2.4 is coming out, a good syntax
209\emph{still} isn't obvious but users are asking for easier access to
210the feature, so a new syntactic feature has been added.
211
212The feature is called ``function decorators''. The name comes from
213the idea that \function{classmethod}, \function{staticmethod}, and
214friends are storing additional information on a function object; they're
215\emph{decorating} functions with more details.
216
217The notation borrows from Java and uses the \samp{@} character as an
218indicator. Using the new syntax, the example above would be written:
219
220\begin{verbatim}
221class C:
222
223 @classmethod
224 def meth (cls):
225 ...
226
227\end{verbatim}
228
229The \code{@classmethod} is shorthand for the
230\code{meth=classmethod(meth} assignment. More generally, if you have
231the following:
232
233\begin{verbatim}
234@A @B @C
235def f ():
236 ...
237\end{verbatim}
238
239It's equivalent to:
240
241\begin{verbatim}
242def f(): ...
243f = C(B(A(f)))
244\end{verbatim}
245
246Decorators must come on the line before a function definition, and
247can't be on the same line, meaning that \code{@A def f(): ...} is
248illegal. You can only decorate function definitions, either at the
249module-level or inside a class; you can't decorate class definitions.
250
251A decorator is just a function that takes the function to be decorated
252as an argument and returns either the same function or some new
253callable thing. It's easy to write your own decorators. The
254following simple example just sets an attribute on the function
255object:
256
257\begin{verbatim}
258>>> def deco(func):
259... func.attr = 'decorated'
260... return func
261...
262>>> @deco
263... def f(): pass
264...
265>>> f
266<function f at 0x402ef0d4>
267>>> f.attr
268'decorated'
269>>>
270\end{verbatim}
271
272As a slightly more realistic example, the following decorator checks
273that the supplied argument is an integer:
274
275\begin{verbatim}
276def require_int (func):
277 def wrapper (arg):
278 assert isinstance(arg, int)
279 return func(arg)
280
281 return wrapper
282
283@require_int
284def p1 (arg):
285 print arg
286
287@require_int
288def p2(arg):
289 print arg*2
290\end{verbatim}
291
292An example in \pep{318} contains a fancier version of this idea that
293lets you specify the required type and check the returned type as
294well.
295
296Decorator functions can take arguments. If arguments are supplied,
297the decorator function is called with only those arguments and must
298return a new decorator function; this new function must take a single
299function and return a function, as previously described. In other
300words, \code{@A @B @C(args)} becomes:
301
302\begin{verbatim}
303def f(): ...
304_deco = C(args)
305f = _deco(B(A(f)))
306\end{verbatim}
307
308Getting this right can be slightly brain-bending, but it's not too
309difficult.
310
311The new syntax was provisionally added in 2.4alpha2, and is subject to
312change during the 2.4alpha release cycle depending on the Python
313community's reaction. Post-2.4 versions of Python will preserve
314compatibility with whatever syntax is used in 2.4final.
Andrew M. Kuchlingd91fcbe2004-08-02 12:44:28 +0000315
316\begin{seealso}
317\seepep{318}{Decorators for Functions, Methods and Classes}{Written
Andrew M. Kuchling77a602f2004-08-02 13:48:18 +0000318by Kevin D. Smith, Jim Jewett, and Skip Montanaro. Several people
319wrote patches implementing function decorators, but the one that was
320actually checked in was patch #979728, written by Mark Russell.}
Andrew M. Kuchlingd91fcbe2004-08-02 12:44:28 +0000321\end{seealso}
322
323%======================================================================
Andrew M. Kuchling1a420252003-11-08 15:58:49 +0000324\section{PEP 322: Reverse Iteration}
Fred Drakeed0fa3d2003-07-30 19:14:09 +0000325
Fred Drake56fcc232004-05-06 02:55:35 +0000326A new built-in function, \function{reversed(\var{seq})}, takes a sequence
Andrew M. Kuchling3bf85f12004-07-05 01:37:07 +0000327and returns an iterator that loops over the elements of the sequence
Andrew M. Kuchling1a420252003-11-08 15:58:49 +0000328in reverse order.
329
330\begin{verbatim}
Raymond Hettingerbc3cba22003-11-12 16:39:30 +0000331>>> for i in reversed(xrange(1,4)):
Andrew M. Kuchling1a420252003-11-08 15:58:49 +0000332... print i
333...
3343
3352
3361
337\end{verbatim}
338
Andrew M. Kuchling3bf85f12004-07-05 01:37:07 +0000339Compared to extended slicing, such as \code{range(1,4)[::-1]},
340\function{reversed()} is easier to read, runs faster, and uses
341substantially less memory.
Raymond Hettingerbc3cba22003-11-12 16:39:30 +0000342
Andrew M. Kuchling1a420252003-11-08 15:58:49 +0000343Note that \function{reversed()} only accepts sequences, not arbitrary
Raymond Hettingerbc3cba22003-11-12 16:39:30 +0000344iterators. If you want to reverse an iterator, first convert it to
345a list with \function{list()}.
Andrew M. Kuchling1a420252003-11-08 15:58:49 +0000346
347\begin{verbatim}
Andrew M. Kuchling44a31e12004-01-01 18:33:34 +0000348>>> input= open('/etc/passwd', 'r')
349>>> for line in reversed(list(input)):
Andrew M. Kuchling1a420252003-11-08 15:58:49 +0000350... print line
351...
352root:*:0:0:System Administrator:/var/root:/bin/tcsh
353 ...
354\end{verbatim}
Fred Drakeed0fa3d2003-07-30 19:14:09 +0000355
Andrew M. Kuchlingf7a6b672003-11-08 16:05:37 +0000356\begin{seealso}
357\seepep{322}{Reverse Iteration}{Written and implemented by Raymond Hettinger.}
358
359\end{seealso}
360
Fred Drakeed0fa3d2003-07-30 19:14:09 +0000361
362%======================================================================
Raymond Hettinger0fff62f2004-07-01 11:52:15 +0000363\section{PEP 327: Decimal Data Type}
364
Andrew M. Kuchlingc8f8a812004-07-04 01:26:42 +0000365Python has always supported floating-point (FP) numbers as a data
366type, based on the underlying C \ctype{double} type. However, while
367most programming languages provide a floating-point type, most people
368(even programmers) are unaware that computing with floating-point
369numbers entails certain unavoidable inaccuracies. The new decimal
370type provides a way to avoid these inaccuracies.
Raymond Hettinger0fff62f2004-07-01 11:52:15 +0000371
Andrew M. Kuchlingc8f8a812004-07-04 01:26:42 +0000372\subsection{Why is Decimal needed?}
Raymond Hettinger0fff62f2004-07-01 11:52:15 +0000373
Andrew M. Kuchlingc8f8a812004-07-04 01:26:42 +0000374The limitations arise from the representation used for floating-point numbers.
375FP numbers are made up of three components:
376
377\begin{itemize}
378\item The sign, which is -1 or +1.
379\item The mantissa, which is a single-digit binary number
380followed by a fractional part. For example, \code{1.01} in base-2 notation
381is \code{1 + 0/2 + 1/4}, or 1.25 in decimal notation.
382\item The exponent, which tells where the decimal point is located in the number represented.
383\end{itemize}
384
385For example, the number 1.25 has sign +1, mantissa 1.01 (in binary),
386and exponent of 0 (the decimal point doesn't need to be shifted). The
387number 5 has the same sign and mantissa, but the exponent is 2
388because the mantissa is multiplied by 4 (2 to the power of the exponent 2).
389
390Modern systems usually provide floating-point support that conforms to
391a relevant standard called IEEE 754. C's \ctype{double} type is
392usually implemented as a 64-bit IEEE 754 number, which uses 52 bits of
393space for the mantissa. This means that numbers can only be specified
394to 52 bits of precision. If you're trying to represent numbers whose
395expansion repeats endlessly, the expansion is cut off after 52 bits.
396Unfortunately, most software needs to produce output in base 10, and
397base 10 often gives rise to such repeating decimals. For example, 1.1
398decimal is binary \code{1.0001100110011 ...}; .1 = 1/16 + 1/32 + 1/256
399plus an infinite number of additional terms. IEEE 754 has to chop off
400that infinitely repeated decimal after 52 digits, so the
401representation is slightly inaccurate.
402
403Sometimes you can see this inaccuracy when the number is printed:
Raymond Hettinger0fff62f2004-07-01 11:52:15 +0000404\begin{verbatim}
Andrew M. Kuchlingc8f8a812004-07-04 01:26:42 +0000405>>> 1.1
4061.1000000000000001
Raymond Hettinger0fff62f2004-07-01 11:52:15 +0000407\end{verbatim}
408
Andrew M. Kuchlingc8f8a812004-07-04 01:26:42 +0000409The inaccuracy isn't always visible when you print the number because
Andrew M. Kuchling0ad20f12004-07-21 13:00:06 +0000410the FP-to-decimal-string conversion is provided by the C library and
Andrew M. Kuchlingc8f8a812004-07-04 01:26:42 +0000411most C libraries try to produce sensible output, but the inaccuracy is
412still there and subsequent operations can magnify the error.
Raymond Hettinger0fff62f2004-07-01 11:52:15 +0000413
Andrew M. Kuchlingc8f8a812004-07-04 01:26:42 +0000414For many applications this doesn't matter. If I'm plotting points and
415displaying them on my monitor, the difference between 1.1 and
4161.1000000000000001 is too small to be visible. Reports often limit
417output to a certain number of decimal places, and if you round the
418number to two or three or even eight decimal places, the error is
419never apparent. However, for applications where it does matter,
420it's a lot of work to implement your own custom arithmetic routines.
421
422\subsection{The \class{Decimal} type}
423
424A new module, \module{decimal}, was added to Python's standard library.
425It contains two classes, \class{Decimal} and \class{Context}.
426\class{Decimal} instances represent numbers, and
427\class{Context} instances are used to wrap up various settings such as the precision and default rounding mode.
428
429\class{Decimal} instances, like regular Python integers and FP numbers, are immutable; once they've been created, you can't change the value it represents.
430\class{Decimal} instances can be created from integers or strings:
Raymond Hettinger0fff62f2004-07-01 11:52:15 +0000431
432\begin{verbatim}
Andrew M. Kuchlingc8f8a812004-07-04 01:26:42 +0000433>>> import decimal
434>>> decimal.Decimal(1972)
435Decimal("1972")
436>>> decimal.Decimal("1.1")
437Decimal("1.1")
Raymond Hettinger0fff62f2004-07-01 11:52:15 +0000438\end{verbatim}
439
Andrew M. Kuchlingc8f8a812004-07-04 01:26:42 +0000440You can also provide tuples containing the sign, mantissa represented
441as a tuple of decimal digits, and exponent:
Raymond Hettinger0fff62f2004-07-01 11:52:15 +0000442
443\begin{verbatim}
Andrew M. Kuchlingc8f8a812004-07-04 01:26:42 +0000444>>> decimal.Decimal((1, (1, 4, 7, 5), -2))
445Decimal("-14.75")
Raymond Hettinger0fff62f2004-07-01 11:52:15 +0000446\end{verbatim}
Andrew M. Kuchlingc8f8a812004-07-04 01:26:42 +0000447
448Cautionary note: the sign bit is a Boolean value, so 0 is positive and 1 is negative.
449
450Floating-point numbers posed a bit of a problem: should the FP number
451representing 1.1 turn into the decimal number for exactly 1.1, or for
4521.1 plus whatever inaccuracies are introduced? The decision was to
453leave such a conversion out of the API. Instead, you should convert
454the floating-point number into a string using the desired precision and
455pass the string to the \class{Decimal} constructor:
456
457\begin{verbatim}
458>>> f = 1.1
459>>> decimal.Decimal(str(f))
460Decimal("1.1")
Andrew M. Kuchling0ad20f12004-07-21 13:00:06 +0000461>>> decimal.Decimal('%.12f' % f)
462Decimal("1.100000000000")
Andrew M. Kuchlingc8f8a812004-07-04 01:26:42 +0000463\end{verbatim}
464
465Once you have \class{Decimal} instances, you can perform the usual
466mathematical operations on them. One limitation: exponentiation
467requires an integer exponent:
468
469\begin{verbatim}
470>>> a = decimal.Decimal('35.72')
471>>> b = decimal.Decimal('1.73')
472>>> a+b
473Decimal("37.45")
474>>> a-b
475Decimal("33.99")
476>>> a*b
477Decimal("61.7956")
478>>> a/b
Andrew M. Kuchling0ad20f12004-07-21 13:00:06 +0000479Decimal("20.64739884393063583815028902")
Andrew M. Kuchlingc8f8a812004-07-04 01:26:42 +0000480>>> a ** 2
481Decimal("1275.9184")
Andrew M. Kuchling0ad20f12004-07-21 13:00:06 +0000482>>> a**b
483Traceback (most recent call last):
484 ...
485decimal.InvalidOperation: x ** (non-integer)
Andrew M. Kuchlingc8f8a812004-07-04 01:26:42 +0000486\end{verbatim}
487
488You can combine \class{Decimal} instances with integers, but not with
489floating-point numbers:
490
491\begin{verbatim}
492>>> a + 4
493Decimal("39.72")
494>>> a + 4.5
495Traceback (most recent call last):
496 ...
497TypeError: You can interact Decimal only with int, long or Decimal data types.
498>>>
499\end{verbatim}
500
501\class{Decimal} numbers can be used with the \module{math} and
Andrew M. Kuchling0ad20f12004-07-21 13:00:06 +0000502\module{cmath} modules, but note that they'll be immediately converted to
503floating-point numbers before the operation is performed, resulting in
504a possible loss of precision and accuracy. You'll also get back a
505regular floating-point number and not a \class{Decimal}.
Andrew M. Kuchlingc8f8a812004-07-04 01:26:42 +0000506
507\begin{verbatim}
508>>> import math, cmath
509>>> d = decimal.Decimal('123456789012.345')
510>>> math.sqrt(d)
511351364.18288201344
512>>> cmath.sqrt(-d)
513351364.18288201344j
Andrew M. Kuchling0ad20f12004-07-21 13:00:06 +0000514\end{verbatim}
515
516Instances also have a \method{sqrt()} method that returns a
517\class{Decimal}, but if you need other things such as trigonometric
518functions you'll have to implement them.
519
520\begin{verbatim}
Andrew M. Kuchlingc8f8a812004-07-04 01:26:42 +0000521>>> d.sqrt()
Raymond Hettingerca1a7752004-07-12 13:00:45 +0000522Decimal("351364.1828820134592177245001")
Andrew M. Kuchlingc8f8a812004-07-04 01:26:42 +0000523\end{verbatim}
524
525
526\subsection{The \class{Context} type}
527
528Instances of the \class{Context} class encapsulate several settings for
529decimal operations:
530
531\begin{itemize}
532 \item \member{prec} is the precision, the number of decimal places.
533 \item \member{rounding} specifies the rounding mode. The \module{decimal}
534 module has constants for the various possibilities:
535 \constant{ROUND_DOWN}, \constant{ROUND_CEILING}, \constant{ROUND_HALF_EVEN}, and various others.
Andrew M. Kuchling0ad20f12004-07-21 13:00:06 +0000536 \item \member{traps} is a dictionary specifying what happens on
Andrew M. Kuchlingc8f8a812004-07-04 01:26:42 +0000537encountering certain error conditions: either an exception is raised or
538a value is returned. Some examples of error conditions are
539division by zero, loss of precision, and overflow.
540\end{itemize}
541
542There's a thread-local default context available by calling
543\function{getcontext()}; you can change the properties of this context
544to alter the default precision, rounding, or trap handling.
545
546\begin{verbatim}
547>>> decimal.getcontext().prec
54828
549>>> decimal.Decimal(1) / decimal.Decimal(7)
Raymond Hettingerca1a7752004-07-12 13:00:45 +0000550Decimal("0.1428571428571428571428571429")
Andrew M. Kuchlingc8f8a812004-07-04 01:26:42 +0000551>>> decimal.getcontext().prec = 9
552>>> decimal.Decimal(1) / decimal.Decimal(7)
Raymond Hettingerca1a7752004-07-12 13:00:45 +0000553Decimal("0.142857143")
Andrew M. Kuchlingc8f8a812004-07-04 01:26:42 +0000554\end{verbatim}
555
Andrew M. Kuchling0ad20f12004-07-21 13:00:06 +0000556The default action for error conditions is selectable; the module can
557either return a special value such as infinity or not-a-number, or
558exceptions can be raised:
Andrew M. Kuchlingc8f8a812004-07-04 01:26:42 +0000559
560\begin{verbatim}
561>>> decimal.Decimal(1) / decimal.Decimal(0)
Andrew M. Kuchlingc8f8a812004-07-04 01:26:42 +0000562Traceback (most recent call last):
563 ...
564decimal.DivisionByZero: x / 0
Andrew M. Kuchling0ad20f12004-07-21 13:00:06 +0000565>>> decimal.getcontext().traps[decimal.DivisionByZero] = False
566>>> decimal.Decimal(1) / decimal.Decimal(0)
567Decimal("Infinity")
Andrew M. Kuchlingc8f8a812004-07-04 01:26:42 +0000568>>>
569\end{verbatim}
570
571The \class{Context} instance also has various methods for formatting
572numbers such as \method{to_eng_string()} and \method{to_sci_string()}.
573
Andrew M. Kuchling0ad20f12004-07-21 13:00:06 +0000574For more information, see the documentation for the \module{decimal}
575module, which includes a quick-start tutorial and a reference.
576
Raymond Hettinger0fff62f2004-07-01 11:52:15 +0000577\begin{seealso}
578\seepep{327}{Decimal Data Type}{Written by Facundo Batista and implemented
Andrew M. Kuchlingc8f8a812004-07-04 01:26:42 +0000579 by Facundo Batista, Eric Price, Raymond Hettinger, Aahz, and Tim Peters.}
580
Raymond Hettingerca1a7752004-07-12 13:00:45 +0000581\seeurl{http://research.microsoft.com/\textasciitilde hollasch/cgindex/coding/ieeefloat.html}
Andrew M. Kuchlingc8f8a812004-07-04 01:26:42 +0000582{A more detailed overview of the IEEE-754 representation.}
583
584\seeurl{http://www.lahey.com/float.htm}
585{The article uses Fortran code to illustrate many of the problems
586that floating-point inaccuracy can cause.}
587
588\seeurl{http://www2.hursley.ibm.com/decimal/}
589{A description of a decimal-based representation. This representation
590is being proposed as a standard, and underlies the new Python decimal
591type. Much of this material was written by Mike Cowlishaw, designer of the
Raymond Hettingerca1a7752004-07-12 13:00:45 +0000592Rexx language.}
Andrew M. Kuchlingc8f8a812004-07-04 01:26:42 +0000593
Raymond Hettinger0fff62f2004-07-01 11:52:15 +0000594\end{seealso}
595
596
597%======================================================================
Andrew M. Kuchling65a33322004-07-21 12:41:38 +0000598\section{PEP 331: Locale-Independent Float/String Conversions}
599
600The \module{locale} modules lets Python software select various
601conversions and display conventions that are localized to a particular
602country or language. However, the module was careful to not change
603the numeric locale because various functions in Python's
604implementation required that the numeric locale remain set to the
605\code{'C'} locale. Often this was because the code was using the C library's
606\cfunction{atof()} function.
607
608Not setting the numeric locale caused trouble for extensions that used
609third-party C libraries, however, because they wouldn't have the
610correct locale set. The motivating example was GTK+, whose user
611interface widgets weren't displaying numbers in the current locale.
612
613The solution described in the PEP is to add three new functions to the
614Python API that perform ASCII-only conversions, ignoring the locale
615setting:
616
617\begin{itemize}
618 \item \cfunction{PyOS_ascii_strtod(\var{str}, \var{ptr})}
619and \cfunction{PyOS_ascii_atof(\var{str}, \var{ptr})}
620both convert a string to a C \ctype{double}.
621 \item \cfunction{PyOS_ascii_formatd(\var{buffer}, \var{buf_len}, \var{format}, \var{d})} converts a \ctype{double} to an ASCII string.
622\end{itemize}
623
624The code for these functions came from the GLib library
625(\url{http://developer.gnome.org/arch/gtk/glib.html}), whose
626developers kindly relicensed the relevant functions and donated them
627to the Python Software Foundation. The \module{locale} module
628can now change the numeric locale, letting extensions such as GTK+
629produce the correct results.
630
631\begin{seealso}
632\seepep{331}{Locale-Independent Float/String Conversions}{Written by Christian R. Reis, and implemented by Gustavo Carneiro.}
633\end{seealso}
634
635%======================================================================
Fred Drakeed0fa3d2003-07-30 19:14:09 +0000636\section{Other Language Changes}
637
638Here are all of the changes that Python 2.4 makes to the core Python
639language.
640
641\begin{itemize}
Raymond Hettingerd4462302003-11-26 17:52:45 +0000642
Raymond Hettinger31017ae2004-03-04 08:25:44 +0000643\item The \method{dict.update()} method now accepts the same
644argument forms as the \class{dict} constructor. This includes any
Andrew M. Kuchlingd0b6d9d2004-07-04 15:35:00 +0000645mapping, any iterable of key/value pairs, and keyword arguments.
Raymond Hettinger31017ae2004-03-04 08:25:44 +0000646
Andrew M. Kuchling3bf85f12004-07-05 01:37:07 +0000647\item The string methods \method{ljust()}, \method{rjust()}, and
Andrew M. Kuchling67087562003-11-26 18:03:48 +0000648\method{center()} now take an optional argument for specifying a
Raymond Hettingerd4462302003-11-26 17:52:45 +0000649fill character other than a space.
650
Andrew M. Kuchling35f2b052003-12-18 13:28:13 +0000651\item Strings also gained an \method{rsplit()} method that
Raymond Hettingered54d912003-12-31 01:59:18 +0000652works like the \method{split()} method but splits from the end of
Andrew M. Kuchling44a31e12004-01-01 18:33:34 +0000653the string.
Andrew M. Kuchling35f2b052003-12-18 13:28:13 +0000654
655\begin{verbatim}
Raymond Hettinger7a6d2972004-02-13 19:00:07 +0000656>>> 'www.python.org'.split('.', 1)
657['www', 'python.org']
658'www.python.org'.rsplit('.', 1)
659['www.python', 'org']
660\end{verbatim}
Raymond Hettinger97ef8de2004-01-05 00:29:57 +0000661
Andrew M. Kuchling2fb4d512003-10-21 12:31:16 +0000662\item The \method{sort()} method of lists gained three keyword
Andrew M. Kuchling3bf85f12004-07-05 01:37:07 +0000663arguments: \var{cmp}, \var{key}, and \var{reverse}. These arguments
Andrew M. Kuchling2fb4d512003-10-21 12:31:16 +0000664make some common usages of \method{sort()} simpler. All are optional.
665
666\var{cmp} is the same as the previous single argument to
667\method{sort()}; if provided, the value should be a comparison
668function that takes two arguments and returns -1, 0, or +1 depending
669on how the arguments compare.
670
671\var{key} should be a single-argument function that takes a list
672element and returns a comparison key for the element. The list is
Raymond Hettinger607c00f2003-11-12 16:27:50 +0000673then sorted using the comparison keys. The following example sorts a
674list case-insensitively:
Andrew M. Kuchling2fb4d512003-10-21 12:31:16 +0000675
676\begin{verbatim}
677>>> L = ['A', 'b', 'c', 'D']
678>>> L.sort() # Case-sensitive sort
679>>> L
680['A', 'D', 'b', 'c']
681>>> L.sort(key=lambda x: x.lower())
682>>> L
683['A', 'b', 'c', 'D']
684>>> L.sort(cmp=lambda x,y: cmp(x.lower(), y.lower()))
685>>> L
686['A', 'b', 'c', 'D']
687\end{verbatim}
688
689The last example, which uses the \var{cmp} parameter, is the old way
Raymond Hettingered54d912003-12-31 01:59:18 +0000690to perform a case-insensitive sort. It works but is slower than
Andrew M. Kuchling2fb4d512003-10-21 12:31:16 +0000691using a \var{key} parameter. Using \var{key} results in calling the
692\method{lower()} method once for each element in the list while using
Andrew M. Kuchling3bf85f12004-07-05 01:37:07 +0000693\var{cmp} will call it twice for each comparison.
Andrew M. Kuchling2fb4d512003-10-21 12:31:16 +0000694
Andrew M. Kuchling981a9182003-11-13 21:33:26 +0000695For simple key functions and comparison functions, it is often
696possible to avoid a \keyword{lambda} expression by using an unbound
Raymond Hettinger607c00f2003-11-12 16:27:50 +0000697method instead. For example, the above case-insensitive sort is best
698coded as:
699
700\begin{verbatim}
701>>> L.sort(key=str.lower)
702>>> L
703['A', 'b', 'c', 'D']
704\end{verbatim}
705
Andrew M. Kuchling3bf85f12004-07-05 01:37:07 +0000706The \var{reverse} parameter should have a Boolean value. If the value
707is \constant{True}, the list will be sorted into reverse order.
708Instead of \code{L.sort(lambda x,y: cmp(x.score, y.score)) ;
709L.reverse()}, you can now write: \code{L.sort(key = lambda x: x.score,
710reverse=True)}.
Fred Drakeed0fa3d2003-07-30 19:14:09 +0000711
Andrew M. Kuchling981a9182003-11-13 21:33:26 +0000712The results of sorting are now guaranteed to be stable. This means
713that two entries with equal keys will be returned in the same order as
714they were input. For example, you can sort a list of people by name,
715and then sort the list by age, resulting in a list sorted by age where
716people with the same age are in name-sorted order.
Raymond Hettinger607c00f2003-11-12 16:27:50 +0000717
Fred Drake56fcc232004-05-06 02:55:35 +0000718\item There is a new built-in function
719\function{sorted(\var{iterable})} that works like the in-place
Andrew M. Kuchling3bf85f12004-07-05 01:37:07 +0000720\method{list.sort()} method but can be used in
Fred Drake56fcc232004-05-06 02:55:35 +0000721expressions. The differences are:
Raymond Hettinger607c00f2003-11-12 16:27:50 +0000722 \begin{itemize}
Raymond Hettinger7d1dd042003-11-12 16:42:10 +0000723 \item the input may be any iterable;
724 \item a newly formed copy is sorted, leaving the original intact; and
Raymond Hettinger607c00f2003-11-12 16:27:50 +0000725 \item the expression returns the new sorted copy
726 \end{itemize}
Andrew M. Kuchling1a420252003-11-08 15:58:49 +0000727
728\begin{verbatim}
729>>> L = [9,7,8,3,2,4,1,6,5]
Raymond Hettinger64958a12003-12-17 20:43:33 +0000730>>> [10+i for i in sorted(L)] # usable in a list comprehension
Raymond Hettinger607c00f2003-11-12 16:27:50 +0000731[11, 12, 13, 14, 15, 16, 17, 18, 19]
Hye-Shik Chang2b052482004-07-17 13:53:48 +0000732>>> L # original is left unchanged
Andrew M. Kuchlinge3e1eca2004-07-26 18:52:48 +0000733[9,7,8,3,2,4,1,6,5]
734>>> sorted('Monty Python') # any iterable may be an input
735[' ', 'M', 'P', 'h', 'n', 'n', 'o', 'o', 't', 't', 'y', 'y']
Raymond Hettingerd4462302003-11-26 17:52:45 +0000736
737>>> # List the contents of a dict sorted by key values
Raymond Hettinger607c00f2003-11-12 16:27:50 +0000738>>> colormap = dict(red=1, blue=2, green=3, black=4, yellow=5)
Raymond Hettinger64958a12003-12-17 20:43:33 +0000739>>> for k, v in sorted(colormap.iteritems()):
Raymond Hettinger607c00f2003-11-12 16:27:50 +0000740... print k, v
741...
742black 4
743blue 2
744green 3
745red 1
746yellow 5
Andrew M. Kuchling1a420252003-11-08 15:58:49 +0000747\end{verbatim}
748
Andrew M. Kuchlingd0b6d9d2004-07-04 15:35:00 +0000749\item The \function{eval(\var{expr}, \var{globals}, \var{locals})}
Andrew M. Kuchling1455f792004-08-02 12:09:58 +0000750and \function{execfile(\var{filename}, \var{globals}, \var{locals})}
751functions and the \keyword{exec} statement now accept any mapping type
752for the \var{locals} argument. Previously this had to be a regular
753Python dictionary. (Contributed by Raymond Hettinger.)
Andrew M. Kuchlingd0b6d9d2004-07-04 15:35:00 +0000754
Raymond Hettinger607c00f2003-11-12 16:27:50 +0000755\item The \function{zip()} built-in function and \function{itertools.izip()}
Andrew M. Kuchling3bf85f12004-07-05 01:37:07 +0000756 now return an empty list if called with no arguments.
757 Previously they raised a \exception{TypeError}
758 exception. This makes them more
Raymond Hettinger607c00f2003-11-12 16:27:50 +0000759 suitable for use with variable length argument lists:
760
761\begin{verbatim}
762>>> def transpose(array):
763... return zip(*array)
764...
765>>> transpose([(1,2,3), (4,5,6)])
766[(1, 4), (2, 5), (3, 6)]
767>>> transpose([])
768[]
769\end{verbatim}
Andrew M. Kuchling6aedcfc2003-10-21 12:48:23 +0000770
Andrew M. Kuchlingd91fcbe2004-08-02 12:44:28 +0000771\item Encountering a failure while importing a module no longer leaves
772a partially-initialized module object in \code{sys.modules}. The
773incomplete module object left behind would fool further imports of the
774same module into succeeding, leading to confusing errors.
775
Andrew M. Kuchling65a33322004-07-21 12:41:38 +0000776\item \constant{None} is now a constant; code that binds a new value to
777the name \samp{None} is now a syntax error.
778
Fred Drakeed0fa3d2003-07-30 19:14:09 +0000779\end{itemize}
780
781
782%======================================================================
783\subsection{Optimizations}
784
785\begin{itemize}
786
Raymond Hettingerca1a7752004-07-12 13:00:45 +0000787\item The inner loops for list and tuple slicing
Andrew M. Kuchling65a33322004-07-21 12:41:38 +0000788 were optimized and now run about one-third faster. The inner loops
789 were also optimized for dictionaries, resulting in performance boosts for
790 \method{keys()}, \method{values()}, \method{items()},
791 \method{iterkeys()}, \method{itervalues()}, and \method{iteritems()}.
Raymond Hettingerb7d05db2004-03-08 07:25:05 +0000792
Andrew M. Kuchling3bf85f12004-07-05 01:37:07 +0000793\item The machinery for growing and shrinking lists was optimized for
794 speed and for space efficiency. Appending and popping from lists now
795 runs faster due to more efficient code paths and less frequent use of
796 the underlying system \cfunction{realloc()}. List comprehensions
797 also benefit. \method{list.extend()} was also optimized and no
798 longer converts its argument into a temporary list before extending
799 the base list.
Raymond Hettinger7a6d2972004-02-13 19:00:07 +0000800
Raymond Hettinger97ef8de2004-01-05 00:29:57 +0000801\item \function{list()}, \function{tuple()}, \function{map()},
802 \function{filter()}, and \function{zip()} now run several times
803 faster with non-sequence arguments that supply a \method{__len__()}
Andrew M. Kuchling3bf85f12004-07-05 01:37:07 +0000804 method.
Raymond Hettinger97ef8de2004-01-05 00:29:57 +0000805
Raymond Hettinger23a0f4e2004-01-05 08:15:20 +0000806\item The methods \method{list.__getitem__()},
Raymond Hettinger97ef8de2004-01-05 00:29:57 +0000807 \method{dict.__getitem__()}, and \method{dict.__contains__()} are
808 are now implemented as \class{method_descriptor} objects rather
809 than \class{wrapper_descriptor} objects. This form of optimized
810 access doubles their performance and makes them more suitable for
Raymond Hettinger23a0f4e2004-01-05 08:15:20 +0000811 use as arguments to functionals:
812 \samp{map(mydict.__getitem__, keylist)}.
Fred Drakeed0fa3d2003-07-30 19:14:09 +0000813
Fred Draked6d35d92004-06-03 13:31:22 +0000814\item Added a new opcode, \code{LIST_APPEND}, that simplifies
Raymond Hettingerdd80f762004-03-07 07:31:06 +0000815 the generated bytecode for list comprehensions and speeds them up
816 by about a third.
817
Fred Drakeed0fa3d2003-07-30 19:14:09 +0000818\end{itemize}
819
820The net result of the 2.4 optimizations is that Python 2.4 runs the
821pystone benchmark around XX\% faster than Python 2.3 and YY\% faster
822than Python 2.2.
823
824
825%======================================================================
826\section{New, Improved, and Deprecated Modules}
827
828As usual, Python's standard library received a number of enhancements and
829bug fixes. Here's a partial list of the most notable changes, sorted
830alphabetically by module name. Consult the
831\file{Misc/NEWS} file in the source tree for a more
832complete list of changes, or look through the CVS logs for all the
833details.
834
835\begin{itemize}
836
Anthony Baxter5da4c832004-07-09 16:16:46 +0000837% XXX new email parser
838
Andrew M. Kuchlingd0b6d9d2004-07-04 15:35:00 +0000839\item The \module{asyncore} module's \function{loop()} now has a
840 \var{count} parameter that lets you perform a limited number
841 of passes through the polling loop. The default is still to loop
842 forever.
843
Andrew M. Kuchling69f31eb2003-08-13 23:11:04 +0000844\item The \module{curses} modules now supports the ncurses extension
Fred Draked6d35d92004-06-03 13:31:22 +0000845 \function{use_default_colors()}. On platforms where the terminal
846 supports transparency, this makes it possible to use a transparent
847 background. (Contributed by J\"org Lehmann.)
Andrew M. Kuchling6aedcfc2003-10-21 12:48:23 +0000848
Raymond Hettinger0c410272004-01-05 10:13:35 +0000849\item The \module{bisect} module now has an underlying C implementation
850 for improved performance.
851 (Contributed by Dmitry Vasiliev.)
852
Andrew M. Kuchling5303a962004-01-18 15:55:51 +0000853\item The CJKCodecs collections of East Asian codecs, maintained
854by Hye-Shik Chang, was integrated into 2.4.
855The new encodings are:
856
857\begin{itemize}
Andrew M. Kuchling671c5062004-07-28 15:29:39 +0000858 \item Chinese (PRC): gb2312, gbk, gb18030, big5hkscs, hz
Andrew M. Kuchling5303a962004-01-18 15:55:51 +0000859 \item Chinese (ROC): big5, cp950
Andrew M. Kuchling671c5062004-07-28 15:29:39 +0000860 \item Japanese: cp932, euc-jis-2004, euc-jp,
Andrew M. Kuchling5303a962004-01-18 15:55:51 +0000861euc-jisx0213, iso-2022-jp, iso-2022-jp-1, iso-2022-jp-2,
Andrew M. Kuchling671c5062004-07-28 15:29:39 +0000862 iso-2022-jp-3, iso-2022-jp-ext, iso-2022-jp-2004,
863 shift-jis, shift-jisx0213, shift-jis-2004
Andrew M. Kuchling5303a962004-01-18 15:55:51 +0000864 \item Korean: cp949, euc-kr, johab, iso-2022-kr
865\end{itemize}
866
Andrew M. Kuchlingfd0e4942004-02-09 13:23:34 +0000867\item There is a new \module{collections} module for
868 various specialized collection datatypes.
869 Currently it contains just one type, \class{deque},
870 a double-ended queue that supports efficiently adding and removing
871 elements from either end.
Raymond Hettinger756b3f32004-01-29 06:37:52 +0000872
873\begin{verbatim}
874>>> from collections import deque
875>>> d = deque('ghi') # make a new deque with three items
876>>> d.append('j') # add a new entry to the right side
877>>> d.appendleft('f') # add a new entry to the left side
878>>> d # show the representation of the deque
879deque(['f', 'g', 'h', 'i', 'j'])
880>>> d.pop() # return and remove the rightmost item
881'j'
882>>> d.popleft() # return and remove the leftmost item
883'f'
884>>> list(d) # list the contents of the deque
885['g', 'h', 'i']
886>>> 'h' in d # search the deque
887True
888\end{verbatim}
889
Andrew M. Kuchlingfd0e4942004-02-09 13:23:34 +0000890Several modules now take advantage of \class{collections.deque} for
Andrew M. Kuchling3bf85f12004-07-05 01:37:07 +0000891improved performance, such as the \module{Queue} and
892\module{threading} modules.
Andrew M. Kuchling5303a962004-01-18 15:55:51 +0000893
Fred Drake9f15b5c2004-05-18 04:30:00 +0000894\item The \module{ConfigParser} classes have been enhanced slightly.
895 The \method{read()} method now returns a list of the files that
896 were successfully parsed, and the \method{set()} method raises
897 \exception{TypeError} if passed a \var{value} argument that isn't a
898 string.
899
Raymond Hettinger607c00f2003-11-12 16:27:50 +0000900\item The \module{heapq} module has been converted to C. The resulting
Andrew M. Kuchlingfd0e4942004-02-09 13:23:34 +0000901 tenfold improvement in speed makes the module suitable for handling
Raymond Hettinger33ecffb2004-06-10 05:03:17 +0000902 high volumes of data. In addition, the module has two new functions
903 \function{nlargest()} and \function{nsmallest()} that use heaps to
Andrew M. Kuchlingd0b6d9d2004-07-04 15:35:00 +0000904 find the N largest or smallest values in a dataset without the
Raymond Hettinger33ecffb2004-06-10 05:03:17 +0000905 expense of a full sort.
Andrew M. Kuchling1a420252003-11-08 15:58:49 +0000906
Andrew M. Kuchlingce4bae62004-07-27 12:13:25 +0000907\item The \module{imaplib} module now supports IMAP's THREAD command
908(contributed by Yves Dionne) and new \method{deleteacl()} and
909\method{myrights()} methods (contributed by Arnaud Mazin).
Andrew M. Kuchlingdff9dbd2003-11-20 22:22:19 +0000910
Andrew M. Kuchlingad809552003-12-06 23:19:23 +0000911\item The \module{itertools} module gained a
Andrew M. Kuchling3bf85f12004-07-05 01:37:07 +0000912 \function{groupby(\var{iterable}\optional{, \var{func}})} function.
Andrew M. Kuchlingad809552003-12-06 23:19:23 +0000913 \var{iterable} returns a succession of elements, and the optional
914 \var{func} is a function that takes an element and returns a key
915 value; if omitted, the key is simply the element itself.
916 \function{groupby()} then groups the elements into subsequences
917 which have matching values of the key, and returns a series of 2-tuples
918 containing the key value and an iterator over the subsequence.
919
920Here's an example. The \var{key} function simply returns whether a
921number is even or odd, so the result of \function{groupby()} is to
922return consecutive runs of odd or even numbers.
923
924\begin{verbatim}
925>>> import itertools
926>>> L = [2,4,6, 7,8,9,11, 12, 14]
927>>> for key_val, it in itertools.groupby(L, lambda x: x % 2):
928... print key_val, list(it)
929...
9300 [2, 4, 6]
9311 [7]
9320 [8]
9331 [9, 11]
9340 [12, 14]
935>>>
936\end{verbatim}
937
Andrew M. Kuchling3bf85f12004-07-05 01:37:07 +0000938\function{groupby()} is typically used with sorted input. The logic
939for \function{groupby()} is similar to the \UNIX{} \code{uniq} filter
940which makes it handy for eliminating, counting, or identifying
941duplicate elements:
Raymond Hettingerfeb78c92003-12-12 13:13:47 +0000942
943\begin{verbatim}
944>>> word = 'abracadabra'
Raymond Hettingered54d912003-12-31 01:59:18 +0000945>>> letters = sorted(word) # Turn string into a sorted list of letters
Raymond Hettinger64958a12003-12-17 20:43:33 +0000946>>> letters
Andrew M. Kuchling4612bc52003-12-16 20:59:37 +0000947['a', 'a', 'a', 'a', 'a', 'b', 'b', 'c', 'd', 'r', 'r']
Andrew M. Kuchling3bf85f12004-07-05 01:37:07 +0000948>>> for k, g in itertools.groupby(letters):
949... print k, list(g)
950...
951a ['a', 'a', 'a', 'a', 'a']
952b ['b', 'b']
953c ['c']
954d ['d']
955r ['r', 'r']
956>>> # List unique letters
957>>> [k for k, g in groupby(letters)]
Raymond Hettingerfeb78c92003-12-12 13:13:47 +0000958['a', 'b', 'c', 'd', 'r']
Andrew M. Kuchling3bf85f12004-07-05 01:37:07 +0000959>>> # Count letter occurences
960>>> [(k, len(list(g))) for k, g in groupby(letters)]
Raymond Hettingerfeb78c92003-12-12 13:13:47 +0000961[('a', 5), ('b', 2), ('c', 1), ('d', 1), ('r', 2)]
Raymond Hettingerfeb78c92003-12-12 13:13:47 +0000962\end{verbatim}
963
Raymond Hettingered54d912003-12-31 01:59:18 +0000964\item \module{itertools} also gained a function named
965\function{tee(\var{iterator}, \var{N})} that returns \var{N} independent
966iterators that replicate \var{iterator}. If \var{N} is omitted, the
967default is 2.
Andrew M. Kuchling35f2b052003-12-18 13:28:13 +0000968
969\begin{verbatim}
970>>> L = [1,2,3]
971>>> i1, i2 = itertools.tee(L)
972>>> i1,i2
973(<itertools.tee object at 0x402c2080>, <itertools.tee object at 0x402c2090>)
Raymond Hettingered54d912003-12-31 01:59:18 +0000974>>> list(i1) # Run the first iterator to exhaustion
Andrew M. Kuchling35f2b052003-12-18 13:28:13 +0000975[1, 2, 3]
Raymond Hettingered54d912003-12-31 01:59:18 +0000976>>> list(i2) # Run the second iterator to exhaustion
Andrew M. Kuchling35f2b052003-12-18 13:28:13 +0000977[1, 2, 3]
978>\end{verbatim}
979
980Note that \function{tee()} has to keep copies of the values returned
Raymond Hettingered54d912003-12-31 01:59:18 +0000981by the iterator; in the worst case, it may need to keep all of them.
Andrew M. Kuchling44a31e12004-01-01 18:33:34 +0000982This should therefore be used carefully if the leading iterator
Raymond Hettingered54d912003-12-31 01:59:18 +0000983can run far ahead of the trailing iterator in a long stream of inputs.
Andrew M. Kuchling3bf85f12004-07-05 01:37:07 +0000984If the separation is large, then you might as well use
Raymond Hettingered54d912003-12-31 01:59:18 +0000985\function{list()} instead. When the iterators track closely with one
986another, \function{tee()} is ideal. Possible applications include
987bookmarking, windowing, or lookahead iterators.
Andrew M. Kuchling35f2b052003-12-18 13:28:13 +0000988
Andrew M. Kuchling5785a132004-07-26 19:28:46 +0000989\item A number of functions were added to the \module{locale}
990module, such as \function{bind_textdomain_codeset()} to specify a
991particular encoding, and a family of \function{l*gettext()} functions
992that return messages in the chosen encoding.
993(Contributed by Gustavo Niemeyer.)
994
Andrew M. Kuchling23406892004-07-15 11:44:42 +0000995\item The \module{logging} package's \function{basicConfig} function
996gained some keyword arguments to simplify log configuration. The
997default behavior is to log messages to standard error, but
998various keyword arguments can be specified to log to a particular file,
999change the logging format, or set the logging level. For example:
Andrew M. Kuchlingbcefe692004-07-07 13:01:53 +00001000
1001\begin{verbatim}
1002import logging
1003logging.basicConfig(filename = '/var/log/application.log',
1004 level=0, # Log all messages, including debugging,
1005 format='%(levelname):%(process):%(thread):%(message)')
1006\end{verbatim}
1007
1008Another addition to \module{logging} is a
1009\class{TimedRotatingFileHandler} class which rotates its log files at
1010a timed interval. The module already had \class{RotatingFileHandler},
1011which rotated logs once the file exceeded a certain size. Both
1012classes derive from a new \class{BaseRotatingHandler} class that can
1013be used to implement other rotating handlers.
1014
Andrew M. Kuchling5785a132004-07-26 19:28:46 +00001015\item The \module{nntplib} module's \class{NNTP} class gained
1016\method{description()} and \method{descriptions()} methods to retrieve
1017newsgroup descriptions for a single group or for a range of groups.
1018(Contributed by J\"urgen A. Erhard.)
1019
Andrew M. Kuchling35f2b052003-12-18 13:28:13 +00001020\item The \module{operator} module gained two new functions,
1021\function{attrgetter(\var{attr})} and \function{itemgetter(\var{index})}.
1022Both functions return callables that take a single argument and return
Raymond Hettingered54d912003-12-31 01:59:18 +00001023the corresponding attribute or item; these callables make excellent
Andrew M. Kuchlingbcefe692004-07-07 13:01:53 +00001024data extractors when used with \function{map()} or
1025\function{sorted()}. For example:
Andrew M. Kuchling35f2b052003-12-18 13:28:13 +00001026
1027\begin{verbatim}
Raymond Hettingered54d912003-12-31 01:59:18 +00001028>>> L = [('c', 2), ('d', 1), ('a', 4), ('b', 3)]
Andrew M. Kuchling35f2b052003-12-18 13:28:13 +00001029>>> map(operator.itemgetter(0), L)
1030['c', 'd', 'a', 'b']
1031>>> map(operator.itemgetter(1), L)
Raymond Hettingered54d912003-12-31 01:59:18 +00001032[2, 1, 4, 3]
1033>>> sorted(L, key=operator.itemgetter(1)) # Sort list by second tuple item
1034[('d', 1), ('c', 2), ('b', 3), ('a', 4)]
Andrew M. Kuchling35f2b052003-12-18 13:28:13 +00001035\end{verbatim}
1036
Andrew M. Kuchlingd0b6d9d2004-07-04 15:35:00 +00001037\item A new \function{getsid()} function was added to the
1038\module{posix} module that underlies the \module{os} module.
1039(Contributed by J. Raynor.)
1040
1041\item The \module{poplib} module now supports POP over SSL.
1042
1043\item The \module{profile} module can now profile C extension functions.
1044% XXX more to say about this?
1045
Andrew M. Kuchling6aedcfc2003-10-21 12:48:23 +00001046\item The \module{random} module has a new method called \method{getrandbits(N)}
Raymond Hettinger607c00f2003-11-12 16:27:50 +00001047 which returns an N-bit long integer. This method supports the existing
1048 \method{randrange()} method, making it possible to efficiently generate
Andrew M. Kuchling44a31e12004-01-01 18:33:34 +00001049 arbitrarily large random numbers.
Andrew M. Kuchling6aedcfc2003-10-21 12:48:23 +00001050
1051\item The regular expression language accepted by the \module{re} module
1052 was extended with simple conditional expressions, written as
1053 \code{(?(\var{group})\var{A}|\var{B})}. \var{group} is either a
1054 numeric group ID or a group name defined with \code{(?P<group>...)}
1055 earlier in the expression. If the specified group matched, the
1056 regular expression pattern \var{A} will be tested against the string; if
1057 the group didn't match, the pattern \var{B} will be used instead.
Raymond Hettinger874ebd52004-05-31 03:15:02 +00001058
Anthony Baxter1869df12004-07-12 08:15:37 +00001059% XXX sre is now non-recursive.
1060
Andrew M. Kuchling00457172004-07-15 11:52:40 +00001061\item The \module{threading} module now has an elegantly simple way to support
1062thread-local data. The module contains a \class{local} class whose
1063attribute values are local to different threads.
1064
1065\begin{verbatim}
1066import threading
1067
1068data = threading.local()
1069data.number = 42
1070data.url = ('www.python.org', 80)
1071\end{verbatim}
1072
1073Other threads can assign and retrieve their own values for the
1074\member{number} and \member{url} attributes. You can subclass
1075\class{local} to initialize attributes or to add methods.
1076(Contributed by Jim Fulton.)
1077
Raymond Hettinger874ebd52004-05-31 03:15:02 +00001078\item The \module{weakref} module now supports a wider variety of objects
1079 including Python functions, class instances, sets, frozensets, deques,
1080 arrays, files, sockets, and regular expression pattern objects.
Andrew M. Kuchlingd0b6d9d2004-07-04 15:35:00 +00001081
1082\item The \module{xmlrpclib} module now supports a multi-call extension for
Andrew M. Kuchling00457172004-07-15 11:52:40 +00001083transmitting multiple XML-RPC calls in a single HTTP operation.
Andrew M. Kuchling69f31eb2003-08-13 23:11:04 +00001084
Fred Drakeed0fa3d2003-07-30 19:14:09 +00001085\end{itemize}
1086
1087
1088%======================================================================
Raymond Hettingerca1a7752004-07-12 13:00:45 +00001089% whole new modules get described in subsections here
Fred Drakeed0fa3d2003-07-30 19:14:09 +00001090
Martin v. Löwis2a6ba902004-05-31 18:22:40 +00001091\subsection{cookielib}
1092
1093The \module{cookielib} library supports client-side handling for HTTP
1094cookies, just as the \module{Cookie} provides server-side cookie
Andrew M. Kuchling71432f12004-07-05 01:40:07 +00001095support in CGI scripts. Cookies are stored in cookie jars; the library
Martin v. Löwis2a6ba902004-05-31 18:22:40 +00001096transparently stores cookies offered by the web server in the cookie
1097jar, and fetches the cookie from the jar when connecting to the
1098server. Similar to web browsers, policy objects control whether
1099cookies are accepted or not.
1100
1101In order to store cookies across sessions, two implementations of
1102cookie jars are provided: one that stores cookies in the Netscape
1103format, so applications can use the Mozilla or Lynx cookie jars, and
1104one that stores cookies in the same format as the Perl libwww libary.
1105
1106\module{urllib2} has been changed to interact with \module{cookielib}:
1107\class{HTTPCookieProcessor} manages a cookie jar that is used when
1108accessing URLs.
Fred Drakeed0fa3d2003-07-30 19:14:09 +00001109
1110% ======================================================================
1111\section{Build and C API Changes}
1112
1113Changes to Python's build process and to the C API include:
1114
1115\begin{itemize}
1116
Andrew M. Kuchling6aedcfc2003-10-21 12:48:23 +00001117 \item Three new convenience macros were added for common return
1118 values from extension functions: \csimplemacro{Py_RETURN_NONE},
1119 \csimplemacro{Py_RETURN_TRUE}, and \csimplemacro{Py_RETURN_FALSE}.
1120
Andrew M. Kuchling5785a132004-07-26 19:28:46 +00001121 \item Another new macro, \csimplemacro{Py_CLEAR(\var{obj})},
1122 decreases the reference count of \var{obj} and sets \var{obj} to the
1123 null pointer.
1124
Fred Drakece3caf22004-02-12 18:13:12 +00001125 \item A new function, \cfunction{PyTuple_Pack(\var{N}, \var{obj1},
1126 \var{obj2}, ..., \var{objN})}, constructs tuples from a variable
1127 length argument list of Python objects.
Fred Drakeed0fa3d2003-07-30 19:14:09 +00001128
Fred Drakece3caf22004-02-12 18:13:12 +00001129 \item A new function, \cfunction{PyDict_Contains(\var{d}, \var{k})},
1130 implements fast dictionary lookups without masking exceptions raised
1131 during the look-up process.
Raymond Hettingerd4462302003-11-26 17:52:45 +00001132
Fred Drakece3caf22004-02-12 18:13:12 +00001133 \item A new method flag, \constant{METH_COEXISTS}, allows a function
Andrew M. Kuchling71432f12004-07-05 01:40:07 +00001134 defined in slots to co-exist with a \ctype{PyCFunction} having the
1135 same name. This can halve the access time for a method such as
Andrew M. Kuchlingd0b6d9d2004-07-04 15:35:00 +00001136 \method{set.__contains__()}.
1137
1138 \item Python can now be built with additional profiling for the interpreter
Andrew M. Kuchling71432f12004-07-05 01:40:07 +00001139 itself. This is intended for people developing on the Python core.
Andrew M. Kuchlingd0b6d9d2004-07-04 15:35:00 +00001140 Providing \longprogramopt{--enable-profiling} to the
1141 \program{configure} script will let you profile the interpreter with
1142 \program{gprof}, and providing the \longprogramopt{--with-tsc} switch
Andrew M. Kuchling71432f12004-07-05 01:40:07 +00001143 enables profiling using the Pentium's Time-Stamp-Counter register.
Andrew M. Kuchlingd0b6d9d2004-07-04 15:35:00 +00001144
1145 \item The \ctype{tracebackobject} type has been renamed to \ctype{PyTracebackObject}.
Raymond Hettinger97ef8de2004-01-05 00:29:57 +00001146
Fred Drakeed0fa3d2003-07-30 19:14:09 +00001147\end{itemize}
1148
1149
1150%======================================================================
1151\subsection{Port-Specific Changes}
1152
Raymond Hettinger97ef8de2004-01-05 00:29:57 +00001153\begin{itemize}
1154
1155\item The Windows port now builds under MSVC++ 7.1 as well as version 6.
1156
1157\end{itemize}
Fred Drakeed0fa3d2003-07-30 19:14:09 +00001158
1159
1160%======================================================================
1161\section{Other Changes and Fixes \label{section-other}}
1162
1163As usual, there were a bunch of other improvements and bugfixes
1164scattered throughout the source tree. A search through the CVS change
1165logs finds there were XXX patches applied and YYY bugs fixed between
1166Python 2.3 and 2.4. Both figures are likely to be underestimates.
1167
1168Some of the more notable changes are:
1169
1170\begin{itemize}
1171
Raymond Hettinger97ef8de2004-01-05 00:29:57 +00001172\item The \module{timeit} module now automatically disables periodic
1173 garbarge collection during the timing loop. This change makes
1174 consecutive timings more comparable.
1175
1176\item The \module{base64} module now has more complete RFC 3548 support
1177 for Base64, Base32, and Base16 encoding and decoding, including
1178 optional case folding and optional alternative alphabets.
1179 (Contributed by Barry Warsaw.)
Fred Drakeed0fa3d2003-07-30 19:14:09 +00001180
1181\end{itemize}
1182
1183
1184%======================================================================
1185\section{Porting to Python 2.4}
1186
1187This section lists previously described changes that may require
1188changes to your code:
1189
1190\begin{itemize}
1191
Raymond Hettinger607c00f2003-11-12 16:27:50 +00001192\item The \function{zip()} built-in function and \function{itertools.izip()}
1193 now return an empty list instead of raising a \exception{TypeError}
1194 exception if called with no arguments.
Andrew M. Kuchling6aedcfc2003-10-21 12:48:23 +00001195
1196\item \function{dircache.listdir()} now passes exceptions to the caller
1197 instead of returning empty lists.
Fred Drakeed0fa3d2003-07-30 19:14:09 +00001198
Andrew M. Kuchling71432f12004-07-05 01:40:07 +00001199\item \function{LexicalHandler.startDTD()} used to receive the public and
1200 system IDs in the wrong order. This has been corrected; applications
Fred Drake56fcc232004-05-06 02:55:35 +00001201 relying on the wrong order need to be fixed.
Martin v. Löwis456ab1d2004-05-06 01:54:36 +00001202
Andrew M. Kuchling71432f12004-07-05 01:40:07 +00001203\item \function{fcntl.ioctl} now warns if the \var{mutate}
1204 argument is omitted and relevant.
Martin v. Löwis77ca6c42004-06-03 12:47:26 +00001205
Fred Drakeed0fa3d2003-07-30 19:14:09 +00001206\end{itemize}
1207
1208
1209%======================================================================
1210\section{Acknowledgements \label{acks}}
1211
1212The author would like to thank the following people for offering
1213suggestions, corrections and assistance with various drafts of this
Andrew M. Kuchling671c5062004-07-28 15:29:39 +00001214article: Hye-Shik Chang, Michael Dyck, Raymond Hettinger.
Fred Drakeed0fa3d2003-07-30 19:14:09 +00001215
1216\end{document}