blob: 91385dccda4e73d0fa5e6a7add7bd7e6f73d9c92 [file] [log] [blame]
Fred Drake2db76802004-12-01 05:05:47 +00001\documentclass{howto}
2\usepackage{distutils}
3% $Id$
4
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00005% Fix XXX comments
Andrew M. Kuchlinga4d651f2006-04-06 13:24:58 +00006% Distutils upload (PEP 243)
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00007% The easy_install stuff
Andrew M. Kuchling075e0232006-04-11 13:14:56 +00008% Access to ASTs with compile() flag
9% Stateful codec changes
10% ASCII is now default encoding for modules
Fred Drake2db76802004-12-01 05:05:47 +000011
12\title{What's New in Python 2.5}
Andrew M. Kuchling2cdb23e2006-04-05 13:59:01 +000013\release{0.1}
Andrew M. Kuchling92e24952004-12-03 13:54:09 +000014\author{A.M. Kuchling}
15\authoraddress{\email{amk@amk.ca}}
Fred Drake2db76802004-12-01 05:05:47 +000016
17\begin{document}
18\maketitle
19\tableofcontents
20
21This article explains the new features in Python 2.5. No release date
Andrew M. Kuchling5eefdca2006-02-08 11:36:09 +000022for Python 2.5 has been set; it will probably be released in the
Andrew M. Kuchlingd96a6ac2006-04-04 19:17:34 +000023autumn of 2006. \pep{356} describes the planned release schedule.
Fred Drake2db76802004-12-01 05:05:47 +000024
Andrew M. Kuchling9c67ee02006-04-04 19:07:27 +000025(This is still an early draft, and some sections are still skeletal or
26completely missing. Comments on the present material will still be
27welcomed.)
28
Andrew M. Kuchling437567c2006-03-07 20:48:55 +000029% XXX Compare with previous release in 2 - 3 sentences here.
Fred Drake2db76802004-12-01 05:05:47 +000030
31This article doesn't attempt to provide a complete specification of
32the new features, but instead provides a convenient overview. For
33full details, you should refer to the documentation for Python 2.5.
Andrew M. Kuchling437567c2006-03-07 20:48:55 +000034% XXX add hyperlink when the documentation becomes available online.
Fred Drake2db76802004-12-01 05:05:47 +000035If you want to understand the complete implementation and design
36rationale, refer to the PEP for a particular new feature.
37
38
39%======================================================================
Andrew M. Kuchling437567c2006-03-07 20:48:55 +000040\section{PEP 308: Conditional Expressions}
41
Andrew M. Kuchlinge362d932006-03-09 13:56:25 +000042For a long time, people have been requesting a way to write
43conditional expressions, expressions that return value A or value B
44depending on whether a Boolean value is true or false. A conditional
45expression lets you write a single assignment statement that has the
46same effect as the following:
47
48\begin{verbatim}
49if condition:
50 x = true_value
51else:
52 x = false_value
53\end{verbatim}
54
55There have been endless tedious discussions of syntax on both
Andrew M. Kuchling9c67ee02006-04-04 19:07:27 +000056python-dev and comp.lang.python. A vote was even held that found the
57majority of voters wanted conditional expressions in some form,
58but there was no syntax that was preferred by a clear majority.
59Candidates included C's \code{cond ? true_v : false_v},
Andrew M. Kuchlinge362d932006-03-09 13:56:25 +000060\code{if cond then true_v else false_v}, and 16 other variations.
61
62GvR eventually chose a surprising syntax:
63
64\begin{verbatim}
65x = true_value if condition else false_value
66\end{verbatim}
67
Andrew M. Kuchling38f85072006-04-02 01:46:32 +000068Evaluation is still lazy as in existing Boolean expressions, so the
69order of evaluation jumps around a bit. The \var{condition}
70expression in the middle is evaluated first, and the \var{true_value}
71expression is evaluated only if the condition was true. Similarly,
72the \var{false_value} expression is only evaluated when the condition
73is false.
Andrew M. Kuchlinge362d932006-03-09 13:56:25 +000074
75This syntax may seem strange and backwards; why does the condition go
76in the \emph{middle} of the expression, and not in the front as in C's
77\code{c ? x : y}? The decision was checked by applying the new syntax
78to the modules in the standard library and seeing how the resulting
79code read. In many cases where a conditional expression is used, one
80value seems to be the 'common case' and one value is an 'exceptional
81case', used only on rarer occasions when the condition isn't met. The
82conditional syntax makes this pattern a bit more obvious:
83
84\begin{verbatim}
85contents = ((doc + '\n') if doc else '')
86\end{verbatim}
87
88I read the above statement as meaning ``here \var{contents} is
Andrew M. Kuchlingd0fcc022006-03-09 13:57:28 +000089usually assigned a value of \code{doc+'\e n'}; sometimes
Andrew M. Kuchlinge362d932006-03-09 13:56:25 +000090\var{doc} is empty, in which special case an empty string is returned.''
91I doubt I will use conditional expressions very often where there
92isn't a clear common and uncommon case.
93
94There was some discussion of whether the language should require
95surrounding conditional expressions with parentheses. The decision
96was made to \emph{not} require parentheses in the Python language's
97grammar, but as a matter of style I think you should always use them.
98Consider these two statements:
99
100\begin{verbatim}
101# First version -- no parens
102level = 1 if logging else 0
103
104# Second version -- with parens
105level = (1 if logging else 0)
106\end{verbatim}
107
108In the first version, I think a reader's eye might group the statement
109into 'level = 1', 'if logging', 'else 0', and think that the condition
110decides whether the assignment to \var{level} is performed. The
111second version reads better, in my opinion, because it makes it clear
112that the assignment is always performed and the choice is being made
113between two values.
114
115Another reason for including the brackets: a few odd combinations of
116list comprehensions and lambdas could look like incorrect conditional
117expressions. See \pep{308} for some examples. If you put parentheses
118around your conditional expressions, you won't run into this case.
119
120
121\begin{seealso}
122
123\seepep{308}{Conditional Expressions}{PEP written by
124Guido van Rossum and Raymond D. Hettinger; implemented by Thomas
125Wouters.}
126
127\end{seealso}
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000128
129
130%======================================================================
Andrew M. Kuchling3e41b052005-03-01 00:53:46 +0000131\section{PEP 309: Partial Function Application}
Fred Drake2db76802004-12-01 05:05:47 +0000132
Andrew M. Kuchlingb1c96fd2005-03-20 21:42:04 +0000133The \module{functional} module is intended to contain tools for
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000134functional-style programming. Currently it only contains a
135\class{partial()} function, but new functions will probably be added
136in future versions of Python.
Andrew M. Kuchlingb1c96fd2005-03-20 21:42:04 +0000137
Andrew M. Kuchling4b000cd2005-04-09 15:51:44 +0000138For programs written in a functional style, it can be useful to
139construct variants of existing functions that have some of the
140parameters filled in. Consider a Python function \code{f(a, b, c)};
141you could create a new function \code{g(b, c)} that was equivalent to
142\code{f(1, b, c)}. This is called ``partial function application'',
143and is provided by the \class{partial} class in the new
144\module{functional} module.
145
146The constructor for \class{partial} takes the arguments
147\code{(\var{function}, \var{arg1}, \var{arg2}, ...
148\var{kwarg1}=\var{value1}, \var{kwarg2}=\var{value2})}. The resulting
149object is callable, so you can just call it to invoke \var{function}
150with the filled-in arguments.
151
152Here's a small but realistic example:
153
154\begin{verbatim}
155import functional
156
157def log (message, subsystem):
158 "Write the contents of 'message' to the specified subsystem."
159 print '%s: %s' % (subsystem, message)
160 ...
161
162server_log = functional.partial(log, subsystem='server')
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000163server_log('Unable to open socket')
Andrew M. Kuchling4b000cd2005-04-09 15:51:44 +0000164\end{verbatim}
165
Andrew M. Kuchling6af7fe02005-08-02 17:20:36 +0000166Here's another example, from a program that uses PyGTk. Here a
167context-sensitive pop-up menu is being constructed dynamically. The
168callback provided for the menu option is a partially applied version
169of the \method{open_item()} method, where the first argument has been
170provided.
Andrew M. Kuchling4b000cd2005-04-09 15:51:44 +0000171
Andrew M. Kuchling6af7fe02005-08-02 17:20:36 +0000172\begin{verbatim}
173...
174class Application:
175 def open_item(self, path):
176 ...
177 def init (self):
178 open_func = functional.partial(self.open_item, item_path)
179 popup_menu.append( ("Open", open_func, 1) )
180\end{verbatim}
Andrew M. Kuchlingb1c96fd2005-03-20 21:42:04 +0000181
182
183\begin{seealso}
184
185\seepep{309}{Partial Function Application}{PEP proposed and written by
186Peter Harris; implemented by Hye-Shik Chang, with adaptations by
187Raymond Hettinger.}
188
189\end{seealso}
Fred Drake2db76802004-12-01 05:05:47 +0000190
191
192%======================================================================
Fred Drakedb7b0022005-03-20 22:19:47 +0000193\section{PEP 314: Metadata for Python Software Packages v1.1}
194
Andrew M. Kuchlingd8d732e2005-04-09 23:59:41 +0000195Some simple dependency support was added to Distutils. The
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000196\function{setup()} function now has \code{requires}, \code{provides},
197and \code{obsoletes} keyword parameters. When you build a source
198distribution using the \code{sdist} command, the dependency
199information will be recorded in the \file{PKG-INFO} file.
Andrew M. Kuchlingd8d732e2005-04-09 23:59:41 +0000200
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000201Another new keyword parameter is \code{download_url}, which should be
202set to a URL for the package's source code. This means it's now
203possible to look up an entry in the package index, determine the
204dependencies for a package, and download the required packages.
Andrew M. Kuchlingd8d732e2005-04-09 23:59:41 +0000205
206% XXX put example here
207
208\begin{seealso}
209
210\seepep{314}{Metadata for Python Software Packages v1.1}{PEP proposed
211and written by A.M. Kuchling, Richard Jones, and Fred Drake;
212implemented by Richard Jones and Fred Drake.}
213
214\end{seealso}
Fred Drakedb7b0022005-03-20 22:19:47 +0000215
216
217%======================================================================
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000218\section{PEP 328: Absolute and Relative Imports}
219
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000220The simpler part of PEP 328 was implemented in Python 2.4: parentheses
221could now be used to enclose the names imported from a module using
222the \code{from ... import ...} statement, making it easier to import
223many different names.
224
225The more complicated part has been implemented in Python 2.5:
226importing a module can be specified to use absolute or
227package-relative imports. The plan is to move toward making absolute
228imports the default in future versions of Python.
229
230Let's say you have a package directory like this:
231\begin{verbatim}
232pkg/
233pkg/__init__.py
234pkg/main.py
235pkg/string.py
236\end{verbatim}
237
238This defines a package named \module{pkg} containing the
239\module{pkg.main} and \module{pkg.string} submodules.
240
241Consider the code in the \file{main.py} module. What happens if it
242executes the statement \code{import string}? In Python 2.4 and
243earlier, it will first look in the package's directory to perform a
244relative import, finds \file{pkg/string.py}, imports the contents of
245that file as the \module{pkg.string} module, and that module is bound
246to the name \samp{string} in the \module{pkg.main} module's namespace.
247
248That's fine if \module{pkg.string} was what you wanted. But what if
249you wanted Python's standard \module{string} module? There's no clean
250way to ignore \module{pkg.string} and look for the standard module;
251generally you had to look at the contents of \code{sys.modules}, which
252is slightly unclean.
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +0000253Holger Krekel's \module{py.std} package provides a tidier way to perform
254imports from the standard library, \code{import py ; py.std.string.join()},
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000255but that package isn't available on all Python installations.
256
257Reading code which relies on relative imports is also less clear,
258because a reader may be confused about which module, \module{string}
259or \module{pkg.string}, is intended to be used. Python users soon
260learned not to duplicate the names of standard library modules in the
261names of their packages' submodules, but you can't protect against
262having your submodule's name being used for a new module added in a
263future version of Python.
264
265In Python 2.5, you can switch \keyword{import}'s behaviour to
266absolute imports using a \code{from __future__ import absolute_import}
267directive. This absolute-import behaviour will become the default in
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +0000268a future version (probably Python 2.7). Once absolute imports
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000269are the default, \code{import string} will
270always find the standard library's version.
271It's suggested that users should begin using absolute imports as much
272as possible, so it's preferable to begin writing \code{from pkg import
273string} in your code.
274
275Relative imports are still possible by adding a leading period
276to the module name when using the \code{from ... import} form:
277
278\begin{verbatim}
279# Import names from pkg.string
280from .string import name1, name2
281# Import pkg.string
282from . import string
283\end{verbatim}
284
285This imports the \module{string} module relative to the current
286package, so in \module{pkg.main} this will import \var{name1} and
287\var{name2} from \module{pkg.string}. Additional leading periods
288perform the relative import starting from the parent of the current
289package. For example, code in the \module{A.B.C} module can do:
290
291\begin{verbatim}
292from . import D # Imports A.B.D
293from .. import E # Imports A.E
294from ..F import G # Imports A.F.G
295\end{verbatim}
296
297Leading periods cannot be used with the \code{import \var{modname}}
298form of the import statement, only the \code{from ... import} form.
299
300\begin{seealso}
301
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +0000302\seepep{328}{Imports: Multi-Line and Absolute/Relative}
303{PEP written by Aahz; implemented by Thomas Wouters.}
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000304
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +0000305\seeurl{http://codespeak.net/py/current/doc/index.html}
306{The py library by Holger Krekel, which contains the \module{py.std} package.}
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000307
308\end{seealso}
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000309
310
311%======================================================================
Andrew M. Kuchling21d3a7c2006-03-15 11:53:09 +0000312\section{PEP 338: Executing Modules as Scripts}
313
Andrew M. Kuchlingb182db42006-03-17 21:48:46 +0000314The \programopt{-m} switch added in Python 2.4 to execute a module as
315a script gained a few more abilities. Instead of being implemented in
316C code inside the Python interpreter, the switch now uses an
317implementation in a new module, \module{runpy}.
318
319The \module{runpy} module implements a more sophisticated import
320mechanism so that it's now possible to run modules in a package such
321as \module{pychecker.checker}. The module also supports alternative
322import mechanisms such as the \module{zipimport} module. (This means
323you can add a .zip archive's path to \code{sys.path} and then use the
324\programopt{-m} switch to execute code from the archive.
325
326
327\begin{seealso}
328
329\seepep{338}{Executing modules as scripts}{PEP written and
330implemented by Nick Coghlan.}
331
332\end{seealso}
Andrew M. Kuchling21d3a7c2006-03-15 11:53:09 +0000333
334
335%======================================================================
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000336\section{PEP 341: Unified try/except/finally}
337
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000338Until Python 2.5, the \keyword{try} statement came in two
339flavours. You could use a \keyword{finally} block to ensure that code
340is always executed, or a number of \keyword{except} blocks to catch an
341exception. You couldn't combine both \keyword{except} blocks and a
342\keyword{finally} block, because generating the right bytecode for the
343combined version was complicated and it wasn't clear what the
344semantics of the combined should be.
345
346GvR spent some time working with Java, which does support the
347equivalent of combining \keyword{except} blocks and a
348\keyword{finally} block, and this clarified what the statement should
349mean. In Python 2.5, you can now write:
350
351\begin{verbatim}
352try:
353 block-1 ...
354except Exception1:
355 handler-1 ...
356except Exception2:
357 handler-2 ...
358else:
359 else-block
360finally:
361 final-block
362\end{verbatim}
363
364The code in \var{block-1} is executed. If the code raises an
365exception, the handlers are tried in order: \var{handler-1},
366\var{handler-2}, ... If no exception is raised, the \var{else-block}
367is executed. No matter what happened previously, the
368\var{final-block} is executed once the code block is complete and any
369raised exceptions handled. Even if there's an error in an exception
370handler or the \var{else-block} and a new exception is raised, the
371\var{final-block} is still executed.
372
373\begin{seealso}
374
375\seepep{341}{Unifying try-except and try-finally}{PEP written by Georg Brandl;
Andrew M. Kuchling9c67ee02006-04-04 19:07:27 +0000376implementation by Thomas Lee.}
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000377
378\end{seealso}
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000379
380
381%======================================================================
Andrew M. Kuchlinga2e21cb2005-08-02 17:13:21 +0000382\section{PEP 342: New Generator Features}
383
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000384Python 2.5 adds a simple way to pass values \emph{into} a generator.
Andrew M. Kuchling150e3492005-08-23 00:56:06 +0000385As introduced in Python 2.3, generators only produce output; once a
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000386generator's code is invoked to create an iterator, there's no way to
387pass any new information into the function when its execution is
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000388resumed. Sometimes the ability to pass in some information would be
389useful. Hackish solutions to this include making the generator's code
390look at a global variable and then changing the global variable's
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000391value, or passing in some mutable object that callers then modify.
Andrew M. Kuchling150e3492005-08-23 00:56:06 +0000392
393To refresh your memory of basic generators, here's a simple example:
394
395\begin{verbatim}
396def counter (maximum):
397 i = 0
398 while i < maximum:
399 yield i
400 i += 1
401\end{verbatim}
402
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000403When you call \code{counter(10)}, the result is an iterator that
404returns the values from 0 up to 9. On encountering the
405\keyword{yield} statement, the iterator returns the provided value and
406suspends the function's execution, preserving the local variables.
407Execution resumes on the following call to the iterator's
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000408\method{next()} method, picking up after the \keyword{yield} statement.
Andrew M. Kuchling150e3492005-08-23 00:56:06 +0000409
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000410In Python 2.3, \keyword{yield} was a statement; it didn't return any
411value. In 2.5, \keyword{yield} is now an expression, returning a
412value that can be assigned to a variable or otherwise operated on:
Andrew M. Kuchlinga2e21cb2005-08-02 17:13:21 +0000413
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000414\begin{verbatim}
415val = (yield i)
416\end{verbatim}
417
418I recommend that you always put parentheses around a \keyword{yield}
419expression when you're doing something with the returned value, as in
420the above example. The parentheses aren't always necessary, but it's
421easier to always add them instead of having to remember when they're
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000422needed.\footnote{The exact rules are that a \keyword{yield}-expression must
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000423always be parenthesized except when it occurs at the top-level
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000424expression on the right-hand side of an assignment, meaning you can
425write \code{val = yield i} but have to use parentheses when there's an
426operation, as in \code{val = (yield i) + 12}.}
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000427
428Values are sent into a generator by calling its
429\method{send(\var{value})} method. The generator's code is then
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000430resumed and the \keyword{yield} expression returns the specified
431\var{value}. If the regular \method{next()} method is called, the
432\keyword{yield} returns \constant{None}.
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000433
434Here's the previous example, modified to allow changing the value of
435the internal counter.
436
437\begin{verbatim}
438def counter (maximum):
439 i = 0
440 while i < maximum:
441 val = (yield i)
442 # If value provided, change counter
443 if val is not None:
444 i = val
445 else:
446 i += 1
447\end{verbatim}
448
449And here's an example of changing the counter:
450
451\begin{verbatim}
452>>> it = counter(10)
453>>> print it.next()
4540
455>>> print it.next()
4561
457>>> print it.send(8)
4588
459>>> print it.next()
4609
461>>> print it.next()
462Traceback (most recent call last):
463 File ``t.py'', line 15, in ?
464 print it.next()
465StopIteration
Andrew M. Kuchlingc2033702005-08-29 13:30:12 +0000466\end{verbatim}
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000467
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000468Because \keyword{yield} will often be returning \constant{None}, you
469should always check for this case. Don't just use its value in
470expressions unless you're sure that the \method{send()} method
471will be the only method used resume your generator function.
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000472
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000473In addition to \method{send()}, there are two other new methods on
474generators:
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000475
476\begin{itemize}
477
478 \item \method{throw(\var{type}, \var{value}=None,
479 \var{traceback}=None)} is used to raise an exception inside the
480 generator; the exception is raised by the \keyword{yield} expression
481 where the generator's execution is paused.
482
483 \item \method{close()} raises a new \exception{GeneratorExit}
484 exception inside the generator to terminate the iteration.
485 On receiving this
486 exception, the generator's code must either raise
487 \exception{GeneratorExit} or \exception{StopIteration}; catching the
488 exception and doing anything else is illegal and will trigger
489 a \exception{RuntimeError}. \method{close()} will also be called by
490 Python's garbage collection when the generator is garbage-collected.
491
492 If you need to run cleanup code in case of a \exception{GeneratorExit},
493 I suggest using a \code{try: ... finally:} suite instead of
494 catching \exception{GeneratorExit}.
495
496\end{itemize}
497
498The cumulative effect of these changes is to turn generators from
499one-way producers of information into both producers and consumers.
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000500
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000501Generators also become \emph{coroutines}, a more generalized form of
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000502subroutines. Subroutines are entered at one point and exited at
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000503another point (the top of the function, and a \keyword{return
504statement}), but coroutines can be entered, exited, and resumed at
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000505many different points (the \keyword{yield} statements). We'll have to
506figure out patterns for using coroutines effectively in Python.
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000507
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000508The addition of the \method{close()} method has one side effect that
509isn't obvious. \method{close()} is called when a generator is
510garbage-collected, so this means the generator's code gets one last
511chance to run before the generator is destroyed, and this last chance
512means that \code{try...finally} statements in generators can now be
513guaranteed to work; the \keyword{finally} clause will now always get a
514chance to run. The syntactic restriction that you couldn't mix
515\keyword{yield} statements with a \code{try...finally} suite has
516therefore been removed. This seems like a minor bit of language
517trivia, but using generators and \code{try...finally} is actually
518necessary in order to implement the \keyword{with} statement
519described by PEP 343. We'll look at this new statement in the following
520section.
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000521
Andrew M. Kuchlinga2e21cb2005-08-02 17:13:21 +0000522\begin{seealso}
523
524\seepep{342}{Coroutines via Enhanced Generators}{PEP written by
525Guido van Rossum and Phillip J. Eby;
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000526implemented by Phillip J. Eby. Includes examples of
527some fancier uses of generators as coroutines.}
528
529\seeurl{http://en.wikipedia.org/wiki/Coroutine}{The Wikipedia entry for
530coroutines.}
531
Neal Norwitz09179882006-03-04 23:31:45 +0000532\seeurl{http://www.sidhe.org/\~{}dan/blog/archives/000178.html}{An
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000533explanation of coroutines from a Perl point of view, written by Dan
534Sugalski.}
Andrew M. Kuchlinga2e21cb2005-08-02 17:13:21 +0000535
536\end{seealso}
537
538
539%======================================================================
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000540\section{PEP 343: The 'with' statement}
541
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000542The \keyword{with} statement allows a clearer
543version of code that uses \code{try...finally} blocks
544
545First, I'll discuss the statement as it will commonly be used, and
546then I'll discuss the detailed implementation and how to write objects
547(called ``context managers'') that can be used with this statement.
548Most people, who will only use \keyword{with} in company with an
Andrew M. Kuchlinga4d651f2006-04-06 13:24:58 +0000549existing object, don't need to know these details and can
550just use objects that are documented to work as context managers.
551Authors of new context managers will need to understand the details of
552the underlying implementation.
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000553
554The \keyword{with} statement is a new control-flow structure whose
555basic structure is:
556
557\begin{verbatim}
558with expression as variable:
559 with-block
560\end{verbatim}
561
562The expression is evaluated, and it should result in a type of object
563that's called a context manager. The context manager can return a
564value that will be bound to the name \var{variable}. (Note carefully:
565\var{variable} is \emph{not} assigned the result of \var{expression}.
566One method of the context manager is run before \var{with-block} is
567executed, and another method is run after the block is done, even if
568the block raised an exception.
569
570To enable the statement in Python 2.5, you need
571to add the following directive to your module:
572
573\begin{verbatim}
574from __future__ import with_statement
575\end{verbatim}
576
577Some standard Python objects can now behave as context managers. For
578example, file objects:
579
580\begin{verbatim}
581with open('/etc/passwd', 'r') as f:
582 for line in f:
583 print line
584
585# f has been automatically closed at this point.
586\end{verbatim}
587
588The \module{threading} module's locks and condition variables
Andrew M. Kuchling9c67ee02006-04-04 19:07:27 +0000589also support the \keyword{with} statement:
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000590
591\begin{verbatim}
592lock = threading.Lock()
593with lock:
594 # Critical section of code
595 ...
596\end{verbatim}
597
598The lock is acquired before the block is executed, and released once
599the block is complete.
600
601The \module{decimal} module's contexts, which encapsulate the desired
602precision and rounding characteristics for computations, can also be
603used as context managers.
604
605\begin{verbatim}
606import decimal
607
608v1 = decimal.Decimal('578')
609
610# Displays with default precision of 28 digits
611print v1.sqrt()
612
613with decimal.Context(prec=16):
614 # All code in this block uses a precision of 16 digits.
615 # The original context is restored on exiting the block.
616 print v1.sqrt()
617\end{verbatim}
618
619\subsection{Writing Context Managers}
620
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000621% XXX write this
622
Andrew M. Kuchling9c67ee02006-04-04 19:07:27 +0000623This section still needs to be written.
624
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000625The new \module{contextlib} module provides some functions and a
Andrew M. Kuchling9c67ee02006-04-04 19:07:27 +0000626decorator that are useful for writing context managers.
627Future versions will go into more detail.
628
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000629% XXX describe further
630
631\begin{seealso}
632
633\seepep{343}{The ``with'' statement}{PEP written by
634Guido van Rossum and Nick Coghlan. }
635
636\end{seealso}
637
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000638
639%======================================================================
Andrew M. Kuchling8f4d2552006-03-08 01:50:20 +0000640\section{PEP 352: Exceptions as New-Style Classes}
641
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000642Exception classes can now be new-style classes, not just classic
643classes, and the built-in \exception{Exception} class and all the
644standard built-in exceptions (\exception{NameError},
645\exception{ValueError}, etc.) are now new-style classes.
Andrew M. Kuchlingaeadf952006-03-09 19:06:05 +0000646
647The inheritance hierarchy for exceptions has been rearranged a bit.
648In 2.5, the inheritance relationships are:
649
650\begin{verbatim}
651BaseException # New in Python 2.5
652|- KeyboardInterrupt
653|- SystemExit
654|- Exception
655 |- (all other current built-in exceptions)
656\end{verbatim}
657
658This rearrangement was done because people often want to catch all
659exceptions that indicate program errors. \exception{KeyboardInterrupt} and
660\exception{SystemExit} aren't errors, though, and usually represent an explicit
661action such as the user hitting Control-C or code calling
662\function{sys.exit()}. A bare \code{except:} will catch all exceptions,
663so you commonly need to list \exception{KeyboardInterrupt} and
664\exception{SystemExit} in order to re-raise them. The usual pattern is:
665
666\begin{verbatim}
667try:
668 ...
669except (KeyboardInterrupt, SystemExit):
670 raise
671except:
672 # Log error...
673 # Continue running program...
674\end{verbatim}
675
676In Python 2.5, you can now write \code{except Exception} to achieve
677the same result, catching all the exceptions that usually indicate errors
678but leaving \exception{KeyboardInterrupt} and
679\exception{SystemExit} alone. As in previous versions,
680a bare \code{except:} still catches all exceptions.
681
682The goal for Python 3.0 is to require any class raised as an exception
683to derive from \exception{BaseException} or some descendant of
684\exception{BaseException}, and future releases in the
685Python 2.x series may begin to enforce this constraint. Therefore, I
686suggest you begin making all your exception classes derive from
687\exception{Exception} now. It's been suggested that the bare
688\code{except:} form should be removed in Python 3.0, but Guido van~Rossum
689hasn't decided whether to do this or not.
690
691Raising of strings as exceptions, as in the statement \code{raise
692"Error occurred"}, is deprecated in Python 2.5 and will trigger a
693warning. The aim is to be able to remove the string-exception feature
694in a few releases.
695
696
697\begin{seealso}
698
Andrew M. Kuchlingc3749a92006-04-04 19:14:41 +0000699\seepep{352}{Required Superclass for Exceptions}{PEP written by
Andrew M. Kuchlingaeadf952006-03-09 19:06:05 +0000700Brett Cannon and Guido van Rossum; implemented by Brett Cannon.}
701
702\end{seealso}
Andrew M. Kuchling8f4d2552006-03-08 01:50:20 +0000703
704
705%======================================================================
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +0000706\section{PEP 353: Using ssize_t as the index type\label{section-353}}
Andrew M. Kuchlingc3749a92006-04-04 19:14:41 +0000707
708A wide-ranging change to Python's C API, using a new
709\ctype{Py_ssize_t} type definition instead of \ctype{int},
710will permit the interpreter to handle more data on 64-bit platforms.
711This change doesn't affect Python's capacity on 32-bit platforms.
712
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +0000713Various pieces of the Python interpreter used C's \ctype{int} type to
714store sizes or counts; for example, the number of items in a list or
715tuple were stored in an \ctype{int}. The C compilers for most 64-bit
716platforms still define \ctype{int} as a 32-bit type, so that meant
717that lists could only hold up to \code{2**31 - 1} = 2147483647 items.
718(There are actually a few different programming models that 64-bit C
719compilers can use -- see
720\url{http://www.unix.org/version2/whatsnew/lp64_wp.html} for a
721discussion -- but the most commonly available model leaves \ctype{int}
722as 32 bits.)
723
724A limit of 2147483647 items doesn't really matter on a 32-bit platform
725because you'll run out of memory before hitting the length limit.
726Each list item requires space for a pointer, which is 4 bytes, plus
727space for a \ctype{PyObject} representing the item. 2147483647*4 is
728already more bytes than a 32-bit address space can contain.
729
730It's possible to address that much memory on a 64-bit platform,
731however. The pointers for a list that size would only require 16GiB
732of space, so it's not unreasonable that Python programmers might
733construct lists that large. Therefore, the Python interpreter had to
734be changed to use some type other than \ctype{int}, and this will be a
73564-bit type on 64-bit platforms. The change will cause
736incompatibilities on 64-bit machines, so it was deemed worth making
737the transition now, while the number of 64-bit users is still
738relatively small. (In 5 or 10 years, we may \emph{all} be on 64-bit
739machines, and the transition would be more painful then.)
740
741This change most strongly affects authors of C extension modules.
742Python strings and container types such as lists and tuples
743now use \ctype{Py_ssize_t} to store their size.
744Functions such as \cfunction{PyList_Size()}
745now return \ctype{Py_ssize_t}. Code in extension modules
746may therefore need to have some variables changed to
747\ctype{Py_ssize_t}.
748
749The \cfunction{PyArg_ParseTuple()} and \cfunction{Py_BuildValue()} functions
750have a new conversion code, \samp{n}, for \ctype{Py_ssize_t}.
Andrew M. Kuchlinga4d651f2006-04-06 13:24:58 +0000751\cfunction{PyArg_ParseTuple()}'s \samp{s\#} and \samp{t\#} still output
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +0000752\ctype{int} by default, but you can define the macro
753\csimplemacro{PY_SSIZE_T_CLEAN} before including \file{Python.h}
754to make them return \ctype{Py_ssize_t}.
755
756\pep{353} has a section on conversion guidelines that
757extension authors should read to learn about supporting 64-bit
758platforms.
Andrew M. Kuchlingc3749a92006-04-04 19:14:41 +0000759
760\begin{seealso}
761
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +0000762\seepep{353}{Using ssize_t as the index type}{PEP written and implemented by Martin von L\"owis.}
Andrew M. Kuchlingc3749a92006-04-04 19:14:41 +0000763
764\end{seealso}
765
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +0000766
Andrew M. Kuchlingc3749a92006-04-04 19:14:41 +0000767%======================================================================
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000768\section{PEP 357: The '__index__' method}
769
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000770The NumPy developers had a problem that could only be solved by adding
771a new special method, \method{__index__}. When using slice notation,
Fred Drake1c0e3282006-04-02 03:30:06 +0000772as in \code{[\var{start}:\var{stop}:\var{step}]}, the values of the
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000773\var{start}, \var{stop}, and \var{step} indexes must all be either
774integers or long integers. NumPy defines a variety of specialized
775integer types corresponding to unsigned and signed integers of 8, 16,
77632, and 64 bits, but there was no way to signal that these types could
777be used as slice indexes.
778
779Slicing can't just use the existing \method{__int__} method because
780that method is also used to implement coercion to integers. If
781slicing used \method{__int__}, floating-point numbers would also
782become legal slice indexes and that's clearly an undesirable
783behaviour.
784
785Instead, a new special method called \method{__index__} was added. It
786takes no arguments and returns an integer giving the slice index to
787use. For example:
788
789\begin{verbatim}
790class C:
791 def __index__ (self):
792 return self.value
793\end{verbatim}
794
795The return value must be either a Python integer or long integer.
796The interpreter will check that the type returned is correct, and
797raises a \exception{TypeError} if this requirement isn't met.
798
799A corresponding \member{nb_index} slot was added to the C-level
800\ctype{PyNumberMethods} structure to let C extensions implement this
801protocol. \cfunction{PyNumber_Index(\var{obj})} can be used in
802extension code to call the \method{__index__} function and retrieve
803its result.
804
805\begin{seealso}
806
807\seepep{357}{Allowing Any Object to be Used for Slicing}{PEP written
Andrew M. Kuchling9c67ee02006-04-04 19:07:27 +0000808and implemented by Travis Oliphant.}
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000809
810\end{seealso}
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000811
812
813%======================================================================
Fred Drake2db76802004-12-01 05:05:47 +0000814\section{Other Language Changes}
815
816Here are all of the changes that Python 2.5 makes to the core Python
817language.
818
819\begin{itemize}
Andrew M. Kuchling1cae3f52004-12-03 14:57:21 +0000820
821\item The \function{min()} and \function{max()} built-in functions
822gained a \code{key} keyword argument analogous to the \code{key}
Andrew M. Kuchlinge9b1bf42005-03-20 19:26:30 +0000823argument for \method{sort()}. This argument supplies a function
Andrew M. Kuchling1cae3f52004-12-03 14:57:21 +0000824that takes a single argument and is called for every value in the list;
825\function{min()}/\function{max()} will return the element with the
826smallest/largest return value from this function.
827For example, to find the longest string in a list, you can do:
828
829\begin{verbatim}
830L = ['medium', 'longest', 'short']
831# Prints 'longest'
832print max(L, key=len)
833# Prints 'short', because lexicographically 'short' has the largest value
834print max(L)
835\end{verbatim}
836
837(Contributed by Steven Bethard and Raymond Hettinger.)
Fred Drake2db76802004-12-01 05:05:47 +0000838
Andrew M. Kuchling150e3492005-08-23 00:56:06 +0000839\item Two new built-in functions, \function{any()} and
840\function{all()}, evaluate whether an iterator contains any true or
841false values. \function{any()} returns \constant{True} if any value
842returned by the iterator is true; otherwise it will return
843\constant{False}. \function{all()} returns \constant{True} only if
844all of the values returned by the iterator evaluate as being true.
Andrew M. Kuchling6e3a66d2006-04-07 12:46:06 +0000845(Suggested by GvR, and implemented by Raymond Hettinger.)
Andrew M. Kuchling150e3492005-08-23 00:56:06 +0000846
Andrew M. Kuchlinge9b1bf42005-03-20 19:26:30 +0000847\item The list of base classes in a class definition can now be empty.
848As an example, this is now legal:
849
850\begin{verbatim}
851class C():
852 pass
853\end{verbatim}
854(Implemented by Brett Cannon.)
855
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000856% XXX __missing__ hook in dictionaries
857
Fred Drake2db76802004-12-01 05:05:47 +0000858\end{itemize}
859
860
861%======================================================================
Andrew M. Kuchlingda376042006-03-17 15:56:41 +0000862\subsection{Interactive Interpreter Changes}
863
864In the interactive interpreter, \code{quit} and \code{exit}
865have long been strings so that new users get a somewhat helpful message
866when they try to quit:
867
868\begin{verbatim}
869>>> quit
870'Use Ctrl-D (i.e. EOF) to exit.'
871\end{verbatim}
872
873In Python 2.5, \code{quit} and \code{exit} are now objects that still
874produce string representations of themselves, but are also callable.
875Newbies who try \code{quit()} or \code{exit()} will now exit the
876interpreter as they expect. (Implemented by Georg Brandl.)
877
878
879%======================================================================
Fred Drake2db76802004-12-01 05:05:47 +0000880\subsection{Optimizations}
881
882\begin{itemize}
883
Andrew M. Kuchling150e3492005-08-23 00:56:06 +0000884\item When they were introduced
885in Python 2.4, the built-in \class{set} and \class{frozenset} types
886were built on top of Python's dictionary type.
887In 2.5 the internal data structure has been customized for implementing sets,
888and as a result sets will use a third less memory and are somewhat faster.
889(Implemented by Raymond Hettinger.)
Fred Drake2db76802004-12-01 05:05:47 +0000890
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000891\item The performance of some Unicode operations has been improved.
892% XXX provide details?
893
894\item The code generator's peephole optimizer now performs
895simple constant folding in expressions. If you write something like
896\code{a = 2+3}, the code generator will do the arithmetic and produce
897code corresponding to \code{a = 5}.
898
Fred Drake2db76802004-12-01 05:05:47 +0000899\end{itemize}
900
901The net result of the 2.5 optimizations is that Python 2.5 runs the
Andrew M. Kuchling9c67ee02006-04-04 19:07:27 +0000902pystone benchmark around XXX\% faster than Python 2.4.
Fred Drake2db76802004-12-01 05:05:47 +0000903
904
905%======================================================================
906\section{New, Improved, and Deprecated Modules}
907
908As usual, Python's standard library received a number of enhancements and
909bug fixes. Here's a partial list of the most notable changes, sorted
910alphabetically by module name. Consult the
911\file{Misc/NEWS} file in the source tree for a more
Andrew M. Kuchlingf688cc52006-03-10 18:50:08 +0000912complete list of changes, or look through the SVN logs for all the
Fred Drake2db76802004-12-01 05:05:47 +0000913details.
914
915\begin{itemize}
916
Andrew M. Kuchling150e3492005-08-23 00:56:06 +0000917% collections.deque now has .remove()
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000918% collections.defaultdict
Andrew M. Kuchling150e3492005-08-23 00:56:06 +0000919
Andrew M. Kuchling3e41b052005-03-01 00:53:46 +0000920% the cPickle module no longer accepts the deprecated None option in the
921% args tuple returned by __reduce__().
922
923% csv module improvements
924
925% datetime.datetime() now has a strptime class method which can be used to
926% create datetime object using a string and format.
927
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000928% fileinput: opening hook used to control how files are opened.
929% .input() now has a mode parameter
930% now has a fileno() function
931% accepts Unicode filenames
932
Andrew M. Kuchlingda376042006-03-17 15:56:41 +0000933\item In the \module{gc} module, the new \function{get_count()} function
934returns a 3-tuple containing the current collection counts for the
935three GC generations. This is accounting information for the garbage
936collector; when these counts reach a specified threshold, a garbage
937collection sweep will be made. The existing \function{gc.collect()}
938function now takes an optional \var{generation} argument of 0, 1, or 2
939to specify which generation to collect.
940
Andrew M. Kuchlinge9b1bf42005-03-20 19:26:30 +0000941\item The \function{nsmallest()} and
942\function{nlargest()} functions in the \module{heapq} module
943now support a \code{key} keyword argument similar to the one
944provided by the \function{min()}/\function{max()} functions
945and the \method{sort()} methods. For example:
946Example:
947
948\begin{verbatim}
949>>> import heapq
950>>> L = ["short", 'medium', 'longest', 'longer still']
951>>> heapq.nsmallest(2, L) # Return two lowest elements, lexicographically
952['longer still', 'longest']
953>>> heapq.nsmallest(2, L, key=len) # Return two shortest elements
954['short', 'medium']
955\end{verbatim}
956
957(Contributed by Raymond Hettinger.)
958
Andrew M. Kuchling511a3a82005-03-20 19:52:18 +0000959\item The \function{itertools.islice()} function now accepts
960\code{None} for the start and step arguments. This makes it more
961compatible with the attributes of slice objects, so that you can now write
962the following:
963
964\begin{verbatim}
965s = slice(5) # Create slice object
966itertools.islice(iterable, s.start, s.stop, s.step)
967\end{verbatim}
968
969(Contributed by Raymond Hettinger.)
Andrew M. Kuchling3e41b052005-03-01 00:53:46 +0000970
Andrew M. Kuchling150e3492005-08-23 00:56:06 +0000971\item The \module{operator} module's \function{itemgetter()}
972and \function{attrgetter()} functions now support multiple fields.
973A call such as \code{operator.attrgetter('a', 'b')}
974will return a function
975that retrieves the \member{a} and \member{b} attributes. Combining
976this new feature with the \method{sort()} method's \code{key} parameter
977lets you easily sort lists using multiple fields.
Andrew M. Kuchling6e3a66d2006-04-07 12:46:06 +0000978(Contributed by Raymond Hettinger.)
Andrew M. Kuchling150e3492005-08-23 00:56:06 +0000979
Andrew M. Kuchling3e41b052005-03-01 00:53:46 +0000980
Andrew M. Kuchlinge9b1bf42005-03-20 19:26:30 +0000981\item The \module{os} module underwent a number of changes. The
982\member{stat_float_times} variable now defaults to true, meaning that
983\function{os.stat()} will now return time values as floats. (This
984doesn't necessarily mean that \function{os.stat()} will return times
985that are precise to fractions of a second; not all systems support
986such precision.)
Andrew M. Kuchling3e41b052005-03-01 00:53:46 +0000987
Andrew M. Kuchling150e3492005-08-23 00:56:06 +0000988Constants named \member{os.SEEK_SET}, \member{os.SEEK_CUR}, and
Andrew M. Kuchlinge9b1bf42005-03-20 19:26:30 +0000989\member{os.SEEK_END} have been added; these are the parameters to the
Andrew M. Kuchling150e3492005-08-23 00:56:06 +0000990\function{os.lseek()} function. Two new constants for locking are
991\member{os.O_SHLOCK} and \member{os.O_EXLOCK}.
992
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000993Two new functions, \function{wait3()} and \function{wait4()}, were
994added. They're similar the \function{waitpid()} function which waits
995for a child process to exit and returns a tuple of the process ID and
996its exit status, but \function{wait3()} and \function{wait4()} return
997additional information. \function{wait3()} doesn't take a process ID
998as input, so it waits for any child process to exit and returns a
9993-tuple of \var{process-id}, \var{exit-status}, \var{resource-usage}
1000as returned from the \function{resource.getrusage()} function.
1001\function{wait4(\var{pid})} does take a process ID.
Andrew M. Kuchling6e3a66d2006-04-07 12:46:06 +00001002(Contributed by Chad J. Schroeder.)
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001003
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001004On FreeBSD, the \function{os.stat()} function now returns
1005times with nanosecond resolution, and the returned object
1006now has \member{st_gen} and \member{st_birthtime}.
1007The \member{st_flags} member is also available, if the platform supports it.
Andrew M. Kuchling6e3a66d2006-04-07 12:46:06 +00001008(Contributed by Antti Louko and Diego Petten\`o.)
1009% (Patch 1180695, 1212117)
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001010
Andrew M. Kuchling01e3d262006-03-17 15:38:39 +00001011\item The old \module{regex} and \module{regsub} modules, which have been
1012deprecated ever since Python 2.0, have finally been deleted.
Andrew M. Kuchlingf4b06602006-03-17 15:39:52 +00001013Other deleted modules: \module{statcache}, \module{tzparse},
1014\module{whrandom}.
Andrew M. Kuchling01e3d262006-03-17 15:38:39 +00001015
1016\item The \file{lib-old} directory,
1017which includes ancient modules such as \module{dircmp} and
1018\module{ni}, was also deleted. \file{lib-old} wasn't on the default
1019\code{sys.path}, so unless your programs explicitly added the directory to
1020\code{sys.path}, this removal shouldn't affect your code.
1021
Andrew M. Kuchling4678dc82006-01-15 16:11:28 +00001022\item The \module{socket} module now supports \constant{AF_NETLINK}
1023sockets on Linux, thanks to a patch from Philippe Biondi.
1024Netlink sockets are a Linux-specific mechanism for communications
1025between a user-space process and kernel code; an introductory
1026article about them is at \url{http://www.linuxjournal.com/article/7356}.
1027In Python code, netlink addresses are represented as a tuple of 2 integers,
1028\code{(\var{pid}, \var{group_mask})}.
1029
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001030Socket objects also gained accessor methods \method{getfamily()},
1031\method{gettype()}, and \method{getproto()} methods to retrieve the
1032family, type, and protocol values for the socket.
1033
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001034\item New module: \module{spwd} provides functions for accessing the
1035shadow password database on systems that support it.
1036% XXX give example
Fred Drake2db76802004-12-01 05:05:47 +00001037
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001038% XXX patch #1382163: sys.subversion, Py_GetBuildNumber()
1039
Andrew M. Kuchlinge9b1bf42005-03-20 19:26:30 +00001040\item The \class{TarFile} class in the \module{tarfile} module now has
Georg Brandl08c02db2005-07-22 18:39:19 +00001041an \method{extractall()} method that extracts all members from the
Andrew M. Kuchlinge9b1bf42005-03-20 19:26:30 +00001042archive into the current working directory. It's also possible to set
1043a different directory as the extraction target, and to unpack only a
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001044subset of the archive's members.
Andrew M. Kuchlinge9b1bf42005-03-20 19:26:30 +00001045
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001046A tarfile's compression can be autodetected by
1047using the mode \code{'r|*'}.
1048% patch 918101
1049(Contributed by Lars Gust\"abel.)
Gregory P. Smithf21a5f72005-08-21 18:45:59 +00001050
Andrew M. Kuchlingf688cc52006-03-10 18:50:08 +00001051\item The \module{unicodedata} module has been updated to use version 4.1.0
1052of the Unicode character database. Version 3.2.0 is required
1053by some specifications, so it's still available as
1054\member{unicodedata.db_3_2_0}.
1055
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001056% patch #754022: Greatly enhanced webbrowser.py (by Oleg Broytmann).
1057
Fredrik Lundh7e0aef02005-12-12 18:54:55 +00001058
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001059\item The \module{xmlrpclib} module now supports returning
1060 \class{datetime} objects for the XML-RPC date type. Supply
1061 \code{use_datetime=True} to the \function{loads()} function
1062 or the \class{Unmarshaller} class to enable this feature.
Andrew M. Kuchling6e3a66d2006-04-07 12:46:06 +00001063 (Contributed by Skip Montanaro.)
1064% Patch 1120353
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001065
Gregory P. Smithf21a5f72005-08-21 18:45:59 +00001066
Fred Drake114b8ca2005-03-21 05:47:11 +00001067\end{itemize}
Andrew M. Kuchlinge9b1bf42005-03-20 19:26:30 +00001068
Fred Drake2db76802004-12-01 05:05:47 +00001069
1070
1071%======================================================================
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001072% whole new modules get described in subsections here
Fred Drake2db76802004-12-01 05:05:47 +00001073
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001074% XXX new distutils features: upload
1075
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001076\subsection{The ctypes package}
1077
1078The \module{ctypes} package, written by Thomas Heller, has been added
1079to the standard library. \module{ctypes} lets you call arbitrary functions
1080in shared libraries or DLLs.
1081
1082In subsequent alpha releases of Python 2.5, I'll add a brief
1083introduction that shows some basic usage of the module.
1084
1085% XXX write introduction
1086
1087
1088\subsection{The ElementTree package}
1089
1090A subset of Fredrik Lundh's ElementTree library for processing XML has
Andrew M. Kuchling16ed5212006-04-10 22:28:11 +00001091been added to the standard library as \module{xmlcore.etree}. The
Georg Brandlce27a062006-04-11 06:27:12 +00001092available modules are
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001093\module{ElementTree}, \module{ElementPath}, and
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +00001094\module{ElementInclude} from ElementTree 1.2.6.
1095The \module{cElementTree} accelerator module is also included.
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001096
Andrew M. Kuchling16ed5212006-04-10 22:28:11 +00001097The rest of this section will provide a brief overview of using
1098ElementTree. Full documentation for ElementTree is available at
1099\url{http://effbot.org/zone/element-index.htm}.
1100
1101ElementTree represents an XML document as a tree of element nodes.
1102The text content of the document is stored as the \member{.text}
1103and \member{.tail} attributes of
1104(This is one of the major differences between ElementTree and
1105the Document Object Model; in the DOM there are many different
1106types of node, including \class{TextNode}.)
1107
1108The most commonly used parsing function is \function{parse()}, that
1109takes either a string (assumed to contain a filename) or a file-like
1110object and returns an \class{ElementTree} instance:
1111
1112\begin{verbatim}
1113from xmlcore.etree import ElementTree as ET
1114
1115tree = ET.parse('ex-1.xml')
1116
1117feed = urllib.urlopen(
1118 'http://planet.python.org/rss10.xml')
1119tree = ET.parse(feed)
1120\end{verbatim}
1121
1122Once you have an \class{ElementTree} instance, you
1123can call its \method{getroot()} method to get the root \class{Element} node.
1124
1125There's also an \function{XML()} function that takes a string literal
1126and returns an \class{Element} node (not an \class{ElementTree}).
1127This function provides a tidy way to incorporate XML fragments,
1128approaching the convenience of an XML literal:
1129
1130\begin{verbatim}
1131svg = et.XML("""<svg width="10px" version="1.0">
1132 </svg>""")
1133svg.set('height', '320px')
1134svg.append(elem1)
1135\end{verbatim}
1136
1137Each XML element supports some dictionary-like and some list-like
Andrew M. Kuchling075e0232006-04-11 13:14:56 +00001138access methods. Dictionary-like operations are used to access attribute
1139values, and list-like operations are used to access child nodes.
Andrew M. Kuchling16ed5212006-04-10 22:28:11 +00001140
Andrew M. Kuchling075e0232006-04-11 13:14:56 +00001141\begin{tableii}{c|l}{code}{Operation}{Result}
1142 \lineii{elem[n]}{Returns n'th child element.}
1143 \lineii{elem[m:n]}{Returns list of m'th through n'th child elements.}
1144 \lineii{len(elem)}{Returns number of child elements.}
1145 \lineii{elem.getchildren()}{Returns list of child elements.}
1146 \lineii{elem.append(elem2)}{Adds \var{elem2} as a child.}
1147 \lineii{elem.insert(index, elem2)}{Inserts \var{elem2} at the specified location.}
1148 \lineii{del elem[n]}{Deletes n'th child element.}
1149 \lineii{elem.keys()}{Returns list of attribute names.}
1150 \lineii{elem.get(name)}{Returns value of attribute \var{name}.}
1151 \lineii{elem.set(name, value)}{Sets new value for attribute \var{name}.}
1152 \lineii{elem.attrib}{Retrieves the dictionary containing attributes.}
1153 \lineii{del elem.attrib[name]}{Deletes attribute \var{name}.}
1154\end{tableii}
1155
1156Comments and processing instructions are also represented as
1157\class{Element} nodes. To check if a node is a comment or processing
1158instructions:
1159
1160\begin{verbatim}
1161if elem.tag is ET.Comment:
1162 ...
1163elif elem.tag is ET.ProcessingInstruction:
1164 ...
1165\end{verbatim}
Andrew M. Kuchling16ed5212006-04-10 22:28:11 +00001166
1167To generate XML output, you should call the
1168\method{ElementTree.write()} method. Like \function{parse()},
1169it can take either a string or a file-like object:
1170
1171\begin{verbatim}
1172# Encoding is US-ASCII
1173tree.write('output.xml')
1174
1175# Encoding is UTF-8
1176f = open('output.xml', 'w')
1177tree.write(f, 'utf-8')
1178\end{verbatim}
1179
1180(Caution: the default encoding used for output is ASCII, which isn't
1181very useful for general XML work, raising an exception if there are
1182any characters with values greater than 127. You should always
1183specify a different encoding such as UTF-8 that can handle any Unicode
1184character.)
1185
Andrew M. Kuchling075e0232006-04-11 13:14:56 +00001186This section is only a partial description of the ElementTree interfaces.
1187Please read the package's official documentation for more details.
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001188
Andrew M. Kuchling16ed5212006-04-10 22:28:11 +00001189\begin{seealso}
1190
1191\seeurl{http://effbot.org/zone/element-index.htm}
1192{Official documentation for ElementTree.}
1193
1194
1195\end{seealso}
1196
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001197
1198\subsection{The hashlib package}
1199
1200A new \module{hashlib} module has been added to replace the
1201\module{md5} and \module{sha} modules. \module{hashlib} adds support
1202for additional secure hashes (SHA-224, SHA-256, SHA-384, and SHA-512).
1203When available, the module uses OpenSSL for fast platform optimized
1204implementations of algorithms.
1205
1206The old \module{md5} and \module{sha} modules still exist as wrappers
1207around hashlib to preserve backwards compatibility. The new module's
1208interface is very close to that of the old modules, but not identical.
1209The most significant difference is that the constructor functions
1210for creating new hashing objects are named differently.
1211
1212\begin{verbatim}
1213# Old versions
1214h = md5.md5()
1215h = md5.new()
1216
1217# New version
1218h = hashlib.md5()
1219
1220# Old versions
1221h = sha.sha()
1222h = sha.new()
1223
1224# New version
1225h = hashlib.sha1()
1226
1227# Hash that weren't previously available
1228h = hashlib.sha224()
1229h = hashlib.sha256()
1230h = hashlib.sha384()
1231h = hashlib.sha512()
1232
1233# Alternative form
1234h = hashlib.new('md5') # Provide algorithm as a string
1235\end{verbatim}
1236
1237Once a hash object has been created, its methods are the same as before:
1238\method{update(\var{string})} hashes the specified string into the
1239current digest state, \method{digest()} and \method{hexdigest()}
1240return the digest value as a binary string or a string of hex digits,
1241and \method{copy()} returns a new hashing object with the same digest state.
1242
1243This module was contributed by Gregory P. Smith.
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001244
1245
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001246\subsection{The sqlite3 package}
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001247
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001248The pysqlite module (\url{http://www.pysqlite.org}), a wrapper for the
1249SQLite embedded database, has been added to the standard library under
1250the package name \module{sqlite3}. SQLite is a C library that
1251provides a SQL-language database that stores data in disk files
1252without requiring a separate server process. pysqlite was written by
1253Gerhard H\"aring, and provides a SQL interface that complies with the
Andrew M. Kuchlingd58baf82006-04-10 21:40:16 +00001254DB-API 2.0 specification described by \pep{249}. This means that it
1255should be possible to write the first version of your applications
1256using SQLite for data storage and, if switching to a larger database
1257such as PostgreSQL or Oracle is necessary, the switch should be
1258relatively easy.
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001259
1260If you're compiling the Python source yourself, note that the source
1261tree doesn't include the SQLite code itself, only the wrapper module.
1262You'll need to have the SQLite libraries and headers installed before
1263compiling Python, and the build process will compile the module when
1264the necessary headers are available.
1265
Andrew M. Kuchlingd58baf82006-04-10 21:40:16 +00001266To use the module, you must first create a \class{Connection} object
1267that represents the database. Here the data will be stored in the
1268\file{/tmp/example} file:
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001269
Andrew M. Kuchlingd58baf82006-04-10 21:40:16 +00001270\begin{verbatim}
1271conn = sqlite3.connect('/tmp/example')
1272\end{verbatim}
1273
1274You can also supply the special name \samp{:memory:} to create
1275a database in RAM.
1276
1277Once you have a \class{Connection}, you can create a \class{Cursor}
1278object and call its \method{execute()} method to perform SQL commands:
1279
1280\begin{verbatim}
1281c = conn.cursor()
1282
1283# Create table
1284c.execute('''create table stocks
1285(date timestamp, trans varchar, symbol varchar,
1286 qty decimal, price decimal)''')
1287
1288# Insert a row of data
1289c.execute("""insert into stocks
1290 values ('2006-01-05','BUY','RHAT',100, 35.14)""")
1291\end{verbatim}
1292
1293Usually your SQL queries will need to reflect the value of Python
1294variables. You shouldn't assemble your query using Python's string
1295operations because doing so is insecure; it makes your program
1296vulnerable to what's called an SQL injection attack. Instead, use
1297SQLite's parameter substitution, putting \samp{?} as a placeholder
1298wherever you want to use a value, and then provide a tuple of values
1299as the second argument to the cursor's \method{execute()} method. For
1300example:
1301
1302\begin{verbatim}
1303# Never do this -- insecure!
1304symbol = 'IBM'
1305c.execute("... where symbol = '%s'" % symbol)
1306
1307# Do this instead
1308t = (symbol,)
1309c.execute("... where symbol = '?'", t)
1310
1311# Larger example
1312for t in (('2006-03-28', 'BUY', 'IBM', 1000, 45.00),
1313 ('2006-04-05', 'BUY', 'MSOFT', 1000, 72.00),
1314 ('2006-04-06', 'SELL', 'IBM', 500, 53.00),
1315 ):
1316 c.execute('insert into stocks values (?,?,?,?,?)', t)
1317\end{verbatim}
1318
1319To retrieve data after executing a SELECT statement, you can either
1320treat the cursor as an iterator, call the cursor's \method{fetchone()}
1321method to retrieve a single matching row,
1322or call \method{fetchall()} to get a list of the matching rows.
1323
1324This example uses the iterator form:
1325
1326\begin{verbatim}
1327>>> c = conn.cursor()
1328>>> c.execute('select * from stocks order by price')
1329>>> for row in c:
1330... print row
1331...
1332(u'2006-01-05', u'BUY', u'RHAT', 100, 35.140000000000001)
1333(u'2006-03-28', u'BUY', u'IBM', 1000, 45.0)
1334(u'2006-04-06', u'SELL', u'IBM', 500, 53.0)
1335(u'2006-04-05', u'BUY', u'MSOFT', 1000, 72.0)
1336>>>
1337\end{verbatim}
1338
1339You should also use parameter substitution with SELECT statements:
1340
1341\begin{verbatim}
1342>>> c.execute('select * from stocks where symbol=?', ('IBM',))
1343>>> print c.fetchall()
1344[(u'2006-03-28', u'BUY', u'IBM', 1000, 45.0),
1345 (u'2006-04-06', u'SELL', u'IBM', 500, 53.0)]
1346\end{verbatim}
1347
1348For more information about the SQL dialect supported by SQLite, see
1349\url{http://www.sqlite.org}.
1350
1351\begin{seealso}
1352
1353\seeurl{http://www.pysqlite.org}
1354{The pysqlite web page.}
1355
1356\seeurl{http://www.sqlite.org}
1357{The SQLite web page; the documentation describes the syntax and the
1358available data types for the supported SQL dialect.}
1359
1360\seepep{249}{Database API Specification 2.0}{PEP written by
1361Marc-Andr\'e Lemburg.}
1362
1363\end{seealso}
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001364
Fred Drake2db76802004-12-01 05:05:47 +00001365
1366% ======================================================================
1367\section{Build and C API Changes}
1368
1369Changes to Python's build process and to the C API include:
1370
1371\begin{itemize}
1372
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +00001373\item The largest change to the C API came from \pep{353},
1374which modifies the interpreter to use a \ctype{Py_ssize_t} type
1375definition instead of \ctype{int}. See the earlier
1376section~ref{section-353} for a discussion of this change.
1377
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001378\item The design of the bytecode compiler has changed a great deal, to
1379no longer generate bytecode by traversing the parse tree. Instead
Andrew M. Kuchlingdb85ed52005-10-23 21:52:59 +00001380the parse tree is converted to an abstract syntax tree (or AST), and it is
1381the abstract syntax tree that's traversed to produce the bytecode.
1382
1383No documentation has been written for the AST code yet. To start
1384learning about it, read the definition of the various AST nodes in
1385\file{Parser/Python.asdl}. A Python script reads this file and
1386generates a set of C structure definitions in
1387\file{Include/Python-ast.h}. The \cfunction{PyParser_ASTFromString()}
1388and \cfunction{PyParser_ASTFromFile()}, defined in
1389\file{Include/pythonrun.h}, take Python source as input and return the
1390root of an AST representing the contents. This AST can then be turned
1391into a code object by \cfunction{PyAST_Compile()}. For more
1392information, read the source code, and then ask questions on
1393python-dev.
1394
1395% List of names taken from Jeremy's python-dev post at
1396% http://mail.python.org/pipermail/python-dev/2005-October/057500.html
1397The AST code was developed under Jeremy Hylton's management, and
1398implemented by (in alphabetical order) Brett Cannon, Nick Coghlan,
1399Grant Edwards, John Ehresman, Kurt Kaiser, Neal Norwitz, Tim Peters,
1400Armin Rigo, and Neil Schemenauer, plus the participants in a number of
1401AST sprints at conferences such as PyCon.
1402
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001403\item The built-in set types now have an official C API. Call
1404\cfunction{PySet_New()} and \cfunction{PyFrozenSet_New()} to create a
1405new set, \cfunction{PySet_Add()} and \cfunction{PySet_Discard()} to
1406add and remove elements, and \cfunction{PySet_Contains} and
1407\cfunction{PySet_Size} to examine the set's state.
1408
1409\item The \cfunction{PyRange_New()} function was removed. It was
1410never documented, never used in the core code, and had dangerously lax
1411error checking.
Fred Drake2db76802004-12-01 05:05:47 +00001412
1413\end{itemize}
1414
1415
1416%======================================================================
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001417%\subsection{Port-Specific Changes}
Fred Drake2db76802004-12-01 05:05:47 +00001418
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001419%Platform-specific changes go here.
Fred Drake2db76802004-12-01 05:05:47 +00001420
1421
1422%======================================================================
1423\section{Other Changes and Fixes \label{section-other}}
1424
1425As usual, there were a bunch of other improvements and bugfixes
Andrew M. Kuchlingf688cc52006-03-10 18:50:08 +00001426scattered throughout the source tree. A search through the SVN change
Fred Drake2db76802004-12-01 05:05:47 +00001427logs finds there were XXX patches applied and YYY bugs fixed between
Andrew M. Kuchling92e24952004-12-03 13:54:09 +00001428Python 2.4 and 2.5. Both figures are likely to be underestimates.
Fred Drake2db76802004-12-01 05:05:47 +00001429
1430Some of the more notable changes are:
1431
1432\begin{itemize}
1433
Andrew M. Kuchling01e3d262006-03-17 15:38:39 +00001434\item Evan Jones's patch to obmalloc, first described in a talk
1435at PyCon DC 2005, was applied. Python 2.4 allocated small objects in
1436256K-sized arenas, but never freed arenas. With this patch, Python
1437will free arenas when they're empty. The net effect is that on some
1438platforms, when you allocate many objects, Python's memory usage may
1439actually drop when you delete them, and the memory may be returned to
1440the operating system. (Implemented by Evan Jones, and reworked by Tim
1441Peters.)
Fred Drake2db76802004-12-01 05:05:47 +00001442
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001443\item Coverity, a company that markets a source code analysis tool
1444 called Prevent, provided the results of their examination of the Python
1445 source code. The analysis found a number of refcounting bugs, often
1446 in error-handling code. These bugs have been fixed.
1447 % XXX provide reference?
1448
Fred Drake2db76802004-12-01 05:05:47 +00001449\end{itemize}
1450
1451
1452%======================================================================
1453\section{Porting to Python 2.5}
1454
1455This section lists previously described changes that may require
1456changes to your code:
1457
1458\begin{itemize}
1459
Andrew M. Kuchlingc3749a92006-04-04 19:14:41 +00001460\item The \module{pickle} module no longer uses the deprecated \var{bin} parameter.
Fred Drake2db76802004-12-01 05:05:47 +00001461
1462\end{itemize}
1463
1464
1465%======================================================================
1466\section{Acknowledgements \label{acks}}
1467
1468The author would like to thank the following people for offering
1469suggestions, corrections and assistance with various drafts of this
Andrew M. Kuchling16ed5212006-04-10 22:28:11 +00001470article: Mike Rovner, Thomas Wouters.
Fred Drake2db76802004-12-01 05:05:47 +00001471
1472\end{document}