blob: eedcaa355f98e492ad89adaad17efb7996423dc6 [file] [log] [blame]
Fred Drake2db76802004-12-01 05:05:47 +00001\documentclass{howto}
2\usepackage{distutils}
3% $Id$
4
Andrew M. Kuchling29b3d082006-04-14 20:35:17 +00005% Fix XXX comments
Andrew M. Kuchling5f445bf2006-04-12 18:54:00 +00006% Count up the patches and bugs
Fred Drake2db76802004-12-01 05:05:47 +00007
8\title{What's New in Python 2.5}
Andrew M. Kuchling99714cf2006-04-27 12:23:07 +00009\release{0.2}
Andrew M. Kuchling92e24952004-12-03 13:54:09 +000010\author{A.M. Kuchling}
11\authoraddress{\email{amk@amk.ca}}
Fred Drake2db76802004-12-01 05:05:47 +000012
13\begin{document}
14\maketitle
15\tableofcontents
16
17This article explains the new features in Python 2.5. No release date
Andrew M. Kuchling5eefdca2006-02-08 11:36:09 +000018for Python 2.5 has been set; it will probably be released in the
Andrew M. Kuchlingd96a6ac2006-04-04 19:17:34 +000019autumn of 2006. \pep{356} describes the planned release schedule.
Fred Drake2db76802004-12-01 05:05:47 +000020
Andrew M. Kuchling0d660c02006-04-17 14:01:36 +000021Comments, suggestions, and error reports are welcome; please e-mail them
22to the author or open a bug in the Python bug tracker.
Andrew M. Kuchling9c67ee02006-04-04 19:07:27 +000023
Andrew M. Kuchling437567c2006-03-07 20:48:55 +000024% XXX Compare with previous release in 2 - 3 sentences here.
Fred Drake2db76802004-12-01 05:05:47 +000025
26This article doesn't attempt to provide a complete specification of
27the new features, but instead provides a convenient overview. For
28full details, you should refer to the documentation for Python 2.5.
Andrew M. Kuchling437567c2006-03-07 20:48:55 +000029% XXX add hyperlink when the documentation becomes available online.
Fred Drake2db76802004-12-01 05:05:47 +000030If you want to understand the complete implementation and design
31rationale, refer to the PEP for a particular new feature.
32
33
34%======================================================================
Andrew M. Kuchlingfb08e732006-04-21 13:08:02 +000035\section{PEP 308: Conditional Expressions\label{pep-308}}
Andrew M. Kuchling437567c2006-03-07 20:48:55 +000036
Andrew M. Kuchlinge362d932006-03-09 13:56:25 +000037For a long time, people have been requesting a way to write
38conditional expressions, expressions that return value A or value B
39depending on whether a Boolean value is true or false. A conditional
40expression lets you write a single assignment statement that has the
41same effect as the following:
42
43\begin{verbatim}
44if condition:
45 x = true_value
46else:
47 x = false_value
48\end{verbatim}
49
50There have been endless tedious discussions of syntax on both
Andrew M. Kuchling9c67ee02006-04-04 19:07:27 +000051python-dev and comp.lang.python. A vote was even held that found the
52majority of voters wanted conditional expressions in some form,
53but there was no syntax that was preferred by a clear majority.
54Candidates included C's \code{cond ? true_v : false_v},
Andrew M. Kuchlinge362d932006-03-09 13:56:25 +000055\code{if cond then true_v else false_v}, and 16 other variations.
56
57GvR eventually chose a surprising syntax:
58
59\begin{verbatim}
60x = true_value if condition else false_value
61\end{verbatim}
62
Andrew M. Kuchling38f85072006-04-02 01:46:32 +000063Evaluation is still lazy as in existing Boolean expressions, so the
64order of evaluation jumps around a bit. The \var{condition}
65expression in the middle is evaluated first, and the \var{true_value}
66expression is evaluated only if the condition was true. Similarly,
67the \var{false_value} expression is only evaluated when the condition
68is false.
Andrew M. Kuchlinge362d932006-03-09 13:56:25 +000069
70This syntax may seem strange and backwards; why does the condition go
71in the \emph{middle} of the expression, and not in the front as in C's
72\code{c ? x : y}? The decision was checked by applying the new syntax
73to the modules in the standard library and seeing how the resulting
74code read. In many cases where a conditional expression is used, one
75value seems to be the 'common case' and one value is an 'exceptional
76case', used only on rarer occasions when the condition isn't met. The
77conditional syntax makes this pattern a bit more obvious:
78
79\begin{verbatim}
80contents = ((doc + '\n') if doc else '')
81\end{verbatim}
82
83I read the above statement as meaning ``here \var{contents} is
Andrew M. Kuchlingd0fcc022006-03-09 13:57:28 +000084usually assigned a value of \code{doc+'\e n'}; sometimes
Andrew M. Kuchlinge362d932006-03-09 13:56:25 +000085\var{doc} is empty, in which special case an empty string is returned.''
86I doubt I will use conditional expressions very often where there
87isn't a clear common and uncommon case.
88
89There was some discussion of whether the language should require
90surrounding conditional expressions with parentheses. The decision
91was made to \emph{not} require parentheses in the Python language's
92grammar, but as a matter of style I think you should always use them.
93Consider these two statements:
94
95\begin{verbatim}
96# First version -- no parens
97level = 1 if logging else 0
98
99# Second version -- with parens
100level = (1 if logging else 0)
101\end{verbatim}
102
103In the first version, I think a reader's eye might group the statement
104into 'level = 1', 'if logging', 'else 0', and think that the condition
105decides whether the assignment to \var{level} is performed. The
106second version reads better, in my opinion, because it makes it clear
107that the assignment is always performed and the choice is being made
108between two values.
109
110Another reason for including the brackets: a few odd combinations of
111list comprehensions and lambdas could look like incorrect conditional
112expressions. See \pep{308} for some examples. If you put parentheses
113around your conditional expressions, you won't run into this case.
114
115
116\begin{seealso}
117
118\seepep{308}{Conditional Expressions}{PEP written by
Andrew M. Kuchling67191312006-04-19 12:55:39 +0000119Guido van~Rossum and Raymond D. Hettinger; implemented by Thomas
Andrew M. Kuchlinge362d932006-03-09 13:56:25 +0000120Wouters.}
121
122\end{seealso}
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000123
124
125%======================================================================
Andrew M. Kuchlingfb08e732006-04-21 13:08:02 +0000126\section{PEP 309: Partial Function Application\label{pep-309}}
Fred Drake2db76802004-12-01 05:05:47 +0000127
Andrew M. Kuchling0d272bb2006-05-31 13:18:56 +0000128The \module{functools} module is intended to contain tools for
Andrew M. Kuchlinge878fe62006-06-09 01:10:17 +0000129functional-style programming.
Andrew M. Kuchlingb1c96fd2005-03-20 21:42:04 +0000130
Andrew M. Kuchlinge878fe62006-06-09 01:10:17 +0000131One useful tool in this module is the \function{partial()} function.
132For programs written in a functional style, you'll sometimes want to
Andrew M. Kuchling4b000cd2005-04-09 15:51:44 +0000133construct variants of existing functions that have some of the
134parameters filled in. Consider a Python function \code{f(a, b, c)};
135you could create a new function \code{g(b, c)} that was equivalent to
Andrew M. Kuchlinge878fe62006-06-09 01:10:17 +0000136\code{f(1, b, c)}. This is called ``partial function application''.
Andrew M. Kuchling4b000cd2005-04-09 15:51:44 +0000137
Andrew M. Kuchlinge878fe62006-06-09 01:10:17 +0000138\function{partial} takes the arguments
Andrew M. Kuchling4b000cd2005-04-09 15:51:44 +0000139\code{(\var{function}, \var{arg1}, \var{arg2}, ...
140\var{kwarg1}=\var{value1}, \var{kwarg2}=\var{value2})}. The resulting
141object is callable, so you can just call it to invoke \var{function}
142with the filled-in arguments.
143
144Here's a small but realistic example:
145
146\begin{verbatim}
Andrew M. Kuchling0d272bb2006-05-31 13:18:56 +0000147import functools
Andrew M. Kuchling4b000cd2005-04-09 15:51:44 +0000148
149def log (message, subsystem):
150 "Write the contents of 'message' to the specified subsystem."
151 print '%s: %s' % (subsystem, message)
152 ...
153
Andrew M. Kuchling0d272bb2006-05-31 13:18:56 +0000154server_log = functools.partial(log, subsystem='server')
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000155server_log('Unable to open socket')
Andrew M. Kuchling4b000cd2005-04-09 15:51:44 +0000156\end{verbatim}
157
Andrew M. Kuchling0d272bb2006-05-31 13:18:56 +0000158Here's another example, from a program that uses PyGTK. Here a
Andrew M. Kuchling6af7fe02005-08-02 17:20:36 +0000159context-sensitive pop-up menu is being constructed dynamically. The
160callback provided for the menu option is a partially applied version
161of the \method{open_item()} method, where the first argument has been
162provided.
Andrew M. Kuchling4b000cd2005-04-09 15:51:44 +0000163
Andrew M. Kuchling6af7fe02005-08-02 17:20:36 +0000164\begin{verbatim}
165...
166class Application:
167 def open_item(self, path):
168 ...
169 def init (self):
Andrew M. Kuchling0d272bb2006-05-31 13:18:56 +0000170 open_func = functools.partial(self.open_item, item_path)
Andrew M. Kuchling6af7fe02005-08-02 17:20:36 +0000171 popup_menu.append( ("Open", open_func, 1) )
172\end{verbatim}
Andrew M. Kuchlingb1c96fd2005-03-20 21:42:04 +0000173
174
Andrew M. Kuchlinge878fe62006-06-09 01:10:17 +0000175Another function in the \module{functools} module is the
Andrew M. Kuchling7dbb1ff2006-06-09 10:22:35 +0000176\function{update_wrapper(\var{wrapper}, \var{wrapped})} function that
Andrew M. Kuchlinge878fe62006-06-09 01:10:17 +0000177helps you write well-behaved decorators. \function{update_wrapper()}
178copies the name, module, and docstring attribute to a wrapper function
179so that tracebacks inside the wrapped function are easier to
180understand. For example, you might write:
181
182\begin{verbatim}
183def my_decorator(f):
184 def wrapper(*args, **kwds):
185 print 'Calling decorated function'
186 return f(*args, **kwds)
187 functools.update_wrapper(wrapper, f)
188 return wrapper
189\end{verbatim}
190
191\function{wraps()} is a decorator that can be used inside your own
192decorators to copy the wrapped function's information. An alternate
193version of the previous example would be:
194
195\begin{verbatim}
196def my_decorator(f):
197 @functools.wraps(f)
198 def wrapper(*args, **kwds):
199 print 'Calling decorated function'
200 return f(*args, **kwds)
201 return wrapper
202\end{verbatim}
203
Andrew M. Kuchlingb1c96fd2005-03-20 21:42:04 +0000204\begin{seealso}
205
206\seepep{309}{Partial Function Application}{PEP proposed and written by
Andrew M. Kuchlinge878fe62006-06-09 01:10:17 +0000207Peter Harris; implemented by Hye-Shik Chang and Nick Coghlan, with
208adaptations by Raymond Hettinger.}
Andrew M. Kuchlingb1c96fd2005-03-20 21:42:04 +0000209
210\end{seealso}
Fred Drake2db76802004-12-01 05:05:47 +0000211
212
213%======================================================================
Andrew M. Kuchlingfb08e732006-04-21 13:08:02 +0000214\section{PEP 314: Metadata for Python Software Packages v1.1\label{pep-314}}
Fred Drakedb7b0022005-03-20 22:19:47 +0000215
Andrew M. Kuchlingd8d732e2005-04-09 23:59:41 +0000216Some simple dependency support was added to Distutils. The
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000217\function{setup()} function now has \code{requires}, \code{provides},
218and \code{obsoletes} keyword parameters. When you build a source
219distribution using the \code{sdist} command, the dependency
220information will be recorded in the \file{PKG-INFO} file.
Andrew M. Kuchlingd8d732e2005-04-09 23:59:41 +0000221
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000222Another new keyword parameter is \code{download_url}, which should be
223set to a URL for the package's source code. This means it's now
224possible to look up an entry in the package index, determine the
225dependencies for a package, and download the required packages.
Andrew M. Kuchlingd8d732e2005-04-09 23:59:41 +0000226
Andrew M. Kuchling61434b62006-04-13 11:51:07 +0000227\begin{verbatim}
228VERSION = '1.0'
229setup(name='PyPackage',
230 version=VERSION,
231 requires=['numarray', 'zlib (>=1.1.4)'],
232 obsoletes=['OldPackage']
233 download_url=('http://www.example.com/pypackage/dist/pkg-%s.tar.gz'
234 % VERSION),
235 )
236\end{verbatim}
Andrew M. Kuchlingc0a0dec2006-05-16 16:27:31 +0000237
238Another new enhancement to the Python package index at
239\url{http://cheeseshop.python.org} is storing source and binary
240archives for a package. The new \command{upload} Distutils command
241will upload a package to the repository.
242
243Before a package can be uploaded, you must be able to build a
244distribution using the \command{sdist} Distutils command. Once that
245works, you can run \code{python setup.py upload} to add your package
246to the PyPI archive. Optionally you can GPG-sign the package by
247supplying the \longprogramopt{sign} and
248\longprogramopt{identity} options.
249
250Package uploading was implemented by Martin von~L\"owis and Richard Jones.
Andrew M. Kuchlingd8d732e2005-04-09 23:59:41 +0000251
252\begin{seealso}
253
254\seepep{314}{Metadata for Python Software Packages v1.1}{PEP proposed
255and written by A.M. Kuchling, Richard Jones, and Fred Drake;
256implemented by Richard Jones and Fred Drake.}
257
258\end{seealso}
Fred Drakedb7b0022005-03-20 22:19:47 +0000259
260
261%======================================================================
Andrew M. Kuchlingfb08e732006-04-21 13:08:02 +0000262\section{PEP 328: Absolute and Relative Imports\label{pep-328}}
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000263
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000264The simpler part of PEP 328 was implemented in Python 2.4: parentheses
265could now be used to enclose the names imported from a module using
266the \code{from ... import ...} statement, making it easier to import
267many different names.
268
269The more complicated part has been implemented in Python 2.5:
270importing a module can be specified to use absolute or
271package-relative imports. The plan is to move toward making absolute
272imports the default in future versions of Python.
273
274Let's say you have a package directory like this:
275\begin{verbatim}
276pkg/
277pkg/__init__.py
278pkg/main.py
279pkg/string.py
280\end{verbatim}
281
282This defines a package named \module{pkg} containing the
283\module{pkg.main} and \module{pkg.string} submodules.
284
285Consider the code in the \file{main.py} module. What happens if it
286executes the statement \code{import string}? In Python 2.4 and
287earlier, it will first look in the package's directory to perform a
288relative import, finds \file{pkg/string.py}, imports the contents of
289that file as the \module{pkg.string} module, and that module is bound
290to the name \samp{string} in the \module{pkg.main} module's namespace.
291
292That's fine if \module{pkg.string} was what you wanted. But what if
293you wanted Python's standard \module{string} module? There's no clean
294way to ignore \module{pkg.string} and look for the standard module;
295generally you had to look at the contents of \code{sys.modules}, which
296is slightly unclean.
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +0000297Holger Krekel's \module{py.std} package provides a tidier way to perform
298imports from the standard library, \code{import py ; py.std.string.join()},
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000299but that package isn't available on all Python installations.
300
301Reading code which relies on relative imports is also less clear,
302because a reader may be confused about which module, \module{string}
303or \module{pkg.string}, is intended to be used. Python users soon
304learned not to duplicate the names of standard library modules in the
305names of their packages' submodules, but you can't protect against
306having your submodule's name being used for a new module added in a
307future version of Python.
308
309In Python 2.5, you can switch \keyword{import}'s behaviour to
310absolute imports using a \code{from __future__ import absolute_import}
311directive. This absolute-import behaviour will become the default in
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +0000312a future version (probably Python 2.7). Once absolute imports
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000313are the default, \code{import string} will
314always find the standard library's version.
315It's suggested that users should begin using absolute imports as much
316as possible, so it's preferable to begin writing \code{from pkg import
317string} in your code.
318
319Relative imports are still possible by adding a leading period
320to the module name when using the \code{from ... import} form:
321
322\begin{verbatim}
323# Import names from pkg.string
324from .string import name1, name2
325# Import pkg.string
326from . import string
327\end{verbatim}
328
329This imports the \module{string} module relative to the current
330package, so in \module{pkg.main} this will import \var{name1} and
331\var{name2} from \module{pkg.string}. Additional leading periods
332perform the relative import starting from the parent of the current
333package. For example, code in the \module{A.B.C} module can do:
334
335\begin{verbatim}
Andrew M. Kuchlingd058d002006-04-16 18:20:05 +0000336from . import D # Imports A.B.D
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000337from .. import E # Imports A.E
338from ..F import G # Imports A.F.G
339\end{verbatim}
340
341Leading periods cannot be used with the \code{import \var{modname}}
342form of the import statement, only the \code{from ... import} form.
343
344\begin{seealso}
345
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +0000346\seepep{328}{Imports: Multi-Line and Absolute/Relative}
347{PEP written by Aahz; implemented by Thomas Wouters.}
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000348
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +0000349\seeurl{http://codespeak.net/py/current/doc/index.html}
350{The py library by Holger Krekel, which contains the \module{py.std} package.}
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000351
352\end{seealso}
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000353
354
355%======================================================================
Andrew M. Kuchlingfb08e732006-04-21 13:08:02 +0000356\section{PEP 338: Executing Modules as Scripts\label{pep-338}}
Andrew M. Kuchling21d3a7c2006-03-15 11:53:09 +0000357
Andrew M. Kuchlingb182db42006-03-17 21:48:46 +0000358The \programopt{-m} switch added in Python 2.4 to execute a module as
359a script gained a few more abilities. Instead of being implemented in
360C code inside the Python interpreter, the switch now uses an
361implementation in a new module, \module{runpy}.
362
363The \module{runpy} module implements a more sophisticated import
364mechanism so that it's now possible to run modules in a package such
365as \module{pychecker.checker}. The module also supports alternative
Andrew M. Kuchling5d4cf5e2006-04-13 13:02:42 +0000366import mechanisms such as the \module{zipimport} module. This means
Andrew M. Kuchlingb182db42006-03-17 21:48:46 +0000367you can add a .zip archive's path to \code{sys.path} and then use the
368\programopt{-m} switch to execute code from the archive.
369
370
371\begin{seealso}
372
373\seepep{338}{Executing modules as scripts}{PEP written and
374implemented by Nick Coghlan.}
375
376\end{seealso}
Andrew M. Kuchling21d3a7c2006-03-15 11:53:09 +0000377
378
379%======================================================================
Andrew M. Kuchlingfb08e732006-04-21 13:08:02 +0000380\section{PEP 341: Unified try/except/finally\label{pep-341}}
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000381
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000382Until Python 2.5, the \keyword{try} statement came in two
383flavours. You could use a \keyword{finally} block to ensure that code
Andrew M. Kuchling0f1955d2006-04-13 12:09:08 +0000384is always executed, or one or more \keyword{except} blocks to catch
385specific exceptions. You couldn't combine both \keyword{except} blocks and a
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000386\keyword{finally} block, because generating the right bytecode for the
387combined version was complicated and it wasn't clear what the
388semantics of the combined should be.
389
390GvR spent some time working with Java, which does support the
391equivalent of combining \keyword{except} blocks and a
392\keyword{finally} block, and this clarified what the statement should
393mean. In Python 2.5, you can now write:
394
395\begin{verbatim}
396try:
397 block-1 ...
398except Exception1:
399 handler-1 ...
400except Exception2:
401 handler-2 ...
402else:
403 else-block
404finally:
405 final-block
406\end{verbatim}
407
408The code in \var{block-1} is executed. If the code raises an
Andrew M. Kuchling356af462006-05-10 17:19:04 +0000409exception, the various \keyword{except} blocks are tested: if the
410exception is of class \class{Exception1}, \var{handler-1} is executed;
411otherwise if it's of class \class{Exception2}, \var{handler-2} is
412executed, and so forth. If no exception is raised, the
413\var{else-block} is executed.
414
415No matter what happened previously, the \var{final-block} is executed
416once the code block is complete and any raised exceptions handled.
417Even if there's an error in an exception handler or the
418\var{else-block} and a new exception is raised, the
419code in the \var{final-block} is still run.
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000420
421\begin{seealso}
422
423\seepep{341}{Unifying try-except and try-finally}{PEP written by Georg Brandl;
Andrew M. Kuchling9c67ee02006-04-04 19:07:27 +0000424implementation by Thomas Lee.}
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000425
426\end{seealso}
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000427
428
429%======================================================================
Andrew M. Kuchlingfb08e732006-04-21 13:08:02 +0000430\section{PEP 342: New Generator Features\label{pep-342}}
Andrew M. Kuchlinga2e21cb2005-08-02 17:13:21 +0000431
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000432Python 2.5 adds a simple way to pass values \emph{into} a generator.
Andrew M. Kuchling150e3492005-08-23 00:56:06 +0000433As introduced in Python 2.3, generators only produce output; once a
Andrew M. Kuchling1e9f5742006-05-20 19:25:16 +0000434generator's code was invoked to create an iterator, there was no way to
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000435pass any new information into the function when its execution is
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000436resumed. Sometimes the ability to pass in some information would be
437useful. Hackish solutions to this include making the generator's code
438look at a global variable and then changing the global variable's
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000439value, or passing in some mutable object that callers then modify.
Andrew M. Kuchling150e3492005-08-23 00:56:06 +0000440
441To refresh your memory of basic generators, here's a simple example:
442
443\begin{verbatim}
444def counter (maximum):
445 i = 0
446 while i < maximum:
447 yield i
Andrew M. Kuchlingd058d002006-04-16 18:20:05 +0000448 i += 1
Andrew M. Kuchling150e3492005-08-23 00:56:06 +0000449\end{verbatim}
450
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000451When you call \code{counter(10)}, the result is an iterator that
452returns the values from 0 up to 9. On encountering the
453\keyword{yield} statement, the iterator returns the provided value and
454suspends the function's execution, preserving the local variables.
455Execution resumes on the following call to the iterator's
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000456\method{next()} method, picking up after the \keyword{yield} statement.
Andrew M. Kuchling150e3492005-08-23 00:56:06 +0000457
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000458In Python 2.3, \keyword{yield} was a statement; it didn't return any
459value. In 2.5, \keyword{yield} is now an expression, returning a
460value that can be assigned to a variable or otherwise operated on:
Andrew M. Kuchlinga2e21cb2005-08-02 17:13:21 +0000461
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000462\begin{verbatim}
463val = (yield i)
464\end{verbatim}
465
466I recommend that you always put parentheses around a \keyword{yield}
467expression when you're doing something with the returned value, as in
468the above example. The parentheses aren't always necessary, but it's
469easier to always add them instead of having to remember when they're
Andrew M. Kuchling3b675d22006-04-20 13:43:21 +0000470needed.
471
472(\pep{342} explains the exact rules, which are that a
473\keyword{yield}-expression must always be parenthesized except when it
474occurs at the top-level expression on the right-hand side of an
475assignment. This means you can write \code{val = yield i} but have to
476use parentheses when there's an operation, as in \code{val = (yield i)
477+ 12}.)
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000478
479Values are sent into a generator by calling its
480\method{send(\var{value})} method. The generator's code is then
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000481resumed and the \keyword{yield} expression returns the specified
482\var{value}. If the regular \method{next()} method is called, the
483\keyword{yield} returns \constant{None}.
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000484
485Here's the previous example, modified to allow changing the value of
486the internal counter.
487
488\begin{verbatim}
489def counter (maximum):
490 i = 0
491 while i < maximum:
492 val = (yield i)
Andrew M. Kuchlingd058d002006-04-16 18:20:05 +0000493 # If value provided, change counter
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000494 if val is not None:
495 i = val
Andrew M. Kuchlingd058d002006-04-16 18:20:05 +0000496 else:
497 i += 1
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000498\end{verbatim}
499
500And here's an example of changing the counter:
501
502\begin{verbatim}
503>>> it = counter(10)
504>>> print it.next()
5050
506>>> print it.next()
5071
508>>> print it.send(8)
5098
510>>> print it.next()
5119
512>>> print it.next()
513Traceback (most recent call last):
514 File ``t.py'', line 15, in ?
515 print it.next()
516StopIteration
Andrew M. Kuchlingc2033702005-08-29 13:30:12 +0000517\end{verbatim}
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000518
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000519Because \keyword{yield} will often be returning \constant{None}, you
520should always check for this case. Don't just use its value in
521expressions unless you're sure that the \method{send()} method
522will be the only method used resume your generator function.
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000523
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000524In addition to \method{send()}, there are two other new methods on
525generators:
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000526
527\begin{itemize}
528
529 \item \method{throw(\var{type}, \var{value}=None,
530 \var{traceback}=None)} is used to raise an exception inside the
531 generator; the exception is raised by the \keyword{yield} expression
532 where the generator's execution is paused.
533
534 \item \method{close()} raises a new \exception{GeneratorExit}
535 exception inside the generator to terminate the iteration.
536 On receiving this
537 exception, the generator's code must either raise
538 \exception{GeneratorExit} or \exception{StopIteration}; catching the
539 exception and doing anything else is illegal and will trigger
540 a \exception{RuntimeError}. \method{close()} will also be called by
Andrew M. Kuchling3cdf24b2006-05-25 00:23:03 +0000541 Python's garbage collector when the generator is garbage-collected.
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000542
Andrew M. Kuchling3cdf24b2006-05-25 00:23:03 +0000543 If you need to run cleanup code when a \exception{GeneratorExit} occurs,
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000544 I suggest using a \code{try: ... finally:} suite instead of
545 catching \exception{GeneratorExit}.
546
547\end{itemize}
548
549The cumulative effect of these changes is to turn generators from
550one-way producers of information into both producers and consumers.
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000551
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000552Generators also become \emph{coroutines}, a more generalized form of
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000553subroutines. Subroutines are entered at one point and exited at
Andrew M. Kuchling1e9f5742006-05-20 19:25:16 +0000554another point (the top of the function, and a \keyword{return}
555statement), but coroutines can be entered, exited, and resumed at
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000556many different points (the \keyword{yield} statements). We'll have to
557figure out patterns for using coroutines effectively in Python.
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000558
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000559The addition of the \method{close()} method has one side effect that
560isn't obvious. \method{close()} is called when a generator is
561garbage-collected, so this means the generator's code gets one last
Andrew M. Kuchling3b4fb042006-04-13 12:49:39 +0000562chance to run before the generator is destroyed. This last chance
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000563means that \code{try...finally} statements in generators can now be
564guaranteed to work; the \keyword{finally} clause will now always get a
565chance to run. The syntactic restriction that you couldn't mix
566\keyword{yield} statements with a \code{try...finally} suite has
567therefore been removed. This seems like a minor bit of language
568trivia, but using generators and \code{try...finally} is actually
569necessary in order to implement the \keyword{with} statement
Andrew M. Kuchling67191312006-04-19 12:55:39 +0000570described by PEP 343. I'll look at this new statement in the following
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000571section.
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000572
Andrew M. Kuchling3b4fb042006-04-13 12:49:39 +0000573Another even more esoteric effect of this change: previously, the
574\member{gi_frame} attribute of a generator was always a frame object.
575It's now possible for \member{gi_frame} to be \code{None}
576once the generator has been exhausted.
577
Andrew M. Kuchlinga2e21cb2005-08-02 17:13:21 +0000578\begin{seealso}
579
580\seepep{342}{Coroutines via Enhanced Generators}{PEP written by
Andrew M. Kuchling67191312006-04-19 12:55:39 +0000581Guido van~Rossum and Phillip J. Eby;
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000582implemented by Phillip J. Eby. Includes examples of
583some fancier uses of generators as coroutines.}
584
585\seeurl{http://en.wikipedia.org/wiki/Coroutine}{The Wikipedia entry for
586coroutines.}
587
Neal Norwitz09179882006-03-04 23:31:45 +0000588\seeurl{http://www.sidhe.org/\~{}dan/blog/archives/000178.html}{An
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000589explanation of coroutines from a Perl point of view, written by Dan
590Sugalski.}
Andrew M. Kuchlinga2e21cb2005-08-02 17:13:21 +0000591
592\end{seealso}
593
594
595%======================================================================
Andrew M. Kuchlingfb08e732006-04-21 13:08:02 +0000596\section{PEP 343: The 'with' statement\label{pep-343}}
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000597
Andrew M. Kuchling0a7ed8c2006-04-24 14:30:47 +0000598The '\keyword{with}' statement clarifies code that previously would
599use \code{try...finally} blocks to ensure that clean-up code is
600executed. In this section, I'll discuss the statement as it will
601commonly be used. In the next section, I'll examine the
602implementation details and show how to write objects for use with this
603statement.
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000604
Andrew M. Kuchling42c6e2f2006-04-21 13:01:45 +0000605The '\keyword{with}' statement is a new control-flow structure whose
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000606basic structure is:
607
608\begin{verbatim}
Andrew M. Kuchlingd058d002006-04-16 18:20:05 +0000609with expression [as variable]:
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000610 with-block
611\end{verbatim}
612
Andrew M. Kuchlingedb575e2006-04-23 21:01:04 +0000613The expression is evaluated, and it should result in an object that
614supports the context management protocol. This object may return a
Andrew M. Kuchlingd058d002006-04-16 18:20:05 +0000615value that can optionally be bound to the name \var{variable}. (Note
Andrew M. Kuchlingedb575e2006-04-23 21:01:04 +0000616carefully that \var{variable} is \emph{not} assigned the result of
617\var{expression}.) The object can then run set-up code
618before \var{with-block} is executed and some clean-up code
619is executed after the block is done, even if the block raised an exception.
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000620
621To enable the statement in Python 2.5, you need
622to add the following directive to your module:
623
624\begin{verbatim}
625from __future__ import with_statement
626\end{verbatim}
627
Andrew M. Kuchlingd058d002006-04-16 18:20:05 +0000628The statement will always be enabled in Python 2.6.
629
Andrew M. Kuchlingedb575e2006-04-23 21:01:04 +0000630Some standard Python objects now support the context management
631protocol and can be used with the '\keyword{with}' statement. File
Andrew M. Kuchlingd058d002006-04-16 18:20:05 +0000632objects are one example:
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000633
634\begin{verbatim}
635with open('/etc/passwd', 'r') as f:
636 for line in f:
637 print line
Andrew M. Kuchlingd058d002006-04-16 18:20:05 +0000638 ... more processing code ...
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000639\end{verbatim}
640
Andrew M. Kuchlingd058d002006-04-16 18:20:05 +0000641After this statement has executed, the file object in \var{f} will
Andrew M. Kuchling42c6e2f2006-04-21 13:01:45 +0000642have been automatically closed, even if the 'for' loop
Andrew M. Kuchlingd058d002006-04-16 18:20:05 +0000643raised an exception part-way through the block.
644
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000645The \module{threading} module's locks and condition variables
Andrew M. Kuchling42c6e2f2006-04-21 13:01:45 +0000646also support the '\keyword{with}' statement:
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000647
648\begin{verbatim}
649lock = threading.Lock()
650with lock:
651 # Critical section of code
652 ...
653\end{verbatim}
654
Andrew M. Kuchlingedb575e2006-04-23 21:01:04 +0000655The lock is acquired before the block is executed and always released once
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000656the block is complete.
657
658The \module{decimal} module's contexts, which encapsulate the desired
Andrew M. Kuchlingf322d682006-05-02 22:47:49 +0000659precision and rounding characteristics for computations, provide a
660\method{context_manager()} method for getting a context manager:
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000661
662\begin{verbatim}
663import decimal
664
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000665# Displays with default precision of 28 digits
Andrew M. Kuchlingd058d002006-04-16 18:20:05 +0000666v1 = decimal.Decimal('578')
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000667print v1.sqrt()
668
Andrew M. Kuchlingf322d682006-05-02 22:47:49 +0000669ctx = decimal.Context(prec=16)
670with ctx.context_manager():
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000671 # All code in this block uses a precision of 16 digits.
672 # The original context is restored on exiting the block.
673 print v1.sqrt()
674\end{verbatim}
675
Andrew M. Kuchlingfb08e732006-04-21 13:08:02 +0000676\subsection{Writing Context Managers\label{context-managers}}
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000677
Andrew M. Kuchling42c6e2f2006-04-21 13:01:45 +0000678Under the hood, the '\keyword{with}' statement is fairly complicated.
Andrew M. Kuchlingedb575e2006-04-23 21:01:04 +0000679Most people will only use '\keyword{with}' in company with existing
Andrew M. Kuchling0a7ed8c2006-04-24 14:30:47 +0000680objects and don't need to know these details, so you can skip the rest
681of this section if you like. Authors of new objects will need to
682understand the details of the underlying implementation and should
683keep reading.
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000684
Andrew M. Kuchlingd058d002006-04-16 18:20:05 +0000685A high-level explanation of the context management protocol is:
686
687\begin{itemize}
Andrew M. Kuchlingd058d002006-04-16 18:20:05 +0000688
Andrew M. Kuchlingf322d682006-05-02 22:47:49 +0000689\item The expression is evaluated and should result in an object
690called a ``context manager''. The context manager must have
Andrew M. Kuchling0a7ed8c2006-04-24 14:30:47 +0000691\method{__enter__()} and \method{__exit__()} methods.
Andrew M. Kuchlingd058d002006-04-16 18:20:05 +0000692
Andrew M. Kuchlingf322d682006-05-02 22:47:49 +0000693\item The context manager's \method{__enter__()} method is called. The value
Andrew M. Kuchlingedb575e2006-04-23 21:01:04 +0000694returned is assigned to \var{VAR}. If no \code{'as \var{VAR}'} clause
695is present, the value is simply discarded.
Andrew M. Kuchlingd058d002006-04-16 18:20:05 +0000696
697\item The code in \var{BLOCK} is executed.
698
Andrew M. Kuchlingedb575e2006-04-23 21:01:04 +0000699\item If \var{BLOCK} raises an exception, the
Andrew M. Kuchlingd058d002006-04-16 18:20:05 +0000700\method{__exit__(\var{type}, \var{value}, \var{traceback})} is called
Andrew M. Kuchlingf322d682006-05-02 22:47:49 +0000701with the exception details, the same values returned by
Andrew M. Kuchlingedb575e2006-04-23 21:01:04 +0000702\function{sys.exc_info()}. The method's return value controls whether
703the exception is re-raised: any false value re-raises the exception,
704and \code{True} will result in suppressing it. You'll only rarely
Andrew M. Kuchlingd798a182006-04-25 12:47:25 +0000705want to suppress the exception, because if you do
706the author of the code containing the
Andrew M. Kuchlingedb575e2006-04-23 21:01:04 +0000707'\keyword{with}' statement will never realize anything went wrong.
Andrew M. Kuchlingd058d002006-04-16 18:20:05 +0000708
709\item If \var{BLOCK} didn't raise an exception,
Andrew M. Kuchlingedb575e2006-04-23 21:01:04 +0000710the \method{__exit__()} method is still called,
Andrew M. Kuchlingd058d002006-04-16 18:20:05 +0000711but \var{type}, \var{value}, and \var{traceback} are all \code{None}.
712
713\end{itemize}
714
715Let's think through an example. I won't present detailed code but
Andrew M. Kuchlingedb575e2006-04-23 21:01:04 +0000716will only sketch the methods necessary for a database that supports
717transactions.
Andrew M. Kuchlingd058d002006-04-16 18:20:05 +0000718
719(For people unfamiliar with database terminology: a set of changes to
720the database are grouped into a transaction. Transactions can be
721either committed, meaning that all the changes are written into the
722database, or rolled back, meaning that the changes are all discarded
723and the database is unchanged. See any database textbook for more
724information.)
725% XXX find a shorter reference?
726
727Let's assume there's an object representing a database connection.
728Our goal will be to let the user write code like this:
729
730\begin{verbatim}
731db_connection = DatabaseConnection()
732with db_connection as cursor:
733 cursor.execute('insert into ...')
734 cursor.execute('delete from ...')
735 # ... more operations ...
736\end{verbatim}
737
Andrew M. Kuchlingedb575e2006-04-23 21:01:04 +0000738The transaction should be committed if the code in the block
739runs flawlessly or rolled back if there's an exception.
Andrew M. Kuchlingf322d682006-05-02 22:47:49 +0000740Here's the basic interface
741for \class{DatabaseConnection} that I'll assume:
Andrew M. Kuchlingd058d002006-04-16 18:20:05 +0000742
743\begin{verbatim}
744class DatabaseConnection:
Andrew M. Kuchlingd058d002006-04-16 18:20:05 +0000745 # Database interface
746 def cursor (self):
747 "Returns a cursor object and starts a new transaction"
748 def commit (self):
749 "Commits current transaction"
750 def rollback (self):
751 "Rolls back current transaction"
752\end{verbatim}
753
Andrew M. Kuchlingedb575e2006-04-23 21:01:04 +0000754The \method {__enter__()} method is pretty easy, having only to start
755a new transaction. For this application the resulting cursor object
756would be a useful result, so the method will return it. The user can
757then add \code{as cursor} to their '\keyword{with}' statement to bind
758the cursor to a variable name.
Andrew M. Kuchlingd058d002006-04-16 18:20:05 +0000759
760\begin{verbatim}
Andrew M. Kuchlingf322d682006-05-02 22:47:49 +0000761class DatabaseConnection:
Andrew M. Kuchlingd058d002006-04-16 18:20:05 +0000762 ...
763 def __enter__ (self):
764 # Code to start a new transaction
Andrew M. Kuchlingf322d682006-05-02 22:47:49 +0000765 cursor = self.cursor()
Andrew M. Kuchlingd058d002006-04-16 18:20:05 +0000766 return cursor
767\end{verbatim}
768
769The \method{__exit__()} method is the most complicated because it's
770where most of the work has to be done. The method has to check if an
771exception occurred. If there was no exception, the transaction is
772committed. The transaction is rolled back if there was an exception.
Andrew M. Kuchling0a7ed8c2006-04-24 14:30:47 +0000773
774In the code below, execution will just fall off the end of the
775function, returning the default value of \code{None}. \code{None} is
776false, so the exception will be re-raised automatically. If you
777wished, you could be more explicit and add a \keyword{return}
778statement at the marked location.
Andrew M. Kuchlingd058d002006-04-16 18:20:05 +0000779
780\begin{verbatim}
Andrew M. Kuchlingf322d682006-05-02 22:47:49 +0000781class DatabaseConnection:
Andrew M. Kuchlingd058d002006-04-16 18:20:05 +0000782 ...
783 def __exit__ (self, type, value, tb):
784 if tb is None:
785 # No exception, so commit
Andrew M. Kuchlingf322d682006-05-02 22:47:49 +0000786 self.commit()
Andrew M. Kuchlingd058d002006-04-16 18:20:05 +0000787 else:
788 # Exception occurred, so rollback.
Andrew M. Kuchlingf322d682006-05-02 22:47:49 +0000789 self.rollback()
Andrew M. Kuchlingd058d002006-04-16 18:20:05 +0000790 # return False
791\end{verbatim}
792
Andrew M. Kuchlingd058d002006-04-16 18:20:05 +0000793
Andrew M. Kuchlingde0a23f2006-04-16 18:45:11 +0000794\subsection{The contextlib module\label{module-contextlib}}
Andrew M. Kuchling9c67ee02006-04-04 19:07:27 +0000795
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000796The new \module{contextlib} module provides some functions and a
Andrew M. Kuchlingedb575e2006-04-23 21:01:04 +0000797decorator that are useful for writing objects for use with the
798'\keyword{with}' statement.
Andrew M. Kuchling9c67ee02006-04-04 19:07:27 +0000799
Andrew M. Kuchlingd798a182006-04-25 12:47:25 +0000800The decorator is called \function{contextfactory}, and lets you write
801a single generator function instead of defining a new class. The generator
Andrew M. Kuchlingedb575e2006-04-23 21:01:04 +0000802should yield exactly one value. The code up to the \keyword{yield}
803will be executed as the \method{__enter__()} method, and the value
804yielded will be the method's return value that will get bound to the
805variable in the '\keyword{with}' statement's \keyword{as} clause, if
806any. The code after the \keyword{yield} will be executed in the
807\method{__exit__()} method. Any exception raised in the block will be
808raised by the \keyword{yield} statement.
Andrew M. Kuchlingde0a23f2006-04-16 18:45:11 +0000809
810Our database example from the previous section could be written
811using this decorator as:
812
813\begin{verbatim}
Andrew M. Kuchlingd798a182006-04-25 12:47:25 +0000814from contextlib import contextfactory
Andrew M. Kuchlingde0a23f2006-04-16 18:45:11 +0000815
Andrew M. Kuchlingd798a182006-04-25 12:47:25 +0000816@contextfactory
Andrew M. Kuchlingde0a23f2006-04-16 18:45:11 +0000817def db_transaction (connection):
818 cursor = connection.cursor()
819 try:
820 yield cursor
821 except:
822 connection.rollback()
823 raise
824 else:
825 connection.commit()
826
827db = DatabaseConnection()
828with db_transaction(db) as cursor:
829 ...
830\end{verbatim}
831
Andrew M. Kuchlingd798a182006-04-25 12:47:25 +0000832The \module{contextlib} module also has a \function{nested(\var{mgr1},
Andrew M. Kuchlingf322d682006-05-02 22:47:49 +0000833\var{mgr2}, ...)} function that combines a number of context managers so you
Andrew M. Kuchlingd798a182006-04-25 12:47:25 +0000834don't need to write nested '\keyword{with}' statements. In this
835example, the single '\keyword{with}' statement both starts a database
836transaction and acquires a thread lock:
Andrew M. Kuchlingde0a23f2006-04-16 18:45:11 +0000837
838\begin{verbatim}
839lock = threading.Lock()
840with nested (db_transaction(db), lock) as (cursor, locked):
841 ...
842\end{verbatim}
843
Andrew M. Kuchlingedb575e2006-04-23 21:01:04 +0000844Finally, the \function{closing(\var{object})} function
Andrew M. Kuchlingde0a23f2006-04-16 18:45:11 +0000845returns \var{object} so that it can be bound to a variable,
846and calls \code{\var{object}.close()} at the end of the block.
847
848\begin{verbatim}
Andrew M. Kuchling63fe9b52006-04-20 13:36:06 +0000849import urllib, sys
850from contextlib import closing
851
852with closing(urllib.urlopen('http://www.yahoo.com')) as f:
Andrew M. Kuchlingde0a23f2006-04-16 18:45:11 +0000853 for line in f:
Andrew M. Kuchling63fe9b52006-04-20 13:36:06 +0000854 sys.stdout.write(line)
Andrew M. Kuchlingde0a23f2006-04-16 18:45:11 +0000855\end{verbatim}
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000856
857\begin{seealso}
858
Andrew M. Kuchling67191312006-04-19 12:55:39 +0000859\seepep{343}{The ``with'' statement}{PEP written by Guido van~Rossum
860and Nick Coghlan; implemented by Mike Bland, Guido van~Rossum, and
Andrew M. Kuchling42c6e2f2006-04-21 13:01:45 +0000861Neal Norwitz. The PEP shows the code generated for a '\keyword{with}'
Andrew M. Kuchlingedb575e2006-04-23 21:01:04 +0000862statement, which can be helpful in learning how the statement works.}
Andrew M. Kuchlingde0a23f2006-04-16 18:45:11 +0000863
864\seeurl{../lib/module-contextlib.html}{The documentation
865for the \module{contextlib} module.}
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000866
867\end{seealso}
868
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000869
870%======================================================================
Andrew M. Kuchlingfb08e732006-04-21 13:08:02 +0000871\section{PEP 352: Exceptions as New-Style Classes\label{pep-352}}
Andrew M. Kuchling8f4d2552006-03-08 01:50:20 +0000872
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000873Exception classes can now be new-style classes, not just classic
874classes, and the built-in \exception{Exception} class and all the
875standard built-in exceptions (\exception{NameError},
876\exception{ValueError}, etc.) are now new-style classes.
Andrew M. Kuchlingaeadf952006-03-09 19:06:05 +0000877
878The inheritance hierarchy for exceptions has been rearranged a bit.
879In 2.5, the inheritance relationships are:
880
881\begin{verbatim}
882BaseException # New in Python 2.5
883|- KeyboardInterrupt
884|- SystemExit
885|- Exception
886 |- (all other current built-in exceptions)
887\end{verbatim}
888
889This rearrangement was done because people often want to catch all
890exceptions that indicate program errors. \exception{KeyboardInterrupt} and
891\exception{SystemExit} aren't errors, though, and usually represent an explicit
892action such as the user hitting Control-C or code calling
893\function{sys.exit()}. A bare \code{except:} will catch all exceptions,
894so you commonly need to list \exception{KeyboardInterrupt} and
895\exception{SystemExit} in order to re-raise them. The usual pattern is:
896
897\begin{verbatim}
898try:
899 ...
900except (KeyboardInterrupt, SystemExit):
901 raise
902except:
903 # Log error...
904 # Continue running program...
905\end{verbatim}
906
907In Python 2.5, you can now write \code{except Exception} to achieve
908the same result, catching all the exceptions that usually indicate errors
909but leaving \exception{KeyboardInterrupt} and
910\exception{SystemExit} alone. As in previous versions,
911a bare \code{except:} still catches all exceptions.
912
913The goal for Python 3.0 is to require any class raised as an exception
914to derive from \exception{BaseException} or some descendant of
915\exception{BaseException}, and future releases in the
916Python 2.x series may begin to enforce this constraint. Therefore, I
917suggest you begin making all your exception classes derive from
918\exception{Exception} now. It's been suggested that the bare
919\code{except:} form should be removed in Python 3.0, but Guido van~Rossum
920hasn't decided whether to do this or not.
921
922Raising of strings as exceptions, as in the statement \code{raise
923"Error occurred"}, is deprecated in Python 2.5 and will trigger a
924warning. The aim is to be able to remove the string-exception feature
925in a few releases.
926
927
928\begin{seealso}
929
Andrew M. Kuchlingc3749a92006-04-04 19:14:41 +0000930\seepep{352}{Required Superclass for Exceptions}{PEP written by
Andrew M. Kuchling67191312006-04-19 12:55:39 +0000931Brett Cannon and Guido van~Rossum; implemented by Brett Cannon.}
Andrew M. Kuchlingaeadf952006-03-09 19:06:05 +0000932
933\end{seealso}
Andrew M. Kuchling8f4d2552006-03-08 01:50:20 +0000934
935
936%======================================================================
Andrew M. Kuchlingfb08e732006-04-21 13:08:02 +0000937\section{PEP 353: Using ssize_t as the index type\label{pep-353}}
Andrew M. Kuchlingc3749a92006-04-04 19:14:41 +0000938
939A wide-ranging change to Python's C API, using a new
940\ctype{Py_ssize_t} type definition instead of \ctype{int},
941will permit the interpreter to handle more data on 64-bit platforms.
942This change doesn't affect Python's capacity on 32-bit platforms.
943
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +0000944Various pieces of the Python interpreter used C's \ctype{int} type to
945store sizes or counts; for example, the number of items in a list or
946tuple were stored in an \ctype{int}. The C compilers for most 64-bit
947platforms still define \ctype{int} as a 32-bit type, so that meant
948that lists could only hold up to \code{2**31 - 1} = 2147483647 items.
949(There are actually a few different programming models that 64-bit C
950compilers can use -- see
951\url{http://www.unix.org/version2/whatsnew/lp64_wp.html} for a
952discussion -- but the most commonly available model leaves \ctype{int}
953as 32 bits.)
954
955A limit of 2147483647 items doesn't really matter on a 32-bit platform
956because you'll run out of memory before hitting the length limit.
957Each list item requires space for a pointer, which is 4 bytes, plus
958space for a \ctype{PyObject} representing the item. 2147483647*4 is
959already more bytes than a 32-bit address space can contain.
960
961It's possible to address that much memory on a 64-bit platform,
962however. The pointers for a list that size would only require 16GiB
963of space, so it's not unreasonable that Python programmers might
964construct lists that large. Therefore, the Python interpreter had to
965be changed to use some type other than \ctype{int}, and this will be a
96664-bit type on 64-bit platforms. The change will cause
967incompatibilities on 64-bit machines, so it was deemed worth making
968the transition now, while the number of 64-bit users is still
969relatively small. (In 5 or 10 years, we may \emph{all} be on 64-bit
970machines, and the transition would be more painful then.)
971
972This change most strongly affects authors of C extension modules.
973Python strings and container types such as lists and tuples
974now use \ctype{Py_ssize_t} to store their size.
975Functions such as \cfunction{PyList_Size()}
976now return \ctype{Py_ssize_t}. Code in extension modules
977may therefore need to have some variables changed to
978\ctype{Py_ssize_t}.
979
980The \cfunction{PyArg_ParseTuple()} and \cfunction{Py_BuildValue()} functions
981have a new conversion code, \samp{n}, for \ctype{Py_ssize_t}.
Andrew M. Kuchlinga4d651f2006-04-06 13:24:58 +0000982\cfunction{PyArg_ParseTuple()}'s \samp{s\#} and \samp{t\#} still output
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +0000983\ctype{int} by default, but you can define the macro
984\csimplemacro{PY_SSIZE_T_CLEAN} before including \file{Python.h}
985to make them return \ctype{Py_ssize_t}.
986
987\pep{353} has a section on conversion guidelines that
988extension authors should read to learn about supporting 64-bit
989platforms.
Andrew M. Kuchlingc3749a92006-04-04 19:14:41 +0000990
991\begin{seealso}
992
Andrew M. Kuchling5f445bf2006-04-12 18:54:00 +0000993\seepep{353}{Using ssize_t as the index type}{PEP written and implemented by Martin von~L\"owis.}
Andrew M. Kuchlingc3749a92006-04-04 19:14:41 +0000994
995\end{seealso}
996
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +0000997
Andrew M. Kuchlingc3749a92006-04-04 19:14:41 +0000998%======================================================================
Andrew M. Kuchlingfb08e732006-04-21 13:08:02 +0000999\section{PEP 357: The '__index__' method\label{pep-357}}
Andrew M. Kuchling437567c2006-03-07 20:48:55 +00001000
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001001The NumPy developers had a problem that could only be solved by adding
1002a new special method, \method{__index__}. When using slice notation,
Fred Drake1c0e3282006-04-02 03:30:06 +00001003as in \code{[\var{start}:\var{stop}:\var{step}]}, the values of the
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001004\var{start}, \var{stop}, and \var{step} indexes must all be either
1005integers or long integers. NumPy defines a variety of specialized
1006integer types corresponding to unsigned and signed integers of 8, 16,
100732, and 64 bits, but there was no way to signal that these types could
1008be used as slice indexes.
1009
1010Slicing can't just use the existing \method{__int__} method because
1011that method is also used to implement coercion to integers. If
1012slicing used \method{__int__}, floating-point numbers would also
1013become legal slice indexes and that's clearly an undesirable
1014behaviour.
1015
1016Instead, a new special method called \method{__index__} was added. It
1017takes no arguments and returns an integer giving the slice index to
1018use. For example:
1019
1020\begin{verbatim}
1021class C:
1022 def __index__ (self):
1023 return self.value
1024\end{verbatim}
1025
1026The return value must be either a Python integer or long integer.
1027The interpreter will check that the type returned is correct, and
1028raises a \exception{TypeError} if this requirement isn't met.
1029
1030A corresponding \member{nb_index} slot was added to the C-level
1031\ctype{PyNumberMethods} structure to let C extensions implement this
1032protocol. \cfunction{PyNumber_Index(\var{obj})} can be used in
1033extension code to call the \method{__index__} function and retrieve
1034its result.
1035
1036\begin{seealso}
1037
1038\seepep{357}{Allowing Any Object to be Used for Slicing}{PEP written
Andrew M. Kuchling9c67ee02006-04-04 19:07:27 +00001039and implemented by Travis Oliphant.}
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001040
1041\end{seealso}
Andrew M. Kuchling437567c2006-03-07 20:48:55 +00001042
1043
1044%======================================================================
Andrew M. Kuchling98189242006-04-26 12:23:39 +00001045\section{Other Language Changes\label{other-lang}}
Fred Drake2db76802004-12-01 05:05:47 +00001046
1047Here are all of the changes that Python 2.5 makes to the core Python
1048language.
1049
1050\begin{itemize}
Andrew M. Kuchling1cae3f52004-12-03 14:57:21 +00001051
Andrew M. Kuchlingc7095842006-04-14 12:41:19 +00001052\item The \class{dict} type has a new hook for letting subclasses
1053provide a default value when a key isn't contained in the dictionary.
1054When a key isn't found, the dictionary's
1055\method{__missing__(\var{key})}
1056method will be called. This hook is used to implement
1057the new \class{defaultdict} class in the \module{collections}
1058module. The following example defines a dictionary
1059that returns zero for any missing key:
1060
1061\begin{verbatim}
1062class zerodict (dict):
1063 def __missing__ (self, key):
1064 return 0
1065
1066d = zerodict({1:1, 2:2})
1067print d[1], d[2] # Prints 1, 2
1068print d[3], d[4] # Prints 0, 0
1069\end{verbatim}
1070
Andrew M. Kuchlingafe65982006-05-26 18:41:18 +00001071\item Both 8-bit and Unicode strings have new \method{partition(sep)}
1072and \method{rpartition(sep)} methods that simplify a common use case.
Andrew M. Kuchlingad0cb652006-05-26 12:39:48 +00001073The \method{find(S)} method is often used to get an index which is
1074then used to slice the string and obtain the pieces that are before
Andrew M. Kuchlingafe65982006-05-26 18:41:18 +00001075and after the separator.
1076
1077\method{partition(sep)} condenses this
Andrew M. Kuchlingad0cb652006-05-26 12:39:48 +00001078pattern into a single method call that returns a 3-tuple containing
1079the substring before the separator, the separator itself, and the
1080substring after the separator. If the separator isn't found, the
1081first element of the tuple is the entire string and the other two
Andrew M. Kuchlingafe65982006-05-26 18:41:18 +00001082elements are empty. \method{rpartition(sep)} also returns a 3-tuple
1083but starts searching from the end of the string; the \samp{r} stands
1084for 'reverse'.
1085
1086Some examples:
Andrew M. Kuchlingad0cb652006-05-26 12:39:48 +00001087
1088\begin{verbatim}
1089>>> ('http://www.python.org').partition('://')
1090('http', '://', 'www.python.org')
1091>>> (u'Subject: a quick question').partition(':')
1092(u'Subject', u':', u' a quick question')
1093>>> ('file:/usr/share/doc/index.html').partition('://')
1094('file:/usr/share/doc/index.html', '', '')
Andrew M. Kuchlingafe65982006-05-26 18:41:18 +00001095>>> 'www.python.org'.rpartition('.')
1096('www.python', '.', 'org')
Andrew M. Kuchlingad0cb652006-05-26 12:39:48 +00001097\end{verbatim}
1098
1099(Implemented by Fredrik Lundh following a suggestion by Raymond Hettinger.)
1100
Andrew M. Kuchling1cae3f52004-12-03 14:57:21 +00001101\item The \function{min()} and \function{max()} built-in functions
Andrew M. Kuchling42c6e2f2006-04-21 13:01:45 +00001102gained a \code{key} keyword parameter analogous to the \code{key}
1103argument for \method{sort()}. This parameter supplies a function that
Andrew M. Kuchlingc7095842006-04-14 12:41:19 +00001104takes a single argument and is called for every value in the list;
Andrew M. Kuchling1cae3f52004-12-03 14:57:21 +00001105\function{min()}/\function{max()} will return the element with the
1106smallest/largest return value from this function.
1107For example, to find the longest string in a list, you can do:
1108
1109\begin{verbatim}
1110L = ['medium', 'longest', 'short']
1111# Prints 'longest'
Andrew M. Kuchlingd058d002006-04-16 18:20:05 +00001112print max(L, key=len)
Andrew M. Kuchling1cae3f52004-12-03 14:57:21 +00001113# Prints 'short', because lexicographically 'short' has the largest value
1114print max(L)
1115\end{verbatim}
1116
1117(Contributed by Steven Bethard and Raymond Hettinger.)
Fred Drake2db76802004-12-01 05:05:47 +00001118
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001119\item Two new built-in functions, \function{any()} and
1120\function{all()}, evaluate whether an iterator contains any true or
1121false values. \function{any()} returns \constant{True} if any value
1122returned by the iterator is true; otherwise it will return
1123\constant{False}. \function{all()} returns \constant{True} only if
1124all of the values returned by the iterator evaluate as being true.
Andrew M. Kuchling6e3a66d2006-04-07 12:46:06 +00001125(Suggested by GvR, and implemented by Raymond Hettinger.)
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001126
Andrew M. Kuchling5f445bf2006-04-12 18:54:00 +00001127\item ASCII is now the default encoding for modules. It's now
1128a syntax error if a module contains string literals with 8-bit
1129characters but doesn't have an encoding declaration. In Python 2.4
1130this triggered a warning, not a syntax error. See \pep{263}
1131for how to declare a module's encoding; for example, you might add
1132a line like this near the top of the source file:
1133
1134\begin{verbatim}
1135# -*- coding: latin1 -*-
1136\end{verbatim}
1137
Andrew M. Kuchlingc9236112006-04-30 01:07:09 +00001138\item One error that Python programmers sometimes make is forgetting
1139to include an \file{__init__.py} module in a package directory.
1140Debugging this mistake can be confusing, and usually requires running
1141Python with the \programopt{-v} switch to log all the paths searched.
1142In Python 2.5, a new \exception{ImportWarning} warning is raised when
1143an import would have picked up a directory as a package but no
1144\file{__init__.py} was found. (Implemented by Thomas Wouters.)
1145
Andrew M. Kuchlinge9b1bf42005-03-20 19:26:30 +00001146\item The list of base classes in a class definition can now be empty.
1147As an example, this is now legal:
1148
1149\begin{verbatim}
1150class C():
1151 pass
1152\end{verbatim}
1153(Implemented by Brett Cannon.)
1154
Fred Drake2db76802004-12-01 05:05:47 +00001155\end{itemize}
1156
1157
1158%======================================================================
Andrew M. Kuchling98189242006-04-26 12:23:39 +00001159\subsection{Interactive Interpreter Changes\label{interactive}}
Andrew M. Kuchlingda376042006-03-17 15:56:41 +00001160
1161In the interactive interpreter, \code{quit} and \code{exit}
1162have long been strings so that new users get a somewhat helpful message
1163when they try to quit:
1164
1165\begin{verbatim}
1166>>> quit
1167'Use Ctrl-D (i.e. EOF) to exit.'
1168\end{verbatim}
1169
1170In Python 2.5, \code{quit} and \code{exit} are now objects that still
1171produce string representations of themselves, but are also callable.
1172Newbies who try \code{quit()} or \code{exit()} will now exit the
1173interpreter as they expect. (Implemented by Georg Brandl.)
1174
1175
1176%======================================================================
Andrew M. Kuchling98189242006-04-26 12:23:39 +00001177\subsection{Optimizations\label{opts}}
Fred Drake2db76802004-12-01 05:05:47 +00001178
Andrew M. Kuchlingc6027232006-05-23 12:44:36 +00001179Several of the optimizations were developed at the NeedForSpeed
1180sprint, an event held in Reykjavik, Iceland, from May 21--28 2006.
1181The sprint focused on speed enhancements to the CPython implementation
1182and was funded by EWT LLC with local support from CCP Games. Those
1183optimizations added at this sprint are specially marked in the
1184following list.
1185
Fred Drake2db76802004-12-01 05:05:47 +00001186\begin{itemize}
1187
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001188\item When they were introduced
1189in Python 2.4, the built-in \class{set} and \class{frozenset} types
1190were built on top of Python's dictionary type.
1191In 2.5 the internal data structure has been customized for implementing sets,
1192and as a result sets will use a third less memory and are somewhat faster.
1193(Implemented by Raymond Hettinger.)
Fred Drake2db76802004-12-01 05:05:47 +00001194
Andrew M. Kuchling1985ff72006-06-05 00:08:09 +00001195\item The speed of some Unicode operations, such as finding
1196substrings, string splitting, and character map encoding and decoding,
1197has been improved. (Substring search and splitting improvements were
Andrew M. Kuchling150faff2006-05-23 19:29:38 +00001198added by Fredrik Lundh and Andrew Dalke at the NeedForSpeed
Andrew M. Kuchling1985ff72006-06-05 00:08:09 +00001199sprint. Character maps were improved by Walter D\"orwald and
1200Martin von~L\"owis.)
1201% Patch 1313939, 1359618
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001202
Andrew M. Kuchling3cdf24b2006-05-25 00:23:03 +00001203\item The \function{long(\var{str}, \var{base})} function is now
1204faster on long digit strings because fewer intermediate results are
1205calculated. The peak is for strings of around 800--1000 digits where
1206the function is 6 times faster.
1207(Contributed by Alan McIntyre and committed at the NeedForSpeed sprint.)
1208% Patch 1442927
1209
Andrew M. Kuchling70bd1992006-05-23 19:32:35 +00001210\item The \module{struct} module now compiles structure format
1211strings into an internal representation and caches this
1212representation, yielding a 20\% speedup. (Contributed by Bob Ippolito
1213at the NeedForSpeed sprint.)
1214
Andrew M. Kuchling3b336c72006-06-07 17:03:46 +00001215\item The \module{re} module got a 1 or 2\% speedup by switching to
1216Python's allocator functions instead of the system's
1217\cfunction{malloc()} and \cfunction{free()}.
1218(Contributed by Jack Diederich at the NeedForSpeed sprint.)
1219
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001220\item The code generator's peephole optimizer now performs
1221simple constant folding in expressions. If you write something like
1222\code{a = 2+3}, the code generator will do the arithmetic and produce
1223code corresponding to \code{a = 5}.
1224
Andrew M. Kuchlingc6027232006-05-23 12:44:36 +00001225\item Function calls are now faster because code objects now keep
1226the most recently finished frame (a ``zombie frame'') in an internal
1227field of the code object, reusing it the next time the code object is
1228invoked. (Original patch by Michael Hudson, modified by Armin Rigo
1229and Richard Jones; committed at the NeedForSpeed sprint.)
1230% Patch 876206
1231
Andrew M. Kuchling150faff2006-05-23 19:29:38 +00001232Frame objects are also slightly smaller, which may improve cache locality
1233and reduce memory usage a bit. (Contributed by Neal Norwitz.)
1234% Patch 1337051
1235
Andrew M. Kuchlingdae266e2006-05-27 13:44:37 +00001236\item Python's built-in exceptions are now new-style classes, a change
1237that speeds up instantiation considerably. Exception handling in
1238Python 2.5 is therefore about 30\% faster than in 2.4.
Richard Jones87f54712006-05-27 13:50:42 +00001239(Contributed by Richard Jones, Georg Brandl and Sean Reifschneider at
1240the NeedForSpeed sprint.)
Andrew M. Kuchlingdae266e2006-05-27 13:44:37 +00001241
Andrew M. Kuchlingafe65982006-05-26 18:41:18 +00001242\item Importing now caches the paths tried, recording whether
1243they exist or not so that the interpreter makes fewer
1244\cfunction{open()} and \cfunction{stat()} calls on startup.
1245(Contributed by Martin von~L\"owis and Georg Brandl.)
1246% Patch 921466
1247
Fred Drake2db76802004-12-01 05:05:47 +00001248\end{itemize}
1249
1250The net result of the 2.5 optimizations is that Python 2.5 runs the
Andrew M. Kuchling9c67ee02006-04-04 19:07:27 +00001251pystone benchmark around XXX\% faster than Python 2.4.
Fred Drake2db76802004-12-01 05:05:47 +00001252
1253
1254%======================================================================
Andrew M. Kuchling98189242006-04-26 12:23:39 +00001255\section{New, Improved, and Removed Modules\label{modules}}
Fred Drake2db76802004-12-01 05:05:47 +00001256
Andrew M. Kuchlingde0a23f2006-04-16 18:45:11 +00001257The standard library received many enhancements and bug fixes in
1258Python 2.5. Here's a partial list of the most notable changes, sorted
1259alphabetically by module name. Consult the \file{Misc/NEWS} file in
1260the source tree for a more complete list of changes, or look through
1261the SVN logs for all the details.
Fred Drake2db76802004-12-01 05:05:47 +00001262
1263\begin{itemize}
1264
Andrew M. Kuchling6fc69762006-04-13 12:37:21 +00001265\item The \module{audioop} module now supports the a-LAW encoding,
1266and the code for u-LAW encoding has been improved. (Contributed by
1267Lars Immisch.)
1268
Andrew M. Kuchling42c6e2f2006-04-21 13:01:45 +00001269\item The \module{codecs} module gained support for incremental
1270codecs. The \function{codec.lookup()} function now
1271returns a \class{CodecInfo} instance instead of a tuple.
1272\class{CodecInfo} instances behave like a 4-tuple to preserve backward
1273compatibility but also have the attributes \member{encode},
1274\member{decode}, \member{incrementalencoder}, \member{incrementaldecoder},
1275\member{streamwriter}, and \member{streamreader}. Incremental codecs
1276can receive input and produce output in multiple chunks; the output is
1277the same as if the entire input was fed to the non-incremental codec.
1278See the \module{codecs} module documentation for details.
1279(Designed and implemented by Walter D\"orwald.)
1280% Patch 1436130
1281
Andrew M. Kuchlingc7095842006-04-14 12:41:19 +00001282\item The \module{collections} module gained a new type,
1283\class{defaultdict}, that subclasses the standard \class{dict}
1284type. The new type mostly behaves like a dictionary but constructs a
1285default value when a key isn't present, automatically adding it to the
1286dictionary for the requested key value.
1287
1288The first argument to \class{defaultdict}'s constructor is a factory
1289function that gets called whenever a key is requested but not found.
1290This factory function receives no arguments, so you can use built-in
1291type constructors such as \function{list()} or \function{int()}. For
1292example,
1293you can make an index of words based on their initial letter like this:
1294
1295\begin{verbatim}
1296words = """Nel mezzo del cammin di nostra vita
1297mi ritrovai per una selva oscura
1298che la diritta via era smarrita""".lower().split()
1299
1300index = defaultdict(list)
1301
1302for w in words:
1303 init_letter = w[0]
1304 index[init_letter].append(w)
1305\end{verbatim}
1306
1307Printing \code{index} results in the following output:
1308
1309\begin{verbatim}
1310defaultdict(<type 'list'>, {'c': ['cammin', 'che'], 'e': ['era'],
Andrew M. Kuchlingd058d002006-04-16 18:20:05 +00001311 'd': ['del', 'di', 'diritta'], 'm': ['mezzo', 'mi'],
1312 'l': ['la'], 'o': ['oscura'], 'n': ['nel', 'nostra'],
1313 'p': ['per'], 's': ['selva', 'smarrita'],
1314 'r': ['ritrovai'], 'u': ['una'], 'v': ['vita', 'via']}
Andrew M. Kuchlingc7095842006-04-14 12:41:19 +00001315\end{verbatim}
1316
1317The \class{deque} double-ended queue type supplied by the
1318\module{collections} module now has a \method{remove(\var{value})}
1319method that removes the first occurrence of \var{value} in the queue,
1320raising \exception{ValueError} if the value isn't found.
1321
Andrew M. Kuchling63fe9b52006-04-20 13:36:06 +00001322\item New module: The \module{contextlib} module contains helper functions for use
Andrew M. Kuchling42c6e2f2006-04-21 13:01:45 +00001323with the new '\keyword{with}' statement. See
Andrew M. Kuchling63fe9b52006-04-20 13:36:06 +00001324section~\ref{module-contextlib} for more about this module.
Andrew M. Kuchlingde0a23f2006-04-16 18:45:11 +00001325
Andrew M. Kuchling63fe9b52006-04-20 13:36:06 +00001326\item New module: The \module{cProfile} module is a C implementation of
Andrew M. Kuchlingc7095842006-04-14 12:41:19 +00001327the existing \module{profile} module that has much lower overhead.
1328The module's interface is the same as \module{profile}: you run
1329\code{cProfile.run('main()')} to profile a function, can save profile
1330data to a file, etc. It's not yet known if the Hotshot profiler,
1331which is also written in C but doesn't match the \module{profile}
1332module's interface, will continue to be maintained in future versions
1333of Python. (Contributed by Armin Rigo.)
1334
Andrew M. Kuchling0a7ed8c2006-04-24 14:30:47 +00001335Also, the \module{pstats} module for analyzing the data measured by
1336the profiler now supports directing the output to any file object
Andrew M. Kuchlinge78eeb12006-04-21 13:26:42 +00001337by supplying a \var{stream} argument to the \class{Stats} constructor.
1338(Contributed by Skip Montanaro.)
1339
Andrew M. Kuchling952f1962006-04-18 12:38:19 +00001340\item The \module{csv} module, which parses files in
1341comma-separated value format, received several enhancements and a
1342number of bugfixes. You can now set the maximum size in bytes of a
1343field by calling the \method{csv.field_size_limit(\var{new_limit})}
1344function; omitting the \var{new_limit} argument will return the
1345currently-set limit. The \class{reader} class now has a
1346\member{line_num} attribute that counts the number of physical lines
1347read from the source; records can span multiple physical lines, so
1348\member{line_num} is not the same as the number of records read.
1349(Contributed by Skip Montanaro and Andrew McNamara.)
1350
Andrew M. Kuchling67191312006-04-19 12:55:39 +00001351\item The \class{datetime} class in the \module{datetime}
1352module now has a \method{strptime(\var{string}, \var{format})}
1353method for parsing date strings, contributed by Josh Spoerri.
1354It uses the same format characters as \function{time.strptime()} and
1355\function{time.strftime()}:
1356
1357\begin{verbatim}
1358from datetime import datetime
1359
1360ts = datetime.strptime('10:13:15 2006-03-07',
1361 '%H:%M:%S %Y-%m-%d')
1362\end{verbatim}
1363
Andrew M. Kuchlingb33842a2006-04-25 12:31:38 +00001364\item The \module{doctest} module gained a \code{SKIP} option that
1365keeps an example from being executed at all. This is intended for
1366code snippets that are usage examples intended for the reader and
1367aren't actually test cases.
1368
Andrew M. Kuchling63fe9b52006-04-20 13:36:06 +00001369\item The \module{fileinput} module was made more flexible.
1370Unicode filenames are now supported, and a \var{mode} parameter that
1371defaults to \code{"r"} was added to the
1372\function{input()} function to allow opening files in binary or
1373universal-newline mode. Another new parameter, \var{openhook},
1374lets you use a function other than \function{open()}
1375to open the input files. Once you're iterating over
1376the set of files, the \class{FileInput} object's new
1377\method{fileno()} returns the file descriptor for the currently opened file.
1378(Contributed by Georg Brandl.)
1379
Andrew M. Kuchlingda376042006-03-17 15:56:41 +00001380\item In the \module{gc} module, the new \function{get_count()} function
1381returns a 3-tuple containing the current collection counts for the
1382three GC generations. This is accounting information for the garbage
1383collector; when these counts reach a specified threshold, a garbage
1384collection sweep will be made. The existing \function{gc.collect()}
1385function now takes an optional \var{generation} argument of 0, 1, or 2
1386to specify which generation to collect.
1387
Andrew M. Kuchlinge9b1bf42005-03-20 19:26:30 +00001388\item The \function{nsmallest()} and
1389\function{nlargest()} functions in the \module{heapq} module
Andrew M. Kuchling42c6e2f2006-04-21 13:01:45 +00001390now support a \code{key} keyword parameter similar to the one
Andrew M. Kuchlinge9b1bf42005-03-20 19:26:30 +00001391provided by the \function{min()}/\function{max()} functions
1392and the \method{sort()} methods. For example:
Andrew M. Kuchlinge9b1bf42005-03-20 19:26:30 +00001393
1394\begin{verbatim}
1395>>> import heapq
1396>>> L = ["short", 'medium', 'longest', 'longer still']
1397>>> heapq.nsmallest(2, L) # Return two lowest elements, lexicographically
1398['longer still', 'longest']
1399>>> heapq.nsmallest(2, L, key=len) # Return two shortest elements
1400['short', 'medium']
1401\end{verbatim}
1402
1403(Contributed by Raymond Hettinger.)
1404
Andrew M. Kuchling511a3a82005-03-20 19:52:18 +00001405\item The \function{itertools.islice()} function now accepts
1406\code{None} for the start and step arguments. This makes it more
1407compatible with the attributes of slice objects, so that you can now write
1408the following:
1409
1410\begin{verbatim}
1411s = slice(5) # Create slice object
1412itertools.islice(iterable, s.start, s.stop, s.step)
1413\end{verbatim}
1414
1415(Contributed by Raymond Hettinger.)
Andrew M. Kuchling3e41b052005-03-01 00:53:46 +00001416
Andrew M. Kuchlingd4c21772006-04-23 21:51:10 +00001417\item The \module{mailbox} module underwent a massive rewrite to add
1418the capability to modify mailboxes in addition to reading them. A new
1419set of classes that include \class{mbox}, \class{MH}, and
1420\class{Maildir} are used to read mailboxes, and have an
1421\method{add(\var{message})} method to add messages,
1422\method{remove(\var{key})} to remove messages, and
1423\method{lock()}/\method{unlock()} to lock/unlock the mailbox. The
1424following example converts a maildir-format mailbox into an mbox-format one:
1425
1426\begin{verbatim}
1427import mailbox
1428
1429# 'factory=None' uses email.Message.Message as the class representing
1430# individual messages.
1431src = mailbox.Maildir('maildir', factory=None)
1432dest = mailbox.mbox('/tmp/mbox')
1433
1434for msg in src:
1435 dest.add(msg)
1436\end{verbatim}
1437
1438(Contributed by Gregory K. Johnson. Funding was provided by Google's
14392005 Summer of Code.)
1440
Andrew M. Kuchling68494882006-05-01 16:32:49 +00001441\item New module: the \module{msilib} module allows creating
1442Microsoft Installer \file{.msi} files and CAB files. Some support
1443for reading the \file{.msi} database is also included.
1444(Contributed by Martin von~L\"owis.)
1445
Andrew M. Kuchling75ba2442006-04-14 10:29:55 +00001446\item The \module{nis} module now supports accessing domains other
1447than the system default domain by supplying a \var{domain} argument to
1448the \function{nis.match()} and \function{nis.maps()} functions.
1449(Contributed by Ben Bell.)
1450
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001451\item The \module{operator} module's \function{itemgetter()}
1452and \function{attrgetter()} functions now support multiple fields.
1453A call such as \code{operator.attrgetter('a', 'b')}
1454will return a function
1455that retrieves the \member{a} and \member{b} attributes. Combining
1456this new feature with the \method{sort()} method's \code{key} parameter
1457lets you easily sort lists using multiple fields.
Andrew M. Kuchling6e3a66d2006-04-07 12:46:06 +00001458(Contributed by Raymond Hettinger.)
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001459
Andrew M. Kuchlingd4c21772006-04-23 21:51:10 +00001460\item The \module{optparse} module was updated to version 1.5.1 of the
1461Optik library. The \class{OptionParser} class gained an
1462\member{epilog} attribute, a string that will be printed after the
1463help message, and a \method{destroy()} method to break reference
1464cycles created by the object. (Contributed by Greg Ward.)
Andrew M. Kuchling3e41b052005-03-01 00:53:46 +00001465
Andrew M. Kuchling0f1955d2006-04-13 12:09:08 +00001466\item The \module{os} module underwent several changes. The
Andrew M. Kuchlinge9b1bf42005-03-20 19:26:30 +00001467\member{stat_float_times} variable now defaults to true, meaning that
1468\function{os.stat()} will now return time values as floats. (This
1469doesn't necessarily mean that \function{os.stat()} will return times
1470that are precise to fractions of a second; not all systems support
1471such precision.)
Andrew M. Kuchling3e41b052005-03-01 00:53:46 +00001472
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001473Constants named \member{os.SEEK_SET}, \member{os.SEEK_CUR}, and
Andrew M. Kuchlinge9b1bf42005-03-20 19:26:30 +00001474\member{os.SEEK_END} have been added; these are the parameters to the
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001475\function{os.lseek()} function. Two new constants for locking are
1476\member{os.O_SHLOCK} and \member{os.O_EXLOCK}.
1477
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001478Two new functions, \function{wait3()} and \function{wait4()}, were
1479added. They're similar the \function{waitpid()} function which waits
1480for a child process to exit and returns a tuple of the process ID and
1481its exit status, but \function{wait3()} and \function{wait4()} return
1482additional information. \function{wait3()} doesn't take a process ID
1483as input, so it waits for any child process to exit and returns a
14843-tuple of \var{process-id}, \var{exit-status}, \var{resource-usage}
1485as returned from the \function{resource.getrusage()} function.
1486\function{wait4(\var{pid})} does take a process ID.
Andrew M. Kuchling6e3a66d2006-04-07 12:46:06 +00001487(Contributed by Chad J. Schroeder.)
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001488
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001489On FreeBSD, the \function{os.stat()} function now returns
1490times with nanosecond resolution, and the returned object
1491now has \member{st_gen} and \member{st_birthtime}.
1492The \member{st_flags} member is also available, if the platform supports it.
Andrew M. Kuchling6e3a66d2006-04-07 12:46:06 +00001493(Contributed by Antti Louko and Diego Petten\`o.)
1494% (Patch 1180695, 1212117)
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001495
Andrew M. Kuchlingb33842a2006-04-25 12:31:38 +00001496\item The Python debugger provided by the \module{pdb} module
1497can now store lists of commands to execute when a breakpoint is
George Yoshida3bbbc492006-04-25 14:09:58 +00001498reached and execution stops. Once breakpoint \#1 has been created,
Andrew M. Kuchlingb33842a2006-04-25 12:31:38 +00001499enter \samp{commands 1} and enter a series of commands to be executed,
1500finishing the list with \samp{end}. The command list can include
1501commands that resume execution, such as \samp{continue} or
1502\samp{next}. (Contributed by Gr\'egoire Dooms.)
1503% Patch 790710
1504
Andrew M. Kuchling42c6e2f2006-04-21 13:01:45 +00001505\item The \module{pickle} and \module{cPickle} modules no
1506longer accept a return value of \code{None} from the
1507\method{__reduce__()} method; the method must return a tuple of
1508arguments instead. The ability to return \code{None} was deprecated
1509in Python 2.4, so this completes the removal of the feature.
1510
Andrew M. Kuchlingaa013da2006-04-29 12:10:43 +00001511\item The \module{pkgutil} module, containing various utility
1512functions for finding packages, was enhanced to support PEP 302's
1513import hooks and now also works for packages stored in ZIP-format archives.
1514(Contributed by Phillip J. Eby.)
1515
Andrew M. Kuchlingc9236112006-04-30 01:07:09 +00001516\item The pybench benchmark suite by Marc-Andr\'e~Lemburg is now
1517included in the \file{Tools/pybench} directory. The pybench suite is
1518an improvement on the commonly used \file{pystone.py} program because
1519pybench provides a more detailed measurement of the interpreter's
Andrew M. Kuchling3e134a52006-05-23 12:49:35 +00001520speed. It times particular operations such as function calls,
Andrew M. Kuchlingc9236112006-04-30 01:07:09 +00001521tuple slicing, method lookups, and numeric operations, instead of
1522performing many different operations and reducing the result to a
1523single number as \file{pystone.py} does.
1524
Andrew M. Kuchling01e3d262006-03-17 15:38:39 +00001525\item The old \module{regex} and \module{regsub} modules, which have been
1526deprecated ever since Python 2.0, have finally been deleted.
Andrew M. Kuchlingf4b06602006-03-17 15:39:52 +00001527Other deleted modules: \module{statcache}, \module{tzparse},
1528\module{whrandom}.
Andrew M. Kuchling01e3d262006-03-17 15:38:39 +00001529
Andrew M. Kuchling42c6e2f2006-04-21 13:01:45 +00001530\item Also deleted: the \file{lib-old} directory,
Andrew M. Kuchling01e3d262006-03-17 15:38:39 +00001531which includes ancient modules such as \module{dircmp} and
Andrew M. Kuchling42c6e2f2006-04-21 13:01:45 +00001532\module{ni}, was removed. \file{lib-old} wasn't on the default
Andrew M. Kuchling01e3d262006-03-17 15:38:39 +00001533\code{sys.path}, so unless your programs explicitly added the directory to
1534\code{sys.path}, this removal shouldn't affect your code.
1535
Andrew M. Kuchling09612282006-04-30 21:19:49 +00001536\item The \module{rlcompleter} module is no longer
1537dependent on importing the \module{readline} module and
1538therefore now works on non-{\UNIX} platforms.
1539(Patch from Robert Kiendl.)
1540% Patch #1472854
1541
Andrew M. Kuchling07cf0722006-05-31 14:12:47 +00001542\item The \module{SimpleXMLRPCServer} and \module{DocXMLRPCServer}
1543classes now have a \member{rpc_paths} attribute that constrains
1544XML-RPC operations to a limited set of URL paths; the default is
1545to allow only \code{'/'} and \code{'/RPC2'}. Setting
1546\member{rpc_paths} to \code{None} or an empty tuple disables
1547this path checking.
1548% Bug #1473048
1549
Andrew M. Kuchling4678dc82006-01-15 16:11:28 +00001550\item The \module{socket} module now supports \constant{AF_NETLINK}
1551sockets on Linux, thanks to a patch from Philippe Biondi.
1552Netlink sockets are a Linux-specific mechanism for communications
1553between a user-space process and kernel code; an introductory
1554article about them is at \url{http://www.linuxjournal.com/article/7356}.
1555In Python code, netlink addresses are represented as a tuple of 2 integers,
1556\code{(\var{pid}, \var{group_mask})}.
1557
Andrew M. Kuchling230c3e12006-05-26 14:03:41 +00001558Two new methods on socket objects, \method{recv_buf(\var{buffer})} and
1559\method{recvfrom_buf(\var{buffer})}, store the received data in an object
1560that supports the buffer protocol instead of returning the data as a
1561string. This means you can put the data directly into an array or a
1562memory-mapped file.
1563
1564Socket objects also gained \method{getfamily()}, \method{gettype()},
1565and \method{getproto()} accessor methods to retrieve the family, type,
1566and protocol values for the socket.
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001567
Andrew M. Kuchling63fe9b52006-04-20 13:36:06 +00001568\item New module: the \module{spwd} module provides functions for
1569accessing the shadow password database on systems that support
1570shadow passwords.
Fred Drake2db76802004-12-01 05:05:47 +00001571
Andrew M. Kuchlingc6f5c872006-05-26 14:04:19 +00001572\item The \module{struct} is now faster because it
Andrew M. Kuchling230c3e12006-05-26 14:03:41 +00001573compiles format strings into \class{Struct} objects
1574with \method{pack()} and \method{unpack()} methods. This is similar
1575to how the \module{re} module lets you create compiled regular
1576expression objects. You can still use the module-level
1577\function{pack()} and \function{unpack()} functions; they'll create
1578\class{Struct} objects and cache them. Or you can use
1579\class{Struct} instances directly:
1580
1581\begin{verbatim}
1582s = struct.Struct('ih3s')
1583
1584data = s.pack(1972, 187, 'abc')
1585year, number, name = s.unpack(data)
1586\end{verbatim}
1587
1588You can also pack and unpack data to and from buffer objects directly
1589using the \method{pack_to(\var{buffer}, \var{offset}, \var{v1},
1590\var{v2}, ...)} and \method{unpack_from(\var{buffer}, \var{offset})}
1591methods. This lets you store data directly into an array or a
1592memory-mapped file.
1593
1594(\class{Struct} objects were implemented by Bob Ippolito at the
1595NeedForSpeed sprint. Support for buffer objects was added by Martin
1596Blais, also at the NeedForSpeed sprint.)
1597
Andrew M. Kuchling61434b62006-04-13 11:51:07 +00001598\item The Python developers switched from CVS to Subversion during the 2.5
Andrew M. Kuchling230c3e12006-05-26 14:03:41 +00001599development process. Information about the exact build version is
1600available as the \code{sys.subversion} variable, a 3-tuple of
1601\code{(\var{interpreter-name}, \var{branch-name},
1602\var{revision-range})}. For example, at the time of writing my copy
1603of 2.5 was reporting \code{('CPython', 'trunk', '45313:45315')}.
Andrew M. Kuchling61434b62006-04-13 11:51:07 +00001604
1605This information is also available to C extensions via the
1606\cfunction{Py_GetBuildInfo()} function that returns a
1607string of build information like this:
1608\code{"trunk:45355:45356M, Apr 13 2006, 07:42:19"}.
1609(Contributed by Barry Warsaw.)
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001610
Andrew M. Kuchlinge9b1bf42005-03-20 19:26:30 +00001611\item The \class{TarFile} class in the \module{tarfile} module now has
Georg Brandl08c02db2005-07-22 18:39:19 +00001612an \method{extractall()} method that extracts all members from the
Andrew M. Kuchlinge9b1bf42005-03-20 19:26:30 +00001613archive into the current working directory. It's also possible to set
1614a different directory as the extraction target, and to unpack only a
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001615subset of the archive's members.
Andrew M. Kuchlinge9b1bf42005-03-20 19:26:30 +00001616
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001617A tarfile's compression can be autodetected by
1618using the mode \code{'r|*'}.
1619% patch 918101
1620(Contributed by Lars Gust\"abel.)
Gregory P. Smithf21a5f72005-08-21 18:45:59 +00001621
Andrew M. Kuchlingf688cc52006-03-10 18:50:08 +00001622\item The \module{unicodedata} module has been updated to use version 4.1.0
1623of the Unicode character database. Version 3.2.0 is required
1624by some specifications, so it's still available as
George Yoshidaa2d6c8a2006-05-27 17:09:17 +00001625\member{unicodedata.ucd_3_2_0}.
Andrew M. Kuchlingf688cc52006-03-10 18:50:08 +00001626
Andrew M. Kuchling63fe9b52006-04-20 13:36:06 +00001627\item The \module{webbrowser} module received a number of
1628enhancements.
1629It's now usable as a script with \code{python -m webbrowser}, taking a
1630URL as the argument; there are a number of switches
1631to control the behaviour (\programopt{-n} for a new browser window,
1632\programopt{-t} for a new tab). New module-level functions,
1633\function{open_new()} and \function{open_new_tab()}, were added
1634to support this. The module's \function{open()} function supports an
1635additional feature, an \var{autoraise} parameter that signals whether
1636to raise the open window when possible. A number of additional
1637browsers were added to the supported list such as Firefox, Opera,
1638Konqueror, and elinks. (Contributed by Oleg Broytmann and George
1639Brandl.)
1640% Patch #754022
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001641
Fredrik Lundh7e0aef02005-12-12 18:54:55 +00001642
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001643\item The \module{xmlrpclib} module now supports returning
1644 \class{datetime} objects for the XML-RPC date type. Supply
1645 \code{use_datetime=True} to the \function{loads()} function
1646 or the \class{Unmarshaller} class to enable this feature.
Andrew M. Kuchling6e3a66d2006-04-07 12:46:06 +00001647 (Contributed by Skip Montanaro.)
1648% Patch 1120353
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001649
Andrew M. Kuchlingd779b352006-05-16 16:11:54 +00001650\item The \module{zlib} module's \class{Compress} and \class{Decompress}
1651objects now support a \method{copy()} method that makes a copy of the
1652object's internal state and returns a new
1653\class{Compress} or \class{Decompress} object.
1654(Contributed by Chris AtLee.)
1655% Patch 1435422
Gregory P. Smithf21a5f72005-08-21 18:45:59 +00001656
Fred Drake114b8ca2005-03-21 05:47:11 +00001657\end{itemize}
Andrew M. Kuchlinge9b1bf42005-03-20 19:26:30 +00001658
Fred Drake2db76802004-12-01 05:05:47 +00001659
1660
1661%======================================================================
Andrew M. Kuchling98189242006-04-26 12:23:39 +00001662\subsection{The ctypes package\label{module-ctypes}}
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001663
1664The \module{ctypes} package, written by Thomas Heller, has been added
1665to the standard library. \module{ctypes} lets you call arbitrary functions
Andrew M. Kuchling28c5f1f2006-04-13 02:04:42 +00001666in shared libraries or DLLs. Long-time users may remember the \module{dl} module, which
1667provides functions for loading shared libraries and calling functions in them. The \module{ctypes} package is much fancier.
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001668
Andrew M. Kuchling28c5f1f2006-04-13 02:04:42 +00001669To load a shared library or DLL, you must create an instance of the
1670\class{CDLL} class and provide the name or path of the shared library
1671or DLL. Once that's done, you can call arbitrary functions
1672by accessing them as attributes of the \class{CDLL} object.
1673
1674\begin{verbatim}
1675import ctypes
1676
1677libc = ctypes.CDLL('libc.so.6')
1678result = libc.printf("Line of output\n")
1679\end{verbatim}
1680
1681Type constructors for the various C types are provided: \function{c_int},
1682\function{c_float}, \function{c_double}, \function{c_char_p} (equivalent to \ctype{char *}), and so forth. Unlike Python's types, the C versions are all mutable; you can assign to their \member{value} attribute
1683to change the wrapped value. Python integers and strings will be automatically
1684converted to the corresponding C types, but for other types you
1685must call the correct type constructor. (And I mean \emph{must};
1686getting it wrong will often result in the interpreter crashing
1687with a segmentation fault.)
1688
1689You shouldn't use \function{c_char_p} with a Python string when the C function will be modifying the memory area, because Python strings are
1690supposed to be immutable; breaking this rule will cause puzzling bugs. When you need a modifiable memory area,
Neal Norwitz5f5a69b2006-04-13 03:41:04 +00001691use \function{create_string_buffer()}:
Andrew M. Kuchling28c5f1f2006-04-13 02:04:42 +00001692
1693\begin{verbatim}
1694s = "this is a string"
1695buf = ctypes.create_string_buffer(s)
1696libc.strfry(buf)
1697\end{verbatim}
1698
1699C functions are assumed to return integers, but you can set
1700the \member{restype} attribute of the function object to
1701change this:
1702
1703\begin{verbatim}
1704>>> libc.atof('2.71828')
1705-1783957616
1706>>> libc.atof.restype = ctypes.c_double
1707>>> libc.atof('2.71828')
17082.71828
1709\end{verbatim}
1710
1711\module{ctypes} also provides a wrapper for Python's C API
1712as the \code{ctypes.pythonapi} object. This object does \emph{not}
1713release the global interpreter lock before calling a function, because the lock must be held when calling into the interpreter's code.
1714There's a \class{py_object()} type constructor that will create a
1715\ctype{PyObject *} pointer. A simple usage:
1716
1717\begin{verbatim}
1718import ctypes
1719
1720d = {}
1721ctypes.pythonapi.PyObject_SetItem(ctypes.py_object(d),
1722 ctypes.py_object("abc"), ctypes.py_object(1))
1723# d is now {'abc', 1}.
1724\end{verbatim}
1725
1726Don't forget to use \class{py_object()}; if it's omitted you end
1727up with a segmentation fault.
1728
1729\module{ctypes} has been around for a while, but people still write
1730and distribution hand-coded extension modules because you can't rely on \module{ctypes} being present.
1731Perhaps developers will begin to write
1732Python wrappers atop a library accessed through \module{ctypes} instead
1733of extension modules, now that \module{ctypes} is included with core Python.
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001734
Andrew M. Kuchling28c5f1f2006-04-13 02:04:42 +00001735\begin{seealso}
1736
1737\seeurl{http://starship.python.net/crew/theller/ctypes/}
1738{The ctypes web page, with a tutorial, reference, and FAQ.}
1739
1740\end{seealso}
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001741
Andrew M. Kuchling61434b62006-04-13 11:51:07 +00001742
1743%======================================================================
Andrew M. Kuchling98189242006-04-26 12:23:39 +00001744\subsection{The ElementTree package\label{module-etree}}
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001745
1746A subset of Fredrik Lundh's ElementTree library for processing XML has
Andrew M. Kuchlinge3c958c2006-05-01 12:45:02 +00001747been added to the standard library as \module{xml.etree}. The
Georg Brandlce27a062006-04-11 06:27:12 +00001748available modules are
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001749\module{ElementTree}, \module{ElementPath}, and
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +00001750\module{ElementInclude} from ElementTree 1.2.6.
1751The \module{cElementTree} accelerator module is also included.
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001752
Andrew M. Kuchling16ed5212006-04-10 22:28:11 +00001753The rest of this section will provide a brief overview of using
1754ElementTree. Full documentation for ElementTree is available at
1755\url{http://effbot.org/zone/element-index.htm}.
1756
1757ElementTree represents an XML document as a tree of element nodes.
1758The text content of the document is stored as the \member{.text}
1759and \member{.tail} attributes of
1760(This is one of the major differences between ElementTree and
1761the Document Object Model; in the DOM there are many different
1762types of node, including \class{TextNode}.)
1763
1764The most commonly used parsing function is \function{parse()}, that
1765takes either a string (assumed to contain a filename) or a file-like
1766object and returns an \class{ElementTree} instance:
1767
1768\begin{verbatim}
Andrew M. Kuchlinge3c958c2006-05-01 12:45:02 +00001769from xml.etree import ElementTree as ET
Andrew M. Kuchling16ed5212006-04-10 22:28:11 +00001770
1771tree = ET.parse('ex-1.xml')
1772
1773feed = urllib.urlopen(
1774 'http://planet.python.org/rss10.xml')
1775tree = ET.parse(feed)
1776\end{verbatim}
1777
1778Once you have an \class{ElementTree} instance, you
1779can call its \method{getroot()} method to get the root \class{Element} node.
1780
1781There's also an \function{XML()} function that takes a string literal
1782and returns an \class{Element} node (not an \class{ElementTree}).
1783This function provides a tidy way to incorporate XML fragments,
1784approaching the convenience of an XML literal:
1785
1786\begin{verbatim}
Andrew M. Kuchlinge3c958c2006-05-01 12:45:02 +00001787svg = ET.XML("""<svg width="10px" version="1.0">
Andrew M. Kuchling16ed5212006-04-10 22:28:11 +00001788 </svg>""")
1789svg.set('height', '320px')
1790svg.append(elem1)
1791\end{verbatim}
1792
1793Each XML element supports some dictionary-like and some list-like
Andrew M. Kuchling075e0232006-04-11 13:14:56 +00001794access methods. Dictionary-like operations are used to access attribute
1795values, and list-like operations are used to access child nodes.
Andrew M. Kuchling16ed5212006-04-10 22:28:11 +00001796
Andrew M. Kuchling075e0232006-04-11 13:14:56 +00001797\begin{tableii}{c|l}{code}{Operation}{Result}
1798 \lineii{elem[n]}{Returns n'th child element.}
1799 \lineii{elem[m:n]}{Returns list of m'th through n'th child elements.}
1800 \lineii{len(elem)}{Returns number of child elements.}
Andrew M. Kuchlinge3c958c2006-05-01 12:45:02 +00001801 \lineii{list(elem)}{Returns list of child elements.}
Andrew M. Kuchling075e0232006-04-11 13:14:56 +00001802 \lineii{elem.append(elem2)}{Adds \var{elem2} as a child.}
1803 \lineii{elem.insert(index, elem2)}{Inserts \var{elem2} at the specified location.}
1804 \lineii{del elem[n]}{Deletes n'th child element.}
1805 \lineii{elem.keys()}{Returns list of attribute names.}
1806 \lineii{elem.get(name)}{Returns value of attribute \var{name}.}
1807 \lineii{elem.set(name, value)}{Sets new value for attribute \var{name}.}
1808 \lineii{elem.attrib}{Retrieves the dictionary containing attributes.}
1809 \lineii{del elem.attrib[name]}{Deletes attribute \var{name}.}
1810\end{tableii}
1811
1812Comments and processing instructions are also represented as
1813\class{Element} nodes. To check if a node is a comment or processing
1814instructions:
1815
1816\begin{verbatim}
1817if elem.tag is ET.Comment:
1818 ...
1819elif elem.tag is ET.ProcessingInstruction:
1820 ...
1821\end{verbatim}
Andrew M. Kuchling16ed5212006-04-10 22:28:11 +00001822
1823To generate XML output, you should call the
1824\method{ElementTree.write()} method. Like \function{parse()},
1825it can take either a string or a file-like object:
1826
1827\begin{verbatim}
1828# Encoding is US-ASCII
1829tree.write('output.xml')
1830
1831# Encoding is UTF-8
1832f = open('output.xml', 'w')
Andrew M. Kuchlinga8837012006-05-02 11:30:03 +00001833tree.write(f, encoding='utf-8')
Andrew M. Kuchling16ed5212006-04-10 22:28:11 +00001834\end{verbatim}
1835
Andrew M. Kuchlinga8837012006-05-02 11:30:03 +00001836(Caution: the default encoding used for output is ASCII. For general
1837XML work, where an element's name may contain arbitrary Unicode
1838characters, ASCII isn't a very useful encoding because it will raise
1839an exception if an element's name contains any characters with values
1840greater than 127. Therefore, it's best to specify a different
1841encoding such as UTF-8 that can handle any Unicode character.)
Andrew M. Kuchling16ed5212006-04-10 22:28:11 +00001842
Andrew M. Kuchling075e0232006-04-11 13:14:56 +00001843This section is only a partial description of the ElementTree interfaces.
1844Please read the package's official documentation for more details.
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001845
Andrew M. Kuchling16ed5212006-04-10 22:28:11 +00001846\begin{seealso}
1847
1848\seeurl{http://effbot.org/zone/element-index.htm}
1849{Official documentation for ElementTree.}
1850
1851
1852\end{seealso}
1853
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001854
Andrew M. Kuchling61434b62006-04-13 11:51:07 +00001855%======================================================================
Andrew M. Kuchling98189242006-04-26 12:23:39 +00001856\subsection{The hashlib package\label{module-hashlib}}
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001857
Andrew M. Kuchling29b3d082006-04-14 20:35:17 +00001858A new \module{hashlib} module, written by Gregory P. Smith,
1859has been added to replace the
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001860\module{md5} and \module{sha} modules. \module{hashlib} adds support
1861for additional secure hashes (SHA-224, SHA-256, SHA-384, and SHA-512).
1862When available, the module uses OpenSSL for fast platform optimized
1863implementations of algorithms.
1864
1865The old \module{md5} and \module{sha} modules still exist as wrappers
1866around hashlib to preserve backwards compatibility. The new module's
1867interface is very close to that of the old modules, but not identical.
1868The most significant difference is that the constructor functions
1869for creating new hashing objects are named differently.
1870
1871\begin{verbatim}
1872# Old versions
Andrew M. Kuchlingd058d002006-04-16 18:20:05 +00001873h = md5.md5()
1874h = md5.new()
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001875
1876# New version
1877h = hashlib.md5()
1878
1879# Old versions
Andrew M. Kuchlingd058d002006-04-16 18:20:05 +00001880h = sha.sha()
1881h = sha.new()
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001882
1883# New version
1884h = hashlib.sha1()
1885
1886# Hash that weren't previously available
1887h = hashlib.sha224()
1888h = hashlib.sha256()
1889h = hashlib.sha384()
1890h = hashlib.sha512()
1891
1892# Alternative form
Andrew M. Kuchlingd058d002006-04-16 18:20:05 +00001893h = hashlib.new('md5') # Provide algorithm as a string
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001894\end{verbatim}
1895
1896Once a hash object has been created, its methods are the same as before:
1897\method{update(\var{string})} hashes the specified string into the
1898current digest state, \method{digest()} and \method{hexdigest()}
1899return the digest value as a binary string or a string of hex digits,
1900and \method{copy()} returns a new hashing object with the same digest state.
1901
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001902
Andrew M. Kuchling61434b62006-04-13 11:51:07 +00001903%======================================================================
Andrew M. Kuchling98189242006-04-26 12:23:39 +00001904\subsection{The sqlite3 package\label{module-sqlite}}
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001905
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001906The pysqlite module (\url{http://www.pysqlite.org}), a wrapper for the
1907SQLite embedded database, has been added to the standard library under
Andrew M. Kuchling29b3d082006-04-14 20:35:17 +00001908the package name \module{sqlite3}.
1909
1910SQLite is a C library that provides a SQL-language database that
1911stores data in disk files without requiring a separate server process.
1912pysqlite was written by Gerhard H\"aring and provides a SQL interface
1913compliant with the DB-API 2.0 specification described by
1914\pep{249}. This means that it should be possible to write the first
1915version of your applications using SQLite for data storage. If
1916switching to a larger database such as PostgreSQL or Oracle is
1917later necessary, the switch should be relatively easy.
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001918
1919If you're compiling the Python source yourself, note that the source
Andrew M. Kuchling29b3d082006-04-14 20:35:17 +00001920tree doesn't include the SQLite code, only the wrapper module.
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001921You'll need to have the SQLite libraries and headers installed before
1922compiling Python, and the build process will compile the module when
1923the necessary headers are available.
1924
Andrew M. Kuchlingd58baf82006-04-10 21:40:16 +00001925To use the module, you must first create a \class{Connection} object
1926that represents the database. Here the data will be stored in the
1927\file{/tmp/example} file:
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001928
Andrew M. Kuchlingd58baf82006-04-10 21:40:16 +00001929\begin{verbatim}
1930conn = sqlite3.connect('/tmp/example')
1931\end{verbatim}
1932
1933You can also supply the special name \samp{:memory:} to create
1934a database in RAM.
1935
1936Once you have a \class{Connection}, you can create a \class{Cursor}
1937object and call its \method{execute()} method to perform SQL commands:
1938
1939\begin{verbatim}
1940c = conn.cursor()
1941
1942# Create table
1943c.execute('''create table stocks
1944(date timestamp, trans varchar, symbol varchar,
1945 qty decimal, price decimal)''')
1946
1947# Insert a row of data
1948c.execute("""insert into stocks
Andrew M. Kuchling29b3d082006-04-14 20:35:17 +00001949 values ('2006-01-05','BUY','RHAT',100,35.14)""")
Andrew M. Kuchlingd58baf82006-04-10 21:40:16 +00001950\end{verbatim}
1951
Andrew M. Kuchling29b3d082006-04-14 20:35:17 +00001952Usually your SQL operations will need to use values from Python
Andrew M. Kuchlingd58baf82006-04-10 21:40:16 +00001953variables. You shouldn't assemble your query using Python's string
1954operations because doing so is insecure; it makes your program
Andrew M. Kuchling29b3d082006-04-14 20:35:17 +00001955vulnerable to an SQL injection attack.
1956
Andrew M. Kuchling1271f002006-06-07 17:02:52 +00001957Instead, use the DB-API's parameter substitution. Put \samp{?} as a
Andrew M. Kuchling29b3d082006-04-14 20:35:17 +00001958placeholder wherever you want to use a value, and then provide a tuple
1959of values as the second argument to the cursor's \method{execute()}
Andrew M. Kuchling1271f002006-06-07 17:02:52 +00001960method. (Other database modules may use a different placeholder,
Andrew M. Kuchling3b336c72006-06-07 17:03:46 +00001961such as \samp{\%s} or \samp{:1}.) For example:
Andrew M. Kuchlingd58baf82006-04-10 21:40:16 +00001962
1963\begin{verbatim}
1964# Never do this -- insecure!
1965symbol = 'IBM'
1966c.execute("... where symbol = '%s'" % symbol)
1967
1968# Do this instead
1969t = (symbol,)
Andrew M. Kuchling7e5abb92006-04-26 12:21:06 +00001970c.execute('select * from stocks where symbol=?', t)
Andrew M. Kuchlingd58baf82006-04-10 21:40:16 +00001971
1972# Larger example
1973for t in (('2006-03-28', 'BUY', 'IBM', 1000, 45.00),
Andrew M. Kuchlingd058d002006-04-16 18:20:05 +00001974 ('2006-04-05', 'BUY', 'MSOFT', 1000, 72.00),
1975 ('2006-04-06', 'SELL', 'IBM', 500, 53.00),
1976 ):
Andrew M. Kuchlingd58baf82006-04-10 21:40:16 +00001977 c.execute('insert into stocks values (?,?,?,?,?)', t)
1978\end{verbatim}
1979
1980To retrieve data after executing a SELECT statement, you can either
1981treat the cursor as an iterator, call the cursor's \method{fetchone()}
1982method to retrieve a single matching row,
1983or call \method{fetchall()} to get a list of the matching rows.
1984
1985This example uses the iterator form:
1986
1987\begin{verbatim}
1988>>> c = conn.cursor()
1989>>> c.execute('select * from stocks order by price')
1990>>> for row in c:
1991... print row
1992...
1993(u'2006-01-05', u'BUY', u'RHAT', 100, 35.140000000000001)
1994(u'2006-03-28', u'BUY', u'IBM', 1000, 45.0)
1995(u'2006-04-06', u'SELL', u'IBM', 500, 53.0)
1996(u'2006-04-05', u'BUY', u'MSOFT', 1000, 72.0)
1997>>>
1998\end{verbatim}
1999
Andrew M. Kuchlingd58baf82006-04-10 21:40:16 +00002000For more information about the SQL dialect supported by SQLite, see
2001\url{http://www.sqlite.org}.
2002
2003\begin{seealso}
2004
2005\seeurl{http://www.pysqlite.org}
2006{The pysqlite web page.}
2007
2008\seeurl{http://www.sqlite.org}
2009{The SQLite web page; the documentation describes the syntax and the
2010available data types for the supported SQL dialect.}
2011
2012\seepep{249}{Database API Specification 2.0}{PEP written by
2013Marc-Andr\'e Lemburg.}
2014
2015\end{seealso}
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00002016
Fred Drake2db76802004-12-01 05:05:47 +00002017
2018% ======================================================================
Andrew M. Kuchling98189242006-04-26 12:23:39 +00002019\section{Build and C API Changes\label{build-api}}
Fred Drake2db76802004-12-01 05:05:47 +00002020
2021Changes to Python's build process and to the C API include:
2022
2023\begin{itemize}
2024
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +00002025\item The largest change to the C API came from \pep{353},
2026which modifies the interpreter to use a \ctype{Py_ssize_t} type
2027definition instead of \ctype{int}. See the earlier
Andrew M. Kuchlingfb08e732006-04-21 13:08:02 +00002028section~\ref{pep-353} for a discussion of this change.
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +00002029
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00002030\item The design of the bytecode compiler has changed a great deal, to
2031no longer generate bytecode by traversing the parse tree. Instead
Andrew M. Kuchlingdb85ed52005-10-23 21:52:59 +00002032the parse tree is converted to an abstract syntax tree (or AST), and it is
2033the abstract syntax tree that's traversed to produce the bytecode.
2034
Andrew M. Kuchling4e861952006-04-12 12:16:31 +00002035It's possible for Python code to obtain AST objects by using the
Andrew M. Kuchling5f445bf2006-04-12 18:54:00 +00002036\function{compile()} built-in and specifying \code{_ast.PyCF_ONLY_AST}
2037as the value of the
Andrew M. Kuchling4e861952006-04-12 12:16:31 +00002038\var{flags} parameter:
2039
2040\begin{verbatim}
Andrew M. Kuchling5f445bf2006-04-12 18:54:00 +00002041from _ast import PyCF_ONLY_AST
Andrew M. Kuchling4e861952006-04-12 12:16:31 +00002042ast = compile("""a=0
2043for i in range(10):
2044 a += i
Andrew M. Kuchling5f445bf2006-04-12 18:54:00 +00002045""", "<string>", 'exec', PyCF_ONLY_AST)
Andrew M. Kuchling4e861952006-04-12 12:16:31 +00002046
2047assignment = ast.body[0]
2048for_loop = ast.body[1]
2049\end{verbatim}
2050
Andrew M. Kuchlingdb85ed52005-10-23 21:52:59 +00002051No documentation has been written for the AST code yet. To start
2052learning about it, read the definition of the various AST nodes in
2053\file{Parser/Python.asdl}. A Python script reads this file and
2054generates a set of C structure definitions in
2055\file{Include/Python-ast.h}. The \cfunction{PyParser_ASTFromString()}
2056and \cfunction{PyParser_ASTFromFile()}, defined in
2057\file{Include/pythonrun.h}, take Python source as input and return the
2058root of an AST representing the contents. This AST can then be turned
2059into a code object by \cfunction{PyAST_Compile()}. For more
2060information, read the source code, and then ask questions on
2061python-dev.
2062
2063% List of names taken from Jeremy's python-dev post at
2064% http://mail.python.org/pipermail/python-dev/2005-October/057500.html
2065The AST code was developed under Jeremy Hylton's management, and
2066implemented by (in alphabetical order) Brett Cannon, Nick Coghlan,
2067Grant Edwards, John Ehresman, Kurt Kaiser, Neal Norwitz, Tim Peters,
2068Armin Rigo, and Neil Schemenauer, plus the participants in a number of
2069AST sprints at conferences such as PyCon.
2070
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00002071\item The built-in set types now have an official C API. Call
2072\cfunction{PySet_New()} and \cfunction{PyFrozenSet_New()} to create a
2073new set, \cfunction{PySet_Add()} and \cfunction{PySet_Discard()} to
2074add and remove elements, and \cfunction{PySet_Contains} and
2075\cfunction{PySet_Size} to examine the set's state.
Andrew M. Kuchling29b3d082006-04-14 20:35:17 +00002076(Contributed by Raymond Hettinger.)
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00002077
Andrew M. Kuchling61434b62006-04-13 11:51:07 +00002078\item C code can now obtain information about the exact revision
2079of the Python interpreter by calling the
2080\cfunction{Py_GetBuildInfo()} function that returns a
2081string of build information like this:
2082\code{"trunk:45355:45356M, Apr 13 2006, 07:42:19"}.
2083(Contributed by Barry Warsaw.)
2084
Andrew M. Kuchlingb98d65c2006-05-27 11:26:33 +00002085\item Two new macros can be used to indicate C functions that are
2086local to the current file so that a faster calling convention can be
2087used. \cfunction{Py_LOCAL(\var{type})} declares the function as
2088returning a value of the specified \var{type} and uses a fast-calling
2089qualifier. \cfunction{Py_LOCAL_INLINE(\var{type})} does the same thing
2090and also requests the function be inlined. If
2091\cfunction{PY_LOCAL_AGGRESSIVE} is defined before \file{python.h} is
2092included, a set of more aggressive optimizations are enabled for the
2093module; you should benchmark the results to find out if these
2094optimizations actually make the code faster. (Contributed by Fredrik
2095Lundh at the NeedForSpeed sprint.)
2096
Andrew M. Kuchlingc6027232006-05-23 12:44:36 +00002097\item \cfunction{PyErr_NewException(\var{name}, \var{base},
2098\var{dict})} can now accept a tuple of base classes as its \var{base}
2099argument. (Contributed by Georg Brandl.)
2100
Andrew M. Kuchling29b3d082006-04-14 20:35:17 +00002101\item The CPython interpreter is still written in C, but
2102the code can now be compiled with a {\Cpp} compiler without errors.
2103(Implemented by Anthony Baxter, Martin von~L\"owis, Skip Montanaro.)
2104
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00002105\item The \cfunction{PyRange_New()} function was removed. It was
2106never documented, never used in the core code, and had dangerously lax
2107error checking.
Fred Drake2db76802004-12-01 05:05:47 +00002108
2109\end{itemize}
2110
2111
2112%======================================================================
Andrew M. Kuchling98189242006-04-26 12:23:39 +00002113\subsection{Port-Specific Changes\label{ports}}
Fred Drake2db76802004-12-01 05:05:47 +00002114
Andrew M. Kuchling6fc69762006-04-13 12:37:21 +00002115\begin{itemize}
2116
2117\item MacOS X (10.3 and higher): dynamic loading of modules
2118now uses the \cfunction{dlopen()} function instead of MacOS-specific
2119functions.
2120
Andrew M. Kuchlingb37bcb52006-04-29 11:53:15 +00002121\item MacOS X: a \longprogramopt{enable-universalsdk} switch was added
2122to the \program{configure} script that compiles the interpreter as a
2123universal binary able to run on both PowerPC and Intel processors.
2124(Contributed by Ronald Oussoren.)
2125
Andrew M. Kuchling63fe9b52006-04-20 13:36:06 +00002126\item Windows: \file{.dll} is no longer supported as a filename extension for
2127extension modules. \file{.pyd} is now the only filename extension that will
2128be searched for.
2129
Andrew M. Kuchling6fc69762006-04-13 12:37:21 +00002130\end{itemize}
Fred Drake2db76802004-12-01 05:05:47 +00002131
2132
2133%======================================================================
2134\section{Other Changes and Fixes \label{section-other}}
2135
2136As usual, there were a bunch of other improvements and bugfixes
Andrew M. Kuchlingf688cc52006-03-10 18:50:08 +00002137scattered throughout the source tree. A search through the SVN change
Fred Drake2db76802004-12-01 05:05:47 +00002138logs finds there were XXX patches applied and YYY bugs fixed between
Andrew M. Kuchling92e24952004-12-03 13:54:09 +00002139Python 2.4 and 2.5. Both figures are likely to be underestimates.
Fred Drake2db76802004-12-01 05:05:47 +00002140
2141Some of the more notable changes are:
2142
2143\begin{itemize}
2144
Andrew M. Kuchling01e3d262006-03-17 15:38:39 +00002145\item Evan Jones's patch to obmalloc, first described in a talk
2146at PyCon DC 2005, was applied. Python 2.4 allocated small objects in
2147256K-sized arenas, but never freed arenas. With this patch, Python
2148will free arenas when they're empty. The net effect is that on some
2149platforms, when you allocate many objects, Python's memory usage may
2150actually drop when you delete them, and the memory may be returned to
2151the operating system. (Implemented by Evan Jones, and reworked by Tim
2152Peters.)
Fred Drake2db76802004-12-01 05:05:47 +00002153
Andrew M. Kuchlingf7c62902006-04-12 12:27:50 +00002154Note that this change means extension modules need to be more careful
Andrew M. Kuchling0f1955d2006-04-13 12:09:08 +00002155with how they allocate memory. Python's API has many different
Andrew M. Kuchlingf7c62902006-04-12 12:27:50 +00002156functions for allocating memory that are grouped into families. For
2157example, \cfunction{PyMem_Malloc()}, \cfunction{PyMem_Realloc()}, and
2158\cfunction{PyMem_Free()} are one family that allocates raw memory,
2159while \cfunction{PyObject_Malloc()}, \cfunction{PyObject_Realloc()},
2160and \cfunction{PyObject_Free()} are another family that's supposed to
2161be used for creating Python objects.
2162
2163Previously these different families all reduced to the platform's
2164\cfunction{malloc()} and \cfunction{free()} functions. This meant
2165it didn't matter if you got things wrong and allocated memory with the
2166\cfunction{PyMem} function but freed it with the \cfunction{PyObject}
2167function. With the obmalloc change, these families now do different
2168things, and mismatches will probably result in a segfault. You should
2169carefully test your C extension modules with Python 2.5.
2170
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00002171\item Coverity, a company that markets a source code analysis tool
2172 called Prevent, provided the results of their examination of the Python
Andrew M. Kuchling0f1955d2006-04-13 12:09:08 +00002173 source code. The analysis found about 60 bugs that
2174 were quickly fixed. Many of the bugs were refcounting problems, often
2175 occurring in error-handling code. See
2176 \url{http://scan.coverity.com} for the statistics.
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00002177
Fred Drake2db76802004-12-01 05:05:47 +00002178\end{itemize}
2179
2180
2181%======================================================================
Andrew M. Kuchling98189242006-04-26 12:23:39 +00002182\section{Porting to Python 2.5\label{porting}}
Fred Drake2db76802004-12-01 05:05:47 +00002183
2184This section lists previously described changes that may require
2185changes to your code:
2186
2187\begin{itemize}
2188
Andrew M. Kuchling5f445bf2006-04-12 18:54:00 +00002189\item ASCII is now the default encoding for modules. It's now
2190a syntax error if a module contains string literals with 8-bit
2191characters but doesn't have an encoding declaration. In Python 2.4
2192this triggered a warning, not a syntax error.
2193
Andrew M. Kuchling3b4fb042006-04-13 12:49:39 +00002194\item Previously, the \member{gi_frame} attribute of a generator
2195was always a frame object. Because of the \pep{342} changes
Andrew M. Kuchlingfb08e732006-04-21 13:08:02 +00002196described in section~\ref{pep-342}, it's now possible
Andrew M. Kuchling3b4fb042006-04-13 12:49:39 +00002197for \member{gi_frame} to be \code{None}.
2198
Andrew M. Kuchling42c6e2f2006-04-21 13:01:45 +00002199
2200\item Library: The \module{pickle} and \module{cPickle} modules no
2201longer accept a return value of \code{None} from the
2202\method{__reduce__()} method; the method must return a tuple of
2203arguments instead. The modules also no longer accept the deprecated
2204\var{bin} keyword parameter.
2205
Andrew M. Kuchling07cf0722006-05-31 14:12:47 +00002206\item Library: The \module{SimpleXMLRPCServer} and \module{DocXMLRPCServer}
2207classes now have a \member{rpc_paths} attribute that constrains
2208XML-RPC operations to a limited set of URL paths; the default is
2209to allow only \code{'/'} and \code{'/RPC2'}. Setting
2210\member{rpc_paths} to \code{None} or an empty tuple disables
2211this path checking.
2212
Andrew M. Kuchlingf7c62902006-04-12 12:27:50 +00002213\item C API: Many functions now use \ctype{Py_ssize_t}
Andrew M. Kuchling42c6e2f2006-04-21 13:01:45 +00002214instead of \ctype{int} to allow processing more data on 64-bit
2215machines. Extension code may need to make the same change to avoid
2216warnings and to support 64-bit machines. See the earlier
Andrew M. Kuchlingfb08e732006-04-21 13:08:02 +00002217section~\ref{pep-353} for a discussion of this change.
Andrew M. Kuchlingf7c62902006-04-12 12:27:50 +00002218
2219\item C API:
2220The obmalloc changes mean that
2221you must be careful to not mix usage
2222of the \cfunction{PyMem_*()} and \cfunction{PyObject_*()}
2223families of functions. Memory allocated with
2224one family's \cfunction{*_Malloc()} must be
2225freed with the corresponding family's \cfunction{*_Free()} function.
2226
Fred Drake2db76802004-12-01 05:05:47 +00002227\end{itemize}
2228
2229
2230%======================================================================
2231\section{Acknowledgements \label{acks}}
2232
2233The author would like to thank the following people for offering
2234suggestions, corrections and assistance with various drafts of this
Andrew M. Kuchlinge3c958c2006-05-01 12:45:02 +00002235article: Phillip J. Eby, Kent Johnson, Martin von~L\"owis, Fredrik Lundh,
Andrew M. Kuchling356af462006-05-10 17:19:04 +00002236Gustavo Niemeyer, James Pryor, Mike Rovner, Scott Weikart, Thomas Wouters.
Fred Drake2db76802004-12-01 05:05:47 +00002237
2238\end{document}