blob: 563438647c3c824793c3e1e8a4b6f5a7dffd8a84 [file] [log] [blame]
Fred Drake2db76802004-12-01 05:05:47 +00001\documentclass{howto}
2\usepackage{distutils}
3% $Id$
4
Andrew M. Kuchling29b3d082006-04-14 20:35:17 +00005% Writing context managers
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00006% The easy_install stuff
Andrew M. Kuchling075e0232006-04-11 13:14:56 +00007% Stateful codec changes
Andrew M. Kuchling29b3d082006-04-14 20:35:17 +00008% Fix XXX comments
Andrew M. Kuchling5f445bf2006-04-12 18:54:00 +00009% Count up the patches and bugs
Fred Drake2db76802004-12-01 05:05:47 +000010
11\title{What's New in Python 2.5}
Andrew M. Kuchling2cdb23e2006-04-05 13:59:01 +000012\release{0.1}
Andrew M. Kuchling92e24952004-12-03 13:54:09 +000013\author{A.M. Kuchling}
14\authoraddress{\email{amk@amk.ca}}
Fred Drake2db76802004-12-01 05:05:47 +000015
16\begin{document}
17\maketitle
18\tableofcontents
19
20This article explains the new features in Python 2.5. No release date
Andrew M. Kuchling5eefdca2006-02-08 11:36:09 +000021for Python 2.5 has been set; it will probably be released in the
Andrew M. Kuchlingd96a6ac2006-04-04 19:17:34 +000022autumn of 2006. \pep{356} describes the planned release schedule.
Fred Drake2db76802004-12-01 05:05:47 +000023
Andrew M. Kuchling9c67ee02006-04-04 19:07:27 +000024(This is still an early draft, and some sections are still skeletal or
25completely missing. Comments on the present material will still be
26welcomed.)
27
Andrew M. Kuchling437567c2006-03-07 20:48:55 +000028% XXX Compare with previous release in 2 - 3 sentences here.
Fred Drake2db76802004-12-01 05:05:47 +000029
30This article doesn't attempt to provide a complete specification of
31the new features, but instead provides a convenient overview. For
32full details, you should refer to the documentation for Python 2.5.
Andrew M. Kuchling437567c2006-03-07 20:48:55 +000033% XXX add hyperlink when the documentation becomes available online.
Fred Drake2db76802004-12-01 05:05:47 +000034If you want to understand the complete implementation and design
35rationale, refer to the PEP for a particular new feature.
36
37
38%======================================================================
Andrew M. Kuchling6a67e4e2006-04-12 13:03:35 +000039\section{PEP 243: Uploading Modules to PyPI}
40
41PEP 243 describes an HTTP-based protocol for submitting software
42packages to a central archive. The Python package index at
43\url{http://cheeseshop.python.org} now supports package uploads, and
44the new \command{upload} Distutils command will upload a package to the
45repository.
46
47Before a package can be uploaded, you must be able to build a
48distribution using the \command{sdist} Distutils command. Once that
49works, you can run \code{python setup.py upload} to add your package
50to the PyPI archive. Optionally you can GPG-sign the package by
51supplying the \programopt{--sign} and
52\programopt{--identity} options.
53
54\begin{seealso}
55
56\seepep{243}{Module Repository Upload Mechanism}{PEP written by
Andrew M. Kuchling5f445bf2006-04-12 18:54:00 +000057Sean Reifschneider; implemented by Martin von~L\"owis
Andrew M. Kuchling6a67e4e2006-04-12 13:03:35 +000058and Richard Jones. Note that the PEP doesn't exactly
59describe what's implemented in PyPI.}
60
61\end{seealso}
62
63
64%======================================================================
Andrew M. Kuchling437567c2006-03-07 20:48:55 +000065\section{PEP 308: Conditional Expressions}
66
Andrew M. Kuchlinge362d932006-03-09 13:56:25 +000067For a long time, people have been requesting a way to write
68conditional expressions, expressions that return value A or value B
69depending on whether a Boolean value is true or false. A conditional
70expression lets you write a single assignment statement that has the
71same effect as the following:
72
73\begin{verbatim}
74if condition:
75 x = true_value
76else:
77 x = false_value
78\end{verbatim}
79
80There have been endless tedious discussions of syntax on both
Andrew M. Kuchling9c67ee02006-04-04 19:07:27 +000081python-dev and comp.lang.python. A vote was even held that found the
82majority of voters wanted conditional expressions in some form,
83but there was no syntax that was preferred by a clear majority.
84Candidates included C's \code{cond ? true_v : false_v},
Andrew M. Kuchlinge362d932006-03-09 13:56:25 +000085\code{if cond then true_v else false_v}, and 16 other variations.
86
87GvR eventually chose a surprising syntax:
88
89\begin{verbatim}
90x = true_value if condition else false_value
91\end{verbatim}
92
Andrew M. Kuchling38f85072006-04-02 01:46:32 +000093Evaluation is still lazy as in existing Boolean expressions, so the
94order of evaluation jumps around a bit. The \var{condition}
95expression in the middle is evaluated first, and the \var{true_value}
96expression is evaluated only if the condition was true. Similarly,
97the \var{false_value} expression is only evaluated when the condition
98is false.
Andrew M. Kuchlinge362d932006-03-09 13:56:25 +000099
100This syntax may seem strange and backwards; why does the condition go
101in the \emph{middle} of the expression, and not in the front as in C's
102\code{c ? x : y}? The decision was checked by applying the new syntax
103to the modules in the standard library and seeing how the resulting
104code read. In many cases where a conditional expression is used, one
105value seems to be the 'common case' and one value is an 'exceptional
106case', used only on rarer occasions when the condition isn't met. The
107conditional syntax makes this pattern a bit more obvious:
108
109\begin{verbatim}
110contents = ((doc + '\n') if doc else '')
111\end{verbatim}
112
113I read the above statement as meaning ``here \var{contents} is
Andrew M. Kuchlingd0fcc022006-03-09 13:57:28 +0000114usually assigned a value of \code{doc+'\e n'}; sometimes
Andrew M. Kuchlinge362d932006-03-09 13:56:25 +0000115\var{doc} is empty, in which special case an empty string is returned.''
116I doubt I will use conditional expressions very often where there
117isn't a clear common and uncommon case.
118
119There was some discussion of whether the language should require
120surrounding conditional expressions with parentheses. The decision
121was made to \emph{not} require parentheses in the Python language's
122grammar, but as a matter of style I think you should always use them.
123Consider these two statements:
124
125\begin{verbatim}
126# First version -- no parens
127level = 1 if logging else 0
128
129# Second version -- with parens
130level = (1 if logging else 0)
131\end{verbatim}
132
133In the first version, I think a reader's eye might group the statement
134into 'level = 1', 'if logging', 'else 0', and think that the condition
135decides whether the assignment to \var{level} is performed. The
136second version reads better, in my opinion, because it makes it clear
137that the assignment is always performed and the choice is being made
138between two values.
139
140Another reason for including the brackets: a few odd combinations of
141list comprehensions and lambdas could look like incorrect conditional
142expressions. See \pep{308} for some examples. If you put parentheses
143around your conditional expressions, you won't run into this case.
144
145
146\begin{seealso}
147
148\seepep{308}{Conditional Expressions}{PEP written by
149Guido van Rossum and Raymond D. Hettinger; implemented by Thomas
150Wouters.}
151
152\end{seealso}
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000153
154
155%======================================================================
Andrew M. Kuchling3e41b052005-03-01 00:53:46 +0000156\section{PEP 309: Partial Function Application}
Fred Drake2db76802004-12-01 05:05:47 +0000157
Andrew M. Kuchlingb1c96fd2005-03-20 21:42:04 +0000158The \module{functional} module is intended to contain tools for
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000159functional-style programming. Currently it only contains a
160\class{partial()} function, but new functions will probably be added
161in future versions of Python.
Andrew M. Kuchlingb1c96fd2005-03-20 21:42:04 +0000162
Andrew M. Kuchling4b000cd2005-04-09 15:51:44 +0000163For programs written in a functional style, it can be useful to
164construct variants of existing functions that have some of the
165parameters filled in. Consider a Python function \code{f(a, b, c)};
166you could create a new function \code{g(b, c)} that was equivalent to
167\code{f(1, b, c)}. This is called ``partial function application'',
168and is provided by the \class{partial} class in the new
169\module{functional} module.
170
171The constructor for \class{partial} takes the arguments
172\code{(\var{function}, \var{arg1}, \var{arg2}, ...
173\var{kwarg1}=\var{value1}, \var{kwarg2}=\var{value2})}. The resulting
174object is callable, so you can just call it to invoke \var{function}
175with the filled-in arguments.
176
177Here's a small but realistic example:
178
179\begin{verbatim}
180import functional
181
182def log (message, subsystem):
183 "Write the contents of 'message' to the specified subsystem."
184 print '%s: %s' % (subsystem, message)
185 ...
186
187server_log = functional.partial(log, subsystem='server')
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000188server_log('Unable to open socket')
Andrew M. Kuchling4b000cd2005-04-09 15:51:44 +0000189\end{verbatim}
190
Andrew M. Kuchling6af7fe02005-08-02 17:20:36 +0000191Here's another example, from a program that uses PyGTk. Here a
192context-sensitive pop-up menu is being constructed dynamically. The
193callback provided for the menu option is a partially applied version
194of the \method{open_item()} method, where the first argument has been
195provided.
Andrew M. Kuchling4b000cd2005-04-09 15:51:44 +0000196
Andrew M. Kuchling6af7fe02005-08-02 17:20:36 +0000197\begin{verbatim}
198...
199class Application:
200 def open_item(self, path):
201 ...
202 def init (self):
203 open_func = functional.partial(self.open_item, item_path)
204 popup_menu.append( ("Open", open_func, 1) )
205\end{verbatim}
Andrew M. Kuchlingb1c96fd2005-03-20 21:42:04 +0000206
207
208\begin{seealso}
209
210\seepep{309}{Partial Function Application}{PEP proposed and written by
211Peter Harris; implemented by Hye-Shik Chang, with adaptations by
212Raymond Hettinger.}
213
214\end{seealso}
Fred Drake2db76802004-12-01 05:05:47 +0000215
216
217%======================================================================
Fred Drakedb7b0022005-03-20 22:19:47 +0000218\section{PEP 314: Metadata for Python Software Packages v1.1}
219
Andrew M. Kuchlingd8d732e2005-04-09 23:59:41 +0000220Some simple dependency support was added to Distutils. The
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000221\function{setup()} function now has \code{requires}, \code{provides},
222and \code{obsoletes} keyword parameters. When you build a source
223distribution using the \code{sdist} command, the dependency
224information will be recorded in the \file{PKG-INFO} file.
Andrew M. Kuchlingd8d732e2005-04-09 23:59:41 +0000225
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000226Another new keyword parameter is \code{download_url}, which should be
227set to a URL for the package's source code. This means it's now
228possible to look up an entry in the package index, determine the
229dependencies for a package, and download the required packages.
Andrew M. Kuchlingd8d732e2005-04-09 23:59:41 +0000230
Andrew M. Kuchling61434b62006-04-13 11:51:07 +0000231\begin{verbatim}
232VERSION = '1.0'
233setup(name='PyPackage',
234 version=VERSION,
235 requires=['numarray', 'zlib (>=1.1.4)'],
236 obsoletes=['OldPackage']
237 download_url=('http://www.example.com/pypackage/dist/pkg-%s.tar.gz'
238 % VERSION),
239 )
240\end{verbatim}
Andrew M. Kuchlingd8d732e2005-04-09 23:59:41 +0000241
242\begin{seealso}
243
244\seepep{314}{Metadata for Python Software Packages v1.1}{PEP proposed
245and written by A.M. Kuchling, Richard Jones, and Fred Drake;
246implemented by Richard Jones and Fred Drake.}
247
248\end{seealso}
Fred Drakedb7b0022005-03-20 22:19:47 +0000249
250
251%======================================================================
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000252\section{PEP 328: Absolute and Relative Imports}
253
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000254The simpler part of PEP 328 was implemented in Python 2.4: parentheses
255could now be used to enclose the names imported from a module using
256the \code{from ... import ...} statement, making it easier to import
257many different names.
258
259The more complicated part has been implemented in Python 2.5:
260importing a module can be specified to use absolute or
261package-relative imports. The plan is to move toward making absolute
262imports the default in future versions of Python.
263
264Let's say you have a package directory like this:
265\begin{verbatim}
266pkg/
267pkg/__init__.py
268pkg/main.py
269pkg/string.py
270\end{verbatim}
271
272This defines a package named \module{pkg} containing the
273\module{pkg.main} and \module{pkg.string} submodules.
274
275Consider the code in the \file{main.py} module. What happens if it
276executes the statement \code{import string}? In Python 2.4 and
277earlier, it will first look in the package's directory to perform a
278relative import, finds \file{pkg/string.py}, imports the contents of
279that file as the \module{pkg.string} module, and that module is bound
280to the name \samp{string} in the \module{pkg.main} module's namespace.
281
282That's fine if \module{pkg.string} was what you wanted. But what if
283you wanted Python's standard \module{string} module? There's no clean
284way to ignore \module{pkg.string} and look for the standard module;
285generally you had to look at the contents of \code{sys.modules}, which
286is slightly unclean.
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +0000287Holger Krekel's \module{py.std} package provides a tidier way to perform
288imports from the standard library, \code{import py ; py.std.string.join()},
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000289but that package isn't available on all Python installations.
290
291Reading code which relies on relative imports is also less clear,
292because a reader may be confused about which module, \module{string}
293or \module{pkg.string}, is intended to be used. Python users soon
294learned not to duplicate the names of standard library modules in the
295names of their packages' submodules, but you can't protect against
296having your submodule's name being used for a new module added in a
297future version of Python.
298
299In Python 2.5, you can switch \keyword{import}'s behaviour to
300absolute imports using a \code{from __future__ import absolute_import}
301directive. This absolute-import behaviour will become the default in
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +0000302a future version (probably Python 2.7). Once absolute imports
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000303are the default, \code{import string} will
304always find the standard library's version.
305It's suggested that users should begin using absolute imports as much
306as possible, so it's preferable to begin writing \code{from pkg import
307string} in your code.
308
309Relative imports are still possible by adding a leading period
310to the module name when using the \code{from ... import} form:
311
312\begin{verbatim}
313# Import names from pkg.string
314from .string import name1, name2
315# Import pkg.string
316from . import string
317\end{verbatim}
318
319This imports the \module{string} module relative to the current
320package, so in \module{pkg.main} this will import \var{name1} and
321\var{name2} from \module{pkg.string}. Additional leading periods
322perform the relative import starting from the parent of the current
323package. For example, code in the \module{A.B.C} module can do:
324
325\begin{verbatim}
326from . import D # Imports A.B.D
327from .. import E # Imports A.E
328from ..F import G # Imports A.F.G
329\end{verbatim}
330
331Leading periods cannot be used with the \code{import \var{modname}}
332form of the import statement, only the \code{from ... import} form.
333
334\begin{seealso}
335
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +0000336\seepep{328}{Imports: Multi-Line and Absolute/Relative}
337{PEP written by Aahz; implemented by Thomas Wouters.}
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000338
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +0000339\seeurl{http://codespeak.net/py/current/doc/index.html}
340{The py library by Holger Krekel, which contains the \module{py.std} package.}
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000341
342\end{seealso}
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000343
344
345%======================================================================
Andrew M. Kuchling21d3a7c2006-03-15 11:53:09 +0000346\section{PEP 338: Executing Modules as Scripts}
347
Andrew M. Kuchlingb182db42006-03-17 21:48:46 +0000348The \programopt{-m} switch added in Python 2.4 to execute a module as
349a script gained a few more abilities. Instead of being implemented in
350C code inside the Python interpreter, the switch now uses an
351implementation in a new module, \module{runpy}.
352
353The \module{runpy} module implements a more sophisticated import
354mechanism so that it's now possible to run modules in a package such
355as \module{pychecker.checker}. The module also supports alternative
Andrew M. Kuchling5d4cf5e2006-04-13 13:02:42 +0000356import mechanisms such as the \module{zipimport} module. This means
Andrew M. Kuchlingb182db42006-03-17 21:48:46 +0000357you can add a .zip archive's path to \code{sys.path} and then use the
358\programopt{-m} switch to execute code from the archive.
359
360
361\begin{seealso}
362
363\seepep{338}{Executing modules as scripts}{PEP written and
364implemented by Nick Coghlan.}
365
366\end{seealso}
Andrew M. Kuchling21d3a7c2006-03-15 11:53:09 +0000367
368
369%======================================================================
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000370\section{PEP 341: Unified try/except/finally}
371
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000372Until Python 2.5, the \keyword{try} statement came in two
373flavours. You could use a \keyword{finally} block to ensure that code
Andrew M. Kuchling0f1955d2006-04-13 12:09:08 +0000374is always executed, or one or more \keyword{except} blocks to catch
375specific exceptions. You couldn't combine both \keyword{except} blocks and a
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000376\keyword{finally} block, because generating the right bytecode for the
377combined version was complicated and it wasn't clear what the
378semantics of the combined should be.
379
380GvR spent some time working with Java, which does support the
381equivalent of combining \keyword{except} blocks and a
382\keyword{finally} block, and this clarified what the statement should
383mean. In Python 2.5, you can now write:
384
385\begin{verbatim}
386try:
387 block-1 ...
388except Exception1:
389 handler-1 ...
390except Exception2:
391 handler-2 ...
392else:
393 else-block
394finally:
395 final-block
396\end{verbatim}
397
398The code in \var{block-1} is executed. If the code raises an
399exception, the handlers are tried in order: \var{handler-1},
400\var{handler-2}, ... If no exception is raised, the \var{else-block}
401is executed. No matter what happened previously, the
402\var{final-block} is executed once the code block is complete and any
403raised exceptions handled. Even if there's an error in an exception
404handler or the \var{else-block} and a new exception is raised, the
405\var{final-block} is still executed.
406
407\begin{seealso}
408
409\seepep{341}{Unifying try-except and try-finally}{PEP written by Georg Brandl;
Andrew M. Kuchling9c67ee02006-04-04 19:07:27 +0000410implementation by Thomas Lee.}
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000411
412\end{seealso}
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000413
414
415%======================================================================
Andrew M. Kuchling3b4fb042006-04-13 12:49:39 +0000416\section{PEP 342: New Generator Features\label{section-generators}}
Andrew M. Kuchlinga2e21cb2005-08-02 17:13:21 +0000417
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000418Python 2.5 adds a simple way to pass values \emph{into} a generator.
Andrew M. Kuchling150e3492005-08-23 00:56:06 +0000419As introduced in Python 2.3, generators only produce output; once a
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000420generator's code is invoked to create an iterator, there's no way to
421pass any new information into the function when its execution is
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000422resumed. Sometimes the ability to pass in some information would be
423useful. Hackish solutions to this include making the generator's code
424look at a global variable and then changing the global variable's
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000425value, or passing in some mutable object that callers then modify.
Andrew M. Kuchling150e3492005-08-23 00:56:06 +0000426
427To refresh your memory of basic generators, here's a simple example:
428
429\begin{verbatim}
430def counter (maximum):
431 i = 0
432 while i < maximum:
433 yield i
434 i += 1
435\end{verbatim}
436
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000437When you call \code{counter(10)}, the result is an iterator that
438returns the values from 0 up to 9. On encountering the
439\keyword{yield} statement, the iterator returns the provided value and
440suspends the function's execution, preserving the local variables.
441Execution resumes on the following call to the iterator's
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000442\method{next()} method, picking up after the \keyword{yield} statement.
Andrew M. Kuchling150e3492005-08-23 00:56:06 +0000443
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000444In Python 2.3, \keyword{yield} was a statement; it didn't return any
445value. In 2.5, \keyword{yield} is now an expression, returning a
446value that can be assigned to a variable or otherwise operated on:
Andrew M. Kuchlinga2e21cb2005-08-02 17:13:21 +0000447
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000448\begin{verbatim}
449val = (yield i)
450\end{verbatim}
451
452I recommend that you always put parentheses around a \keyword{yield}
453expression when you're doing something with the returned value, as in
454the above example. The parentheses aren't always necessary, but it's
455easier to always add them instead of having to remember when they're
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000456needed.\footnote{The exact rules are that a \keyword{yield}-expression must
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000457always be parenthesized except when it occurs at the top-level
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000458expression on the right-hand side of an assignment, meaning you can
459write \code{val = yield i} but have to use parentheses when there's an
460operation, as in \code{val = (yield i) + 12}.}
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000461
462Values are sent into a generator by calling its
463\method{send(\var{value})} method. The generator's code is then
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000464resumed and the \keyword{yield} expression returns the specified
465\var{value}. If the regular \method{next()} method is called, the
466\keyword{yield} returns \constant{None}.
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000467
468Here's the previous example, modified to allow changing the value of
469the internal counter.
470
471\begin{verbatim}
472def counter (maximum):
473 i = 0
474 while i < maximum:
475 val = (yield i)
476 # If value provided, change counter
477 if val is not None:
478 i = val
479 else:
480 i += 1
481\end{verbatim}
482
483And here's an example of changing the counter:
484
485\begin{verbatim}
486>>> it = counter(10)
487>>> print it.next()
4880
489>>> print it.next()
4901
491>>> print it.send(8)
4928
493>>> print it.next()
4949
495>>> print it.next()
496Traceback (most recent call last):
497 File ``t.py'', line 15, in ?
498 print it.next()
499StopIteration
Andrew M. Kuchlingc2033702005-08-29 13:30:12 +0000500\end{verbatim}
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000501
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000502Because \keyword{yield} will often be returning \constant{None}, you
503should always check for this case. Don't just use its value in
504expressions unless you're sure that the \method{send()} method
505will be the only method used resume your generator function.
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000506
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000507In addition to \method{send()}, there are two other new methods on
508generators:
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000509
510\begin{itemize}
511
512 \item \method{throw(\var{type}, \var{value}=None,
513 \var{traceback}=None)} is used to raise an exception inside the
514 generator; the exception is raised by the \keyword{yield} expression
515 where the generator's execution is paused.
516
517 \item \method{close()} raises a new \exception{GeneratorExit}
518 exception inside the generator to terminate the iteration.
519 On receiving this
520 exception, the generator's code must either raise
521 \exception{GeneratorExit} or \exception{StopIteration}; catching the
522 exception and doing anything else is illegal and will trigger
523 a \exception{RuntimeError}. \method{close()} will also be called by
524 Python's garbage collection when the generator is garbage-collected.
525
526 If you need to run cleanup code in case of a \exception{GeneratorExit},
527 I suggest using a \code{try: ... finally:} suite instead of
528 catching \exception{GeneratorExit}.
529
530\end{itemize}
531
532The cumulative effect of these changes is to turn generators from
533one-way producers of information into both producers and consumers.
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000534
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000535Generators also become \emph{coroutines}, a more generalized form of
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000536subroutines. Subroutines are entered at one point and exited at
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000537another point (the top of the function, and a \keyword{return
538statement}), but coroutines can be entered, exited, and resumed at
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000539many different points (the \keyword{yield} statements). We'll have to
540figure out patterns for using coroutines effectively in Python.
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000541
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000542The addition of the \method{close()} method has one side effect that
543isn't obvious. \method{close()} is called when a generator is
544garbage-collected, so this means the generator's code gets one last
Andrew M. Kuchling3b4fb042006-04-13 12:49:39 +0000545chance to run before the generator is destroyed. This last chance
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000546means that \code{try...finally} statements in generators can now be
547guaranteed to work; the \keyword{finally} clause will now always get a
548chance to run. The syntactic restriction that you couldn't mix
549\keyword{yield} statements with a \code{try...finally} suite has
550therefore been removed. This seems like a minor bit of language
551trivia, but using generators and \code{try...finally} is actually
552necessary in order to implement the \keyword{with} statement
553described by PEP 343. We'll look at this new statement in the following
554section.
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000555
Andrew M. Kuchling3b4fb042006-04-13 12:49:39 +0000556Another even more esoteric effect of this change: previously, the
557\member{gi_frame} attribute of a generator was always a frame object.
558It's now possible for \member{gi_frame} to be \code{None}
559once the generator has been exhausted.
560
Andrew M. Kuchlinga2e21cb2005-08-02 17:13:21 +0000561\begin{seealso}
562
563\seepep{342}{Coroutines via Enhanced Generators}{PEP written by
564Guido van Rossum and Phillip J. Eby;
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000565implemented by Phillip J. Eby. Includes examples of
566some fancier uses of generators as coroutines.}
567
568\seeurl{http://en.wikipedia.org/wiki/Coroutine}{The Wikipedia entry for
569coroutines.}
570
Neal Norwitz09179882006-03-04 23:31:45 +0000571\seeurl{http://www.sidhe.org/\~{}dan/blog/archives/000178.html}{An
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000572explanation of coroutines from a Perl point of view, written by Dan
573Sugalski.}
Andrew M. Kuchlinga2e21cb2005-08-02 17:13:21 +0000574
575\end{seealso}
576
577
578%======================================================================
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000579\section{PEP 343: The 'with' statement}
580
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000581The \keyword{with} statement allows a clearer
582version of code that uses \code{try...finally} blocks
583
584First, I'll discuss the statement as it will commonly be used, and
585then I'll discuss the detailed implementation and how to write objects
586(called ``context managers'') that can be used with this statement.
587Most people, who will only use \keyword{with} in company with an
Andrew M. Kuchlinga4d651f2006-04-06 13:24:58 +0000588existing object, don't need to know these details and can
589just use objects that are documented to work as context managers.
590Authors of new context managers will need to understand the details of
591the underlying implementation.
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000592
593The \keyword{with} statement is a new control-flow structure whose
594basic structure is:
595
596\begin{verbatim}
597with expression as variable:
598 with-block
599\end{verbatim}
600
601The expression is evaluated, and it should result in a type of object
602that's called a context manager. The context manager can return a
603value that will be bound to the name \var{variable}. (Note carefully:
604\var{variable} is \emph{not} assigned the result of \var{expression}.
605One method of the context manager is run before \var{with-block} is
606executed, and another method is run after the block is done, even if
607the block raised an exception.
608
609To enable the statement in Python 2.5, you need
610to add the following directive to your module:
611
612\begin{verbatim}
613from __future__ import with_statement
614\end{verbatim}
615
616Some standard Python objects can now behave as context managers. For
617example, file objects:
618
619\begin{verbatim}
620with open('/etc/passwd', 'r') as f:
621 for line in f:
622 print line
623
624# f has been automatically closed at this point.
625\end{verbatim}
626
627The \module{threading} module's locks and condition variables
Andrew M. Kuchling9c67ee02006-04-04 19:07:27 +0000628also support the \keyword{with} statement:
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000629
630\begin{verbatim}
631lock = threading.Lock()
632with lock:
633 # Critical section of code
634 ...
635\end{verbatim}
636
637The lock is acquired before the block is executed, and released once
638the block is complete.
639
640The \module{decimal} module's contexts, which encapsulate the desired
641precision and rounding characteristics for computations, can also be
642used as context managers.
643
644\begin{verbatim}
645import decimal
646
647v1 = decimal.Decimal('578')
648
649# Displays with default precision of 28 digits
650print v1.sqrt()
651
652with decimal.Context(prec=16):
653 # All code in this block uses a precision of 16 digits.
654 # The original context is restored on exiting the block.
655 print v1.sqrt()
656\end{verbatim}
657
658\subsection{Writing Context Managers}
659
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000660% XXX write this
661
Andrew M. Kuchling9c67ee02006-04-04 19:07:27 +0000662This section still needs to be written.
663
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000664The new \module{contextlib} module provides some functions and a
Andrew M. Kuchling9c67ee02006-04-04 19:07:27 +0000665decorator that are useful for writing context managers.
666Future versions will go into more detail.
667
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000668% XXX describe further
669
670\begin{seealso}
671
672\seepep{343}{The ``with'' statement}{PEP written by
673Guido van Rossum and Nick Coghlan. }
674
675\end{seealso}
676
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000677
678%======================================================================
Andrew M. Kuchling8f4d2552006-03-08 01:50:20 +0000679\section{PEP 352: Exceptions as New-Style Classes}
680
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000681Exception classes can now be new-style classes, not just classic
682classes, and the built-in \exception{Exception} class and all the
683standard built-in exceptions (\exception{NameError},
684\exception{ValueError}, etc.) are now new-style classes.
Andrew M. Kuchlingaeadf952006-03-09 19:06:05 +0000685
686The inheritance hierarchy for exceptions has been rearranged a bit.
687In 2.5, the inheritance relationships are:
688
689\begin{verbatim}
690BaseException # New in Python 2.5
691|- KeyboardInterrupt
692|- SystemExit
693|- Exception
694 |- (all other current built-in exceptions)
695\end{verbatim}
696
697This rearrangement was done because people often want to catch all
698exceptions that indicate program errors. \exception{KeyboardInterrupt} and
699\exception{SystemExit} aren't errors, though, and usually represent an explicit
700action such as the user hitting Control-C or code calling
701\function{sys.exit()}. A bare \code{except:} will catch all exceptions,
702so you commonly need to list \exception{KeyboardInterrupt} and
703\exception{SystemExit} in order to re-raise them. The usual pattern is:
704
705\begin{verbatim}
706try:
707 ...
708except (KeyboardInterrupt, SystemExit):
709 raise
710except:
711 # Log error...
712 # Continue running program...
713\end{verbatim}
714
715In Python 2.5, you can now write \code{except Exception} to achieve
716the same result, catching all the exceptions that usually indicate errors
717but leaving \exception{KeyboardInterrupt} and
718\exception{SystemExit} alone. As in previous versions,
719a bare \code{except:} still catches all exceptions.
720
721The goal for Python 3.0 is to require any class raised as an exception
722to derive from \exception{BaseException} or some descendant of
723\exception{BaseException}, and future releases in the
724Python 2.x series may begin to enforce this constraint. Therefore, I
725suggest you begin making all your exception classes derive from
726\exception{Exception} now. It's been suggested that the bare
727\code{except:} form should be removed in Python 3.0, but Guido van~Rossum
728hasn't decided whether to do this or not.
729
730Raising of strings as exceptions, as in the statement \code{raise
731"Error occurred"}, is deprecated in Python 2.5 and will trigger a
732warning. The aim is to be able to remove the string-exception feature
733in a few releases.
734
735
736\begin{seealso}
737
Andrew M. Kuchlingc3749a92006-04-04 19:14:41 +0000738\seepep{352}{Required Superclass for Exceptions}{PEP written by
Andrew M. Kuchlingaeadf952006-03-09 19:06:05 +0000739Brett Cannon and Guido van Rossum; implemented by Brett Cannon.}
740
741\end{seealso}
Andrew M. Kuchling8f4d2552006-03-08 01:50:20 +0000742
743
744%======================================================================
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +0000745\section{PEP 353: Using ssize_t as the index type\label{section-353}}
Andrew M. Kuchlingc3749a92006-04-04 19:14:41 +0000746
747A wide-ranging change to Python's C API, using a new
748\ctype{Py_ssize_t} type definition instead of \ctype{int},
749will permit the interpreter to handle more data on 64-bit platforms.
750This change doesn't affect Python's capacity on 32-bit platforms.
751
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +0000752Various pieces of the Python interpreter used C's \ctype{int} type to
753store sizes or counts; for example, the number of items in a list or
754tuple were stored in an \ctype{int}. The C compilers for most 64-bit
755platforms still define \ctype{int} as a 32-bit type, so that meant
756that lists could only hold up to \code{2**31 - 1} = 2147483647 items.
757(There are actually a few different programming models that 64-bit C
758compilers can use -- see
759\url{http://www.unix.org/version2/whatsnew/lp64_wp.html} for a
760discussion -- but the most commonly available model leaves \ctype{int}
761as 32 bits.)
762
763A limit of 2147483647 items doesn't really matter on a 32-bit platform
764because you'll run out of memory before hitting the length limit.
765Each list item requires space for a pointer, which is 4 bytes, plus
766space for a \ctype{PyObject} representing the item. 2147483647*4 is
767already more bytes than a 32-bit address space can contain.
768
769It's possible to address that much memory on a 64-bit platform,
770however. The pointers for a list that size would only require 16GiB
771of space, so it's not unreasonable that Python programmers might
772construct lists that large. Therefore, the Python interpreter had to
773be changed to use some type other than \ctype{int}, and this will be a
77464-bit type on 64-bit platforms. The change will cause
775incompatibilities on 64-bit machines, so it was deemed worth making
776the transition now, while the number of 64-bit users is still
777relatively small. (In 5 or 10 years, we may \emph{all} be on 64-bit
778machines, and the transition would be more painful then.)
779
780This change most strongly affects authors of C extension modules.
781Python strings and container types such as lists and tuples
782now use \ctype{Py_ssize_t} to store their size.
783Functions such as \cfunction{PyList_Size()}
784now return \ctype{Py_ssize_t}. Code in extension modules
785may therefore need to have some variables changed to
786\ctype{Py_ssize_t}.
787
788The \cfunction{PyArg_ParseTuple()} and \cfunction{Py_BuildValue()} functions
789have a new conversion code, \samp{n}, for \ctype{Py_ssize_t}.
Andrew M. Kuchlinga4d651f2006-04-06 13:24:58 +0000790\cfunction{PyArg_ParseTuple()}'s \samp{s\#} and \samp{t\#} still output
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +0000791\ctype{int} by default, but you can define the macro
792\csimplemacro{PY_SSIZE_T_CLEAN} before including \file{Python.h}
793to make them return \ctype{Py_ssize_t}.
794
795\pep{353} has a section on conversion guidelines that
796extension authors should read to learn about supporting 64-bit
797platforms.
Andrew M. Kuchlingc3749a92006-04-04 19:14:41 +0000798
799\begin{seealso}
800
Andrew M. Kuchling5f445bf2006-04-12 18:54:00 +0000801\seepep{353}{Using ssize_t as the index type}{PEP written and implemented by Martin von~L\"owis.}
Andrew M. Kuchlingc3749a92006-04-04 19:14:41 +0000802
803\end{seealso}
804
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +0000805
Andrew M. Kuchlingc3749a92006-04-04 19:14:41 +0000806%======================================================================
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000807\section{PEP 357: The '__index__' method}
808
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000809The NumPy developers had a problem that could only be solved by adding
810a new special method, \method{__index__}. When using slice notation,
Fred Drake1c0e3282006-04-02 03:30:06 +0000811as in \code{[\var{start}:\var{stop}:\var{step}]}, the values of the
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000812\var{start}, \var{stop}, and \var{step} indexes must all be either
813integers or long integers. NumPy defines a variety of specialized
814integer types corresponding to unsigned and signed integers of 8, 16,
81532, and 64 bits, but there was no way to signal that these types could
816be used as slice indexes.
817
818Slicing can't just use the existing \method{__int__} method because
819that method is also used to implement coercion to integers. If
820slicing used \method{__int__}, floating-point numbers would also
821become legal slice indexes and that's clearly an undesirable
822behaviour.
823
824Instead, a new special method called \method{__index__} was added. It
825takes no arguments and returns an integer giving the slice index to
826use. For example:
827
828\begin{verbatim}
829class C:
830 def __index__ (self):
831 return self.value
832\end{verbatim}
833
834The return value must be either a Python integer or long integer.
835The interpreter will check that the type returned is correct, and
836raises a \exception{TypeError} if this requirement isn't met.
837
838A corresponding \member{nb_index} slot was added to the C-level
839\ctype{PyNumberMethods} structure to let C extensions implement this
840protocol. \cfunction{PyNumber_Index(\var{obj})} can be used in
841extension code to call the \method{__index__} function and retrieve
842its result.
843
844\begin{seealso}
845
846\seepep{357}{Allowing Any Object to be Used for Slicing}{PEP written
Andrew M. Kuchling9c67ee02006-04-04 19:07:27 +0000847and implemented by Travis Oliphant.}
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000848
849\end{seealso}
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000850
851
852%======================================================================
Fred Drake2db76802004-12-01 05:05:47 +0000853\section{Other Language Changes}
854
855Here are all of the changes that Python 2.5 makes to the core Python
856language.
857
858\begin{itemize}
Andrew M. Kuchling1cae3f52004-12-03 14:57:21 +0000859
Andrew M. Kuchlingc7095842006-04-14 12:41:19 +0000860\item The \class{dict} type has a new hook for letting subclasses
861provide a default value when a key isn't contained in the dictionary.
862When a key isn't found, the dictionary's
863\method{__missing__(\var{key})}
864method will be called. This hook is used to implement
865the new \class{defaultdict} class in the \module{collections}
866module. The following example defines a dictionary
867that returns zero for any missing key:
868
869\begin{verbatim}
870class zerodict (dict):
871 def __missing__ (self, key):
872 return 0
873
874d = zerodict({1:1, 2:2})
875print d[1], d[2] # Prints 1, 2
876print d[3], d[4] # Prints 0, 0
877\end{verbatim}
878
Andrew M. Kuchling1cae3f52004-12-03 14:57:21 +0000879\item The \function{min()} and \function{max()} built-in functions
880gained a \code{key} keyword argument analogous to the \code{key}
Andrew M. Kuchlingc7095842006-04-14 12:41:19 +0000881argument for \method{sort()}. This argument supplies a function that
882takes a single argument and is called for every value in the list;
Andrew M. Kuchling1cae3f52004-12-03 14:57:21 +0000883\function{min()}/\function{max()} will return the element with the
884smallest/largest return value from this function.
885For example, to find the longest string in a list, you can do:
886
887\begin{verbatim}
888L = ['medium', 'longest', 'short']
889# Prints 'longest'
890print max(L, key=len)
891# Prints 'short', because lexicographically 'short' has the largest value
892print max(L)
893\end{verbatim}
894
895(Contributed by Steven Bethard and Raymond Hettinger.)
Fred Drake2db76802004-12-01 05:05:47 +0000896
Andrew M. Kuchling150e3492005-08-23 00:56:06 +0000897\item Two new built-in functions, \function{any()} and
898\function{all()}, evaluate whether an iterator contains any true or
899false values. \function{any()} returns \constant{True} if any value
900returned by the iterator is true; otherwise it will return
901\constant{False}. \function{all()} returns \constant{True} only if
902all of the values returned by the iterator evaluate as being true.
Andrew M. Kuchling6e3a66d2006-04-07 12:46:06 +0000903(Suggested by GvR, and implemented by Raymond Hettinger.)
Andrew M. Kuchling150e3492005-08-23 00:56:06 +0000904
Andrew M. Kuchling5f445bf2006-04-12 18:54:00 +0000905\item ASCII is now the default encoding for modules. It's now
906a syntax error if a module contains string literals with 8-bit
907characters but doesn't have an encoding declaration. In Python 2.4
908this triggered a warning, not a syntax error. See \pep{263}
909for how to declare a module's encoding; for example, you might add
910a line like this near the top of the source file:
911
912\begin{verbatim}
913# -*- coding: latin1 -*-
914\end{verbatim}
915
Andrew M. Kuchlinge9b1bf42005-03-20 19:26:30 +0000916\item The list of base classes in a class definition can now be empty.
917As an example, this is now legal:
918
919\begin{verbatim}
920class C():
921 pass
922\end{verbatim}
923(Implemented by Brett Cannon.)
924
Fred Drake2db76802004-12-01 05:05:47 +0000925\end{itemize}
926
927
928%======================================================================
Andrew M. Kuchlingda376042006-03-17 15:56:41 +0000929\subsection{Interactive Interpreter Changes}
930
931In the interactive interpreter, \code{quit} and \code{exit}
932have long been strings so that new users get a somewhat helpful message
933when they try to quit:
934
935\begin{verbatim}
936>>> quit
937'Use Ctrl-D (i.e. EOF) to exit.'
938\end{verbatim}
939
940In Python 2.5, \code{quit} and \code{exit} are now objects that still
941produce string representations of themselves, but are also callable.
942Newbies who try \code{quit()} or \code{exit()} will now exit the
943interpreter as they expect. (Implemented by Georg Brandl.)
944
945
946%======================================================================
Fred Drake2db76802004-12-01 05:05:47 +0000947\subsection{Optimizations}
948
949\begin{itemize}
950
Andrew M. Kuchling150e3492005-08-23 00:56:06 +0000951\item When they were introduced
952in Python 2.4, the built-in \class{set} and \class{frozenset} types
953were built on top of Python's dictionary type.
954In 2.5 the internal data structure has been customized for implementing sets,
955and as a result sets will use a third less memory and are somewhat faster.
956(Implemented by Raymond Hettinger.)
Fred Drake2db76802004-12-01 05:05:47 +0000957
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000958\item The performance of some Unicode operations has been improved.
959% XXX provide details?
960
961\item The code generator's peephole optimizer now performs
962simple constant folding in expressions. If you write something like
963\code{a = 2+3}, the code generator will do the arithmetic and produce
964code corresponding to \code{a = 5}.
965
Fred Drake2db76802004-12-01 05:05:47 +0000966\end{itemize}
967
968The net result of the 2.5 optimizations is that Python 2.5 runs the
Andrew M. Kuchling9c67ee02006-04-04 19:07:27 +0000969pystone benchmark around XXX\% faster than Python 2.4.
Fred Drake2db76802004-12-01 05:05:47 +0000970
971
972%======================================================================
973\section{New, Improved, and Deprecated Modules}
974
Andrew M. Kuchling0f1955d2006-04-13 12:09:08 +0000975As usual, Python's standard library received many enhancements and
Fred Drake2db76802004-12-01 05:05:47 +0000976bug fixes. Here's a partial list of the most notable changes, sorted
977alphabetically by module name. Consult the
978\file{Misc/NEWS} file in the source tree for a more
Andrew M. Kuchlingf688cc52006-03-10 18:50:08 +0000979complete list of changes, or look through the SVN logs for all the
Fred Drake2db76802004-12-01 05:05:47 +0000980details.
981
982\begin{itemize}
983
Andrew M. Kuchling3e41b052005-03-01 00:53:46 +0000984% the cPickle module no longer accepts the deprecated None option in the
985% args tuple returned by __reduce__().
986
Andrew M. Kuchling6fc69762006-04-13 12:37:21 +0000987% XXX csv module improvements
Andrew M. Kuchling3e41b052005-03-01 00:53:46 +0000988
Andrew M. Kuchling6fc69762006-04-13 12:37:21 +0000989% XXX datetime.datetime() now has a strptime class method which can be used to
Andrew M. Kuchling3e41b052005-03-01 00:53:46 +0000990% create datetime object using a string and format.
991
Andrew M. Kuchling6fc69762006-04-13 12:37:21 +0000992% XXX fileinput: opening hook used to control how files are opened.
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000993% .input() now has a mode parameter
994% now has a fileno() function
995% accepts Unicode filenames
996
Andrew M. Kuchling6fc69762006-04-13 12:37:21 +0000997\item The \module{audioop} module now supports the a-LAW encoding,
998and the code for u-LAW encoding has been improved. (Contributed by
999Lars Immisch.)
1000
Andrew M. Kuchlingc7095842006-04-14 12:41:19 +00001001\item The \module{collections} module gained a new type,
1002\class{defaultdict}, that subclasses the standard \class{dict}
1003type. The new type mostly behaves like a dictionary but constructs a
1004default value when a key isn't present, automatically adding it to the
1005dictionary for the requested key value.
1006
1007The first argument to \class{defaultdict}'s constructor is a factory
1008function that gets called whenever a key is requested but not found.
1009This factory function receives no arguments, so you can use built-in
1010type constructors such as \function{list()} or \function{int()}. For
1011example,
1012you can make an index of words based on their initial letter like this:
1013
1014\begin{verbatim}
1015words = """Nel mezzo del cammin di nostra vita
1016mi ritrovai per una selva oscura
1017che la diritta via era smarrita""".lower().split()
1018
1019index = defaultdict(list)
1020
1021for w in words:
1022 init_letter = w[0]
1023 index[init_letter].append(w)
1024\end{verbatim}
1025
1026Printing \code{index} results in the following output:
1027
1028\begin{verbatim}
1029defaultdict(<type 'list'>, {'c': ['cammin', 'che'], 'e': ['era'],
1030 'd': ['del', 'di', 'diritta'], 'm': ['mezzo', 'mi'],
1031 'l': ['la'], 'o': ['oscura'], 'n': ['nel', 'nostra'],
1032 'p': ['per'], 's': ['selva', 'smarrita'],
1033 'r': ['ritrovai'], 'u': ['una'], 'v': ['vita', 'via']}
1034\end{verbatim}
1035
1036The \class{deque} double-ended queue type supplied by the
1037\module{collections} module now has a \method{remove(\var{value})}
1038method that removes the first occurrence of \var{value} in the queue,
1039raising \exception{ValueError} if the value isn't found.
1040
1041\item The \module{cProfile} module is a C implementation of
1042the existing \module{profile} module that has much lower overhead.
1043The module's interface is the same as \module{profile}: you run
1044\code{cProfile.run('main()')} to profile a function, can save profile
1045data to a file, etc. It's not yet known if the Hotshot profiler,
1046which is also written in C but doesn't match the \module{profile}
1047module's interface, will continue to be maintained in future versions
1048of Python. (Contributed by Armin Rigo.)
1049
Andrew M. Kuchlingda376042006-03-17 15:56:41 +00001050\item In the \module{gc} module, the new \function{get_count()} function
1051returns a 3-tuple containing the current collection counts for the
1052three GC generations. This is accounting information for the garbage
1053collector; when these counts reach a specified threshold, a garbage
1054collection sweep will be made. The existing \function{gc.collect()}
1055function now takes an optional \var{generation} argument of 0, 1, or 2
1056to specify which generation to collect.
1057
Andrew M. Kuchlinge9b1bf42005-03-20 19:26:30 +00001058\item The \function{nsmallest()} and
1059\function{nlargest()} functions in the \module{heapq} module
1060now support a \code{key} keyword argument similar to the one
1061provided by the \function{min()}/\function{max()} functions
1062and the \method{sort()} methods. For example:
1063Example:
1064
1065\begin{verbatim}
1066>>> import heapq
1067>>> L = ["short", 'medium', 'longest', 'longer still']
1068>>> heapq.nsmallest(2, L) # Return two lowest elements, lexicographically
1069['longer still', 'longest']
1070>>> heapq.nsmallest(2, L, key=len) # Return two shortest elements
1071['short', 'medium']
1072\end{verbatim}
1073
1074(Contributed by Raymond Hettinger.)
1075
Andrew M. Kuchling511a3a82005-03-20 19:52:18 +00001076\item The \function{itertools.islice()} function now accepts
1077\code{None} for the start and step arguments. This makes it more
1078compatible with the attributes of slice objects, so that you can now write
1079the following:
1080
1081\begin{verbatim}
1082s = slice(5) # Create slice object
1083itertools.islice(iterable, s.start, s.stop, s.step)
1084\end{verbatim}
1085
1086(Contributed by Raymond Hettinger.)
Andrew M. Kuchling3e41b052005-03-01 00:53:46 +00001087
Andrew M. Kuchling75ba2442006-04-14 10:29:55 +00001088\item The \module{nis} module now supports accessing domains other
1089than the system default domain by supplying a \var{domain} argument to
1090the \function{nis.match()} and \function{nis.maps()} functions.
1091(Contributed by Ben Bell.)
1092
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001093\item The \module{operator} module's \function{itemgetter()}
1094and \function{attrgetter()} functions now support multiple fields.
1095A call such as \code{operator.attrgetter('a', 'b')}
1096will return a function
1097that retrieves the \member{a} and \member{b} attributes. Combining
1098this new feature with the \method{sort()} method's \code{key} parameter
1099lets you easily sort lists using multiple fields.
Andrew M. Kuchling6e3a66d2006-04-07 12:46:06 +00001100(Contributed by Raymond Hettinger.)
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001101
Andrew M. Kuchling3e41b052005-03-01 00:53:46 +00001102
Andrew M. Kuchling0f1955d2006-04-13 12:09:08 +00001103\item The \module{os} module underwent several changes. The
Andrew M. Kuchlinge9b1bf42005-03-20 19:26:30 +00001104\member{stat_float_times} variable now defaults to true, meaning that
1105\function{os.stat()} will now return time values as floats. (This
1106doesn't necessarily mean that \function{os.stat()} will return times
1107that are precise to fractions of a second; not all systems support
1108such precision.)
Andrew M. Kuchling3e41b052005-03-01 00:53:46 +00001109
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001110Constants named \member{os.SEEK_SET}, \member{os.SEEK_CUR}, and
Andrew M. Kuchlinge9b1bf42005-03-20 19:26:30 +00001111\member{os.SEEK_END} have been added; these are the parameters to the
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001112\function{os.lseek()} function. Two new constants for locking are
1113\member{os.O_SHLOCK} and \member{os.O_EXLOCK}.
1114
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001115Two new functions, \function{wait3()} and \function{wait4()}, were
1116added. They're similar the \function{waitpid()} function which waits
1117for a child process to exit and returns a tuple of the process ID and
1118its exit status, but \function{wait3()} and \function{wait4()} return
1119additional information. \function{wait3()} doesn't take a process ID
1120as input, so it waits for any child process to exit and returns a
11213-tuple of \var{process-id}, \var{exit-status}, \var{resource-usage}
1122as returned from the \function{resource.getrusage()} function.
1123\function{wait4(\var{pid})} does take a process ID.
Andrew M. Kuchling6e3a66d2006-04-07 12:46:06 +00001124(Contributed by Chad J. Schroeder.)
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001125
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001126On FreeBSD, the \function{os.stat()} function now returns
1127times with nanosecond resolution, and the returned object
1128now has \member{st_gen} and \member{st_birthtime}.
1129The \member{st_flags} member is also available, if the platform supports it.
Andrew M. Kuchling6e3a66d2006-04-07 12:46:06 +00001130(Contributed by Antti Louko and Diego Petten\`o.)
1131% (Patch 1180695, 1212117)
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001132
Andrew M. Kuchling01e3d262006-03-17 15:38:39 +00001133\item The old \module{regex} and \module{regsub} modules, which have been
1134deprecated ever since Python 2.0, have finally been deleted.
Andrew M. Kuchlingf4b06602006-03-17 15:39:52 +00001135Other deleted modules: \module{statcache}, \module{tzparse},
1136\module{whrandom}.
Andrew M. Kuchling01e3d262006-03-17 15:38:39 +00001137
1138\item The \file{lib-old} directory,
1139which includes ancient modules such as \module{dircmp} and
1140\module{ni}, was also deleted. \file{lib-old} wasn't on the default
1141\code{sys.path}, so unless your programs explicitly added the directory to
1142\code{sys.path}, this removal shouldn't affect your code.
1143
Andrew M. Kuchling4678dc82006-01-15 16:11:28 +00001144\item The \module{socket} module now supports \constant{AF_NETLINK}
1145sockets on Linux, thanks to a patch from Philippe Biondi.
1146Netlink sockets are a Linux-specific mechanism for communications
1147between a user-space process and kernel code; an introductory
1148article about them is at \url{http://www.linuxjournal.com/article/7356}.
1149In Python code, netlink addresses are represented as a tuple of 2 integers,
1150\code{(\var{pid}, \var{group_mask})}.
1151
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001152Socket objects also gained accessor methods \method{getfamily()},
1153\method{gettype()}, and \method{getproto()} methods to retrieve the
1154family, type, and protocol values for the socket.
1155
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001156\item New module: \module{spwd} provides functions for accessing the
Andrew M. Kuchling5f445bf2006-04-12 18:54:00 +00001157shadow password database on systems that support it.
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001158% XXX give example
Fred Drake2db76802004-12-01 05:05:47 +00001159
Andrew M. Kuchling61434b62006-04-13 11:51:07 +00001160\item The Python developers switched from CVS to Subversion during the 2.5
1161development process. Information about the exact build version is
1162available as the \code{sys.subversion} variable, a 3-tuple
1163of \code{(\var{interpreter-name}, \var{branch-name}, \var{revision-range})}.
1164For example, at the time of writing
1165my copy of 2.5 was reporting \code{('CPython', 'trunk', '45313:45315')}.
1166
1167This information is also available to C extensions via the
1168\cfunction{Py_GetBuildInfo()} function that returns a
1169string of build information like this:
1170\code{"trunk:45355:45356M, Apr 13 2006, 07:42:19"}.
1171(Contributed by Barry Warsaw.)
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001172
Andrew M. Kuchlinge9b1bf42005-03-20 19:26:30 +00001173\item The \class{TarFile} class in the \module{tarfile} module now has
Georg Brandl08c02db2005-07-22 18:39:19 +00001174an \method{extractall()} method that extracts all members from the
Andrew M. Kuchlinge9b1bf42005-03-20 19:26:30 +00001175archive into the current working directory. It's also possible to set
1176a different directory as the extraction target, and to unpack only a
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001177subset of the archive's members.
Andrew M. Kuchlinge9b1bf42005-03-20 19:26:30 +00001178
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001179A tarfile's compression can be autodetected by
1180using the mode \code{'r|*'}.
1181% patch 918101
1182(Contributed by Lars Gust\"abel.)
Gregory P. Smithf21a5f72005-08-21 18:45:59 +00001183
Andrew M. Kuchlingf688cc52006-03-10 18:50:08 +00001184\item The \module{unicodedata} module has been updated to use version 4.1.0
1185of the Unicode character database. Version 3.2.0 is required
1186by some specifications, so it's still available as
1187\member{unicodedata.db_3_2_0}.
1188
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001189% patch #754022: Greatly enhanced webbrowser.py (by Oleg Broytmann).
1190
Fredrik Lundh7e0aef02005-12-12 18:54:55 +00001191
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001192\item The \module{xmlrpclib} module now supports returning
1193 \class{datetime} objects for the XML-RPC date type. Supply
1194 \code{use_datetime=True} to the \function{loads()} function
1195 or the \class{Unmarshaller} class to enable this feature.
Andrew M. Kuchling6e3a66d2006-04-07 12:46:06 +00001196 (Contributed by Skip Montanaro.)
1197% Patch 1120353
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001198
Gregory P. Smithf21a5f72005-08-21 18:45:59 +00001199
Fred Drake114b8ca2005-03-21 05:47:11 +00001200\end{itemize}
Andrew M. Kuchlinge9b1bf42005-03-20 19:26:30 +00001201
Fred Drake2db76802004-12-01 05:05:47 +00001202
1203
1204%======================================================================
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001205% whole new modules get described in subsections here
Fred Drake2db76802004-12-01 05:05:47 +00001206
Andrew M. Kuchling61434b62006-04-13 11:51:07 +00001207%======================================================================
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001208\subsection{The ctypes package}
1209
1210The \module{ctypes} package, written by Thomas Heller, has been added
1211to the standard library. \module{ctypes} lets you call arbitrary functions
Andrew M. Kuchling28c5f1f2006-04-13 02:04:42 +00001212in shared libraries or DLLs. Long-time users may remember the \module{dl} module, which
1213provides functions for loading shared libraries and calling functions in them. The \module{ctypes} package is much fancier.
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001214
Andrew M. Kuchling28c5f1f2006-04-13 02:04:42 +00001215To load a shared library or DLL, you must create an instance of the
1216\class{CDLL} class and provide the name or path of the shared library
1217or DLL. Once that's done, you can call arbitrary functions
1218by accessing them as attributes of the \class{CDLL} object.
1219
1220\begin{verbatim}
1221import ctypes
1222
1223libc = ctypes.CDLL('libc.so.6')
1224result = libc.printf("Line of output\n")
1225\end{verbatim}
1226
1227Type constructors for the various C types are provided: \function{c_int},
1228\function{c_float}, \function{c_double}, \function{c_char_p} (equivalent to \ctype{char *}), and so forth. Unlike Python's types, the C versions are all mutable; you can assign to their \member{value} attribute
1229to change the wrapped value. Python integers and strings will be automatically
1230converted to the corresponding C types, but for other types you
1231must call the correct type constructor. (And I mean \emph{must};
1232getting it wrong will often result in the interpreter crashing
1233with a segmentation fault.)
1234
1235You shouldn't use \function{c_char_p} with a Python string when the C function will be modifying the memory area, because Python strings are
1236supposed to be immutable; breaking this rule will cause puzzling bugs. When you need a modifiable memory area,
Neal Norwitz5f5a69b2006-04-13 03:41:04 +00001237use \function{create_string_buffer()}:
Andrew M. Kuchling28c5f1f2006-04-13 02:04:42 +00001238
1239\begin{verbatim}
1240s = "this is a string"
1241buf = ctypes.create_string_buffer(s)
1242libc.strfry(buf)
1243\end{verbatim}
1244
1245C functions are assumed to return integers, but you can set
1246the \member{restype} attribute of the function object to
1247change this:
1248
1249\begin{verbatim}
1250>>> libc.atof('2.71828')
1251-1783957616
1252>>> libc.atof.restype = ctypes.c_double
1253>>> libc.atof('2.71828')
12542.71828
1255\end{verbatim}
1256
1257\module{ctypes} also provides a wrapper for Python's C API
1258as the \code{ctypes.pythonapi} object. This object does \emph{not}
1259release the global interpreter lock before calling a function, because the lock must be held when calling into the interpreter's code.
1260There's a \class{py_object()} type constructor that will create a
1261\ctype{PyObject *} pointer. A simple usage:
1262
1263\begin{verbatim}
1264import ctypes
1265
1266d = {}
1267ctypes.pythonapi.PyObject_SetItem(ctypes.py_object(d),
1268 ctypes.py_object("abc"), ctypes.py_object(1))
1269# d is now {'abc', 1}.
1270\end{verbatim}
1271
1272Don't forget to use \class{py_object()}; if it's omitted you end
1273up with a segmentation fault.
1274
1275\module{ctypes} has been around for a while, but people still write
1276and distribution hand-coded extension modules because you can't rely on \module{ctypes} being present.
1277Perhaps developers will begin to write
1278Python wrappers atop a library accessed through \module{ctypes} instead
1279of extension modules, now that \module{ctypes} is included with core Python.
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001280
Andrew M. Kuchling28c5f1f2006-04-13 02:04:42 +00001281\begin{seealso}
1282
1283\seeurl{http://starship.python.net/crew/theller/ctypes/}
1284{The ctypes web page, with a tutorial, reference, and FAQ.}
1285
1286\end{seealso}
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001287
Andrew M. Kuchling61434b62006-04-13 11:51:07 +00001288
1289%======================================================================
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001290\subsection{The ElementTree package}
1291
1292A subset of Fredrik Lundh's ElementTree library for processing XML has
Andrew M. Kuchling16ed5212006-04-10 22:28:11 +00001293been added to the standard library as \module{xmlcore.etree}. The
Georg Brandlce27a062006-04-11 06:27:12 +00001294available modules are
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001295\module{ElementTree}, \module{ElementPath}, and
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +00001296\module{ElementInclude} from ElementTree 1.2.6.
1297The \module{cElementTree} accelerator module is also included.
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001298
Andrew M. Kuchling16ed5212006-04-10 22:28:11 +00001299The rest of this section will provide a brief overview of using
1300ElementTree. Full documentation for ElementTree is available at
1301\url{http://effbot.org/zone/element-index.htm}.
1302
1303ElementTree represents an XML document as a tree of element nodes.
1304The text content of the document is stored as the \member{.text}
1305and \member{.tail} attributes of
1306(This is one of the major differences between ElementTree and
1307the Document Object Model; in the DOM there are many different
1308types of node, including \class{TextNode}.)
1309
1310The most commonly used parsing function is \function{parse()}, that
1311takes either a string (assumed to contain a filename) or a file-like
1312object and returns an \class{ElementTree} instance:
1313
1314\begin{verbatim}
1315from xmlcore.etree import ElementTree as ET
1316
1317tree = ET.parse('ex-1.xml')
1318
1319feed = urllib.urlopen(
1320 'http://planet.python.org/rss10.xml')
1321tree = ET.parse(feed)
1322\end{verbatim}
1323
1324Once you have an \class{ElementTree} instance, you
1325can call its \method{getroot()} method to get the root \class{Element} node.
1326
1327There's also an \function{XML()} function that takes a string literal
1328and returns an \class{Element} node (not an \class{ElementTree}).
1329This function provides a tidy way to incorporate XML fragments,
1330approaching the convenience of an XML literal:
1331
1332\begin{verbatim}
1333svg = et.XML("""<svg width="10px" version="1.0">
1334 </svg>""")
1335svg.set('height', '320px')
1336svg.append(elem1)
1337\end{verbatim}
1338
1339Each XML element supports some dictionary-like and some list-like
Andrew M. Kuchling075e0232006-04-11 13:14:56 +00001340access methods. Dictionary-like operations are used to access attribute
1341values, and list-like operations are used to access child nodes.
Andrew M. Kuchling16ed5212006-04-10 22:28:11 +00001342
Andrew M. Kuchling075e0232006-04-11 13:14:56 +00001343\begin{tableii}{c|l}{code}{Operation}{Result}
1344 \lineii{elem[n]}{Returns n'th child element.}
1345 \lineii{elem[m:n]}{Returns list of m'th through n'th child elements.}
1346 \lineii{len(elem)}{Returns number of child elements.}
1347 \lineii{elem.getchildren()}{Returns list of child elements.}
1348 \lineii{elem.append(elem2)}{Adds \var{elem2} as a child.}
1349 \lineii{elem.insert(index, elem2)}{Inserts \var{elem2} at the specified location.}
1350 \lineii{del elem[n]}{Deletes n'th child element.}
1351 \lineii{elem.keys()}{Returns list of attribute names.}
1352 \lineii{elem.get(name)}{Returns value of attribute \var{name}.}
1353 \lineii{elem.set(name, value)}{Sets new value for attribute \var{name}.}
1354 \lineii{elem.attrib}{Retrieves the dictionary containing attributes.}
1355 \lineii{del elem.attrib[name]}{Deletes attribute \var{name}.}
1356\end{tableii}
1357
1358Comments and processing instructions are also represented as
1359\class{Element} nodes. To check if a node is a comment or processing
1360instructions:
1361
1362\begin{verbatim}
1363if elem.tag is ET.Comment:
1364 ...
1365elif elem.tag is ET.ProcessingInstruction:
1366 ...
1367\end{verbatim}
Andrew M. Kuchling16ed5212006-04-10 22:28:11 +00001368
1369To generate XML output, you should call the
1370\method{ElementTree.write()} method. Like \function{parse()},
1371it can take either a string or a file-like object:
1372
1373\begin{verbatim}
1374# Encoding is US-ASCII
1375tree.write('output.xml')
1376
1377# Encoding is UTF-8
1378f = open('output.xml', 'w')
1379tree.write(f, 'utf-8')
1380\end{verbatim}
1381
1382(Caution: the default encoding used for output is ASCII, which isn't
1383very useful for general XML work, raising an exception if there are
1384any characters with values greater than 127. You should always
1385specify a different encoding such as UTF-8 that can handle any Unicode
1386character.)
1387
Andrew M. Kuchling075e0232006-04-11 13:14:56 +00001388This section is only a partial description of the ElementTree interfaces.
1389Please read the package's official documentation for more details.
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001390
Andrew M. Kuchling16ed5212006-04-10 22:28:11 +00001391\begin{seealso}
1392
1393\seeurl{http://effbot.org/zone/element-index.htm}
1394{Official documentation for ElementTree.}
1395
1396
1397\end{seealso}
1398
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001399
Andrew M. Kuchling61434b62006-04-13 11:51:07 +00001400%======================================================================
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001401\subsection{The hashlib package}
1402
Andrew M. Kuchling29b3d082006-04-14 20:35:17 +00001403A new \module{hashlib} module, written by Gregory P. Smith,
1404has been added to replace the
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001405\module{md5} and \module{sha} modules. \module{hashlib} adds support
1406for additional secure hashes (SHA-224, SHA-256, SHA-384, and SHA-512).
1407When available, the module uses OpenSSL for fast platform optimized
1408implementations of algorithms.
1409
1410The old \module{md5} and \module{sha} modules still exist as wrappers
1411around hashlib to preserve backwards compatibility. The new module's
1412interface is very close to that of the old modules, but not identical.
1413The most significant difference is that the constructor functions
1414for creating new hashing objects are named differently.
1415
1416\begin{verbatim}
1417# Old versions
1418h = md5.md5()
1419h = md5.new()
1420
1421# New version
1422h = hashlib.md5()
1423
1424# Old versions
1425h = sha.sha()
1426h = sha.new()
1427
1428# New version
1429h = hashlib.sha1()
1430
1431# Hash that weren't previously available
1432h = hashlib.sha224()
1433h = hashlib.sha256()
1434h = hashlib.sha384()
1435h = hashlib.sha512()
1436
1437# Alternative form
1438h = hashlib.new('md5') # Provide algorithm as a string
1439\end{verbatim}
1440
1441Once a hash object has been created, its methods are the same as before:
1442\method{update(\var{string})} hashes the specified string into the
1443current digest state, \method{digest()} and \method{hexdigest()}
1444return the digest value as a binary string or a string of hex digits,
1445and \method{copy()} returns a new hashing object with the same digest state.
1446
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001447
Andrew M. Kuchling61434b62006-04-13 11:51:07 +00001448%======================================================================
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001449\subsection{The sqlite3 package}
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001450
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001451The pysqlite module (\url{http://www.pysqlite.org}), a wrapper for the
1452SQLite embedded database, has been added to the standard library under
Andrew M. Kuchling29b3d082006-04-14 20:35:17 +00001453the package name \module{sqlite3}.
1454
1455SQLite is a C library that provides a SQL-language database that
1456stores data in disk files without requiring a separate server process.
1457pysqlite was written by Gerhard H\"aring and provides a SQL interface
1458compliant with the DB-API 2.0 specification described by
1459\pep{249}. This means that it should be possible to write the first
1460version of your applications using SQLite for data storage. If
1461switching to a larger database such as PostgreSQL or Oracle is
1462later necessary, the switch should be relatively easy.
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001463
1464If you're compiling the Python source yourself, note that the source
Andrew M. Kuchling29b3d082006-04-14 20:35:17 +00001465tree doesn't include the SQLite code, only the wrapper module.
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001466You'll need to have the SQLite libraries and headers installed before
1467compiling Python, and the build process will compile the module when
1468the necessary headers are available.
1469
Andrew M. Kuchlingd58baf82006-04-10 21:40:16 +00001470To use the module, you must first create a \class{Connection} object
1471that represents the database. Here the data will be stored in the
1472\file{/tmp/example} file:
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001473
Andrew M. Kuchlingd58baf82006-04-10 21:40:16 +00001474\begin{verbatim}
1475conn = sqlite3.connect('/tmp/example')
1476\end{verbatim}
1477
1478You can also supply the special name \samp{:memory:} to create
1479a database in RAM.
1480
1481Once you have a \class{Connection}, you can create a \class{Cursor}
1482object and call its \method{execute()} method to perform SQL commands:
1483
1484\begin{verbatim}
1485c = conn.cursor()
1486
1487# Create table
1488c.execute('''create table stocks
1489(date timestamp, trans varchar, symbol varchar,
1490 qty decimal, price decimal)''')
1491
1492# Insert a row of data
1493c.execute("""insert into stocks
Andrew M. Kuchling29b3d082006-04-14 20:35:17 +00001494 values ('2006-01-05','BUY','RHAT',100,35.14)""")
Andrew M. Kuchlingd58baf82006-04-10 21:40:16 +00001495\end{verbatim}
1496
Andrew M. Kuchling29b3d082006-04-14 20:35:17 +00001497Usually your SQL operations will need to use values from Python
Andrew M. Kuchlingd58baf82006-04-10 21:40:16 +00001498variables. You shouldn't assemble your query using Python's string
1499operations because doing so is insecure; it makes your program
Andrew M. Kuchling29b3d082006-04-14 20:35:17 +00001500vulnerable to an SQL injection attack.
1501
1502Instead, use SQLite's parameter substitution. Put \samp{?} as a
1503placeholder wherever you want to use a value, and then provide a tuple
1504of values as the second argument to the cursor's \method{execute()}
1505method. For example:
Andrew M. Kuchlingd58baf82006-04-10 21:40:16 +00001506
1507\begin{verbatim}
1508# Never do this -- insecure!
1509symbol = 'IBM'
1510c.execute("... where symbol = '%s'" % symbol)
1511
1512# Do this instead
1513t = (symbol,)
Andrew M. Kuchling29b3d082006-04-14 20:35:17 +00001514c.execute('select * from stocks where symbol=?', ('IBM',))
Andrew M. Kuchlingd58baf82006-04-10 21:40:16 +00001515
1516# Larger example
1517for t in (('2006-03-28', 'BUY', 'IBM', 1000, 45.00),
1518 ('2006-04-05', 'BUY', 'MSOFT', 1000, 72.00),
1519 ('2006-04-06', 'SELL', 'IBM', 500, 53.00),
1520 ):
1521 c.execute('insert into stocks values (?,?,?,?,?)', t)
1522\end{verbatim}
1523
1524To retrieve data after executing a SELECT statement, you can either
1525treat the cursor as an iterator, call the cursor's \method{fetchone()}
1526method to retrieve a single matching row,
1527or call \method{fetchall()} to get a list of the matching rows.
1528
1529This example uses the iterator form:
1530
1531\begin{verbatim}
1532>>> c = conn.cursor()
1533>>> c.execute('select * from stocks order by price')
1534>>> for row in c:
1535... print row
1536...
1537(u'2006-01-05', u'BUY', u'RHAT', 100, 35.140000000000001)
1538(u'2006-03-28', u'BUY', u'IBM', 1000, 45.0)
1539(u'2006-04-06', u'SELL', u'IBM', 500, 53.0)
1540(u'2006-04-05', u'BUY', u'MSOFT', 1000, 72.0)
1541>>>
1542\end{verbatim}
1543
Andrew M. Kuchlingd58baf82006-04-10 21:40:16 +00001544For more information about the SQL dialect supported by SQLite, see
1545\url{http://www.sqlite.org}.
1546
1547\begin{seealso}
1548
1549\seeurl{http://www.pysqlite.org}
1550{The pysqlite web page.}
1551
1552\seeurl{http://www.sqlite.org}
1553{The SQLite web page; the documentation describes the syntax and the
1554available data types for the supported SQL dialect.}
1555
1556\seepep{249}{Database API Specification 2.0}{PEP written by
1557Marc-Andr\'e Lemburg.}
1558
1559\end{seealso}
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001560
Fred Drake2db76802004-12-01 05:05:47 +00001561
1562% ======================================================================
1563\section{Build and C API Changes}
1564
1565Changes to Python's build process and to the C API include:
1566
1567\begin{itemize}
1568
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +00001569\item The largest change to the C API came from \pep{353},
1570which modifies the interpreter to use a \ctype{Py_ssize_t} type
1571definition instead of \ctype{int}. See the earlier
1572section~ref{section-353} for a discussion of this change.
1573
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001574\item The design of the bytecode compiler has changed a great deal, to
1575no longer generate bytecode by traversing the parse tree. Instead
Andrew M. Kuchlingdb85ed52005-10-23 21:52:59 +00001576the parse tree is converted to an abstract syntax tree (or AST), and it is
1577the abstract syntax tree that's traversed to produce the bytecode.
1578
Andrew M. Kuchling4e861952006-04-12 12:16:31 +00001579It's possible for Python code to obtain AST objects by using the
Andrew M. Kuchling5f445bf2006-04-12 18:54:00 +00001580\function{compile()} built-in and specifying \code{_ast.PyCF_ONLY_AST}
1581as the value of the
Andrew M. Kuchling4e861952006-04-12 12:16:31 +00001582\var{flags} parameter:
1583
1584\begin{verbatim}
Andrew M. Kuchling5f445bf2006-04-12 18:54:00 +00001585from _ast import PyCF_ONLY_AST
Andrew M. Kuchling4e861952006-04-12 12:16:31 +00001586ast = compile("""a=0
1587for i in range(10):
1588 a += i
Andrew M. Kuchling5f445bf2006-04-12 18:54:00 +00001589""", "<string>", 'exec', PyCF_ONLY_AST)
Andrew M. Kuchling4e861952006-04-12 12:16:31 +00001590
1591assignment = ast.body[0]
1592for_loop = ast.body[1]
1593\end{verbatim}
1594
Andrew M. Kuchlingdb85ed52005-10-23 21:52:59 +00001595No documentation has been written for the AST code yet. To start
1596learning about it, read the definition of the various AST nodes in
1597\file{Parser/Python.asdl}. A Python script reads this file and
1598generates a set of C structure definitions in
1599\file{Include/Python-ast.h}. The \cfunction{PyParser_ASTFromString()}
1600and \cfunction{PyParser_ASTFromFile()}, defined in
1601\file{Include/pythonrun.h}, take Python source as input and return the
1602root of an AST representing the contents. This AST can then be turned
1603into a code object by \cfunction{PyAST_Compile()}. For more
1604information, read the source code, and then ask questions on
1605python-dev.
1606
1607% List of names taken from Jeremy's python-dev post at
1608% http://mail.python.org/pipermail/python-dev/2005-October/057500.html
1609The AST code was developed under Jeremy Hylton's management, and
1610implemented by (in alphabetical order) Brett Cannon, Nick Coghlan,
1611Grant Edwards, John Ehresman, Kurt Kaiser, Neal Norwitz, Tim Peters,
1612Armin Rigo, and Neil Schemenauer, plus the participants in a number of
1613AST sprints at conferences such as PyCon.
1614
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001615\item The built-in set types now have an official C API. Call
1616\cfunction{PySet_New()} and \cfunction{PyFrozenSet_New()} to create a
1617new set, \cfunction{PySet_Add()} and \cfunction{PySet_Discard()} to
1618add and remove elements, and \cfunction{PySet_Contains} and
1619\cfunction{PySet_Size} to examine the set's state.
Andrew M. Kuchling29b3d082006-04-14 20:35:17 +00001620(Contributed by Raymond Hettinger.)
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001621
Andrew M. Kuchling61434b62006-04-13 11:51:07 +00001622\item C code can now obtain information about the exact revision
1623of the Python interpreter by calling the
1624\cfunction{Py_GetBuildInfo()} function that returns a
1625string of build information like this:
1626\code{"trunk:45355:45356M, Apr 13 2006, 07:42:19"}.
1627(Contributed by Barry Warsaw.)
1628
Andrew M. Kuchling29b3d082006-04-14 20:35:17 +00001629\item The CPython interpreter is still written in C, but
1630the code can now be compiled with a {\Cpp} compiler without errors.
1631(Implemented by Anthony Baxter, Martin von~L\"owis, Skip Montanaro.)
1632
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001633\item The \cfunction{PyRange_New()} function was removed. It was
1634never documented, never used in the core code, and had dangerously lax
1635error checking.
Fred Drake2db76802004-12-01 05:05:47 +00001636
1637\end{itemize}
1638
1639
1640%======================================================================
Andrew M. Kuchling6fc69762006-04-13 12:37:21 +00001641\subsection{Port-Specific Changes}
Fred Drake2db76802004-12-01 05:05:47 +00001642
Andrew M. Kuchling6fc69762006-04-13 12:37:21 +00001643\begin{itemize}
1644
1645\item MacOS X (10.3 and higher): dynamic loading of modules
1646now uses the \cfunction{dlopen()} function instead of MacOS-specific
1647functions.
1648
1649\end{itemize}
Fred Drake2db76802004-12-01 05:05:47 +00001650
1651
1652%======================================================================
1653\section{Other Changes and Fixes \label{section-other}}
1654
1655As usual, there were a bunch of other improvements and bugfixes
Andrew M. Kuchlingf688cc52006-03-10 18:50:08 +00001656scattered throughout the source tree. A search through the SVN change
Fred Drake2db76802004-12-01 05:05:47 +00001657logs finds there were XXX patches applied and YYY bugs fixed between
Andrew M. Kuchling92e24952004-12-03 13:54:09 +00001658Python 2.4 and 2.5. Both figures are likely to be underestimates.
Fred Drake2db76802004-12-01 05:05:47 +00001659
1660Some of the more notable changes are:
1661
1662\begin{itemize}
1663
Andrew M. Kuchling01e3d262006-03-17 15:38:39 +00001664\item Evan Jones's patch to obmalloc, first described in a talk
1665at PyCon DC 2005, was applied. Python 2.4 allocated small objects in
1666256K-sized arenas, but never freed arenas. With this patch, Python
1667will free arenas when they're empty. The net effect is that on some
1668platforms, when you allocate many objects, Python's memory usage may
1669actually drop when you delete them, and the memory may be returned to
1670the operating system. (Implemented by Evan Jones, and reworked by Tim
1671Peters.)
Fred Drake2db76802004-12-01 05:05:47 +00001672
Andrew M. Kuchlingf7c62902006-04-12 12:27:50 +00001673Note that this change means extension modules need to be more careful
Andrew M. Kuchling0f1955d2006-04-13 12:09:08 +00001674with how they allocate memory. Python's API has many different
Andrew M. Kuchlingf7c62902006-04-12 12:27:50 +00001675functions for allocating memory that are grouped into families. For
1676example, \cfunction{PyMem_Malloc()}, \cfunction{PyMem_Realloc()}, and
1677\cfunction{PyMem_Free()} are one family that allocates raw memory,
1678while \cfunction{PyObject_Malloc()}, \cfunction{PyObject_Realloc()},
1679and \cfunction{PyObject_Free()} are another family that's supposed to
1680be used for creating Python objects.
1681
1682Previously these different families all reduced to the platform's
1683\cfunction{malloc()} and \cfunction{free()} functions. This meant
1684it didn't matter if you got things wrong and allocated memory with the
1685\cfunction{PyMem} function but freed it with the \cfunction{PyObject}
1686function. With the obmalloc change, these families now do different
1687things, and mismatches will probably result in a segfault. You should
1688carefully test your C extension modules with Python 2.5.
1689
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001690\item Coverity, a company that markets a source code analysis tool
1691 called Prevent, provided the results of their examination of the Python
Andrew M. Kuchling0f1955d2006-04-13 12:09:08 +00001692 source code. The analysis found about 60 bugs that
1693 were quickly fixed. Many of the bugs were refcounting problems, often
1694 occurring in error-handling code. See
1695 \url{http://scan.coverity.com} for the statistics.
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001696
Fred Drake2db76802004-12-01 05:05:47 +00001697\end{itemize}
1698
1699
1700%======================================================================
1701\section{Porting to Python 2.5}
1702
1703This section lists previously described changes that may require
1704changes to your code:
1705
1706\begin{itemize}
1707
Andrew M. Kuchling5f445bf2006-04-12 18:54:00 +00001708\item ASCII is now the default encoding for modules. It's now
1709a syntax error if a module contains string literals with 8-bit
1710characters but doesn't have an encoding declaration. In Python 2.4
1711this triggered a warning, not a syntax error.
1712
Andrew M. Kuchlingc3749a92006-04-04 19:14:41 +00001713\item The \module{pickle} module no longer uses the deprecated \var{bin} parameter.
Fred Drake2db76802004-12-01 05:05:47 +00001714
Andrew M. Kuchling3b4fb042006-04-13 12:49:39 +00001715\item Previously, the \member{gi_frame} attribute of a generator
1716was always a frame object. Because of the \pep{342} changes
1717described in section~\ref{section-generators}, it's now possible
1718for \member{gi_frame} to be \code{None}.
1719
Andrew M. Kuchlingf7c62902006-04-12 12:27:50 +00001720\item C API: Many functions now use \ctype{Py_ssize_t}
1721instead of \ctype{int} to allow processing more data
1722on 64-bit machines. Extension code may need to make
1723the same change to avoid warnings and to support 64-bit machines.
1724See the earlier
1725section~ref{section-353} for a discussion of this change.
1726
1727\item C API:
1728The obmalloc changes mean that
1729you must be careful to not mix usage
1730of the \cfunction{PyMem_*()} and \cfunction{PyObject_*()}
1731families of functions. Memory allocated with
1732one family's \cfunction{*_Malloc()} must be
1733freed with the corresponding family's \cfunction{*_Free()} function.
1734
Fred Drake2db76802004-12-01 05:05:47 +00001735\end{itemize}
1736
1737
1738%======================================================================
1739\section{Acknowledgements \label{acks}}
1740
1741The author would like to thank the following people for offering
1742suggestions, corrections and assistance with various drafts of this
Andrew M. Kuchling5f445bf2006-04-12 18:54:00 +00001743article: Martin von~L\"owis, Mike Rovner, Thomas Wouters.
Fred Drake2db76802004-12-01 05:05:47 +00001744
1745\end{document}