blob: d509bc01c64dcf7605829851da94753c0646891b [file] [log] [blame]
Fred Drake2db76802004-12-01 05:05:47 +00001\documentclass{howto}
2\usepackage{distutils}
3% $Id$
4
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00005% Fix XXX comments
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00006% The easy_install stuff
Andrew M. Kuchling075e0232006-04-11 13:14:56 +00007% Stateful codec changes
Andrew M. Kuchling5f445bf2006-04-12 18:54:00 +00008% Count up the patches and bugs
Fred Drake2db76802004-12-01 05:05:47 +00009
10\title{What's New in Python 2.5}
Andrew M. Kuchling2cdb23e2006-04-05 13:59:01 +000011\release{0.1}
Andrew M. Kuchling92e24952004-12-03 13:54:09 +000012\author{A.M. Kuchling}
13\authoraddress{\email{amk@amk.ca}}
Fred Drake2db76802004-12-01 05:05:47 +000014
15\begin{document}
16\maketitle
17\tableofcontents
18
19This article explains the new features in Python 2.5. No release date
Andrew M. Kuchling5eefdca2006-02-08 11:36:09 +000020for Python 2.5 has been set; it will probably be released in the
Andrew M. Kuchlingd96a6ac2006-04-04 19:17:34 +000021autumn of 2006. \pep{356} describes the planned release schedule.
Fred Drake2db76802004-12-01 05:05:47 +000022
Andrew M. Kuchling9c67ee02006-04-04 19:07:27 +000023(This is still an early draft, and some sections are still skeletal or
24completely missing. Comments on the present material will still be
25welcomed.)
26
Andrew M. Kuchling437567c2006-03-07 20:48:55 +000027% XXX Compare with previous release in 2 - 3 sentences here.
Fred Drake2db76802004-12-01 05:05:47 +000028
29This article doesn't attempt to provide a complete specification of
30the new features, but instead provides a convenient overview. For
31full details, you should refer to the documentation for Python 2.5.
Andrew M. Kuchling437567c2006-03-07 20:48:55 +000032% XXX add hyperlink when the documentation becomes available online.
Fred Drake2db76802004-12-01 05:05:47 +000033If you want to understand the complete implementation and design
34rationale, refer to the PEP for a particular new feature.
35
36
37%======================================================================
Andrew M. Kuchling6a67e4e2006-04-12 13:03:35 +000038\section{PEP 243: Uploading Modules to PyPI}
39
40PEP 243 describes an HTTP-based protocol for submitting software
41packages to a central archive. The Python package index at
42\url{http://cheeseshop.python.org} now supports package uploads, and
43the new \command{upload} Distutils command will upload a package to the
44repository.
45
46Before a package can be uploaded, you must be able to build a
47distribution using the \command{sdist} Distutils command. Once that
48works, you can run \code{python setup.py upload} to add your package
49to the PyPI archive. Optionally you can GPG-sign the package by
50supplying the \programopt{--sign} and
51\programopt{--identity} options.
52
53\begin{seealso}
54
55\seepep{243}{Module Repository Upload Mechanism}{PEP written by
Andrew M. Kuchling5f445bf2006-04-12 18:54:00 +000056Sean Reifschneider; implemented by Martin von~L\"owis
Andrew M. Kuchling6a67e4e2006-04-12 13:03:35 +000057and Richard Jones. Note that the PEP doesn't exactly
58describe what's implemented in PyPI.}
59
60\end{seealso}
61
62
63%======================================================================
Andrew M. Kuchling437567c2006-03-07 20:48:55 +000064\section{PEP 308: Conditional Expressions}
65
Andrew M. Kuchlinge362d932006-03-09 13:56:25 +000066For a long time, people have been requesting a way to write
67conditional expressions, expressions that return value A or value B
68depending on whether a Boolean value is true or false. A conditional
69expression lets you write a single assignment statement that has the
70same effect as the following:
71
72\begin{verbatim}
73if condition:
74 x = true_value
75else:
76 x = false_value
77\end{verbatim}
78
79There have been endless tedious discussions of syntax on both
Andrew M. Kuchling9c67ee02006-04-04 19:07:27 +000080python-dev and comp.lang.python. A vote was even held that found the
81majority of voters wanted conditional expressions in some form,
82but there was no syntax that was preferred by a clear majority.
83Candidates included C's \code{cond ? true_v : false_v},
Andrew M. Kuchlinge362d932006-03-09 13:56:25 +000084\code{if cond then true_v else false_v}, and 16 other variations.
85
86GvR eventually chose a surprising syntax:
87
88\begin{verbatim}
89x = true_value if condition else false_value
90\end{verbatim}
91
Andrew M. Kuchling38f85072006-04-02 01:46:32 +000092Evaluation is still lazy as in existing Boolean expressions, so the
93order of evaluation jumps around a bit. The \var{condition}
94expression in the middle is evaluated first, and the \var{true_value}
95expression is evaluated only if the condition was true. Similarly,
96the \var{false_value} expression is only evaluated when the condition
97is false.
Andrew M. Kuchlinge362d932006-03-09 13:56:25 +000098
99This syntax may seem strange and backwards; why does the condition go
100in the \emph{middle} of the expression, and not in the front as in C's
101\code{c ? x : y}? The decision was checked by applying the new syntax
102to the modules in the standard library and seeing how the resulting
103code read. In many cases where a conditional expression is used, one
104value seems to be the 'common case' and one value is an 'exceptional
105case', used only on rarer occasions when the condition isn't met. The
106conditional syntax makes this pattern a bit more obvious:
107
108\begin{verbatim}
109contents = ((doc + '\n') if doc else '')
110\end{verbatim}
111
112I read the above statement as meaning ``here \var{contents} is
Andrew M. Kuchlingd0fcc022006-03-09 13:57:28 +0000113usually assigned a value of \code{doc+'\e n'}; sometimes
Andrew M. Kuchlinge362d932006-03-09 13:56:25 +0000114\var{doc} is empty, in which special case an empty string is returned.''
115I doubt I will use conditional expressions very often where there
116isn't a clear common and uncommon case.
117
118There was some discussion of whether the language should require
119surrounding conditional expressions with parentheses. The decision
120was made to \emph{not} require parentheses in the Python language's
121grammar, but as a matter of style I think you should always use them.
122Consider these two statements:
123
124\begin{verbatim}
125# First version -- no parens
126level = 1 if logging else 0
127
128# Second version -- with parens
129level = (1 if logging else 0)
130\end{verbatim}
131
132In the first version, I think a reader's eye might group the statement
133into 'level = 1', 'if logging', 'else 0', and think that the condition
134decides whether the assignment to \var{level} is performed. The
135second version reads better, in my opinion, because it makes it clear
136that the assignment is always performed and the choice is being made
137between two values.
138
139Another reason for including the brackets: a few odd combinations of
140list comprehensions and lambdas could look like incorrect conditional
141expressions. See \pep{308} for some examples. If you put parentheses
142around your conditional expressions, you won't run into this case.
143
144
145\begin{seealso}
146
147\seepep{308}{Conditional Expressions}{PEP written by
148Guido van Rossum and Raymond D. Hettinger; implemented by Thomas
149Wouters.}
150
151\end{seealso}
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000152
153
154%======================================================================
Andrew M. Kuchling3e41b052005-03-01 00:53:46 +0000155\section{PEP 309: Partial Function Application}
Fred Drake2db76802004-12-01 05:05:47 +0000156
Andrew M. Kuchlingb1c96fd2005-03-20 21:42:04 +0000157The \module{functional} module is intended to contain tools for
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000158functional-style programming. Currently it only contains a
159\class{partial()} function, but new functions will probably be added
160in future versions of Python.
Andrew M. Kuchlingb1c96fd2005-03-20 21:42:04 +0000161
Andrew M. Kuchling4b000cd2005-04-09 15:51:44 +0000162For programs written in a functional style, it can be useful to
163construct variants of existing functions that have some of the
164parameters filled in. Consider a Python function \code{f(a, b, c)};
165you could create a new function \code{g(b, c)} that was equivalent to
166\code{f(1, b, c)}. This is called ``partial function application'',
167and is provided by the \class{partial} class in the new
168\module{functional} module.
169
170The constructor for \class{partial} takes the arguments
171\code{(\var{function}, \var{arg1}, \var{arg2}, ...
172\var{kwarg1}=\var{value1}, \var{kwarg2}=\var{value2})}. The resulting
173object is callable, so you can just call it to invoke \var{function}
174with the filled-in arguments.
175
176Here's a small but realistic example:
177
178\begin{verbatim}
179import functional
180
181def log (message, subsystem):
182 "Write the contents of 'message' to the specified subsystem."
183 print '%s: %s' % (subsystem, message)
184 ...
185
186server_log = functional.partial(log, subsystem='server')
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000187server_log('Unable to open socket')
Andrew M. Kuchling4b000cd2005-04-09 15:51:44 +0000188\end{verbatim}
189
Andrew M. Kuchling6af7fe02005-08-02 17:20:36 +0000190Here's another example, from a program that uses PyGTk. Here a
191context-sensitive pop-up menu is being constructed dynamically. The
192callback provided for the menu option is a partially applied version
193of the \method{open_item()} method, where the first argument has been
194provided.
Andrew M. Kuchling4b000cd2005-04-09 15:51:44 +0000195
Andrew M. Kuchling6af7fe02005-08-02 17:20:36 +0000196\begin{verbatim}
197...
198class Application:
199 def open_item(self, path):
200 ...
201 def init (self):
202 open_func = functional.partial(self.open_item, item_path)
203 popup_menu.append( ("Open", open_func, 1) )
204\end{verbatim}
Andrew M. Kuchlingb1c96fd2005-03-20 21:42:04 +0000205
206
207\begin{seealso}
208
209\seepep{309}{Partial Function Application}{PEP proposed and written by
210Peter Harris; implemented by Hye-Shik Chang, with adaptations by
211Raymond Hettinger.}
212
213\end{seealso}
Fred Drake2db76802004-12-01 05:05:47 +0000214
215
216%======================================================================
Fred Drakedb7b0022005-03-20 22:19:47 +0000217\section{PEP 314: Metadata for Python Software Packages v1.1}
218
Andrew M. Kuchlingd8d732e2005-04-09 23:59:41 +0000219Some simple dependency support was added to Distutils. The
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000220\function{setup()} function now has \code{requires}, \code{provides},
221and \code{obsoletes} keyword parameters. When you build a source
222distribution using the \code{sdist} command, the dependency
223information will be recorded in the \file{PKG-INFO} file.
Andrew M. Kuchlingd8d732e2005-04-09 23:59:41 +0000224
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000225Another new keyword parameter is \code{download_url}, which should be
226set to a URL for the package's source code. This means it's now
227possible to look up an entry in the package index, determine the
228dependencies for a package, and download the required packages.
Andrew M. Kuchlingd8d732e2005-04-09 23:59:41 +0000229
230% XXX put example here
231
232\begin{seealso}
233
234\seepep{314}{Metadata for Python Software Packages v1.1}{PEP proposed
235and written by A.M. Kuchling, Richard Jones, and Fred Drake;
236implemented by Richard Jones and Fred Drake.}
237
238\end{seealso}
Fred Drakedb7b0022005-03-20 22:19:47 +0000239
240
241%======================================================================
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000242\section{PEP 328: Absolute and Relative Imports}
243
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000244The simpler part of PEP 328 was implemented in Python 2.4: parentheses
245could now be used to enclose the names imported from a module using
246the \code{from ... import ...} statement, making it easier to import
247many different names.
248
249The more complicated part has been implemented in Python 2.5:
250importing a module can be specified to use absolute or
251package-relative imports. The plan is to move toward making absolute
252imports the default in future versions of Python.
253
254Let's say you have a package directory like this:
255\begin{verbatim}
256pkg/
257pkg/__init__.py
258pkg/main.py
259pkg/string.py
260\end{verbatim}
261
262This defines a package named \module{pkg} containing the
263\module{pkg.main} and \module{pkg.string} submodules.
264
265Consider the code in the \file{main.py} module. What happens if it
266executes the statement \code{import string}? In Python 2.4 and
267earlier, it will first look in the package's directory to perform a
268relative import, finds \file{pkg/string.py}, imports the contents of
269that file as the \module{pkg.string} module, and that module is bound
270to the name \samp{string} in the \module{pkg.main} module's namespace.
271
272That's fine if \module{pkg.string} was what you wanted. But what if
273you wanted Python's standard \module{string} module? There's no clean
274way to ignore \module{pkg.string} and look for the standard module;
275generally you had to look at the contents of \code{sys.modules}, which
276is slightly unclean.
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +0000277Holger Krekel's \module{py.std} package provides a tidier way to perform
278imports from the standard library, \code{import py ; py.std.string.join()},
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000279but that package isn't available on all Python installations.
280
281Reading code which relies on relative imports is also less clear,
282because a reader may be confused about which module, \module{string}
283or \module{pkg.string}, is intended to be used. Python users soon
284learned not to duplicate the names of standard library modules in the
285names of their packages' submodules, but you can't protect against
286having your submodule's name being used for a new module added in a
287future version of Python.
288
289In Python 2.5, you can switch \keyword{import}'s behaviour to
290absolute imports using a \code{from __future__ import absolute_import}
291directive. This absolute-import behaviour will become the default in
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +0000292a future version (probably Python 2.7). Once absolute imports
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000293are the default, \code{import string} will
294always find the standard library's version.
295It's suggested that users should begin using absolute imports as much
296as possible, so it's preferable to begin writing \code{from pkg import
297string} in your code.
298
299Relative imports are still possible by adding a leading period
300to the module name when using the \code{from ... import} form:
301
302\begin{verbatim}
303# Import names from pkg.string
304from .string import name1, name2
305# Import pkg.string
306from . import string
307\end{verbatim}
308
309This imports the \module{string} module relative to the current
310package, so in \module{pkg.main} this will import \var{name1} and
311\var{name2} from \module{pkg.string}. Additional leading periods
312perform the relative import starting from the parent of the current
313package. For example, code in the \module{A.B.C} module can do:
314
315\begin{verbatim}
316from . import D # Imports A.B.D
317from .. import E # Imports A.E
318from ..F import G # Imports A.F.G
319\end{verbatim}
320
321Leading periods cannot be used with the \code{import \var{modname}}
322form of the import statement, only the \code{from ... import} form.
323
324\begin{seealso}
325
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +0000326\seepep{328}{Imports: Multi-Line and Absolute/Relative}
327{PEP written by Aahz; implemented by Thomas Wouters.}
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000328
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +0000329\seeurl{http://codespeak.net/py/current/doc/index.html}
330{The py library by Holger Krekel, which contains the \module{py.std} package.}
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000331
332\end{seealso}
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000333
334
335%======================================================================
Andrew M. Kuchling21d3a7c2006-03-15 11:53:09 +0000336\section{PEP 338: Executing Modules as Scripts}
337
Andrew M. Kuchlingb182db42006-03-17 21:48:46 +0000338The \programopt{-m} switch added in Python 2.4 to execute a module as
339a script gained a few more abilities. Instead of being implemented in
340C code inside the Python interpreter, the switch now uses an
341implementation in a new module, \module{runpy}.
342
343The \module{runpy} module implements a more sophisticated import
344mechanism so that it's now possible to run modules in a package such
345as \module{pychecker.checker}. The module also supports alternative
346import mechanisms such as the \module{zipimport} module. (This means
347you can add a .zip archive's path to \code{sys.path} and then use the
348\programopt{-m} switch to execute code from the archive.
349
350
351\begin{seealso}
352
353\seepep{338}{Executing modules as scripts}{PEP written and
354implemented by Nick Coghlan.}
355
356\end{seealso}
Andrew M. Kuchling21d3a7c2006-03-15 11:53:09 +0000357
358
359%======================================================================
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000360\section{PEP 341: Unified try/except/finally}
361
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000362Until Python 2.5, the \keyword{try} statement came in two
363flavours. You could use a \keyword{finally} block to ensure that code
364is always executed, or a number of \keyword{except} blocks to catch an
365exception. You couldn't combine both \keyword{except} blocks and a
366\keyword{finally} block, because generating the right bytecode for the
367combined version was complicated and it wasn't clear what the
368semantics of the combined should be.
369
370GvR spent some time working with Java, which does support the
371equivalent of combining \keyword{except} blocks and a
372\keyword{finally} block, and this clarified what the statement should
373mean. In Python 2.5, you can now write:
374
375\begin{verbatim}
376try:
377 block-1 ...
378except Exception1:
379 handler-1 ...
380except Exception2:
381 handler-2 ...
382else:
383 else-block
384finally:
385 final-block
386\end{verbatim}
387
388The code in \var{block-1} is executed. If the code raises an
389exception, the handlers are tried in order: \var{handler-1},
390\var{handler-2}, ... If no exception is raised, the \var{else-block}
391is executed. No matter what happened previously, the
392\var{final-block} is executed once the code block is complete and any
393raised exceptions handled. Even if there's an error in an exception
394handler or the \var{else-block} and a new exception is raised, the
395\var{final-block} is still executed.
396
397\begin{seealso}
398
399\seepep{341}{Unifying try-except and try-finally}{PEP written by Georg Brandl;
Andrew M. Kuchling9c67ee02006-04-04 19:07:27 +0000400implementation by Thomas Lee.}
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000401
402\end{seealso}
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000403
404
405%======================================================================
Andrew M. Kuchlinga2e21cb2005-08-02 17:13:21 +0000406\section{PEP 342: New Generator Features}
407
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000408Python 2.5 adds a simple way to pass values \emph{into} a generator.
Andrew M. Kuchling150e3492005-08-23 00:56:06 +0000409As introduced in Python 2.3, generators only produce output; once a
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000410generator's code is invoked to create an iterator, there's no way to
411pass any new information into the function when its execution is
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000412resumed. Sometimes the ability to pass in some information would be
413useful. Hackish solutions to this include making the generator's code
414look at a global variable and then changing the global variable's
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000415value, or passing in some mutable object that callers then modify.
Andrew M. Kuchling150e3492005-08-23 00:56:06 +0000416
417To refresh your memory of basic generators, here's a simple example:
418
419\begin{verbatim}
420def counter (maximum):
421 i = 0
422 while i < maximum:
423 yield i
424 i += 1
425\end{verbatim}
426
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000427When you call \code{counter(10)}, the result is an iterator that
428returns the values from 0 up to 9. On encountering the
429\keyword{yield} statement, the iterator returns the provided value and
430suspends the function's execution, preserving the local variables.
431Execution resumes on the following call to the iterator's
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000432\method{next()} method, picking up after the \keyword{yield} statement.
Andrew M. Kuchling150e3492005-08-23 00:56:06 +0000433
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000434In Python 2.3, \keyword{yield} was a statement; it didn't return any
435value. In 2.5, \keyword{yield} is now an expression, returning a
436value that can be assigned to a variable or otherwise operated on:
Andrew M. Kuchlinga2e21cb2005-08-02 17:13:21 +0000437
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000438\begin{verbatim}
439val = (yield i)
440\end{verbatim}
441
442I recommend that you always put parentheses around a \keyword{yield}
443expression when you're doing something with the returned value, as in
444the above example. The parentheses aren't always necessary, but it's
445easier to always add them instead of having to remember when they're
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000446needed.\footnote{The exact rules are that a \keyword{yield}-expression must
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000447always be parenthesized except when it occurs at the top-level
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000448expression on the right-hand side of an assignment, meaning you can
449write \code{val = yield i} but have to use parentheses when there's an
450operation, as in \code{val = (yield i) + 12}.}
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000451
452Values are sent into a generator by calling its
453\method{send(\var{value})} method. The generator's code is then
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000454resumed and the \keyword{yield} expression returns the specified
455\var{value}. If the regular \method{next()} method is called, the
456\keyword{yield} returns \constant{None}.
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000457
458Here's the previous example, modified to allow changing the value of
459the internal counter.
460
461\begin{verbatim}
462def counter (maximum):
463 i = 0
464 while i < maximum:
465 val = (yield i)
466 # If value provided, change counter
467 if val is not None:
468 i = val
469 else:
470 i += 1
471\end{verbatim}
472
473And here's an example of changing the counter:
474
475\begin{verbatim}
476>>> it = counter(10)
477>>> print it.next()
4780
479>>> print it.next()
4801
481>>> print it.send(8)
4828
483>>> print it.next()
4849
485>>> print it.next()
486Traceback (most recent call last):
487 File ``t.py'', line 15, in ?
488 print it.next()
489StopIteration
Andrew M. Kuchlingc2033702005-08-29 13:30:12 +0000490\end{verbatim}
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000491
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000492Because \keyword{yield} will often be returning \constant{None}, you
493should always check for this case. Don't just use its value in
494expressions unless you're sure that the \method{send()} method
495will be the only method used resume your generator function.
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000496
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000497In addition to \method{send()}, there are two other new methods on
498generators:
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000499
500\begin{itemize}
501
502 \item \method{throw(\var{type}, \var{value}=None,
503 \var{traceback}=None)} is used to raise an exception inside the
504 generator; the exception is raised by the \keyword{yield} expression
505 where the generator's execution is paused.
506
507 \item \method{close()} raises a new \exception{GeneratorExit}
508 exception inside the generator to terminate the iteration.
509 On receiving this
510 exception, the generator's code must either raise
511 \exception{GeneratorExit} or \exception{StopIteration}; catching the
512 exception and doing anything else is illegal and will trigger
513 a \exception{RuntimeError}. \method{close()} will also be called by
514 Python's garbage collection when the generator is garbage-collected.
515
516 If you need to run cleanup code in case of a \exception{GeneratorExit},
517 I suggest using a \code{try: ... finally:} suite instead of
518 catching \exception{GeneratorExit}.
519
520\end{itemize}
521
522The cumulative effect of these changes is to turn generators from
523one-way producers of information into both producers and consumers.
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000524
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000525Generators also become \emph{coroutines}, a more generalized form of
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000526subroutines. Subroutines are entered at one point and exited at
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000527another point (the top of the function, and a \keyword{return
528statement}), but coroutines can be entered, exited, and resumed at
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000529many different points (the \keyword{yield} statements). We'll have to
530figure out patterns for using coroutines effectively in Python.
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000531
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000532The addition of the \method{close()} method has one side effect that
533isn't obvious. \method{close()} is called when a generator is
534garbage-collected, so this means the generator's code gets one last
535chance to run before the generator is destroyed, and this last chance
536means that \code{try...finally} statements in generators can now be
537guaranteed to work; the \keyword{finally} clause will now always get a
538chance to run. The syntactic restriction that you couldn't mix
539\keyword{yield} statements with a \code{try...finally} suite has
540therefore been removed. This seems like a minor bit of language
541trivia, but using generators and \code{try...finally} is actually
542necessary in order to implement the \keyword{with} statement
543described by PEP 343. We'll look at this new statement in the following
544section.
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000545
Andrew M. Kuchlinga2e21cb2005-08-02 17:13:21 +0000546\begin{seealso}
547
548\seepep{342}{Coroutines via Enhanced Generators}{PEP written by
549Guido van Rossum and Phillip J. Eby;
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000550implemented by Phillip J. Eby. Includes examples of
551some fancier uses of generators as coroutines.}
552
553\seeurl{http://en.wikipedia.org/wiki/Coroutine}{The Wikipedia entry for
554coroutines.}
555
Neal Norwitz09179882006-03-04 23:31:45 +0000556\seeurl{http://www.sidhe.org/\~{}dan/blog/archives/000178.html}{An
Andrew M. Kuchling07382062005-08-27 18:45:47 +0000557explanation of coroutines from a Perl point of view, written by Dan
558Sugalski.}
Andrew M. Kuchlinga2e21cb2005-08-02 17:13:21 +0000559
560\end{seealso}
561
562
563%======================================================================
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000564\section{PEP 343: The 'with' statement}
565
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000566The \keyword{with} statement allows a clearer
567version of code that uses \code{try...finally} blocks
568
569First, I'll discuss the statement as it will commonly be used, and
570then I'll discuss the detailed implementation and how to write objects
571(called ``context managers'') that can be used with this statement.
572Most people, who will only use \keyword{with} in company with an
Andrew M. Kuchlinga4d651f2006-04-06 13:24:58 +0000573existing object, don't need to know these details and can
574just use objects that are documented to work as context managers.
575Authors of new context managers will need to understand the details of
576the underlying implementation.
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000577
578The \keyword{with} statement is a new control-flow structure whose
579basic structure is:
580
581\begin{verbatim}
582with expression as variable:
583 with-block
584\end{verbatim}
585
586The expression is evaluated, and it should result in a type of object
587that's called a context manager. The context manager can return a
588value that will be bound to the name \var{variable}. (Note carefully:
589\var{variable} is \emph{not} assigned the result of \var{expression}.
590One method of the context manager is run before \var{with-block} is
591executed, and another method is run after the block is done, even if
592the block raised an exception.
593
594To enable the statement in Python 2.5, you need
595to add the following directive to your module:
596
597\begin{verbatim}
598from __future__ import with_statement
599\end{verbatim}
600
601Some standard Python objects can now behave as context managers. For
602example, file objects:
603
604\begin{verbatim}
605with open('/etc/passwd', 'r') as f:
606 for line in f:
607 print line
608
609# f has been automatically closed at this point.
610\end{verbatim}
611
612The \module{threading} module's locks and condition variables
Andrew M. Kuchling9c67ee02006-04-04 19:07:27 +0000613also support the \keyword{with} statement:
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000614
615\begin{verbatim}
616lock = threading.Lock()
617with lock:
618 # Critical section of code
619 ...
620\end{verbatim}
621
622The lock is acquired before the block is executed, and released once
623the block is complete.
624
625The \module{decimal} module's contexts, which encapsulate the desired
626precision and rounding characteristics for computations, can also be
627used as context managers.
628
629\begin{verbatim}
630import decimal
631
632v1 = decimal.Decimal('578')
633
634# Displays with default precision of 28 digits
635print v1.sqrt()
636
637with decimal.Context(prec=16):
638 # All code in this block uses a precision of 16 digits.
639 # The original context is restored on exiting the block.
640 print v1.sqrt()
641\end{verbatim}
642
643\subsection{Writing Context Managers}
644
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000645% XXX write this
646
Andrew M. Kuchling9c67ee02006-04-04 19:07:27 +0000647This section still needs to be written.
648
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000649The new \module{contextlib} module provides some functions and a
Andrew M. Kuchling9c67ee02006-04-04 19:07:27 +0000650decorator that are useful for writing context managers.
651Future versions will go into more detail.
652
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000653% XXX describe further
654
655\begin{seealso}
656
657\seepep{343}{The ``with'' statement}{PEP written by
658Guido van Rossum and Nick Coghlan. }
659
660\end{seealso}
661
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000662
663%======================================================================
Andrew M. Kuchling8f4d2552006-03-08 01:50:20 +0000664\section{PEP 352: Exceptions as New-Style Classes}
665
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000666Exception classes can now be new-style classes, not just classic
667classes, and the built-in \exception{Exception} class and all the
668standard built-in exceptions (\exception{NameError},
669\exception{ValueError}, etc.) are now new-style classes.
Andrew M. Kuchlingaeadf952006-03-09 19:06:05 +0000670
671The inheritance hierarchy for exceptions has been rearranged a bit.
672In 2.5, the inheritance relationships are:
673
674\begin{verbatim}
675BaseException # New in Python 2.5
676|- KeyboardInterrupt
677|- SystemExit
678|- Exception
679 |- (all other current built-in exceptions)
680\end{verbatim}
681
682This rearrangement was done because people often want to catch all
683exceptions that indicate program errors. \exception{KeyboardInterrupt} and
684\exception{SystemExit} aren't errors, though, and usually represent an explicit
685action such as the user hitting Control-C or code calling
686\function{sys.exit()}. A bare \code{except:} will catch all exceptions,
687so you commonly need to list \exception{KeyboardInterrupt} and
688\exception{SystemExit} in order to re-raise them. The usual pattern is:
689
690\begin{verbatim}
691try:
692 ...
693except (KeyboardInterrupt, SystemExit):
694 raise
695except:
696 # Log error...
697 # Continue running program...
698\end{verbatim}
699
700In Python 2.5, you can now write \code{except Exception} to achieve
701the same result, catching all the exceptions that usually indicate errors
702but leaving \exception{KeyboardInterrupt} and
703\exception{SystemExit} alone. As in previous versions,
704a bare \code{except:} still catches all exceptions.
705
706The goal for Python 3.0 is to require any class raised as an exception
707to derive from \exception{BaseException} or some descendant of
708\exception{BaseException}, and future releases in the
709Python 2.x series may begin to enforce this constraint. Therefore, I
710suggest you begin making all your exception classes derive from
711\exception{Exception} now. It's been suggested that the bare
712\code{except:} form should be removed in Python 3.0, but Guido van~Rossum
713hasn't decided whether to do this or not.
714
715Raising of strings as exceptions, as in the statement \code{raise
716"Error occurred"}, is deprecated in Python 2.5 and will trigger a
717warning. The aim is to be able to remove the string-exception feature
718in a few releases.
719
720
721\begin{seealso}
722
Andrew M. Kuchlingc3749a92006-04-04 19:14:41 +0000723\seepep{352}{Required Superclass for Exceptions}{PEP written by
Andrew M. Kuchlingaeadf952006-03-09 19:06:05 +0000724Brett Cannon and Guido van Rossum; implemented by Brett Cannon.}
725
726\end{seealso}
Andrew M. Kuchling8f4d2552006-03-08 01:50:20 +0000727
728
729%======================================================================
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +0000730\section{PEP 353: Using ssize_t as the index type\label{section-353}}
Andrew M. Kuchlingc3749a92006-04-04 19:14:41 +0000731
732A wide-ranging change to Python's C API, using a new
733\ctype{Py_ssize_t} type definition instead of \ctype{int},
734will permit the interpreter to handle more data on 64-bit platforms.
735This change doesn't affect Python's capacity on 32-bit platforms.
736
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +0000737Various pieces of the Python interpreter used C's \ctype{int} type to
738store sizes or counts; for example, the number of items in a list or
739tuple were stored in an \ctype{int}. The C compilers for most 64-bit
740platforms still define \ctype{int} as a 32-bit type, so that meant
741that lists could only hold up to \code{2**31 - 1} = 2147483647 items.
742(There are actually a few different programming models that 64-bit C
743compilers can use -- see
744\url{http://www.unix.org/version2/whatsnew/lp64_wp.html} for a
745discussion -- but the most commonly available model leaves \ctype{int}
746as 32 bits.)
747
748A limit of 2147483647 items doesn't really matter on a 32-bit platform
749because you'll run out of memory before hitting the length limit.
750Each list item requires space for a pointer, which is 4 bytes, plus
751space for a \ctype{PyObject} representing the item. 2147483647*4 is
752already more bytes than a 32-bit address space can contain.
753
754It's possible to address that much memory on a 64-bit platform,
755however. The pointers for a list that size would only require 16GiB
756of space, so it's not unreasonable that Python programmers might
757construct lists that large. Therefore, the Python interpreter had to
758be changed to use some type other than \ctype{int}, and this will be a
75964-bit type on 64-bit platforms. The change will cause
760incompatibilities on 64-bit machines, so it was deemed worth making
761the transition now, while the number of 64-bit users is still
762relatively small. (In 5 or 10 years, we may \emph{all} be on 64-bit
763machines, and the transition would be more painful then.)
764
765This change most strongly affects authors of C extension modules.
766Python strings and container types such as lists and tuples
767now use \ctype{Py_ssize_t} to store their size.
768Functions such as \cfunction{PyList_Size()}
769now return \ctype{Py_ssize_t}. Code in extension modules
770may therefore need to have some variables changed to
771\ctype{Py_ssize_t}.
772
773The \cfunction{PyArg_ParseTuple()} and \cfunction{Py_BuildValue()} functions
774have a new conversion code, \samp{n}, for \ctype{Py_ssize_t}.
Andrew M. Kuchlinga4d651f2006-04-06 13:24:58 +0000775\cfunction{PyArg_ParseTuple()}'s \samp{s\#} and \samp{t\#} still output
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +0000776\ctype{int} by default, but you can define the macro
777\csimplemacro{PY_SSIZE_T_CLEAN} before including \file{Python.h}
778to make them return \ctype{Py_ssize_t}.
779
780\pep{353} has a section on conversion guidelines that
781extension authors should read to learn about supporting 64-bit
782platforms.
Andrew M. Kuchlingc3749a92006-04-04 19:14:41 +0000783
784\begin{seealso}
785
Andrew M. Kuchling5f445bf2006-04-12 18:54:00 +0000786\seepep{353}{Using ssize_t as the index type}{PEP written and implemented by Martin von~L\"owis.}
Andrew M. Kuchlingc3749a92006-04-04 19:14:41 +0000787
788\end{seealso}
789
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +0000790
Andrew M. Kuchlingc3749a92006-04-04 19:14:41 +0000791%======================================================================
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000792\section{PEP 357: The '__index__' method}
793
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000794The NumPy developers had a problem that could only be solved by adding
795a new special method, \method{__index__}. When using slice notation,
Fred Drake1c0e3282006-04-02 03:30:06 +0000796as in \code{[\var{start}:\var{stop}:\var{step}]}, the values of the
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000797\var{start}, \var{stop}, and \var{step} indexes must all be either
798integers or long integers. NumPy defines a variety of specialized
799integer types corresponding to unsigned and signed integers of 8, 16,
80032, and 64 bits, but there was no way to signal that these types could
801be used as slice indexes.
802
803Slicing can't just use the existing \method{__int__} method because
804that method is also used to implement coercion to integers. If
805slicing used \method{__int__}, floating-point numbers would also
806become legal slice indexes and that's clearly an undesirable
807behaviour.
808
809Instead, a new special method called \method{__index__} was added. It
810takes no arguments and returns an integer giving the slice index to
811use. For example:
812
813\begin{verbatim}
814class C:
815 def __index__ (self):
816 return self.value
817\end{verbatim}
818
819The return value must be either a Python integer or long integer.
820The interpreter will check that the type returned is correct, and
821raises a \exception{TypeError} if this requirement isn't met.
822
823A corresponding \member{nb_index} slot was added to the C-level
824\ctype{PyNumberMethods} structure to let C extensions implement this
825protocol. \cfunction{PyNumber_Index(\var{obj})} can be used in
826extension code to call the \method{__index__} function and retrieve
827its result.
828
829\begin{seealso}
830
831\seepep{357}{Allowing Any Object to be Used for Slicing}{PEP written
Andrew M. Kuchling9c67ee02006-04-04 19:07:27 +0000832and implemented by Travis Oliphant.}
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000833
834\end{seealso}
Andrew M. Kuchling437567c2006-03-07 20:48:55 +0000835
836
837%======================================================================
Fred Drake2db76802004-12-01 05:05:47 +0000838\section{Other Language Changes}
839
840Here are all of the changes that Python 2.5 makes to the core Python
841language.
842
843\begin{itemize}
Andrew M. Kuchling1cae3f52004-12-03 14:57:21 +0000844
845\item The \function{min()} and \function{max()} built-in functions
846gained a \code{key} keyword argument analogous to the \code{key}
Andrew M. Kuchlinge9b1bf42005-03-20 19:26:30 +0000847argument for \method{sort()}. This argument supplies a function
Andrew M. Kuchling1cae3f52004-12-03 14:57:21 +0000848that takes a single argument and is called for every value in the list;
849\function{min()}/\function{max()} will return the element with the
850smallest/largest return value from this function.
851For example, to find the longest string in a list, you can do:
852
853\begin{verbatim}
854L = ['medium', 'longest', 'short']
855# Prints 'longest'
856print max(L, key=len)
857# Prints 'short', because lexicographically 'short' has the largest value
858print max(L)
859\end{verbatim}
860
861(Contributed by Steven Bethard and Raymond Hettinger.)
Fred Drake2db76802004-12-01 05:05:47 +0000862
Andrew M. Kuchling150e3492005-08-23 00:56:06 +0000863\item Two new built-in functions, \function{any()} and
864\function{all()}, evaluate whether an iterator contains any true or
865false values. \function{any()} returns \constant{True} if any value
866returned by the iterator is true; otherwise it will return
867\constant{False}. \function{all()} returns \constant{True} only if
868all of the values returned by the iterator evaluate as being true.
Andrew M. Kuchling6e3a66d2006-04-07 12:46:06 +0000869(Suggested by GvR, and implemented by Raymond Hettinger.)
Andrew M. Kuchling150e3492005-08-23 00:56:06 +0000870
Andrew M. Kuchling5f445bf2006-04-12 18:54:00 +0000871\item ASCII is now the default encoding for modules. It's now
872a syntax error if a module contains string literals with 8-bit
873characters but doesn't have an encoding declaration. In Python 2.4
874this triggered a warning, not a syntax error. See \pep{263}
875for how to declare a module's encoding; for example, you might add
876a line like this near the top of the source file:
877
878\begin{verbatim}
879# -*- coding: latin1 -*-
880\end{verbatim}
881
Andrew M. Kuchlinge9b1bf42005-03-20 19:26:30 +0000882\item The list of base classes in a class definition can now be empty.
883As an example, this is now legal:
884
885\begin{verbatim}
886class C():
887 pass
888\end{verbatim}
889(Implemented by Brett Cannon.)
890
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000891% XXX __missing__ hook in dictionaries
892
Fred Drake2db76802004-12-01 05:05:47 +0000893\end{itemize}
894
895
896%======================================================================
Andrew M. Kuchlingda376042006-03-17 15:56:41 +0000897\subsection{Interactive Interpreter Changes}
898
899In the interactive interpreter, \code{quit} and \code{exit}
900have long been strings so that new users get a somewhat helpful message
901when they try to quit:
902
903\begin{verbatim}
904>>> quit
905'Use Ctrl-D (i.e. EOF) to exit.'
906\end{verbatim}
907
908In Python 2.5, \code{quit} and \code{exit} are now objects that still
909produce string representations of themselves, but are also callable.
910Newbies who try \code{quit()} or \code{exit()} will now exit the
911interpreter as they expect. (Implemented by Georg Brandl.)
912
913
914%======================================================================
Fred Drake2db76802004-12-01 05:05:47 +0000915\subsection{Optimizations}
916
917\begin{itemize}
918
Andrew M. Kuchling150e3492005-08-23 00:56:06 +0000919\item When they were introduced
920in Python 2.4, the built-in \class{set} and \class{frozenset} types
921were built on top of Python's dictionary type.
922In 2.5 the internal data structure has been customized for implementing sets,
923and as a result sets will use a third less memory and are somewhat faster.
924(Implemented by Raymond Hettinger.)
Fred Drake2db76802004-12-01 05:05:47 +0000925
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000926\item The performance of some Unicode operations has been improved.
927% XXX provide details?
928
929\item The code generator's peephole optimizer now performs
930simple constant folding in expressions. If you write something like
931\code{a = 2+3}, the code generator will do the arithmetic and produce
932code corresponding to \code{a = 5}.
933
Fred Drake2db76802004-12-01 05:05:47 +0000934\end{itemize}
935
936The net result of the 2.5 optimizations is that Python 2.5 runs the
Andrew M. Kuchling9c67ee02006-04-04 19:07:27 +0000937pystone benchmark around XXX\% faster than Python 2.4.
Fred Drake2db76802004-12-01 05:05:47 +0000938
939
940%======================================================================
941\section{New, Improved, and Deprecated Modules}
942
943As usual, Python's standard library received a number of enhancements and
944bug fixes. Here's a partial list of the most notable changes, sorted
945alphabetically by module name. Consult the
946\file{Misc/NEWS} file in the source tree for a more
Andrew M. Kuchlingf688cc52006-03-10 18:50:08 +0000947complete list of changes, or look through the SVN logs for all the
Fred Drake2db76802004-12-01 05:05:47 +0000948details.
949
950\begin{itemize}
951
Andrew M. Kuchling150e3492005-08-23 00:56:06 +0000952% collections.deque now has .remove()
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000953% collections.defaultdict
Andrew M. Kuchling150e3492005-08-23 00:56:06 +0000954
Andrew M. Kuchling3e41b052005-03-01 00:53:46 +0000955% the cPickle module no longer accepts the deprecated None option in the
956% args tuple returned by __reduce__().
957
958% csv module improvements
959
960% datetime.datetime() now has a strptime class method which can be used to
961% create datetime object using a string and format.
962
Andrew M. Kuchling38f85072006-04-02 01:46:32 +0000963% fileinput: opening hook used to control how files are opened.
964% .input() now has a mode parameter
965% now has a fileno() function
966% accepts Unicode filenames
967
Andrew M. Kuchlingda376042006-03-17 15:56:41 +0000968\item In the \module{gc} module, the new \function{get_count()} function
969returns a 3-tuple containing the current collection counts for the
970three GC generations. This is accounting information for the garbage
971collector; when these counts reach a specified threshold, a garbage
972collection sweep will be made. The existing \function{gc.collect()}
973function now takes an optional \var{generation} argument of 0, 1, or 2
974to specify which generation to collect.
975
Andrew M. Kuchlinge9b1bf42005-03-20 19:26:30 +0000976\item The \function{nsmallest()} and
977\function{nlargest()} functions in the \module{heapq} module
978now support a \code{key} keyword argument similar to the one
979provided by the \function{min()}/\function{max()} functions
980and the \method{sort()} methods. For example:
981Example:
982
983\begin{verbatim}
984>>> import heapq
985>>> L = ["short", 'medium', 'longest', 'longer still']
986>>> heapq.nsmallest(2, L) # Return two lowest elements, lexicographically
987['longer still', 'longest']
988>>> heapq.nsmallest(2, L, key=len) # Return two shortest elements
989['short', 'medium']
990\end{verbatim}
991
992(Contributed by Raymond Hettinger.)
993
Andrew M. Kuchling511a3a82005-03-20 19:52:18 +0000994\item The \function{itertools.islice()} function now accepts
995\code{None} for the start and step arguments. This makes it more
996compatible with the attributes of slice objects, so that you can now write
997the following:
998
999\begin{verbatim}
1000s = slice(5) # Create slice object
1001itertools.islice(iterable, s.start, s.stop, s.step)
1002\end{verbatim}
1003
1004(Contributed by Raymond Hettinger.)
Andrew M. Kuchling3e41b052005-03-01 00:53:46 +00001005
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001006\item The \module{operator} module's \function{itemgetter()}
1007and \function{attrgetter()} functions now support multiple fields.
1008A call such as \code{operator.attrgetter('a', 'b')}
1009will return a function
1010that retrieves the \member{a} and \member{b} attributes. Combining
1011this new feature with the \method{sort()} method's \code{key} parameter
1012lets you easily sort lists using multiple fields.
Andrew M. Kuchling6e3a66d2006-04-07 12:46:06 +00001013(Contributed by Raymond Hettinger.)
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001014
Andrew M. Kuchling3e41b052005-03-01 00:53:46 +00001015
Andrew M. Kuchlinge9b1bf42005-03-20 19:26:30 +00001016\item The \module{os} module underwent a number of changes. The
1017\member{stat_float_times} variable now defaults to true, meaning that
1018\function{os.stat()} will now return time values as floats. (This
1019doesn't necessarily mean that \function{os.stat()} will return times
1020that are precise to fractions of a second; not all systems support
1021such precision.)
Andrew M. Kuchling3e41b052005-03-01 00:53:46 +00001022
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001023Constants named \member{os.SEEK_SET}, \member{os.SEEK_CUR}, and
Andrew M. Kuchlinge9b1bf42005-03-20 19:26:30 +00001024\member{os.SEEK_END} have been added; these are the parameters to the
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001025\function{os.lseek()} function. Two new constants for locking are
1026\member{os.O_SHLOCK} and \member{os.O_EXLOCK}.
1027
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001028Two new functions, \function{wait3()} and \function{wait4()}, were
1029added. They're similar the \function{waitpid()} function which waits
1030for a child process to exit and returns a tuple of the process ID and
1031its exit status, but \function{wait3()} and \function{wait4()} return
1032additional information. \function{wait3()} doesn't take a process ID
1033as input, so it waits for any child process to exit and returns a
10343-tuple of \var{process-id}, \var{exit-status}, \var{resource-usage}
1035as returned from the \function{resource.getrusage()} function.
1036\function{wait4(\var{pid})} does take a process ID.
Andrew M. Kuchling6e3a66d2006-04-07 12:46:06 +00001037(Contributed by Chad J. Schroeder.)
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001038
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001039On FreeBSD, the \function{os.stat()} function now returns
1040times with nanosecond resolution, and the returned object
1041now has \member{st_gen} and \member{st_birthtime}.
1042The \member{st_flags} member is also available, if the platform supports it.
Andrew M. Kuchling6e3a66d2006-04-07 12:46:06 +00001043(Contributed by Antti Louko and Diego Petten\`o.)
1044% (Patch 1180695, 1212117)
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001045
Andrew M. Kuchling01e3d262006-03-17 15:38:39 +00001046\item The old \module{regex} and \module{regsub} modules, which have been
1047deprecated ever since Python 2.0, have finally been deleted.
Andrew M. Kuchlingf4b06602006-03-17 15:39:52 +00001048Other deleted modules: \module{statcache}, \module{tzparse},
1049\module{whrandom}.
Andrew M. Kuchling01e3d262006-03-17 15:38:39 +00001050
1051\item The \file{lib-old} directory,
1052which includes ancient modules such as \module{dircmp} and
1053\module{ni}, was also deleted. \file{lib-old} wasn't on the default
1054\code{sys.path}, so unless your programs explicitly added the directory to
1055\code{sys.path}, this removal shouldn't affect your code.
1056
Andrew M. Kuchling4678dc82006-01-15 16:11:28 +00001057\item The \module{socket} module now supports \constant{AF_NETLINK}
1058sockets on Linux, thanks to a patch from Philippe Biondi.
1059Netlink sockets are a Linux-specific mechanism for communications
1060between a user-space process and kernel code; an introductory
1061article about them is at \url{http://www.linuxjournal.com/article/7356}.
1062In Python code, netlink addresses are represented as a tuple of 2 integers,
1063\code{(\var{pid}, \var{group_mask})}.
1064
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001065Socket objects also gained accessor methods \method{getfamily()},
1066\method{gettype()}, and \method{getproto()} methods to retrieve the
1067family, type, and protocol values for the socket.
1068
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001069\item New module: \module{spwd} provides functions for accessing the
Andrew M. Kuchling5f445bf2006-04-12 18:54:00 +00001070shadow password database on systems that support it.
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001071% XXX give example
Fred Drake2db76802004-12-01 05:05:47 +00001072
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001073% XXX patch #1382163: sys.subversion, Py_GetBuildNumber()
1074
Andrew M. Kuchlinge9b1bf42005-03-20 19:26:30 +00001075\item The \class{TarFile} class in the \module{tarfile} module now has
Georg Brandl08c02db2005-07-22 18:39:19 +00001076an \method{extractall()} method that extracts all members from the
Andrew M. Kuchlinge9b1bf42005-03-20 19:26:30 +00001077archive into the current working directory. It's also possible to set
1078a different directory as the extraction target, and to unpack only a
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001079subset of the archive's members.
Andrew M. Kuchlinge9b1bf42005-03-20 19:26:30 +00001080
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001081A tarfile's compression can be autodetected by
1082using the mode \code{'r|*'}.
1083% patch 918101
1084(Contributed by Lars Gust\"abel.)
Gregory P. Smithf21a5f72005-08-21 18:45:59 +00001085
Andrew M. Kuchlingf688cc52006-03-10 18:50:08 +00001086\item The \module{unicodedata} module has been updated to use version 4.1.0
1087of the Unicode character database. Version 3.2.0 is required
1088by some specifications, so it's still available as
1089\member{unicodedata.db_3_2_0}.
1090
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001091% patch #754022: Greatly enhanced webbrowser.py (by Oleg Broytmann).
1092
Fredrik Lundh7e0aef02005-12-12 18:54:55 +00001093
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001094\item The \module{xmlrpclib} module now supports returning
1095 \class{datetime} objects for the XML-RPC date type. Supply
1096 \code{use_datetime=True} to the \function{loads()} function
1097 or the \class{Unmarshaller} class to enable this feature.
Andrew M. Kuchling6e3a66d2006-04-07 12:46:06 +00001098 (Contributed by Skip Montanaro.)
1099% Patch 1120353
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001100
Gregory P. Smithf21a5f72005-08-21 18:45:59 +00001101
Fred Drake114b8ca2005-03-21 05:47:11 +00001102\end{itemize}
Andrew M. Kuchlinge9b1bf42005-03-20 19:26:30 +00001103
Fred Drake2db76802004-12-01 05:05:47 +00001104
1105
1106%======================================================================
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001107% whole new modules get described in subsections here
Fred Drake2db76802004-12-01 05:05:47 +00001108
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001109\subsection{The ctypes package}
1110
1111The \module{ctypes} package, written by Thomas Heller, has been added
1112to the standard library. \module{ctypes} lets you call arbitrary functions
Andrew M. Kuchling28c5f1f2006-04-13 02:04:42 +00001113in shared libraries or DLLs. Long-time users may remember the \module{dl} module, which
1114provides functions for loading shared libraries and calling functions in them. The \module{ctypes} package is much fancier.
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001115
Andrew M. Kuchling28c5f1f2006-04-13 02:04:42 +00001116To load a shared library or DLL, you must create an instance of the
1117\class{CDLL} class and provide the name or path of the shared library
1118or DLL. Once that's done, you can call arbitrary functions
1119by accessing them as attributes of the \class{CDLL} object.
1120
1121\begin{verbatim}
1122import ctypes
1123
1124libc = ctypes.CDLL('libc.so.6')
1125result = libc.printf("Line of output\n")
1126\end{verbatim}
1127
1128Type constructors for the various C types are provided: \function{c_int},
1129\function{c_float}, \function{c_double}, \function{c_char_p} (equivalent to \ctype{char *}), and so forth. Unlike Python's types, the C versions are all mutable; you can assign to their \member{value} attribute
1130to change the wrapped value. Python integers and strings will be automatically
1131converted to the corresponding C types, but for other types you
1132must call the correct type constructor. (And I mean \emph{must};
1133getting it wrong will often result in the interpreter crashing
1134with a segmentation fault.)
1135
1136You shouldn't use \function{c_char_p} with a Python string when the C function will be modifying the memory area, because Python strings are
1137supposed to be immutable; breaking this rule will cause puzzling bugs. When you need a modifiable memory area,
1138use \function{create_string_buffer():
1139
1140\begin{verbatim}
1141s = "this is a string"
1142buf = ctypes.create_string_buffer(s)
1143libc.strfry(buf)
1144\end{verbatim}
1145
1146C functions are assumed to return integers, but you can set
1147the \member{restype} attribute of the function object to
1148change this:
1149
1150\begin{verbatim}
1151>>> libc.atof('2.71828')
1152-1783957616
1153>>> libc.atof.restype = ctypes.c_double
1154>>> libc.atof('2.71828')
11552.71828
1156\end{verbatim}
1157
1158\module{ctypes} also provides a wrapper for Python's C API
1159as the \code{ctypes.pythonapi} object. This object does \emph{not}
1160release the global interpreter lock before calling a function, because the lock must be held when calling into the interpreter's code.
1161There's a \class{py_object()} type constructor that will create a
1162\ctype{PyObject *} pointer. A simple usage:
1163
1164\begin{verbatim}
1165import ctypes
1166
1167d = {}
1168ctypes.pythonapi.PyObject_SetItem(ctypes.py_object(d),
1169 ctypes.py_object("abc"), ctypes.py_object(1))
1170# d is now {'abc', 1}.
1171\end{verbatim}
1172
1173Don't forget to use \class{py_object()}; if it's omitted you end
1174up with a segmentation fault.
1175
1176\module{ctypes} has been around for a while, but people still write
1177and distribution hand-coded extension modules because you can't rely on \module{ctypes} being present.
1178Perhaps developers will begin to write
1179Python wrappers atop a library accessed through \module{ctypes} instead
1180of extension modules, now that \module{ctypes} is included with core Python.
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001181
1182% XXX write introduction
1183
Andrew M. Kuchling28c5f1f2006-04-13 02:04:42 +00001184\begin{seealso}
1185
1186\seeurl{http://starship.python.net/crew/theller/ctypes/}
1187{The ctypes web page, with a tutorial, reference, and FAQ.}
1188
1189\end{seealso}
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001190
1191\subsection{The ElementTree package}
1192
1193A subset of Fredrik Lundh's ElementTree library for processing XML has
Andrew M. Kuchling16ed5212006-04-10 22:28:11 +00001194been added to the standard library as \module{xmlcore.etree}. The
Georg Brandlce27a062006-04-11 06:27:12 +00001195available modules are
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001196\module{ElementTree}, \module{ElementPath}, and
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +00001197\module{ElementInclude} from ElementTree 1.2.6.
1198The \module{cElementTree} accelerator module is also included.
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001199
Andrew M. Kuchling16ed5212006-04-10 22:28:11 +00001200The rest of this section will provide a brief overview of using
1201ElementTree. Full documentation for ElementTree is available at
1202\url{http://effbot.org/zone/element-index.htm}.
1203
1204ElementTree represents an XML document as a tree of element nodes.
1205The text content of the document is stored as the \member{.text}
1206and \member{.tail} attributes of
1207(This is one of the major differences between ElementTree and
1208the Document Object Model; in the DOM there are many different
1209types of node, including \class{TextNode}.)
1210
1211The most commonly used parsing function is \function{parse()}, that
1212takes either a string (assumed to contain a filename) or a file-like
1213object and returns an \class{ElementTree} instance:
1214
1215\begin{verbatim}
1216from xmlcore.etree import ElementTree as ET
1217
1218tree = ET.parse('ex-1.xml')
1219
1220feed = urllib.urlopen(
1221 'http://planet.python.org/rss10.xml')
1222tree = ET.parse(feed)
1223\end{verbatim}
1224
1225Once you have an \class{ElementTree} instance, you
1226can call its \method{getroot()} method to get the root \class{Element} node.
1227
1228There's also an \function{XML()} function that takes a string literal
1229and returns an \class{Element} node (not an \class{ElementTree}).
1230This function provides a tidy way to incorporate XML fragments,
1231approaching the convenience of an XML literal:
1232
1233\begin{verbatim}
1234svg = et.XML("""<svg width="10px" version="1.0">
1235 </svg>""")
1236svg.set('height', '320px')
1237svg.append(elem1)
1238\end{verbatim}
1239
1240Each XML element supports some dictionary-like and some list-like
Andrew M. Kuchling075e0232006-04-11 13:14:56 +00001241access methods. Dictionary-like operations are used to access attribute
1242values, and list-like operations are used to access child nodes.
Andrew M. Kuchling16ed5212006-04-10 22:28:11 +00001243
Andrew M. Kuchling075e0232006-04-11 13:14:56 +00001244\begin{tableii}{c|l}{code}{Operation}{Result}
1245 \lineii{elem[n]}{Returns n'th child element.}
1246 \lineii{elem[m:n]}{Returns list of m'th through n'th child elements.}
1247 \lineii{len(elem)}{Returns number of child elements.}
1248 \lineii{elem.getchildren()}{Returns list of child elements.}
1249 \lineii{elem.append(elem2)}{Adds \var{elem2} as a child.}
1250 \lineii{elem.insert(index, elem2)}{Inserts \var{elem2} at the specified location.}
1251 \lineii{del elem[n]}{Deletes n'th child element.}
1252 \lineii{elem.keys()}{Returns list of attribute names.}
1253 \lineii{elem.get(name)}{Returns value of attribute \var{name}.}
1254 \lineii{elem.set(name, value)}{Sets new value for attribute \var{name}.}
1255 \lineii{elem.attrib}{Retrieves the dictionary containing attributes.}
1256 \lineii{del elem.attrib[name]}{Deletes attribute \var{name}.}
1257\end{tableii}
1258
1259Comments and processing instructions are also represented as
1260\class{Element} nodes. To check if a node is a comment or processing
1261instructions:
1262
1263\begin{verbatim}
1264if elem.tag is ET.Comment:
1265 ...
1266elif elem.tag is ET.ProcessingInstruction:
1267 ...
1268\end{verbatim}
Andrew M. Kuchling16ed5212006-04-10 22:28:11 +00001269
1270To generate XML output, you should call the
1271\method{ElementTree.write()} method. Like \function{parse()},
1272it can take either a string or a file-like object:
1273
1274\begin{verbatim}
1275# Encoding is US-ASCII
1276tree.write('output.xml')
1277
1278# Encoding is UTF-8
1279f = open('output.xml', 'w')
1280tree.write(f, 'utf-8')
1281\end{verbatim}
1282
1283(Caution: the default encoding used for output is ASCII, which isn't
1284very useful for general XML work, raising an exception if there are
1285any characters with values greater than 127. You should always
1286specify a different encoding such as UTF-8 that can handle any Unicode
1287character.)
1288
Andrew M. Kuchling075e0232006-04-11 13:14:56 +00001289This section is only a partial description of the ElementTree interfaces.
1290Please read the package's official documentation for more details.
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001291
Andrew M. Kuchling16ed5212006-04-10 22:28:11 +00001292\begin{seealso}
1293
1294\seeurl{http://effbot.org/zone/element-index.htm}
1295{Official documentation for ElementTree.}
1296
1297
1298\end{seealso}
1299
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001300
1301\subsection{The hashlib package}
1302
1303A new \module{hashlib} module has been added to replace the
1304\module{md5} and \module{sha} modules. \module{hashlib} adds support
1305for additional secure hashes (SHA-224, SHA-256, SHA-384, and SHA-512).
1306When available, the module uses OpenSSL for fast platform optimized
1307implementations of algorithms.
1308
1309The old \module{md5} and \module{sha} modules still exist as wrappers
1310around hashlib to preserve backwards compatibility. The new module's
1311interface is very close to that of the old modules, but not identical.
1312The most significant difference is that the constructor functions
1313for creating new hashing objects are named differently.
1314
1315\begin{verbatim}
1316# Old versions
1317h = md5.md5()
1318h = md5.new()
1319
1320# New version
1321h = hashlib.md5()
1322
1323# Old versions
1324h = sha.sha()
1325h = sha.new()
1326
1327# New version
1328h = hashlib.sha1()
1329
1330# Hash that weren't previously available
1331h = hashlib.sha224()
1332h = hashlib.sha256()
1333h = hashlib.sha384()
1334h = hashlib.sha512()
1335
1336# Alternative form
1337h = hashlib.new('md5') # Provide algorithm as a string
1338\end{verbatim}
1339
1340Once a hash object has been created, its methods are the same as before:
1341\method{update(\var{string})} hashes the specified string into the
1342current digest state, \method{digest()} and \method{hexdigest()}
1343return the digest value as a binary string or a string of hex digits,
1344and \method{copy()} returns a new hashing object with the same digest state.
1345
1346This module was contributed by Gregory P. Smith.
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001347
1348
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001349\subsection{The sqlite3 package}
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001350
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001351The pysqlite module (\url{http://www.pysqlite.org}), a wrapper for the
1352SQLite embedded database, has been added to the standard library under
1353the package name \module{sqlite3}. SQLite is a C library that
1354provides a SQL-language database that stores data in disk files
1355without requiring a separate server process. pysqlite was written by
1356Gerhard H\"aring, and provides a SQL interface that complies with the
Andrew M. Kuchlingd58baf82006-04-10 21:40:16 +00001357DB-API 2.0 specification described by \pep{249}. This means that it
1358should be possible to write the first version of your applications
1359using SQLite for data storage and, if switching to a larger database
1360such as PostgreSQL or Oracle is necessary, the switch should be
1361relatively easy.
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001362
1363If you're compiling the Python source yourself, note that the source
1364tree doesn't include the SQLite code itself, only the wrapper module.
1365You'll need to have the SQLite libraries and headers installed before
1366compiling Python, and the build process will compile the module when
1367the necessary headers are available.
1368
Andrew M. Kuchlingd58baf82006-04-10 21:40:16 +00001369To use the module, you must first create a \class{Connection} object
1370that represents the database. Here the data will be stored in the
1371\file{/tmp/example} file:
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001372
Andrew M. Kuchlingd58baf82006-04-10 21:40:16 +00001373\begin{verbatim}
1374conn = sqlite3.connect('/tmp/example')
1375\end{verbatim}
1376
1377You can also supply the special name \samp{:memory:} to create
1378a database in RAM.
1379
1380Once you have a \class{Connection}, you can create a \class{Cursor}
1381object and call its \method{execute()} method to perform SQL commands:
1382
1383\begin{verbatim}
1384c = conn.cursor()
1385
1386# Create table
1387c.execute('''create table stocks
1388(date timestamp, trans varchar, symbol varchar,
1389 qty decimal, price decimal)''')
1390
1391# Insert a row of data
1392c.execute("""insert into stocks
1393 values ('2006-01-05','BUY','RHAT',100, 35.14)""")
1394\end{verbatim}
1395
1396Usually your SQL queries will need to reflect the value of Python
1397variables. You shouldn't assemble your query using Python's string
1398operations because doing so is insecure; it makes your program
1399vulnerable to what's called an SQL injection attack. Instead, use
1400SQLite's parameter substitution, putting \samp{?} as a placeholder
1401wherever you want to use a value, and then provide a tuple of values
1402as the second argument to the cursor's \method{execute()} method. For
1403example:
1404
1405\begin{verbatim}
1406# Never do this -- insecure!
1407symbol = 'IBM'
1408c.execute("... where symbol = '%s'" % symbol)
1409
1410# Do this instead
1411t = (symbol,)
1412c.execute("... where symbol = '?'", t)
1413
1414# Larger example
1415for t in (('2006-03-28', 'BUY', 'IBM', 1000, 45.00),
1416 ('2006-04-05', 'BUY', 'MSOFT', 1000, 72.00),
1417 ('2006-04-06', 'SELL', 'IBM', 500, 53.00),
1418 ):
1419 c.execute('insert into stocks values (?,?,?,?,?)', t)
1420\end{verbatim}
1421
1422To retrieve data after executing a SELECT statement, you can either
1423treat the cursor as an iterator, call the cursor's \method{fetchone()}
1424method to retrieve a single matching row,
1425or call \method{fetchall()} to get a list of the matching rows.
1426
1427This example uses the iterator form:
1428
1429\begin{verbatim}
1430>>> c = conn.cursor()
1431>>> c.execute('select * from stocks order by price')
1432>>> for row in c:
1433... print row
1434...
1435(u'2006-01-05', u'BUY', u'RHAT', 100, 35.140000000000001)
1436(u'2006-03-28', u'BUY', u'IBM', 1000, 45.0)
1437(u'2006-04-06', u'SELL', u'IBM', 500, 53.0)
1438(u'2006-04-05', u'BUY', u'MSOFT', 1000, 72.0)
1439>>>
1440\end{verbatim}
1441
1442You should also use parameter substitution with SELECT statements:
1443
1444\begin{verbatim}
1445>>> c.execute('select * from stocks where symbol=?', ('IBM',))
1446>>> print c.fetchall()
1447[(u'2006-03-28', u'BUY', u'IBM', 1000, 45.0),
1448 (u'2006-04-06', u'SELL', u'IBM', 500, 53.0)]
1449\end{verbatim}
1450
1451For more information about the SQL dialect supported by SQLite, see
1452\url{http://www.sqlite.org}.
1453
1454\begin{seealso}
1455
1456\seeurl{http://www.pysqlite.org}
1457{The pysqlite web page.}
1458
1459\seeurl{http://www.sqlite.org}
1460{The SQLite web page; the documentation describes the syntax and the
1461available data types for the supported SQL dialect.}
1462
1463\seepep{249}{Database API Specification 2.0}{PEP written by
1464Marc-Andr\'e Lemburg.}
1465
1466\end{seealso}
Andrew M. Kuchlingaf7ee992006-04-03 12:41:37 +00001467
Fred Drake2db76802004-12-01 05:05:47 +00001468
1469% ======================================================================
1470\section{Build and C API Changes}
1471
1472Changes to Python's build process and to the C API include:
1473
1474\begin{itemize}
1475
Andrew M. Kuchling4d8cd892006-04-06 13:03:04 +00001476\item The largest change to the C API came from \pep{353},
1477which modifies the interpreter to use a \ctype{Py_ssize_t} type
1478definition instead of \ctype{int}. See the earlier
1479section~ref{section-353} for a discussion of this change.
1480
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001481\item The design of the bytecode compiler has changed a great deal, to
1482no longer generate bytecode by traversing the parse tree. Instead
Andrew M. Kuchlingdb85ed52005-10-23 21:52:59 +00001483the parse tree is converted to an abstract syntax tree (or AST), and it is
1484the abstract syntax tree that's traversed to produce the bytecode.
1485
Andrew M. Kuchling4e861952006-04-12 12:16:31 +00001486It's possible for Python code to obtain AST objects by using the
Andrew M. Kuchling5f445bf2006-04-12 18:54:00 +00001487\function{compile()} built-in and specifying \code{_ast.PyCF_ONLY_AST}
1488as the value of the
Andrew M. Kuchling4e861952006-04-12 12:16:31 +00001489\var{flags} parameter:
1490
1491\begin{verbatim}
Andrew M. Kuchling5f445bf2006-04-12 18:54:00 +00001492from _ast import PyCF_ONLY_AST
Andrew M. Kuchling4e861952006-04-12 12:16:31 +00001493ast = compile("""a=0
1494for i in range(10):
1495 a += i
Andrew M. Kuchling5f445bf2006-04-12 18:54:00 +00001496""", "<string>", 'exec', PyCF_ONLY_AST)
Andrew M. Kuchling4e861952006-04-12 12:16:31 +00001497
1498assignment = ast.body[0]
1499for_loop = ast.body[1]
1500\end{verbatim}
1501
Andrew M. Kuchlingdb85ed52005-10-23 21:52:59 +00001502No documentation has been written for the AST code yet. To start
1503learning about it, read the definition of the various AST nodes in
1504\file{Parser/Python.asdl}. A Python script reads this file and
1505generates a set of C structure definitions in
1506\file{Include/Python-ast.h}. The \cfunction{PyParser_ASTFromString()}
1507and \cfunction{PyParser_ASTFromFile()}, defined in
1508\file{Include/pythonrun.h}, take Python source as input and return the
1509root of an AST representing the contents. This AST can then be turned
1510into a code object by \cfunction{PyAST_Compile()}. For more
1511information, read the source code, and then ask questions on
1512python-dev.
1513
1514% List of names taken from Jeremy's python-dev post at
1515% http://mail.python.org/pipermail/python-dev/2005-October/057500.html
1516The AST code was developed under Jeremy Hylton's management, and
1517implemented by (in alphabetical order) Brett Cannon, Nick Coghlan,
1518Grant Edwards, John Ehresman, Kurt Kaiser, Neal Norwitz, Tim Peters,
1519Armin Rigo, and Neil Schemenauer, plus the participants in a number of
1520AST sprints at conferences such as PyCon.
1521
Andrew M. Kuchling150e3492005-08-23 00:56:06 +00001522\item The built-in set types now have an official C API. Call
1523\cfunction{PySet_New()} and \cfunction{PyFrozenSet_New()} to create a
1524new set, \cfunction{PySet_Add()} and \cfunction{PySet_Discard()} to
1525add and remove elements, and \cfunction{PySet_Contains} and
1526\cfunction{PySet_Size} to examine the set's state.
1527
1528\item The \cfunction{PyRange_New()} function was removed. It was
1529never documented, never used in the core code, and had dangerously lax
1530error checking.
Fred Drake2db76802004-12-01 05:05:47 +00001531
1532\end{itemize}
1533
1534
1535%======================================================================
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001536%\subsection{Port-Specific Changes}
Fred Drake2db76802004-12-01 05:05:47 +00001537
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001538%Platform-specific changes go here.
Fred Drake2db76802004-12-01 05:05:47 +00001539
1540
1541%======================================================================
1542\section{Other Changes and Fixes \label{section-other}}
1543
1544As usual, there were a bunch of other improvements and bugfixes
Andrew M. Kuchlingf688cc52006-03-10 18:50:08 +00001545scattered throughout the source tree. A search through the SVN change
Fred Drake2db76802004-12-01 05:05:47 +00001546logs finds there were XXX patches applied and YYY bugs fixed between
Andrew M. Kuchling92e24952004-12-03 13:54:09 +00001547Python 2.4 and 2.5. Both figures are likely to be underestimates.
Fred Drake2db76802004-12-01 05:05:47 +00001548
1549Some of the more notable changes are:
1550
1551\begin{itemize}
1552
Andrew M. Kuchling01e3d262006-03-17 15:38:39 +00001553\item Evan Jones's patch to obmalloc, first described in a talk
1554at PyCon DC 2005, was applied. Python 2.4 allocated small objects in
1555256K-sized arenas, but never freed arenas. With this patch, Python
1556will free arenas when they're empty. The net effect is that on some
1557platforms, when you allocate many objects, Python's memory usage may
1558actually drop when you delete them, and the memory may be returned to
1559the operating system. (Implemented by Evan Jones, and reworked by Tim
1560Peters.)
Fred Drake2db76802004-12-01 05:05:47 +00001561
Andrew M. Kuchlingf7c62902006-04-12 12:27:50 +00001562Note that this change means extension modules need to be more careful
1563with how they allocate memory. Python's API has a number of different
1564functions for allocating memory that are grouped into families. For
1565example, \cfunction{PyMem_Malloc()}, \cfunction{PyMem_Realloc()}, and
1566\cfunction{PyMem_Free()} are one family that allocates raw memory,
1567while \cfunction{PyObject_Malloc()}, \cfunction{PyObject_Realloc()},
1568and \cfunction{PyObject_Free()} are another family that's supposed to
1569be used for creating Python objects.
1570
1571Previously these different families all reduced to the platform's
1572\cfunction{malloc()} and \cfunction{free()} functions. This meant
1573it didn't matter if you got things wrong and allocated memory with the
1574\cfunction{PyMem} function but freed it with the \cfunction{PyObject}
1575function. With the obmalloc change, these families now do different
1576things, and mismatches will probably result in a segfault. You should
1577carefully test your C extension modules with Python 2.5.
1578
Andrew M. Kuchling38f85072006-04-02 01:46:32 +00001579\item Coverity, a company that markets a source code analysis tool
1580 called Prevent, provided the results of their examination of the Python
1581 source code. The analysis found a number of refcounting bugs, often
1582 in error-handling code. These bugs have been fixed.
1583 % XXX provide reference?
1584
Fred Drake2db76802004-12-01 05:05:47 +00001585\end{itemize}
1586
1587
1588%======================================================================
1589\section{Porting to Python 2.5}
1590
1591This section lists previously described changes that may require
1592changes to your code:
1593
1594\begin{itemize}
1595
Andrew M. Kuchling5f445bf2006-04-12 18:54:00 +00001596\item ASCII is now the default encoding for modules. It's now
1597a syntax error if a module contains string literals with 8-bit
1598characters but doesn't have an encoding declaration. In Python 2.4
1599this triggered a warning, not a syntax error.
1600
Andrew M. Kuchlingc3749a92006-04-04 19:14:41 +00001601\item The \module{pickle} module no longer uses the deprecated \var{bin} parameter.
Fred Drake2db76802004-12-01 05:05:47 +00001602
Andrew M. Kuchlingf7c62902006-04-12 12:27:50 +00001603\item C API: Many functions now use \ctype{Py_ssize_t}
1604instead of \ctype{int} to allow processing more data
1605on 64-bit machines. Extension code may need to make
1606the same change to avoid warnings and to support 64-bit machines.
1607See the earlier
1608section~ref{section-353} for a discussion of this change.
1609
1610\item C API:
1611The obmalloc changes mean that
1612you must be careful to not mix usage
1613of the \cfunction{PyMem_*()} and \cfunction{PyObject_*()}
1614families of functions. Memory allocated with
1615one family's \cfunction{*_Malloc()} must be
1616freed with the corresponding family's \cfunction{*_Free()} function.
1617
Fred Drake2db76802004-12-01 05:05:47 +00001618\end{itemize}
1619
1620
1621%======================================================================
1622\section{Acknowledgements \label{acks}}
1623
1624The author would like to thank the following people for offering
1625suggestions, corrections and assistance with various drafts of this
Andrew M. Kuchling5f445bf2006-04-12 18:54:00 +00001626article: Martin von~L\"owis, Mike Rovner, Thomas Wouters.
Fred Drake2db76802004-12-01 05:05:47 +00001627
1628\end{document}