blob: 683630adb1c98ff32ab8101bc39489dc8b51d641 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001****************************
Georg Brandl48310cd2009-01-03 21:18:54 +00002 What's New in Python 2.5
Georg Brandl116aa622007-08-15 14:28:22 +00003****************************
4
5:Author: A.M. Kuchling
6
7.. |release| replace:: 1.01
8
Christian Heimes5b5e81c2007-12-31 16:14:33 +00009.. $Id: whatsnew25.tex 56611 2007-07-29 08:26:10Z georg.brandl $
10.. Fix XXX comments
Georg Brandl116aa622007-08-15 14:28:22 +000011
12This article explains the new features in Python 2.5. The final release of
13Python 2.5 is scheduled for August 2006; :pep:`356` describes the planned
14release schedule.
15
16The changes in Python 2.5 are an interesting mix of language and library
17improvements. The library enhancements will be more important to Python's user
18community, I think, because several widely-useful packages were added. New
Benjamin Petersonf10a79a2008-10-11 00:49:57 +000019modules include ElementTree for XML processing (:mod:`xml.etree`),
20the SQLite database module (:mod:`sqlite`), and the :mod:`ctypes`
21module for calling C functions.
Georg Brandl116aa622007-08-15 14:28:22 +000022
23The language changes are of middling significance. Some pleasant new features
24were added, but most of them aren't features that you'll use every day.
25Conditional expressions were finally added to the language using a novel syntax;
26see section :ref:`pep-308`. The new ':keyword:`with`' statement will make
27writing cleanup code easier (section :ref:`pep-343`). Values can now be passed
28into generators (section :ref:`pep-342`). Imports are now visible as either
29absolute or relative (section :ref:`pep-328`). Some corner cases of exception
30handling are handled better (section :ref:`pep-341`). All these improvements
31are worthwhile, but they're improvements to one specific language feature or
32another; none of them are broad modifications to Python's semantics.
33
34As well as the language and library additions, other improvements and bugfixes
35were made throughout the source tree. A search through the SVN change logs
36finds there were 353 patches applied and 458 bugs fixed between Python 2.4 and
372.5. (Both figures are likely to be underestimates.)
38
39This article doesn't try to be a complete specification of the new features;
40instead changes are briefly introduced using helpful examples. For full
41details, you should always refer to the documentation for Python 2.5 at
42http://docs.python.org. If you want to understand the complete implementation
43and design rationale, refer to the PEP for a particular new feature.
44
45Comments, suggestions, and error reports for this document are welcome; please
46e-mail them to the author or open a bug in the Python bug tracker.
47
Christian Heimes5b5e81c2007-12-31 16:14:33 +000048.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +000049
50
51.. _pep-308:
52
53PEP 308: Conditional Expressions
54================================
55
56For a long time, people have been requesting a way to write conditional
57expressions, which are expressions that return value A or value B depending on
58whether a Boolean value is true or false. A conditional expression lets you
59write a single assignment statement that has the same effect as the following::
60
61 if condition:
62 x = true_value
63 else:
64 x = false_value
65
66There have been endless tedious discussions of syntax on both python-dev and
67comp.lang.python. A vote was even held that found the majority of voters wanted
68conditional expressions in some form, but there was no syntax that was preferred
69by a clear majority. Candidates included C's ``cond ? true_v : false_v``, ``if
70cond then true_v else false_v``, and 16 other variations.
71
72Guido van Rossum eventually chose a surprising syntax::
73
74 x = true_value if condition else false_value
75
76Evaluation is still lazy as in existing Boolean expressions, so the order of
77evaluation jumps around a bit. The *condition* expression in the middle is
78evaluated first, and the *true_value* expression is evaluated only if the
79condition was true. Similarly, the *false_value* expression is only evaluated
80when the condition is false.
81
82This syntax may seem strange and backwards; why does the condition go in the
83*middle* of the expression, and not in the front as in C's ``c ? x : y``? The
84decision was checked by applying the new syntax to the modules in the standard
85library and seeing how the resulting code read. In many cases where a
86conditional expression is used, one value seems to be the 'common case' and one
87value is an 'exceptional case', used only on rarer occasions when the condition
88isn't met. The conditional syntax makes this pattern a bit more obvious::
89
90 contents = ((doc + '\n') if doc else '')
91
92I read the above statement as meaning "here *contents* is usually assigned a
93value of ``doc+'\n'``; sometimes *doc* is empty, in which special case an empty
94string is returned." I doubt I will use conditional expressions very often
95where there isn't a clear common and uncommon case.
96
97There was some discussion of whether the language should require surrounding
98conditional expressions with parentheses. The decision was made to *not*
99require parentheses in the Python language's grammar, but as a matter of style I
100think you should always use them. Consider these two statements::
101
102 # First version -- no parens
103 level = 1 if logging else 0
104
105 # Second version -- with parens
106 level = (1 if logging else 0)
107
108In the first version, I think a reader's eye might group the statement into
109'level = 1', 'if logging', 'else 0', and think that the condition decides
110whether the assignment to *level* is performed. The second version reads
111better, in my opinion, because it makes it clear that the assignment is always
112performed and the choice is being made between two values.
113
114Another reason for including the brackets: a few odd combinations of list
115comprehensions and lambdas could look like incorrect conditional expressions.
116See :pep:`308` for some examples. If you put parentheses around your
117conditional expressions, you won't run into this case.
118
119
120.. seealso::
121
122 :pep:`308` - Conditional Expressions
123 PEP written by Guido van Rossum and Raymond D. Hettinger; implemented by Thomas
124 Wouters.
125
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000126.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +0000127
128
129.. _pep-309:
130
131PEP 309: Partial Function Application
132=====================================
133
134The :mod:`functools` module is intended to contain tools for functional-style
135programming.
136
137One useful tool in this module is the :func:`partial` function. For programs
138written in a functional style, you'll sometimes want to construct variants of
139existing functions that have some of the parameters filled in. Consider a
140Python function ``f(a, b, c)``; you could create a new function ``g(b, c)`` that
141was equivalent to ``f(1, b, c)``. This is called "partial function
142application".
143
144:func:`partial` takes the arguments ``(function, arg1, arg2, ... kwarg1=value1,
145kwarg2=value2)``. The resulting object is callable, so you can just call it to
146invoke *function* with the filled-in arguments.
147
148Here's a small but realistic example::
149
150 import functools
151
152 def log (message, subsystem):
153 "Write the contents of 'message' to the specified subsystem."
154 print '%s: %s' % (subsystem, message)
155 ...
156
157 server_log = functools.partial(log, subsystem='server')
158 server_log('Unable to open socket')
159
160Here's another example, from a program that uses PyGTK. Here a context-
161sensitive pop-up menu is being constructed dynamically. The callback provided
162for the menu option is a partially applied version of the :meth:`open_item`
163method, where the first argument has been provided. ::
164
165 ...
166 class Application:
167 def open_item(self, path):
168 ...
169 def init (self):
170 open_func = functools.partial(self.open_item, item_path)
171 popup_menu.append( ("Open", open_func, 1) )
172
173Another function in the :mod:`functools` module is the
Andrew Svetlova2fe3342012-08-11 21:14:08 +0300174``update_wrapper(wrapper, wrapped)`` function that helps you write well-
Georg Brandl116aa622007-08-15 14:28:22 +0000175behaved decorators. :func:`update_wrapper` copies the name, module, and
176docstring attribute to a wrapper function so that tracebacks inside the wrapped
177function are easier to understand. For example, you might write::
178
179 def my_decorator(f):
180 def wrapper(*args, **kwds):
181 print 'Calling decorated function'
182 return f(*args, **kwds)
183 functools.update_wrapper(wrapper, f)
184 return wrapper
185
186:func:`wraps` is a decorator that can be used inside your own decorators to copy
187the wrapped function's information. An alternate version of the previous
188example would be::
189
190 def my_decorator(f):
191 @functools.wraps(f)
192 def wrapper(*args, **kwds):
193 print 'Calling decorated function'
194 return f(*args, **kwds)
195 return wrapper
196
197
198.. seealso::
199
200 :pep:`309` - Partial Function Application
201 PEP proposed and written by Peter Harris; implemented by Hye-Shik Chang and Nick
202 Coghlan, with adaptations by Raymond Hettinger.
203
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000204.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +0000205
206
207.. _pep-314:
208
209PEP 314: Metadata for Python Software Packages v1.1
210===================================================
211
212Some simple dependency support was added to Distutils. The :func:`setup`
213function now has ``requires``, ``provides``, and ``obsoletes`` keyword
214parameters. When you build a source distribution using the ``sdist`` command,
215the dependency information will be recorded in the :file:`PKG-INFO` file.
216
217Another new keyword parameter is ``download_url``, which should be set to a URL
218for the package's source code. This means it's now possible to look up an entry
219in the package index, determine the dependencies for a package, and download the
220required packages. ::
221
222 VERSION = '1.0'
Georg Brandl48310cd2009-01-03 21:18:54 +0000223 setup(name='PyPackage',
Georg Brandl116aa622007-08-15 14:28:22 +0000224 version=VERSION,
225 requires=['numarray', 'zlib (>=1.1.4)'],
226 obsoletes=['OldPackage']
227 download_url=('http://www.example.com/pypackage/dist/pkg-%s.tar.gz'
228 % VERSION),
229 )
230
231Another new enhancement to the Python package index at
232http://cheeseshop.python.org is storing source and binary archives for a
233package. The new :command:`upload` Distutils command will upload a package to
234the repository.
235
236Before a package can be uploaded, you must be able to build a distribution using
237the :command:`sdist` Distutils command. Once that works, you can run ``python
238setup.py upload`` to add your package to the PyPI archive. Optionally you can
239GPG-sign the package by supplying the :option:`--sign` and :option:`--identity`
240options.
241
242Package uploading was implemented by Martin von Löwis and Richard Jones.
243
244
245.. seealso::
246
247 :pep:`314` - Metadata for Python Software Packages v1.1
248 PEP proposed and written by A.M. Kuchling, Richard Jones, and Fred Drake;
249 implemented by Richard Jones and Fred Drake.
250
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000251.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +0000252
253
254.. _pep-328:
255
256PEP 328: Absolute and Relative Imports
257======================================
258
259The simpler part of PEP 328 was implemented in Python 2.4: parentheses could now
260be used to enclose the names imported from a module using the ``from ... import
261...`` statement, making it easier to import many different names.
262
263The more complicated part has been implemented in Python 2.5: importing a module
264can be specified to use absolute or package-relative imports. The plan is to
265move toward making absolute imports the default in future versions of Python.
266
267Let's say you have a package directory like this::
268
269 pkg/
270 pkg/__init__.py
271 pkg/main.py
272 pkg/string.py
273
274This defines a package named :mod:`pkg` containing the :mod:`pkg.main` and
275:mod:`pkg.string` submodules.
276
277Consider the code in the :file:`main.py` module. What happens if it executes
278the statement ``import string``? In Python 2.4 and earlier, it will first look
279in the package's directory to perform a relative import, finds
280:file:`pkg/string.py`, imports the contents of that file as the
281:mod:`pkg.string` module, and that module is bound to the name ``string`` in the
282:mod:`pkg.main` module's namespace.
283
284That's fine if :mod:`pkg.string` was what you wanted. But what if you wanted
285Python's standard :mod:`string` module? There's no clean way to ignore
286:mod:`pkg.string` and look for the standard module; generally you had to look at
287the contents of ``sys.modules``, which is slightly unclean. Holger Krekel's
288:mod:`py.std` package provides a tidier way to perform imports from the standard
Serhiy Storchakaf47036c2013-12-24 11:04:36 +0200289library, ``import py; py.std.string.join()``, but that package isn't available
Georg Brandl116aa622007-08-15 14:28:22 +0000290on all Python installations.
291
292Reading code which relies on relative imports is also less clear, because a
293reader may be confused about which module, :mod:`string` or :mod:`pkg.string`,
294is intended to be used. Python users soon learned not to duplicate the names of
295standard library modules in the names of their packages' submodules, but you
296can't protect against having your submodule's name being used for a new module
297added in a future version of Python.
298
299In Python 2.5, you can switch :keyword:`import`'s behaviour to absolute imports
300using a ``from __future__ import absolute_import`` directive. This absolute-
301import behaviour will become the default in a future version (probably Python
3022.7). Once absolute imports are the default, ``import string`` will always
303find the standard library's version. It's suggested that users should begin
304using absolute imports as much as possible, so it's preferable to begin writing
305``from pkg import string`` in your code.
306
307Relative imports are still possible by adding a leading period to the module
308name when using the ``from ... import`` form::
309
310 # Import names from pkg.string
311 from .string import name1, name2
312 # Import pkg.string
313 from . import string
314
315This imports the :mod:`string` module relative to the current package, so in
316:mod:`pkg.main` this will import *name1* and *name2* from :mod:`pkg.string`.
317Additional leading periods perform the relative import starting from the parent
318of the current package. For example, code in the :mod:`A.B.C` module can do::
319
320 from . import D # Imports A.B.D
321 from .. import E # Imports A.E
322 from ..F import G # Imports A.F.G
323
324Leading periods cannot be used with the ``import modname`` form of the import
325statement, only the ``from ... import`` form.
326
327
328.. seealso::
329
330 :pep:`328` - Imports: Multi-Line and Absolute/Relative
331 PEP written by Aahz; implemented by Thomas Wouters.
332
333 http://codespeak.net/py/current/doc/index.html
334 The py library by Holger Krekel, which contains the :mod:`py.std` package.
335
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000336.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +0000337
338
339.. _pep-338:
340
341PEP 338: Executing Modules as Scripts
342=====================================
343
344The :option:`-m` switch added in Python 2.4 to execute a module as a script
345gained a few more abilities. Instead of being implemented in C code inside the
346Python interpreter, the switch now uses an implementation in a new module,
347:mod:`runpy`.
348
349The :mod:`runpy` module implements a more sophisticated import mechanism so that
350it's now possible to run modules in a package such as :mod:`pychecker.checker`.
351The module also supports alternative import mechanisms such as the
352:mod:`zipimport` module. This means you can add a .zip archive's path to
353``sys.path`` and then use the :option:`-m` switch to execute code from the
354archive.
355
356
357.. seealso::
358
359 :pep:`338` - Executing modules as scripts
360 PEP written and implemented by Nick Coghlan.
361
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000362.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +0000363
364
365.. _pep-341:
366
367PEP 341: Unified try/except/finally
368===================================
369
370Until Python 2.5, the :keyword:`try` statement came in two flavours. You could
371use a :keyword:`finally` block to ensure that code is always executed, or one or
372more :keyword:`except` blocks to catch specific exceptions. You couldn't
373combine both :keyword:`except` blocks and a :keyword:`finally` block, because
374generating the right bytecode for the combined version was complicated and it
375wasn't clear what the semantics of the combined statement should be.
376
377Guido van Rossum spent some time working with Java, which does support the
378equivalent of combining :keyword:`except` blocks and a :keyword:`finally` block,
379and this clarified what the statement should mean. In Python 2.5, you can now
380write::
381
382 try:
383 block-1 ...
384 except Exception1:
385 handler-1 ...
386 except Exception2:
387 handler-2 ...
388 else:
389 else-block
390 finally:
Georg Brandl48310cd2009-01-03 21:18:54 +0000391 final-block
Georg Brandl116aa622007-08-15 14:28:22 +0000392
393The code in *block-1* is executed. If the code raises an exception, the various
394:keyword:`except` blocks are tested: if the exception is of class
395:class:`Exception1`, *handler-1* is executed; otherwise if it's of class
396:class:`Exception2`, *handler-2* is executed, and so forth. If no exception is
397raised, the *else-block* is executed.
398
399No matter what happened previously, the *final-block* is executed once the code
400block is complete and any raised exceptions handled. Even if there's an error in
401an exception handler or the *else-block* and a new exception is raised, the code
402in the *final-block* is still run.
403
404
405.. seealso::
406
407 :pep:`341` - Unifying try-except and try-finally
408 PEP written by Georg Brandl; implementation by Thomas Lee.
409
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000410.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +0000411
412
413.. _pep-342:
414
415PEP 342: New Generator Features
416===============================
417
418Python 2.5 adds a simple way to pass values *into* a generator. As introduced in
419Python 2.3, generators only produce output; once a generator's code was invoked
420to create an iterator, there was no way to pass any new information into the
421function when its execution is resumed. Sometimes the ability to pass in some
422information would be useful. Hackish solutions to this include making the
423generator's code look at a global variable and then changing the global
424variable's value, or passing in some mutable object that callers then modify.
425
426To refresh your memory of basic generators, here's a simple example::
427
428 def counter (maximum):
429 i = 0
430 while i < maximum:
431 yield i
432 i += 1
433
434When you call ``counter(10)``, the result is an iterator that returns the values
435from 0 up to 9. On encountering the :keyword:`yield` statement, the iterator
436returns the provided value and suspends the function's execution, preserving the
437local variables. Execution resumes on the following call to the iterator's
438:meth:`next` method, picking up after the :keyword:`yield` statement.
439
440In Python 2.3, :keyword:`yield` was a statement; it didn't return any value. In
4412.5, :keyword:`yield` is now an expression, returning a value that can be
442assigned to a variable or otherwise operated on::
443
444 val = (yield i)
445
446I recommend that you always put parentheses around a :keyword:`yield` expression
447when you're doing something with the returned value, as in the above example.
448The parentheses aren't always necessary, but it's easier to always add them
449instead of having to remember when they're needed.
450
451(:pep:`342` explains the exact rules, which are that a :keyword:`yield`\
452-expression must always be parenthesized except when it occurs at the top-level
453expression on the right-hand side of an assignment. This means you can write
454``val = yield i`` but have to use parentheses when there's an operation, as in
455``val = (yield i) + 12``.)
456
Andrew Svetlova2fe3342012-08-11 21:14:08 +0300457Values are sent into a generator by calling its ``send(value)`` method. The
Georg Brandl116aa622007-08-15 14:28:22 +0000458generator's code is then resumed and the :keyword:`yield` expression returns the
459specified *value*. If the regular :meth:`next` method is called, the
460:keyword:`yield` returns :const:`None`.
461
462Here's the previous example, modified to allow changing the value of the
463internal counter. ::
464
465 def counter (maximum):
466 i = 0
467 while i < maximum:
468 val = (yield i)
469 # If value provided, change counter
470 if val is not None:
471 i = val
472 else:
473 i += 1
474
475And here's an example of changing the counter::
476
477 >>> it = counter(10)
478 >>> print it.next()
479 0
480 >>> print it.next()
481 1
482 >>> print it.send(8)
483 8
484 >>> print it.next()
485 9
486 >>> print it.next()
487 Traceback (most recent call last):
Georg Brandl1f01deb2009-01-03 22:47:39 +0000488 File "t.py", line 15, in ?
Georg Brandl116aa622007-08-15 14:28:22 +0000489 print it.next()
490 StopIteration
491
492:keyword:`yield` will usually return :const:`None`, so you should always check
493for this case. Don't just use its value in expressions unless you're sure that
494the :meth:`send` method will be the only method used to resume your generator
495function.
496
497In addition to :meth:`send`, there are two other new methods on generators:
498
Andrew Svetlova2fe3342012-08-11 21:14:08 +0300499* ``throw(type, value=None, traceback=None)`` is used to raise an exception
Georg Brandl116aa622007-08-15 14:28:22 +0000500 inside the generator; the exception is raised by the :keyword:`yield` expression
501 where the generator's execution is paused.
502
503* :meth:`close` raises a new :exc:`GeneratorExit` exception inside the generator
504 to terminate the iteration. On receiving this exception, the generator's code
505 must either raise :exc:`GeneratorExit` or :exc:`StopIteration`. Catching the
506 :exc:`GeneratorExit` exception and returning a value is illegal and will trigger
507 a :exc:`RuntimeError`; if the function raises some other exception, that
508 exception is propagated to the caller. :meth:`close` will also be called by
509 Python's garbage collector when the generator is garbage-collected.
510
511 If you need to run cleanup code when a :exc:`GeneratorExit` occurs, I suggest
512 using a ``try: ... finally:`` suite instead of catching :exc:`GeneratorExit`.
513
514The cumulative effect of these changes is to turn generators from one-way
515producers of information into both producers and consumers.
516
517Generators also become *coroutines*, a more generalized form of subroutines.
518Subroutines are entered at one point and exited at another point (the top of the
519function, and a :keyword:`return` statement), but coroutines can be entered,
520exited, and resumed at many different points (the :keyword:`yield` statements).
521We'll have to figure out patterns for using coroutines effectively in Python.
522
523The addition of the :meth:`close` method has one side effect that isn't obvious.
524:meth:`close` is called when a generator is garbage-collected, so this means the
525generator's code gets one last chance to run before the generator is destroyed.
526This last chance means that ``try...finally`` statements in generators can now
527be guaranteed to work; the :keyword:`finally` clause will now always get a
528chance to run. The syntactic restriction that you couldn't mix :keyword:`yield`
529statements with a ``try...finally`` suite has therefore been removed. This
530seems like a minor bit of language trivia, but using generators and
531``try...finally`` is actually necessary in order to implement the
532:keyword:`with` statement described by PEP 343. I'll look at this new statement
533in the following section.
534
535Another even more esoteric effect of this change: previously, the
536:attr:`gi_frame` attribute of a generator was always a frame object. It's now
537possible for :attr:`gi_frame` to be ``None`` once the generator has been
538exhausted.
539
540
541.. seealso::
542
543 :pep:`342` - Coroutines via Enhanced Generators
544 PEP written by Guido van Rossum and Phillip J. Eby; implemented by Phillip J.
545 Eby. Includes examples of some fancier uses of generators as coroutines.
546
547 Earlier versions of these features were proposed in :pep:`288` by Raymond
548 Hettinger and :pep:`325` by Samuele Pedroni.
549
550 http://en.wikipedia.org/wiki/Coroutine
551 The Wikipedia entry for coroutines.
552
553 http://www.sidhe.org/~dan/blog/archives/000178.html
554 An explanation of coroutines from a Perl point of view, written by Dan Sugalski.
555
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000556.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +0000557
558
559.. _pep-343:
560
561PEP 343: The 'with' statement
562=============================
563
564The ':keyword:`with`' statement clarifies code that previously would use
565``try...finally`` blocks to ensure that clean-up code is executed. In this
566section, I'll discuss the statement as it will commonly be used. In the next
567section, I'll examine the implementation details and show how to write objects
568for use with this statement.
569
570The ':keyword:`with`' statement is a new control-flow structure whose basic
571structure is::
572
573 with expression [as variable]:
574 with-block
575
576The expression is evaluated, and it should result in an object that supports the
577context management protocol (that is, has :meth:`__enter__` and :meth:`__exit__`
578methods.
579
580The object's :meth:`__enter__` is called before *with-block* is executed and
581therefore can run set-up code. It also may return a value that is bound to the
582name *variable*, if given. (Note carefully that *variable* is *not* assigned
583the result of *expression*.)
584
585After execution of the *with-block* is finished, the object's :meth:`__exit__`
586method is called, even if the block raised an exception, and can therefore run
587clean-up code.
588
589To enable the statement in Python 2.5, you need to add the following directive
590to your module::
591
592 from __future__ import with_statement
593
594The statement will always be enabled in Python 2.6.
595
596Some standard Python objects now support the context management protocol and can
597be used with the ':keyword:`with`' statement. File objects are one example::
598
599 with open('/etc/passwd', 'r') as f:
600 for line in f:
601 print line
602 ... more processing code ...
603
604After this statement has executed, the file object in *f* will have been
605automatically closed, even if the :keyword:`for` loop raised an exception part-
606way through the block.
607
608.. note::
609
610 In this case, *f* is the same object created by :func:`open`, because
611 :meth:`file.__enter__` returns *self*.
612
613The :mod:`threading` module's locks and condition variables also support the
614':keyword:`with`' statement::
615
616 lock = threading.Lock()
617 with lock:
618 # Critical section of code
619 ...
620
621The lock is acquired before the block is executed and always released once the
622block is complete.
623
624The new :func:`localcontext` function in the :mod:`decimal` module makes it easy
625to save and restore the current decimal context, which encapsulates the desired
626precision and rounding characteristics for computations::
627
628 from decimal import Decimal, Context, localcontext
629
630 # Displays with default precision of 28 digits
631 v = Decimal('578')
632 print v.sqrt()
633
634 with localcontext(Context(prec=16)):
635 # All code in this block uses a precision of 16 digits.
636 # The original context is restored on exiting the block.
637 print v.sqrt()
638
639
Thomas Wouters1b7f8912007-09-19 03:06:30 +0000640.. _new-25-context-managers:
Georg Brandl116aa622007-08-15 14:28:22 +0000641
642Writing Context Managers
643------------------------
644
645Under the hood, the ':keyword:`with`' statement is fairly complicated. Most
646people will only use ':keyword:`with`' in company with existing objects and
647don't need to know these details, so you can skip the rest of this section if
648you like. Authors of new objects will need to understand the details of the
649underlying implementation and should keep reading.
650
651A high-level explanation of the context management protocol is:
652
653* The expression is evaluated and should result in an object called a "context
654 manager". The context manager must have :meth:`__enter__` and :meth:`__exit__`
655 methods.
656
657* The context manager's :meth:`__enter__` method is called. The value returned
658 is assigned to *VAR*. If no ``'as VAR'`` clause is present, the value is simply
659 discarded.
660
661* The code in *BLOCK* is executed.
662
Andrew Svetlova2fe3342012-08-11 21:14:08 +0300663* If *BLOCK* raises an exception, the ``__exit__(type, value, traceback)``
Georg Brandl116aa622007-08-15 14:28:22 +0000664 is called with the exception details, the same values returned by
665 :func:`sys.exc_info`. The method's return value controls whether the exception
666 is re-raised: any false value re-raises the exception, and ``True`` will result
667 in suppressing it. You'll only rarely want to suppress the exception, because
668 if you do the author of the code containing the ':keyword:`with`' statement will
669 never realize anything went wrong.
670
671* If *BLOCK* didn't raise an exception, the :meth:`__exit__` method is still
672 called, but *type*, *value*, and *traceback* are all ``None``.
673
674Let's think through an example. I won't present detailed code but will only
675sketch the methods necessary for a database that supports transactions.
676
677(For people unfamiliar with database terminology: a set of changes to the
678database are grouped into a transaction. Transactions can be either committed,
679meaning that all the changes are written into the database, or rolled back,
680meaning that the changes are all discarded and the database is unchanged. See
681any database textbook for more information.)
682
683Let's assume there's an object representing a database connection. Our goal will
684be to let the user write code like this::
685
686 db_connection = DatabaseConnection()
687 with db_connection as cursor:
688 cursor.execute('insert into ...')
689 cursor.execute('delete from ...')
690 # ... more operations ...
691
692The transaction should be committed if the code in the block runs flawlessly or
693rolled back if there's an exception. Here's the basic interface for
694:class:`DatabaseConnection` that I'll assume::
695
696 class DatabaseConnection:
697 # Database interface
698 def cursor (self):
699 "Returns a cursor object and starts a new transaction"
700 def commit (self):
701 "Commits current transaction"
702 def rollback (self):
703 "Rolls back current transaction"
704
705The :meth:`__enter__` method is pretty easy, having only to start a new
706transaction. For this application the resulting cursor object would be a useful
707result, so the method will return it. The user can then add ``as cursor`` to
708their ':keyword:`with`' statement to bind the cursor to a variable name. ::
709
710 class DatabaseConnection:
711 ...
712 def __enter__ (self):
713 # Code to start a new transaction
714 cursor = self.cursor()
715 return cursor
716
717The :meth:`__exit__` method is the most complicated because it's where most of
718the work has to be done. The method has to check if an exception occurred. If
719there was no exception, the transaction is committed. The transaction is rolled
720back if there was an exception.
721
722In the code below, execution will just fall off the end of the function,
723returning the default value of ``None``. ``None`` is false, so the exception
724will be re-raised automatically. If you wished, you could be more explicit and
725add a :keyword:`return` statement at the marked location. ::
726
727 class DatabaseConnection:
728 ...
729 def __exit__ (self, type, value, tb):
730 if tb is None:
731 # No exception, so commit
732 self.commit()
733 else:
734 # Exception occurred, so rollback.
735 self.rollback()
736 # return False
737
738
Benjamin Petersonf10a79a2008-10-11 00:49:57 +0000739.. _contextlibmod:
Georg Brandl116aa622007-08-15 14:28:22 +0000740
741The contextlib module
742---------------------
743
744The new :mod:`contextlib` module provides some functions and a decorator that
745are useful for writing objects for use with the ':keyword:`with`' statement.
746
747The decorator is called :func:`contextmanager`, and lets you write a single
748generator function instead of defining a new class. The generator should yield
749exactly one value. The code up to the :keyword:`yield` will be executed as the
750:meth:`__enter__` method, and the value yielded will be the method's return
751value that will get bound to the variable in the ':keyword:`with`' statement's
752:keyword:`as` clause, if any. The code after the :keyword:`yield` will be
753executed in the :meth:`__exit__` method. Any exception raised in the block will
754be raised by the :keyword:`yield` statement.
755
756Our database example from the previous section could be written using this
757decorator as::
758
759 from contextlib import contextmanager
760
761 @contextmanager
762 def db_transaction (connection):
763 cursor = connection.cursor()
764 try:
765 yield cursor
766 except:
767 connection.rollback()
768 raise
769 else:
770 connection.commit()
771
772 db = DatabaseConnection()
773 with db_transaction(db) as cursor:
774 ...
775
Andrew Svetlova2fe3342012-08-11 21:14:08 +0300776The :mod:`contextlib` module also has a ``nested(mgr1, mgr2, ...)`` function
Georg Brandl116aa622007-08-15 14:28:22 +0000777that combines a number of context managers so you don't need to write nested
778':keyword:`with`' statements. In this example, the single ':keyword:`with`'
779statement both starts a database transaction and acquires a thread lock::
780
781 lock = threading.Lock()
782 with nested (db_transaction(db), lock) as (cursor, locked):
783 ...
784
Andrew Svetlova2fe3342012-08-11 21:14:08 +0300785Finally, the ``closing(object)`` function returns *object* so that it can be
Georg Brandl116aa622007-08-15 14:28:22 +0000786bound to a variable, and calls ``object.close`` at the end of the block. ::
787
788 import urllib, sys
789 from contextlib import closing
790
791 with closing(urllib.urlopen('http://www.yahoo.com')) as f:
792 for line in f:
793 sys.stdout.write(line)
794
795
796.. seealso::
797
798 :pep:`343` - The "with" statement
799 PEP written by Guido van Rossum and Nick Coghlan; implemented by Mike Bland,
800 Guido van Rossum, and Neal Norwitz. The PEP shows the code generated for a
801 ':keyword:`with`' statement, which can be helpful in learning how the statement
802 works.
803
804 The documentation for the :mod:`contextlib` module.
805
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000806.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +0000807
808
809.. _pep-352:
810
811PEP 352: Exceptions as New-Style Classes
812========================================
813
814Exception classes can now be new-style classes, not just classic classes, and
815the built-in :exc:`Exception` class and all the standard built-in exceptions
816(:exc:`NameError`, :exc:`ValueError`, etc.) are now new-style classes.
817
818The inheritance hierarchy for exceptions has been rearranged a bit. In 2.5, the
819inheritance relationships are::
820
821 BaseException # New in Python 2.5
822 |- KeyboardInterrupt
823 |- SystemExit
824 |- Exception
825 |- (all other current built-in exceptions)
826
827This rearrangement was done because people often want to catch all exceptions
828that indicate program errors. :exc:`KeyboardInterrupt` and :exc:`SystemExit`
829aren't errors, though, and usually represent an explicit action such as the user
830hitting Control-C or code calling :func:`sys.exit`. A bare ``except:`` will
831catch all exceptions, so you commonly need to list :exc:`KeyboardInterrupt` and
832:exc:`SystemExit` in order to re-raise them. The usual pattern is::
833
834 try:
835 ...
836 except (KeyboardInterrupt, SystemExit):
837 raise
Georg Brandl48310cd2009-01-03 21:18:54 +0000838 except:
839 # Log error...
Georg Brandl116aa622007-08-15 14:28:22 +0000840 # Continue running program...
841
842In Python 2.5, you can now write ``except Exception`` to achieve the same
843result, catching all the exceptions that usually indicate errors but leaving
844:exc:`KeyboardInterrupt` and :exc:`SystemExit` alone. As in previous versions,
845a bare ``except:`` still catches all exceptions.
846
847The goal for Python 3.0 is to require any class raised as an exception to derive
848from :exc:`BaseException` or some descendant of :exc:`BaseException`, and future
849releases in the Python 2.x series may begin to enforce this constraint.
850Therefore, I suggest you begin making all your exception classes derive from
851:exc:`Exception` now. It's been suggested that the bare ``except:`` form should
852be removed in Python 3.0, but Guido van Rossum hasn't decided whether to do this
853or not.
854
855Raising of strings as exceptions, as in the statement ``raise "Error
856occurred"``, is deprecated in Python 2.5 and will trigger a warning. The aim is
857to be able to remove the string-exception feature in a few releases.
858
859
860.. seealso::
861
862 :pep:`352` - Required Superclass for Exceptions
863 PEP written by Brett Cannon and Guido van Rossum; implemented by Brett Cannon.
864
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000865.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +0000866
867
868.. _pep-353:
869
870PEP 353: Using ssize_t as the index type
871========================================
872
Georg Brandl60203b42010-10-06 10:11:56 +0000873A wide-ranging change to Python's C API, using a new :c:type:`Py_ssize_t` type
874definition instead of :c:type:`int`, will permit the interpreter to handle more
Georg Brandl116aa622007-08-15 14:28:22 +0000875data on 64-bit platforms. This change doesn't affect Python's capacity on 32-bit
876platforms.
877
Georg Brandl60203b42010-10-06 10:11:56 +0000878Various pieces of the Python interpreter used C's :c:type:`int` type to store
Georg Brandl116aa622007-08-15 14:28:22 +0000879sizes or counts; for example, the number of items in a list or tuple were stored
Georg Brandl60203b42010-10-06 10:11:56 +0000880in an :c:type:`int`. The C compilers for most 64-bit platforms still define
881:c:type:`int` as a 32-bit type, so that meant that lists could only hold up to
Georg Brandl116aa622007-08-15 14:28:22 +0000882``2**31 - 1`` = 2147483647 items. (There are actually a few different
883programming models that 64-bit C compilers can use -- see
884http://www.unix.org/version2/whatsnew/lp64_wp.html for a discussion -- but the
Georg Brandl60203b42010-10-06 10:11:56 +0000885most commonly available model leaves :c:type:`int` as 32 bits.)
Georg Brandl116aa622007-08-15 14:28:22 +0000886
887A limit of 2147483647 items doesn't really matter on a 32-bit platform because
888you'll run out of memory before hitting the length limit. Each list item
889requires space for a pointer, which is 4 bytes, plus space for a
Georg Brandl60203b42010-10-06 10:11:56 +0000890:c:type:`PyObject` representing the item. 2147483647\*4 is already more bytes
Georg Brandl116aa622007-08-15 14:28:22 +0000891than a 32-bit address space can contain.
892
893It's possible to address that much memory on a 64-bit platform, however. The
894pointers for a list that size would only require 16 GiB of space, so it's not
895unreasonable that Python programmers might construct lists that large.
896Therefore, the Python interpreter had to be changed to use some type other than
Georg Brandl60203b42010-10-06 10:11:56 +0000897:c:type:`int`, and this will be a 64-bit type on 64-bit platforms. The change
Georg Brandl116aa622007-08-15 14:28:22 +0000898will cause incompatibilities on 64-bit machines, so it was deemed worth making
899the transition now, while the number of 64-bit users is still relatively small.
900(In 5 or 10 years, we may *all* be on 64-bit machines, and the transition would
901be more painful then.)
902
903This change most strongly affects authors of C extension modules. Python
904strings and container types such as lists and tuples now use
Georg Brandl60203b42010-10-06 10:11:56 +0000905:c:type:`Py_ssize_t` to store their size. Functions such as
906:c:func:`PyList_Size` now return :c:type:`Py_ssize_t`. Code in extension modules
907may therefore need to have some variables changed to :c:type:`Py_ssize_t`.
Georg Brandl116aa622007-08-15 14:28:22 +0000908
Georg Brandl60203b42010-10-06 10:11:56 +0000909The :c:func:`PyArg_ParseTuple` and :c:func:`Py_BuildValue` functions have a new
910conversion code, ``n``, for :c:type:`Py_ssize_t`. :c:func:`PyArg_ParseTuple`'s
911``s#`` and ``t#`` still output :c:type:`int` by default, but you can define the
912macro :c:macro:`PY_SSIZE_T_CLEAN` before including :file:`Python.h` to make
913them return :c:type:`Py_ssize_t`.
Georg Brandl116aa622007-08-15 14:28:22 +0000914
915:pep:`353` has a section on conversion guidelines that extension authors should
916read to learn about supporting 64-bit platforms.
917
918
919.. seealso::
920
921 :pep:`353` - Using ssize_t as the index type
922 PEP written and implemented by Martin von Löwis.
923
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000924.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +0000925
926
927.. _pep-357:
928
929PEP 357: The '__index__' method
930===============================
931
932The NumPy developers had a problem that could only be solved by adding a new
933special method, :meth:`__index__`. When using slice notation, as in
934``[start:stop:step]``, the values of the *start*, *stop*, and *step* indexes
935must all be either integers or long integers. NumPy defines a variety of
936specialized integer types corresponding to unsigned and signed integers of 8,
93716, 32, and 64 bits, but there was no way to signal that these types could be
938used as slice indexes.
939
940Slicing can't just use the existing :meth:`__int__` method because that method
941is also used to implement coercion to integers. If slicing used
942:meth:`__int__`, floating-point numbers would also become legal slice indexes
943and that's clearly an undesirable behaviour.
944
945Instead, a new special method called :meth:`__index__` was added. It takes no
946arguments and returns an integer giving the slice index to use. For example::
947
948 class C:
949 def __index__ (self):
Georg Brandl48310cd2009-01-03 21:18:54 +0000950 return self.value
Georg Brandl116aa622007-08-15 14:28:22 +0000951
952The return value must be either a Python integer or long integer. The
953interpreter will check that the type returned is correct, and raises a
954:exc:`TypeError` if this requirement isn't met.
955
956A corresponding :attr:`nb_index` slot was added to the C-level
Georg Brandl60203b42010-10-06 10:11:56 +0000957:c:type:`PyNumberMethods` structure to let C extensions implement this protocol.
Andrew Svetlova2fe3342012-08-11 21:14:08 +0300958``PyNumber_Index(obj)`` can be used in extension code to call the
Georg Brandl116aa622007-08-15 14:28:22 +0000959:meth:`__index__` function and retrieve its result.
960
961
962.. seealso::
963
964 :pep:`357` - Allowing Any Object to be Used for Slicing
965 PEP written and implemented by Travis Oliphant.
966
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000967.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +0000968
969
970.. _other-lang:
971
972Other Language Changes
973======================
974
975Here are all of the changes that Python 2.5 makes to the core Python language.
976
977* The :class:`dict` type has a new hook for letting subclasses provide a default
978 value when a key isn't contained in the dictionary. When a key isn't found, the
Andrew Svetlova2fe3342012-08-11 21:14:08 +0300979 dictionary's ``__missing__(key)`` method will be called. This hook is used
Georg Brandl116aa622007-08-15 14:28:22 +0000980 to implement the new :class:`defaultdict` class in the :mod:`collections`
981 module. The following example defines a dictionary that returns zero for any
982 missing key::
983
984 class zerodict (dict):
985 def __missing__ (self, key):
986 return 0
987
988 d = zerodict({1:1, 2:2})
989 print d[1], d[2] # Prints 1, 2
990 print d[3], d[4] # Prints 0, 0
991
Andrew Svetlova2fe3342012-08-11 21:14:08 +0300992* Both 8-bit and Unicode strings have new ``partition(sep)`` and
993 ``rpartition(sep)`` methods that simplify a common use case.
Georg Brandl116aa622007-08-15 14:28:22 +0000994
Andrew Svetlova2fe3342012-08-11 21:14:08 +0300995 The ``find(S)`` method is often used to get an index which is then used to
Georg Brandl116aa622007-08-15 14:28:22 +0000996 slice the string and obtain the pieces that are before and after the separator.
Andrew Svetlova2fe3342012-08-11 21:14:08 +0300997 ``partition(sep)`` condenses this pattern into a single method call that
Georg Brandl116aa622007-08-15 14:28:22 +0000998 returns a 3-tuple containing the substring before the separator, the separator
999 itself, and the substring after the separator. If the separator isn't found,
1000 the first element of the tuple is the entire string and the other two elements
Andrew Svetlova2fe3342012-08-11 21:14:08 +03001001 are empty. ``rpartition(sep)`` also returns a 3-tuple but starts searching
Georg Brandl116aa622007-08-15 14:28:22 +00001002 from the end of the string; the ``r`` stands for 'reverse'.
1003
1004 Some examples::
1005
1006 >>> ('http://www.python.org').partition('://')
1007 ('http', '://', 'www.python.org')
1008 >>> ('file:/usr/share/doc/index.html').partition('://')
1009 ('file:/usr/share/doc/index.html', '', '')
1010 >>> (u'Subject: a quick question').partition(':')
1011 (u'Subject', u':', u' a quick question')
1012 >>> 'www.python.org'.rpartition('.')
1013 ('www.python', '.', 'org')
1014 >>> 'www.python.org'.rpartition(':')
1015 ('', '', 'www.python.org')
1016
1017 (Implemented by Fredrik Lundh following a suggestion by Raymond Hettinger.)
1018
1019* The :meth:`startswith` and :meth:`endswith` methods of string types now accept
1020 tuples of strings to check for. ::
1021
1022 def is_image_file (filename):
1023 return filename.endswith(('.gif', '.jpg', '.tiff'))
1024
1025 (Implemented by Georg Brandl following a suggestion by Tom Lynn.)
1026
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001027 .. RFE #1491485
Georg Brandl116aa622007-08-15 14:28:22 +00001028
1029* The :func:`min` and :func:`max` built-in functions gained a ``key`` keyword
1030 parameter analogous to the ``key`` argument for :meth:`sort`. This parameter
1031 supplies a function that takes a single argument and is called for every value
1032 in the list; :func:`min`/:func:`max` will return the element with the
1033 smallest/largest return value from this function. For example, to find the
1034 longest string in a list, you can do::
1035
1036 L = ['medium', 'longest', 'short']
1037 # Prints 'longest'
Georg Brandl48310cd2009-01-03 21:18:54 +00001038 print max(L, key=len)
Georg Brandl116aa622007-08-15 14:28:22 +00001039 # Prints 'short', because lexicographically 'short' has the largest value
Georg Brandl48310cd2009-01-03 21:18:54 +00001040 print max(L)
Georg Brandl116aa622007-08-15 14:28:22 +00001041
1042 (Contributed by Steven Bethard and Raymond Hettinger.)
1043
1044* Two new built-in functions, :func:`any` and :func:`all`, evaluate whether an
1045 iterator contains any true or false values. :func:`any` returns :const:`True`
1046 if any value returned by the iterator is true; otherwise it will return
1047 :const:`False`. :func:`all` returns :const:`True` only if all of the values
1048 returned by the iterator evaluate as true. (Suggested by Guido van Rossum, and
1049 implemented by Raymond Hettinger.)
1050
1051* The result of a class's :meth:`__hash__` method can now be either a long
1052 integer or a regular integer. If a long integer is returned, the hash of that
1053 value is taken. In earlier versions the hash value was required to be a
1054 regular integer, but in 2.5 the :func:`id` built-in was changed to always
1055 return non-negative numbers, and users often seem to use ``id(self)`` in
1056 :meth:`__hash__` methods (though this is discouraged).
1057
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001058 .. Bug #1536021
Georg Brandl116aa622007-08-15 14:28:22 +00001059
1060* ASCII is now the default encoding for modules. It's now a syntax error if a
1061 module contains string literals with 8-bit characters but doesn't have an
1062 encoding declaration. In Python 2.4 this triggered a warning, not a syntax
1063 error. See :pep:`263` for how to declare a module's encoding; for example, you
1064 might add a line like this near the top of the source file::
1065
1066 # -*- coding: latin1 -*-
1067
1068* A new warning, :class:`UnicodeWarning`, is triggered when you attempt to
1069 compare a Unicode string and an 8-bit string that can't be converted to Unicode
1070 using the default ASCII encoding. The result of the comparison is false::
1071
1072 >>> chr(128) == unichr(128) # Can't convert chr(128) to Unicode
Georg Brandl48310cd2009-01-03 21:18:54 +00001073 __main__:1: UnicodeWarning: Unicode equal comparison failed
1074 to convert both arguments to Unicode - interpreting them
Georg Brandl116aa622007-08-15 14:28:22 +00001075 as being unequal
1076 False
1077 >>> chr(127) == unichr(127) # chr(127) can be converted
1078 True
1079
1080 Previously this would raise a :class:`UnicodeDecodeError` exception, but in 2.5
1081 this could result in puzzling problems when accessing a dictionary. If you
1082 looked up ``unichr(128)`` and ``chr(128)`` was being used as a key, you'd get a
1083 :class:`UnicodeDecodeError` exception. Other changes in 2.5 resulted in this
1084 exception being raised instead of suppressed by the code in :file:`dictobject.c`
1085 that implements dictionaries.
1086
1087 Raising an exception for such a comparison is strictly correct, but the change
1088 might have broken code, so instead :class:`UnicodeWarning` was introduced.
1089
1090 (Implemented by Marc-André Lemburg.)
1091
1092* One error that Python programmers sometimes make is forgetting to include an
1093 :file:`__init__.py` module in a package directory. Debugging this mistake can be
1094 confusing, and usually requires running Python with the :option:`-v` switch to
1095 log all the paths searched. In Python 2.5, a new :exc:`ImportWarning` warning is
1096 triggered when an import would have picked up a directory as a package but no
1097 :file:`__init__.py` was found. This warning is silently ignored by default;
1098 provide the :option:`-Wd` option when running the Python executable to display
1099 the warning message. (Implemented by Thomas Wouters.)
1100
1101* The list of base classes in a class definition can now be empty. As an
1102 example, this is now legal::
1103
1104 class C():
1105 pass
1106
1107 (Implemented by Brett Cannon.)
1108
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001109.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +00001110
1111
Benjamin Petersonf10a79a2008-10-11 00:49:57 +00001112.. _25interactive:
Georg Brandl116aa622007-08-15 14:28:22 +00001113
1114Interactive Interpreter Changes
1115-------------------------------
1116
1117In the interactive interpreter, ``quit`` and ``exit`` have long been strings so
1118that new users get a somewhat helpful message when they try to quit::
1119
1120 >>> quit
1121 'Use Ctrl-D (i.e. EOF) to exit.'
1122
1123In Python 2.5, ``quit`` and ``exit`` are now objects that still produce string
1124representations of themselves, but are also callable. Newbies who try ``quit()``
1125or ``exit()`` will now exit the interpreter as they expect. (Implemented by
1126Georg Brandl.)
1127
1128The Python executable now accepts the standard long options :option:`--help`
1129and :option:`--version`; on Windows, it also accepts the :option:`/?` option
1130for displaying a help message. (Implemented by Georg Brandl.)
1131
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001132.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +00001133
1134
1135.. _opts:
1136
1137Optimizations
1138-------------
1139
1140Several of the optimizations were developed at the NeedForSpeed sprint, an event
1141held in Reykjavik, Iceland, from May 21--28 2006. The sprint focused on speed
1142enhancements to the CPython implementation and was funded by EWT LLC with local
1143support from CCP Games. Those optimizations added at this sprint are specially
1144marked in the following list.
1145
1146* When they were introduced in Python 2.4, the built-in :class:`set` and
1147 :class:`frozenset` types were built on top of Python's dictionary type. In 2.5
1148 the internal data structure has been customized for implementing sets, and as a
1149 result sets will use a third less memory and are somewhat faster. (Implemented
1150 by Raymond Hettinger.)
1151
1152* The speed of some Unicode operations, such as finding substrings, string
1153 splitting, and character map encoding and decoding, has been improved.
1154 (Substring search and splitting improvements were added by Fredrik Lundh and
1155 Andrew Dalke at the NeedForSpeed sprint. Character maps were improved by Walter
1156 Dörwald and Martin von Löwis.)
1157
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001158 .. Patch 1313939, 1359618
Georg Brandl116aa622007-08-15 14:28:22 +00001159
Andrew Svetlova2fe3342012-08-11 21:14:08 +03001160* The ``long(str, base)`` function is now faster on long digit strings
Georg Brandl116aa622007-08-15 14:28:22 +00001161 because fewer intermediate results are calculated. The peak is for strings of
1162 around 800--1000 digits where the function is 6 times faster. (Contributed by
1163 Alan McIntyre and committed at the NeedForSpeed sprint.)
1164
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001165 .. Patch 1442927
Georg Brandl116aa622007-08-15 14:28:22 +00001166
1167* It's now illegal to mix iterating over a file with ``for line in file`` and
1168 calling the file object's :meth:`read`/:meth:`readline`/:meth:`readlines`
1169 methods. Iteration uses an internal buffer and the :meth:`read\*` methods
1170 don't use that buffer. Instead they would return the data following the
1171 buffer, causing the data to appear out of order. Mixing iteration and these
1172 methods will now trigger a :exc:`ValueError` from the :meth:`read\*` method.
1173 (Implemented by Thomas Wouters.)
1174
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001175 .. Patch 1397960
Georg Brandl116aa622007-08-15 14:28:22 +00001176
1177* The :mod:`struct` module now compiles structure format strings into an
1178 internal representation and caches this representation, yielding a 20% speedup.
1179 (Contributed by Bob Ippolito at the NeedForSpeed sprint.)
1180
1181* The :mod:`re` module got a 1 or 2% speedup by switching to Python's allocator
Georg Brandl60203b42010-10-06 10:11:56 +00001182 functions instead of the system's :c:func:`malloc` and :c:func:`free`.
Georg Brandl116aa622007-08-15 14:28:22 +00001183 (Contributed by Jack Diederich at the NeedForSpeed sprint.)
1184
1185* The code generator's peephole optimizer now performs simple constant folding
1186 in expressions. If you write something like ``a = 2+3``, the code generator
1187 will do the arithmetic and produce code corresponding to ``a = 5``. (Proposed
1188 and implemented by Raymond Hettinger.)
1189
1190* Function calls are now faster because code objects now keep the most recently
1191 finished frame (a "zombie frame") in an internal field of the code object,
1192 reusing it the next time the code object is invoked. (Original patch by Michael
1193 Hudson, modified by Armin Rigo and Richard Jones; committed at the NeedForSpeed
1194 sprint.) Frame objects are also slightly smaller, which may improve cache
1195 locality and reduce memory usage a bit. (Contributed by Neal Norwitz.)
1196
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001197 .. Patch 876206
1198 .. Patch 1337051
Georg Brandl116aa622007-08-15 14:28:22 +00001199
1200* Python's built-in exceptions are now new-style classes, a change that speeds
1201 up instantiation considerably. Exception handling in Python 2.5 is therefore
1202 about 30% faster than in 2.4. (Contributed by Richard Jones, Georg Brandl and
1203 Sean Reifschneider at the NeedForSpeed sprint.)
1204
1205* Importing now caches the paths tried, recording whether they exist or not so
Georg Brandl60203b42010-10-06 10:11:56 +00001206 that the interpreter makes fewer :c:func:`open` and :c:func:`stat` calls on
Georg Brandl116aa622007-08-15 14:28:22 +00001207 startup. (Contributed by Martin von Löwis and Georg Brandl.)
1208
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001209 .. Patch 921466
Georg Brandl116aa622007-08-15 14:28:22 +00001210
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001211.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +00001212
1213
Benjamin Petersonf10a79a2008-10-11 00:49:57 +00001214.. _25modules:
Georg Brandl116aa622007-08-15 14:28:22 +00001215
1216New, Improved, and Removed Modules
1217==================================
1218
1219The standard library received many enhancements and bug fixes in Python 2.5.
1220Here's a partial list of the most notable changes, sorted alphabetically by
1221module name. Consult the :file:`Misc/NEWS` file in the source tree for a more
1222complete list of changes, or look through the SVN logs for all the details.
1223
1224* The :mod:`audioop` module now supports the a-LAW encoding, and the code for
1225 u-LAW encoding has been improved. (Contributed by Lars Immisch.)
1226
1227* The :mod:`codecs` module gained support for incremental codecs. The
1228 :func:`codec.lookup` function now returns a :class:`CodecInfo` instance instead
1229 of a tuple. :class:`CodecInfo` instances behave like a 4-tuple to preserve
1230 backward compatibility but also have the attributes :attr:`encode`,
1231 :attr:`decode`, :attr:`incrementalencoder`, :attr:`incrementaldecoder`,
1232 :attr:`streamwriter`, and :attr:`streamreader`. Incremental codecs can receive
1233 input and produce output in multiple chunks; the output is the same as if the
1234 entire input was fed to the non-incremental codec. See the :mod:`codecs` module
1235 documentation for details. (Designed and implemented by Walter Dörwald.)
1236
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001237 .. Patch 1436130
Georg Brandl116aa622007-08-15 14:28:22 +00001238
1239* The :mod:`collections` module gained a new type, :class:`defaultdict`, that
1240 subclasses the standard :class:`dict` type. The new type mostly behaves like a
1241 dictionary but constructs a default value when a key isn't present,
1242 automatically adding it to the dictionary for the requested key value.
1243
1244 The first argument to :class:`defaultdict`'s constructor is a factory function
1245 that gets called whenever a key is requested but not found. This factory
1246 function receives no arguments, so you can use built-in type constructors such
1247 as :func:`list` or :func:`int`. For example, you can make an index of words
1248 based on their initial letter like this::
1249
1250 words = """Nel mezzo del cammin di nostra vita
1251 mi ritrovai per una selva oscura
1252 che la diritta via era smarrita""".lower().split()
1253
1254 index = defaultdict(list)
1255
1256 for w in words:
1257 init_letter = w[0]
1258 index[init_letter].append(w)
1259
1260 Printing ``index`` results in the following output::
1261
Georg Brandl48310cd2009-01-03 21:18:54 +00001262 defaultdict(<type 'list'>, {'c': ['cammin', 'che'], 'e': ['era'],
1263 'd': ['del', 'di', 'diritta'], 'm': ['mezzo', 'mi'],
1264 'l': ['la'], 'o': ['oscura'], 'n': ['nel', 'nostra'],
1265 'p': ['per'], 's': ['selva', 'smarrita'],
Georg Brandl116aa622007-08-15 14:28:22 +00001266 'r': ['ritrovai'], 'u': ['una'], 'v': ['vita', 'via']}
1267
1268 (Contributed by Guido van Rossum.)
1269
1270* The :class:`deque` double-ended queue type supplied by the :mod:`collections`
Andrew Svetlova2fe3342012-08-11 21:14:08 +03001271 module now has a ``remove(value)`` method that removes the first occurrence
Georg Brandl116aa622007-08-15 14:28:22 +00001272 of *value* in the queue, raising :exc:`ValueError` if the value isn't found.
1273 (Contributed by Raymond Hettinger.)
1274
1275* New module: The :mod:`contextlib` module contains helper functions for use
Benjamin Petersonf10a79a2008-10-11 00:49:57 +00001276 with the new ':keyword:`with`' statement. See section :ref:`contextlibmod`
Georg Brandl116aa622007-08-15 14:28:22 +00001277 for more about this module.
1278
1279* New module: The :mod:`cProfile` module is a C implementation of the existing
1280 :mod:`profile` module that has much lower overhead. The module's interface is
1281 the same as :mod:`profile`: you run ``cProfile.run('main()')`` to profile a
1282 function, can save profile data to a file, etc. It's not yet known if the
1283 Hotshot profiler, which is also written in C but doesn't match the
1284 :mod:`profile` module's interface, will continue to be maintained in future
1285 versions of Python. (Contributed by Armin Rigo.)
1286
1287 Also, the :mod:`pstats` module for analyzing the data measured by the profiler
1288 now supports directing the output to any file object by supplying a *stream*
1289 argument to the :class:`Stats` constructor. (Contributed by Skip Montanaro.)
1290
1291* The :mod:`csv` module, which parses files in comma-separated value format,
1292 received several enhancements and a number of bugfixes. You can now set the
1293 maximum size in bytes of a field by calling the
Andrew Svetlova2fe3342012-08-11 21:14:08 +03001294 ``csv.field_size_limit(new_limit)`` function; omitting the *new_limit*
Georg Brandl116aa622007-08-15 14:28:22 +00001295 argument will return the currently-set limit. The :class:`reader` class now has
1296 a :attr:`line_num` attribute that counts the number of physical lines read from
1297 the source; records can span multiple physical lines, so :attr:`line_num` is not
1298 the same as the number of records read.
1299
1300 The CSV parser is now stricter about multi-line quoted fields. Previously, if a
1301 line ended within a quoted field without a terminating newline character, a
1302 newline would be inserted into the returned field. This behavior caused problems
1303 when reading files that contained carriage return characters within fields, so
1304 the code was changed to return the field without inserting newlines. As a
1305 consequence, if newlines embedded within fields are important, the input should
1306 be split into lines in a manner that preserves the newline characters.
1307
1308 (Contributed by Skip Montanaro and Andrew McNamara.)
1309
1310* The :class:`datetime` class in the :mod:`datetime` module now has a
Andrew Svetlova2fe3342012-08-11 21:14:08 +03001311 ``strptime(string, format)`` method for parsing date strings, contributed
Georg Brandl116aa622007-08-15 14:28:22 +00001312 by Josh Spoerri. It uses the same format characters as :func:`time.strptime` and
1313 :func:`time.strftime`::
1314
1315 from datetime import datetime
1316
1317 ts = datetime.strptime('10:13:15 2006-03-07',
1318 '%H:%M:%S %Y-%m-%d')
1319
1320* The :meth:`SequenceMatcher.get_matching_blocks` method in the :mod:`difflib`
1321 module now guarantees to return a minimal list of blocks describing matching
1322 subsequences. Previously, the algorithm would occasionally break a block of
1323 matching elements into two list entries. (Enhancement by Tim Peters.)
1324
1325* The :mod:`doctest` module gained a ``SKIP`` option that keeps an example from
1326 being executed at all. This is intended for code snippets that are usage
1327 examples intended for the reader and aren't actually test cases.
1328
1329 An *encoding* parameter was added to the :func:`testfile` function and the
1330 :class:`DocFileSuite` class to specify the file's encoding. This makes it
1331 easier to use non-ASCII characters in tests contained within a docstring.
1332 (Contributed by Bjorn Tillenius.)
1333
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001334 .. Patch 1080727
Georg Brandl116aa622007-08-15 14:28:22 +00001335
1336* The :mod:`email` package has been updated to version 4.0. (Contributed by
1337 Barry Warsaw.)
1338
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001339 .. XXX need to provide some more detail here
Georg Brandl116aa622007-08-15 14:28:22 +00001340
R David Murray1b00f252012-08-15 10:43:58 -04001341 .. index::
1342 single: universal newlines; What's new
1343
Georg Brandl116aa622007-08-15 14:28:22 +00001344* The :mod:`fileinput` module was made more flexible. Unicode filenames are now
1345 supported, and a *mode* parameter that defaults to ``"r"`` was added to the
R David Murrayee0a9452012-08-15 11:05:36 -04001346 :func:`input` function to allow opening files in binary or :term:`universal
1347 newlines` mode. Another new parameter, *openhook*, lets you use a function
1348 other than :func:`open` to open the input files. Once you're iterating over
1349 the set of files, the :class:`FileInput` object's new :meth:`fileno` returns
1350 the file descriptor for the currently opened file. (Contributed by Georg
1351 Brandl.)
Georg Brandl116aa622007-08-15 14:28:22 +00001352
1353* In the :mod:`gc` module, the new :func:`get_count` function returns a 3-tuple
1354 containing the current collection counts for the three GC generations. This is
1355 accounting information for the garbage collector; when these counts reach a
1356 specified threshold, a garbage collection sweep will be made. The existing
1357 :func:`gc.collect` function now takes an optional *generation* argument of 0, 1,
1358 or 2 to specify which generation to collect. (Contributed by Barry Warsaw.)
1359
1360* The :func:`nsmallest` and :func:`nlargest` functions in the :mod:`heapq`
1361 module now support a ``key`` keyword parameter similar to the one provided by
1362 the :func:`min`/:func:`max` functions and the :meth:`sort` methods. For
1363 example::
1364
1365 >>> import heapq
1366 >>> L = ["short", 'medium', 'longest', 'longer still']
1367 >>> heapq.nsmallest(2, L) # Return two lowest elements, lexicographically
1368 ['longer still', 'longest']
1369 >>> heapq.nsmallest(2, L, key=len) # Return two shortest elements
1370 ['short', 'medium']
1371
1372 (Contributed by Raymond Hettinger.)
1373
1374* The :func:`itertools.islice` function now accepts ``None`` for the start and
1375 step arguments. This makes it more compatible with the attributes of slice
1376 objects, so that you can now write the following::
1377
1378 s = slice(5) # Create slice object
1379 itertools.islice(iterable, s.start, s.stop, s.step)
1380
1381 (Contributed by Raymond Hettinger.)
1382
1383* The :func:`format` function in the :mod:`locale` module has been modified and
1384 two new functions were added, :func:`format_string` and :func:`currency`.
1385
1386 The :func:`format` function's *val* parameter could previously be a string as
1387 long as no more than one %char specifier appeared; now the parameter must be
1388 exactly one %char specifier with no surrounding text. An optional *monetary*
1389 parameter was also added which, if ``True``, will use the locale's rules for
1390 formatting currency in placing a separator between groups of three digits.
1391
1392 To format strings with multiple %char specifiers, use the new
1393 :func:`format_string` function that works like :func:`format` but also supports
1394 mixing %char specifiers with arbitrary text.
1395
1396 A new :func:`currency` function was also added that formats a number according
1397 to the current locale's settings.
1398
1399 (Contributed by Georg Brandl.)
1400
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001401 .. Patch 1180296
Georg Brandl116aa622007-08-15 14:28:22 +00001402
1403* The :mod:`mailbox` module underwent a massive rewrite to add the capability to
1404 modify mailboxes in addition to reading them. A new set of classes that include
1405 :class:`mbox`, :class:`MH`, and :class:`Maildir` are used to read mailboxes, and
Andrew Svetlova2fe3342012-08-11 21:14:08 +03001406 have an ``add(message)`` method to add messages, ``remove(key)`` to
Georg Brandl116aa622007-08-15 14:28:22 +00001407 remove messages, and :meth:`lock`/:meth:`unlock` to lock/unlock the mailbox.
1408 The following example converts a maildir-format mailbox into an mbox-format
1409 one::
1410
1411 import mailbox
1412
1413 # 'factory=None' uses email.Message.Message as the class representing
1414 # individual messages.
1415 src = mailbox.Maildir('maildir', factory=None)
1416 dest = mailbox.mbox('/tmp/mbox')
1417
1418 for msg in src:
1419 dest.add(msg)
1420
1421 (Contributed by Gregory K. Johnson. Funding was provided by Google's 2005
1422 Summer of Code.)
1423
1424* New module: the :mod:`msilib` module allows creating Microsoft Installer
1425 :file:`.msi` files and CAB files. Some support for reading the :file:`.msi`
1426 database is also included. (Contributed by Martin von Löwis.)
1427
1428* The :mod:`nis` module now supports accessing domains other than the system
1429 default domain by supplying a *domain* argument to the :func:`nis.match` and
1430 :func:`nis.maps` functions. (Contributed by Ben Bell.)
1431
1432* The :mod:`operator` module's :func:`itemgetter` and :func:`attrgetter`
1433 functions now support multiple fields. A call such as
1434 ``operator.attrgetter('a', 'b')`` will return a function that retrieves the
1435 :attr:`a` and :attr:`b` attributes. Combining this new feature with the
1436 :meth:`sort` method's ``key`` parameter lets you easily sort lists using
1437 multiple fields. (Contributed by Raymond Hettinger.)
1438
1439* The :mod:`optparse` module was updated to version 1.5.1 of the Optik library.
1440 The :class:`OptionParser` class gained an :attr:`epilog` attribute, a string
1441 that will be printed after the help message, and a :meth:`destroy` method to
1442 break reference cycles created by the object. (Contributed by Greg Ward.)
1443
1444* The :mod:`os` module underwent several changes. The :attr:`stat_float_times`
1445 variable now defaults to true, meaning that :func:`os.stat` will now return time
1446 values as floats. (This doesn't necessarily mean that :func:`os.stat` will
1447 return times that are precise to fractions of a second; not all systems support
1448 such precision.)
1449
1450 Constants named :attr:`os.SEEK_SET`, :attr:`os.SEEK_CUR`, and
1451 :attr:`os.SEEK_END` have been added; these are the parameters to the
1452 :func:`os.lseek` function. Two new constants for locking are
1453 :attr:`os.O_SHLOCK` and :attr:`os.O_EXLOCK`.
1454
1455 Two new functions, :func:`wait3` and :func:`wait4`, were added. They're similar
1456 the :func:`waitpid` function which waits for a child process to exit and returns
1457 a tuple of the process ID and its exit status, but :func:`wait3` and
1458 :func:`wait4` return additional information. :func:`wait3` doesn't take a
1459 process ID as input, so it waits for any child process to exit and returns a
1460 3-tuple of *process-id*, *exit-status*, *resource-usage* as returned from the
Andrew Svetlova2fe3342012-08-11 21:14:08 +03001461 :func:`resource.getrusage` function. ``wait4(pid)`` does take a process ID.
Georg Brandl116aa622007-08-15 14:28:22 +00001462 (Contributed by Chad J. Schroeder.)
1463
1464 On FreeBSD, the :func:`os.stat` function now returns times with nanosecond
1465 resolution, and the returned object now has :attr:`st_gen` and
Senthil Kumarana6bac952011-07-04 11:28:30 -07001466 :attr:`st_birthtime`. The :attr:`st_flags` attribute is also available, if the
Georg Brandl116aa622007-08-15 14:28:22 +00001467 platform supports it. (Contributed by Antti Louko and Diego Pettenò.)
1468
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001469 .. (Patch 1180695, 1212117)
Georg Brandl116aa622007-08-15 14:28:22 +00001470
1471* The Python debugger provided by the :mod:`pdb` module can now store lists of
1472 commands to execute when a breakpoint is reached and execution stops. Once
1473 breakpoint #1 has been created, enter ``commands 1`` and enter a series of
1474 commands to be executed, finishing the list with ``end``. The command list can
1475 include commands that resume execution, such as ``continue`` or ``next``.
1476 (Contributed by Grégoire Dooms.)
1477
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001478 .. Patch 790710
Georg Brandl116aa622007-08-15 14:28:22 +00001479
1480* The :mod:`pickle` and :mod:`cPickle` modules no longer accept a return value
1481 of ``None`` from the :meth:`__reduce__` method; the method must return a tuple
1482 of arguments instead. The ability to return ``None`` was deprecated in Python
1483 2.4, so this completes the removal of the feature.
1484
1485* The :mod:`pkgutil` module, containing various utility functions for finding
1486 packages, was enhanced to support PEP 302's import hooks and now also works for
1487 packages stored in ZIP-format archives. (Contributed by Phillip J. Eby.)
1488
1489* The pybench benchmark suite by Marc-André Lemburg is now included in the
1490 :file:`Tools/pybench` directory. The pybench suite is an improvement on the
1491 commonly used :file:`pystone.py` program because pybench provides a more
1492 detailed measurement of the interpreter's speed. It times particular operations
1493 such as function calls, tuple slicing, method lookups, and numeric operations,
1494 instead of performing many different operations and reducing the result to a
1495 single number as :file:`pystone.py` does.
1496
1497* The :mod:`pyexpat` module now uses version 2.0 of the Expat parser.
1498 (Contributed by Trent Mick.)
1499
1500* The :class:`Queue` class provided by the :mod:`Queue` module gained two new
1501 methods. :meth:`join` blocks until all items in the queue have been retrieved
1502 and all processing work on the items have been completed. Worker threads call
1503 the other new method, :meth:`task_done`, to signal that processing for an item
1504 has been completed. (Contributed by Raymond Hettinger.)
1505
1506* The old :mod:`regex` and :mod:`regsub` modules, which have been deprecated
1507 ever since Python 2.0, have finally been deleted. Other deleted modules:
1508 :mod:`statcache`, :mod:`tzparse`, :mod:`whrandom`.
1509
1510* Also deleted: the :file:`lib-old` directory, which includes ancient modules
1511 such as :mod:`dircmp` and :mod:`ni`, was removed. :file:`lib-old` wasn't on the
1512 default ``sys.path``, so unless your programs explicitly added the directory to
1513 ``sys.path``, this removal shouldn't affect your code.
1514
1515* The :mod:`rlcompleter` module is no longer dependent on importing the
1516 :mod:`readline` module and therefore now works on non-Unix platforms. (Patch
1517 from Robert Kiendl.)
1518
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001519 .. Patch #1472854
Georg Brandl116aa622007-08-15 14:28:22 +00001520
1521* The :mod:`SimpleXMLRPCServer` and :mod:`DocXMLRPCServer` classes now have a
1522 :attr:`rpc_paths` attribute that constrains XML-RPC operations to a limited set
1523 of URL paths; the default is to allow only ``'/'`` and ``'/RPC2'``. Setting
1524 :attr:`rpc_paths` to ``None`` or an empty tuple disables this path checking.
1525
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001526 .. Bug #1473048
Georg Brandl116aa622007-08-15 14:28:22 +00001527
1528* The :mod:`socket` module now supports :const:`AF_NETLINK` sockets on Linux,
1529 thanks to a patch from Philippe Biondi. Netlink sockets are a Linux-specific
1530 mechanism for communications between a user-space process and kernel code; an
1531 introductory article about them is at http://www.linuxjournal.com/article/7356.
1532 In Python code, netlink addresses are represented as a tuple of 2 integers,
1533 ``(pid, group_mask)``.
1534
Andrew Svetlova2fe3342012-08-11 21:14:08 +03001535 Two new methods on socket objects, ``recv_into(buffer)`` and
1536 ``recvfrom_into(buffer)``, store the received data in an object that
Georg Brandl116aa622007-08-15 14:28:22 +00001537 supports the buffer protocol instead of returning the data as a string. This
1538 means you can put the data directly into an array or a memory-mapped file.
1539
1540 Socket objects also gained :meth:`getfamily`, :meth:`gettype`, and
1541 :meth:`getproto` accessor methods to retrieve the family, type, and protocol
1542 values for the socket.
1543
1544* New module: the :mod:`spwd` module provides functions for accessing the shadow
1545 password database on systems that support shadow passwords.
1546
1547* The :mod:`struct` is now faster because it compiles format strings into
1548 :class:`Struct` objects with :meth:`pack` and :meth:`unpack` methods. This is
1549 similar to how the :mod:`re` module lets you create compiled regular expression
1550 objects. You can still use the module-level :func:`pack` and :func:`unpack`
1551 functions; they'll create :class:`Struct` objects and cache them. Or you can
1552 use :class:`Struct` instances directly::
1553
1554 s = struct.Struct('ih3s')
1555
1556 data = s.pack(1972, 187, 'abc')
1557 year, number, name = s.unpack(data)
1558
1559 You can also pack and unpack data to and from buffer objects directly using the
Andrew Svetlova2fe3342012-08-11 21:14:08 +03001560 ``pack_into(buffer, offset, v1, v2, ...)`` and ``unpack_from(buffer,
1561 offset)`` methods. This lets you store data directly into an array or a memory-
Georg Brandl116aa622007-08-15 14:28:22 +00001562 mapped file.
1563
1564 (:class:`Struct` objects were implemented by Bob Ippolito at the NeedForSpeed
1565 sprint. Support for buffer objects was added by Martin Blais, also at the
1566 NeedForSpeed sprint.)
1567
1568* The Python developers switched from CVS to Subversion during the 2.5
1569 development process. Information about the exact build version is available as
1570 the ``sys.subversion`` variable, a 3-tuple of ``(interpreter-name, branch-name,
1571 revision-range)``. For example, at the time of writing my copy of 2.5 was
1572 reporting ``('CPython', 'trunk', '45313:45315')``.
1573
1574 This information is also available to C extensions via the
Georg Brandl60203b42010-10-06 10:11:56 +00001575 :c:func:`Py_GetBuildInfo` function that returns a string of build information
Georg Brandl116aa622007-08-15 14:28:22 +00001576 like this: ``"trunk:45355:45356M, Apr 13 2006, 07:42:19"``. (Contributed by
1577 Barry Warsaw.)
1578
1579* Another new function, :func:`sys._current_frames`, returns the current stack
1580 frames for all running threads as a dictionary mapping thread identifiers to the
1581 topmost stack frame currently active in that thread at the time the function is
1582 called. (Contributed by Tim Peters.)
1583
1584* The :class:`TarFile` class in the :mod:`tarfile` module now has an
1585 :meth:`extractall` method that extracts all members from the archive into the
1586 current working directory. It's also possible to set a different directory as
1587 the extraction target, and to unpack only a subset of the archive's members.
1588
1589 The compression used for a tarfile opened in stream mode can now be autodetected
1590 using the mode ``'r|*'``. (Contributed by Lars Gustäbel.)
1591
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001592 .. patch 918101
Georg Brandl116aa622007-08-15 14:28:22 +00001593
1594* The :mod:`threading` module now lets you set the stack size used when new
Andrew Svetlova2fe3342012-08-11 21:14:08 +03001595 threads are created. The ``stack_size([*size*])`` function returns the
Georg Brandl116aa622007-08-15 14:28:22 +00001596 currently configured stack size, and supplying the optional *size* parameter
1597 sets a new value. Not all platforms support changing the stack size, but
1598 Windows, POSIX threading, and OS/2 all do. (Contributed by Andrew MacIntyre.)
1599
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001600 .. Patch 1454481
Georg Brandl116aa622007-08-15 14:28:22 +00001601
1602* The :mod:`unicodedata` module has been updated to use version 4.1.0 of the
1603 Unicode character database. Version 3.2.0 is required by some specifications,
1604 so it's still available as :attr:`unicodedata.ucd_3_2_0`.
1605
1606* New module: the :mod:`uuid` module generates universally unique identifiers
1607 (UUIDs) according to :rfc:`4122`. The RFC defines several different UUID
1608 versions that are generated from a starting string, from system properties, or
1609 purely randomly. This module contains a :class:`UUID` class and functions
1610 named :func:`uuid1`, :func:`uuid3`, :func:`uuid4`, and :func:`uuid5` to
1611 generate different versions of UUID. (Version 2 UUIDs are not specified in
1612 :rfc:`4122` and are not supported by this module.) ::
1613
1614 >>> import uuid
1615 >>> # make a UUID based on the host ID and current time
1616 >>> uuid.uuid1()
1617 UUID('a8098c1a-f86e-11da-bd1a-00112444be1e')
1618
1619 >>> # make a UUID using an MD5 hash of a namespace UUID and a name
1620 >>> uuid.uuid3(uuid.NAMESPACE_DNS, 'python.org')
1621 UUID('6fa459ea-ee8a-3ca4-894e-db77e160355e')
1622
1623 >>> # make a random UUID
1624 >>> uuid.uuid4()
1625 UUID('16fd2706-8baf-433b-82eb-8c7fada847da')
1626
1627 >>> # make a UUID using a SHA-1 hash of a namespace UUID and a name
1628 >>> uuid.uuid5(uuid.NAMESPACE_DNS, 'python.org')
1629 UUID('886313e1-3b8a-5372-9b90-0c9aee199e5d')
1630
1631 (Contributed by Ka-Ping Yee.)
1632
1633* The :mod:`weakref` module's :class:`WeakKeyDictionary` and
1634 :class:`WeakValueDictionary` types gained new methods for iterating over the
1635 weak references contained in the dictionary. :meth:`iterkeyrefs` and
1636 :meth:`keyrefs` methods were added to :class:`WeakKeyDictionary`, and
1637 :meth:`itervaluerefs` and :meth:`valuerefs` were added to
1638 :class:`WeakValueDictionary`. (Contributed by Fred L. Drake, Jr.)
1639
1640* The :mod:`webbrowser` module received a number of enhancements. It's now
1641 usable as a script with ``python -m webbrowser``, taking a URL as the argument;
1642 there are a number of switches to control the behaviour (:option:`-n` for a new
1643 browser window, :option:`-t` for a new tab). New module-level functions,
1644 :func:`open_new` and :func:`open_new_tab`, were added to support this. The
1645 module's :func:`open` function supports an additional feature, an *autoraise*
1646 parameter that signals whether to raise the open window when possible. A number
1647 of additional browsers were added to the supported list such as Firefox, Opera,
1648 Konqueror, and elinks. (Contributed by Oleg Broytmann and Georg Brandl.)
1649
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001650 .. Patch #754022
Georg Brandl116aa622007-08-15 14:28:22 +00001651
1652* The :mod:`xmlrpclib` module now supports returning :class:`datetime` objects
1653 for the XML-RPC date type. Supply ``use_datetime=True`` to the :func:`loads`
1654 function or the :class:`Unmarshaller` class to enable this feature. (Contributed
1655 by Skip Montanaro.)
1656
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001657 .. Patch 1120353
Georg Brandl116aa622007-08-15 14:28:22 +00001658
1659* The :mod:`zipfile` module now supports the ZIP64 version of the format,
1660 meaning that a .zip archive can now be larger than 4 GiB and can contain
1661 individual files larger than 4 GiB. (Contributed by Ronald Oussoren.)
1662
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001663 .. Patch 1446489
Georg Brandl116aa622007-08-15 14:28:22 +00001664
1665* The :mod:`zlib` module's :class:`Compress` and :class:`Decompress` objects now
1666 support a :meth:`copy` method that makes a copy of the object's internal state
1667 and returns a new :class:`Compress` or :class:`Decompress` object.
1668 (Contributed by Chris AtLee.)
1669
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001670 .. Patch 1435422
Georg Brandl116aa622007-08-15 14:28:22 +00001671
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001672.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +00001673
1674
1675.. _module-ctypes:
1676
1677The ctypes package
1678------------------
1679
1680The :mod:`ctypes` package, written by Thomas Heller, has been added to the
1681standard library. :mod:`ctypes` lets you call arbitrary functions in shared
1682libraries or DLLs. Long-time users may remember the :mod:`dl` module, which
1683provides functions for loading shared libraries and calling functions in them.
1684The :mod:`ctypes` package is much fancier.
1685
1686To load a shared library or DLL, you must create an instance of the
1687:class:`CDLL` class and provide the name or path of the shared library or DLL.
1688Once that's done, you can call arbitrary functions by accessing them as
1689attributes of the :class:`CDLL` object. ::
1690
1691 import ctypes
1692
1693 libc = ctypes.CDLL('libc.so.6')
1694 result = libc.printf("Line of output\n")
1695
1696Type constructors for the various C types are provided: :func:`c_int`,
Georg Brandl60203b42010-10-06 10:11:56 +00001697:func:`c_float`, :func:`c_double`, :func:`c_char_p` (equivalent to :c:type:`char
Georg Brandl116aa622007-08-15 14:28:22 +00001698\*`), and so forth. Unlike Python's types, the C versions are all mutable; you
1699can assign to their :attr:`value` attribute to change the wrapped value. Python
1700integers and strings will be automatically converted to the corresponding C
1701types, but for other types you must call the correct type constructor. (And I
1702mean *must*; getting it wrong will often result in the interpreter crashing
1703with a segmentation fault.)
1704
1705You shouldn't use :func:`c_char_p` with a Python string when the C function will
1706be modifying the memory area, because Python strings are supposed to be
1707immutable; breaking this rule will cause puzzling bugs. When you need a
1708modifiable memory area, use :func:`create_string_buffer`::
1709
1710 s = "this is a string"
1711 buf = ctypes.create_string_buffer(s)
1712 libc.strfry(buf)
1713
1714C functions are assumed to return integers, but you can set the :attr:`restype`
1715attribute of the function object to change this::
1716
1717 >>> libc.atof('2.71828')
1718 -1783957616
1719 >>> libc.atof.restype = ctypes.c_double
1720 >>> libc.atof('2.71828')
1721 2.71828
1722
1723:mod:`ctypes` also provides a wrapper for Python's C API as the
1724``ctypes.pythonapi`` object. This object does *not* release the global
1725interpreter lock before calling a function, because the lock must be held when
1726calling into the interpreter's code. There's a :class:`py_object()` type
Georg Brandl60203b42010-10-06 10:11:56 +00001727constructor that will create a :c:type:`PyObject \*` pointer. A simple usage::
Georg Brandl116aa622007-08-15 14:28:22 +00001728
1729 import ctypes
1730
1731 d = {}
1732 ctypes.pythonapi.PyObject_SetItem(ctypes.py_object(d),
1733 ctypes.py_object("abc"), ctypes.py_object(1))
1734 # d is now {'abc', 1}.
1735
1736Don't forget to use :class:`py_object()`; if it's omitted you end up with a
1737segmentation fault.
1738
1739:mod:`ctypes` has been around for a while, but people still write and
1740distribution hand-coded extension modules because you can't rely on
1741:mod:`ctypes` being present. Perhaps developers will begin to write Python
1742wrappers atop a library accessed through :mod:`ctypes` instead of extension
1743modules, now that :mod:`ctypes` is included with core Python.
1744
1745
1746.. seealso::
1747
1748 http://starship.python.net/crew/theller/ctypes/
1749 The ctypes web page, with a tutorial, reference, and FAQ.
1750
1751 The documentation for the :mod:`ctypes` module.
1752
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001753.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +00001754
1755
1756.. _module-etree:
1757
1758The ElementTree package
1759-----------------------
1760
1761A subset of Fredrik Lundh's ElementTree library for processing XML has been
1762added to the standard library as :mod:`xml.etree`. The available modules are
1763:mod:`ElementTree`, :mod:`ElementPath`, and :mod:`ElementInclude` from
1764ElementTree 1.2.6. The :mod:`cElementTree` accelerator module is also
1765included.
1766
1767The rest of this section will provide a brief overview of using ElementTree.
Benjamin Petersonad3d5c22009-02-26 03:38:59 +00001768Full documentation for ElementTree is available at
1769http://effbot.org/zone/element-index.htm.
Georg Brandl116aa622007-08-15 14:28:22 +00001770
1771ElementTree represents an XML document as a tree of element nodes. The text
Georg Brandla5eacee2010-07-23 16:55:26 +00001772content of the document is stored as the :attr:`text` and :attr:`tail`
Georg Brandl116aa622007-08-15 14:28:22 +00001773attributes of (This is one of the major differences between ElementTree and
1774the Document Object Model; in the DOM there are many different types of node,
1775including :class:`TextNode`.)
1776
1777The most commonly used parsing function is :func:`parse`, that takes either a
1778string (assumed to contain a filename) or a file-like object and returns an
1779:class:`ElementTree` instance::
1780
1781 from xml.etree import ElementTree as ET
1782
1783 tree = ET.parse('ex-1.xml')
1784
1785 feed = urllib.urlopen(
1786 'http://planet.python.org/rss10.xml')
1787 tree = ET.parse(feed)
1788
1789Once you have an :class:`ElementTree` instance, you can call its :meth:`getroot`
1790method to get the root :class:`Element` node.
1791
1792There's also an :func:`XML` function that takes a string literal and returns an
1793:class:`Element` node (not an :class:`ElementTree`). This function provides a
1794tidy way to incorporate XML fragments, approaching the convenience of an XML
1795literal::
1796
1797 svg = ET.XML("""<svg width="10px" version="1.0">
1798 </svg>""")
1799 svg.set('height', '320px')
1800 svg.append(elem1)
1801
1802Each XML element supports some dictionary-like and some list-like access
1803methods. Dictionary-like operations are used to access attribute values, and
1804list-like operations are used to access child nodes.
1805
1806+-------------------------------+--------------------------------------------+
1807| Operation | Result |
1808+===============================+============================================+
1809| ``elem[n]`` | Returns n'th child element. |
1810+-------------------------------+--------------------------------------------+
1811| ``elem[m:n]`` | Returns list of m'th through n'th child |
1812| | elements. |
1813+-------------------------------+--------------------------------------------+
1814| ``len(elem)`` | Returns number of child elements. |
1815+-------------------------------+--------------------------------------------+
1816| ``list(elem)`` | Returns list of child elements. |
1817+-------------------------------+--------------------------------------------+
1818| ``elem.append(elem2)`` | Adds *elem2* as a child. |
1819+-------------------------------+--------------------------------------------+
1820| ``elem.insert(index, elem2)`` | Inserts *elem2* at the specified location. |
1821+-------------------------------+--------------------------------------------+
1822| ``del elem[n]`` | Deletes n'th child element. |
1823+-------------------------------+--------------------------------------------+
1824| ``elem.keys()`` | Returns list of attribute names. |
1825+-------------------------------+--------------------------------------------+
1826| ``elem.get(name)`` | Returns value of attribute *name*. |
1827+-------------------------------+--------------------------------------------+
1828| ``elem.set(name, value)`` | Sets new value for attribute *name*. |
1829+-------------------------------+--------------------------------------------+
1830| ``elem.attrib`` | Retrieves the dictionary containing |
1831| | attributes. |
1832+-------------------------------+--------------------------------------------+
1833| ``del elem.attrib[name]`` | Deletes attribute *name*. |
1834+-------------------------------+--------------------------------------------+
1835
1836Comments and processing instructions are also represented as :class:`Element`
1837nodes. To check if a node is a comment or processing instructions::
1838
1839 if elem.tag is ET.Comment:
1840 ...
1841 elif elem.tag is ET.ProcessingInstruction:
1842 ...
1843
1844To generate XML output, you should call the :meth:`ElementTree.write` method.
1845Like :func:`parse`, it can take either a string or a file-like object::
1846
1847 # Encoding is US-ASCII
1848 tree.write('output.xml')
1849
1850 # Encoding is UTF-8
1851 f = open('output.xml', 'w')
1852 tree.write(f, encoding='utf-8')
1853
1854(Caution: the default encoding used for output is ASCII. For general XML work,
1855where an element's name may contain arbitrary Unicode characters, ASCII isn't a
1856very useful encoding because it will raise an exception if an element's name
1857contains any characters with values greater than 127. Therefore, it's best to
1858specify a different encoding such as UTF-8 that can handle any Unicode
1859character.)
1860
1861This section is only a partial description of the ElementTree interfaces. Please
1862read the package's official documentation for more details.
1863
1864
1865.. seealso::
1866
1867 http://effbot.org/zone/element-index.htm
1868 Official documentation for ElementTree.
1869
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001870.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +00001871
1872
1873.. _module-hashlib:
1874
1875The hashlib package
1876-------------------
1877
1878A new :mod:`hashlib` module, written by Gregory P. Smith, has been added to
1879replace the :mod:`md5` and :mod:`sha` modules. :mod:`hashlib` adds support for
1880additional secure hashes (SHA-224, SHA-256, SHA-384, and SHA-512). When
1881available, the module uses OpenSSL for fast platform optimized implementations
1882of algorithms.
1883
1884The old :mod:`md5` and :mod:`sha` modules still exist as wrappers around hashlib
1885to preserve backwards compatibility. The new module's interface is very close
1886to that of the old modules, but not identical. The most significant difference
1887is that the constructor functions for creating new hashing objects are named
1888differently. ::
1889
1890 # Old versions
Georg Brandl48310cd2009-01-03 21:18:54 +00001891 h = md5.md5()
1892 h = md5.new()
Georg Brandl116aa622007-08-15 14:28:22 +00001893
Georg Brandl48310cd2009-01-03 21:18:54 +00001894 # New version
Georg Brandl116aa622007-08-15 14:28:22 +00001895 h = hashlib.md5()
1896
1897 # Old versions
Georg Brandl48310cd2009-01-03 21:18:54 +00001898 h = sha.sha()
1899 h = sha.new()
Georg Brandl116aa622007-08-15 14:28:22 +00001900
Georg Brandl48310cd2009-01-03 21:18:54 +00001901 # New version
Georg Brandl116aa622007-08-15 14:28:22 +00001902 h = hashlib.sha1()
1903
1904 # Hash that weren't previously available
1905 h = hashlib.sha224()
1906 h = hashlib.sha256()
1907 h = hashlib.sha384()
1908 h = hashlib.sha512()
1909
1910 # Alternative form
1911 h = hashlib.new('md5') # Provide algorithm as a string
1912
1913Once a hash object has been created, its methods are the same as before:
Andrew Svetlova2fe3342012-08-11 21:14:08 +03001914``update(string)`` hashes the specified string into the current digest
Georg Brandl116aa622007-08-15 14:28:22 +00001915state, :meth:`digest` and :meth:`hexdigest` return the digest value as a binary
1916string or a string of hex digits, and :meth:`copy` returns a new hashing object
1917with the same digest state.
1918
1919
1920.. seealso::
1921
1922 The documentation for the :mod:`hashlib` module.
1923
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001924.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +00001925
1926
1927.. _module-sqlite:
1928
1929The sqlite3 package
1930-------------------
1931
1932The pysqlite module (http://www.pysqlite.org), a wrapper for the SQLite embedded
1933database, has been added to the standard library under the package name
1934:mod:`sqlite3`.
1935
1936SQLite is a C library that provides a lightweight disk-based database that
1937doesn't require a separate server process and allows accessing the database
1938using a nonstandard variant of the SQL query language. Some applications can use
1939SQLite for internal data storage. It's also possible to prototype an
1940application using SQLite and then port the code to a larger database such as
1941PostgreSQL or Oracle.
1942
1943pysqlite was written by Gerhard Häring and provides a SQL interface compliant
1944with the DB-API 2.0 specification described by :pep:`249`.
1945
1946If you're compiling the Python source yourself, note that the source tree
1947doesn't include the SQLite code, only the wrapper module. You'll need to have
1948the SQLite libraries and headers installed before compiling Python, and the
1949build process will compile the module when the necessary headers are available.
1950
1951To use the module, you must first create a :class:`Connection` object that
1952represents the database. Here the data will be stored in the
1953:file:`/tmp/example` file::
1954
1955 conn = sqlite3.connect('/tmp/example')
1956
1957You can also supply the special name ``:memory:`` to create a database in RAM.
1958
1959Once you have a :class:`Connection`, you can create a :class:`Cursor` object
1960and call its :meth:`execute` method to perform SQL commands::
1961
1962 c = conn.cursor()
1963
1964 # Create table
1965 c.execute('''create table stocks
1966 (date text, trans text, symbol text,
1967 qty real, price real)''')
1968
1969 # Insert a row of data
1970 c.execute("""insert into stocks
1971 values ('2006-01-05','BUY','RHAT',100,35.14)""")
1972
1973Usually your SQL operations will need to use values from Python variables. You
1974shouldn't assemble your query using Python's string operations because doing so
1975is insecure; it makes your program vulnerable to an SQL injection attack.
1976
1977Instead, use the DB-API's parameter substitution. Put ``?`` as a placeholder
1978wherever you want to use a value, and then provide a tuple of values as the
1979second argument to the cursor's :meth:`execute` method. (Other database modules
1980may use a different placeholder, such as ``%s`` or ``:1``.) For example::
1981
1982 # Never do this -- insecure!
1983 symbol = 'IBM'
1984 c.execute("... where symbol = '%s'" % symbol)
1985
1986 # Do this instead
1987 t = (symbol,)
1988 c.execute('select * from stocks where symbol=?', t)
1989
1990 # Larger example
1991 for t in (('2006-03-28', 'BUY', 'IBM', 1000, 45.00),
1992 ('2006-04-05', 'BUY', 'MSOFT', 1000, 72.00),
1993 ('2006-04-06', 'SELL', 'IBM', 500, 53.00),
1994 ):
1995 c.execute('insert into stocks values (?,?,?,?,?)', t)
1996
1997To retrieve data after executing a SELECT statement, you can either treat the
1998cursor as an iterator, call the cursor's :meth:`fetchone` method to retrieve a
1999single matching row, or call :meth:`fetchall` to get a list of the matching
2000rows.
2001
2002This example uses the iterator form::
2003
2004 >>> c = conn.cursor()
2005 >>> c.execute('select * from stocks order by price')
2006 >>> for row in c:
2007 ... print row
2008 ...
2009 (u'2006-01-05', u'BUY', u'RHAT', 100, 35.140000000000001)
2010 (u'2006-03-28', u'BUY', u'IBM', 1000, 45.0)
2011 (u'2006-04-06', u'SELL', u'IBM', 500, 53.0)
2012 (u'2006-04-05', u'BUY', u'MSOFT', 1000, 72.0)
2013 >>>
2014
2015For more information about the SQL dialect supported by SQLite, see
2016http://www.sqlite.org.
2017
2018
2019.. seealso::
2020
2021 http://www.pysqlite.org
2022 The pysqlite web page.
2023
2024 http://www.sqlite.org
2025 The SQLite web page; the documentation describes the syntax and the available
2026 data types for the supported SQL dialect.
2027
2028 The documentation for the :mod:`sqlite3` module.
2029
2030 :pep:`249` - Database API Specification 2.0
2031 PEP written by Marc-André Lemburg.
2032
Christian Heimes5b5e81c2007-12-31 16:14:33 +00002033.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +00002034
2035
2036.. _module-wsgiref:
2037
2038The wsgiref package
2039-------------------
2040
2041The Web Server Gateway Interface (WSGI) v1.0 defines a standard interface
2042between web servers and Python web applications and is described in :pep:`333`.
2043The :mod:`wsgiref` package is a reference implementation of the WSGI
2044specification.
2045
Christian Heimes5b5e81c2007-12-31 16:14:33 +00002046.. XXX should this be in a PEP 333 section instead?
Georg Brandl116aa622007-08-15 14:28:22 +00002047
2048The package includes a basic HTTP server that will run a WSGI application; this
2049server is useful for debugging but isn't intended for production use. Setting
2050up a server takes only a few lines of code::
2051
2052 from wsgiref import simple_server
2053
2054 wsgi_app = ...
2055
2056 host = ''
2057 port = 8000
2058 httpd = simple_server.make_server(host, port, wsgi_app)
2059 httpd.serve_forever()
2060
Christian Heimes5b5e81c2007-12-31 16:14:33 +00002061.. XXX discuss structure of WSGI applications?
2062.. XXX provide an example using Django or some other framework?
Georg Brandl116aa622007-08-15 14:28:22 +00002063
2064
2065.. seealso::
2066
2067 http://www.wsgi.org
2068 A central web site for WSGI-related resources.
2069
2070 :pep:`333` - Python Web Server Gateway Interface v1.0
2071 PEP written by Phillip J. Eby.
2072
Christian Heimes5b5e81c2007-12-31 16:14:33 +00002073.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +00002074
2075
2076.. _build-api:
2077
2078Build and C API Changes
2079=======================
2080
2081Changes to Python's build process and to the C API include:
2082
2083* The Python source tree was converted from CVS to Subversion, in a complex
2084 migration procedure that was supervised and flawlessly carried out by Martin von
2085 Löwis. The procedure was developed as :pep:`347`.
2086
2087* Coverity, a company that markets a source code analysis tool called Prevent,
2088 provided the results of their examination of the Python source code. The
2089 analysis found about 60 bugs that were quickly fixed. Many of the bugs were
2090 refcounting problems, often occurring in error-handling code. See
2091 http://scan.coverity.com for the statistics.
2092
2093* The largest change to the C API came from :pep:`353`, which modifies the
Georg Brandl60203b42010-10-06 10:11:56 +00002094 interpreter to use a :c:type:`Py_ssize_t` type definition instead of
2095 :c:type:`int`. See the earlier section :ref:`pep-353` for a discussion of this
Georg Brandl116aa622007-08-15 14:28:22 +00002096 change.
2097
2098* The design of the bytecode compiler has changed a great deal, no longer
2099 generating bytecode by traversing the parse tree. Instead the parse tree is
2100 converted to an abstract syntax tree (or AST), and it is the abstract syntax
2101 tree that's traversed to produce the bytecode.
2102
2103 It's possible for Python code to obtain AST objects by using the
2104 :func:`compile` built-in and specifying ``_ast.PyCF_ONLY_AST`` as the value of
2105 the *flags* parameter::
2106
2107 from _ast import PyCF_ONLY_AST
2108 ast = compile("""a=0
2109 for i in range(10):
2110 a += i
2111 """, "<string>", 'exec', PyCF_ONLY_AST)
2112
2113 assignment = ast.body[0]
2114 for_loop = ast.body[1]
2115
2116 No official documentation has been written for the AST code yet, but :pep:`339`
2117 discusses the design. To start learning about the code, read the definition of
2118 the various AST nodes in :file:`Parser/Python.asdl`. A Python script reads this
2119 file and generates a set of C structure definitions in
Georg Brandl60203b42010-10-06 10:11:56 +00002120 :file:`Include/Python-ast.h`. The :c:func:`PyParser_ASTFromString` and
2121 :c:func:`PyParser_ASTFromFile`, defined in :file:`Include/pythonrun.h`, take
Georg Brandl116aa622007-08-15 14:28:22 +00002122 Python source as input and return the root of an AST representing the contents.
Georg Brandl60203b42010-10-06 10:11:56 +00002123 This AST can then be turned into a code object by :c:func:`PyAST_Compile`. For
Georg Brandl116aa622007-08-15 14:28:22 +00002124 more information, read the source code, and then ask questions on python-dev.
2125
2126 The AST code was developed under Jeremy Hylton's management, and implemented by
2127 (in alphabetical order) Brett Cannon, Nick Coghlan, Grant Edwards, John
2128 Ehresman, Kurt Kaiser, Neal Norwitz, Tim Peters, Armin Rigo, and Neil
2129 Schemenauer, plus the participants in a number of AST sprints at conferences
2130 such as PyCon.
2131
Christian Heimes5b5e81c2007-12-31 16:14:33 +00002132 .. List of names taken from Jeremy's python-dev post at
2133 .. http://mail.python.org/pipermail/python-dev/2005-October/057500.html
Georg Brandl116aa622007-08-15 14:28:22 +00002134
2135* Evan Jones's patch to obmalloc, first described in a talk at PyCon DC 2005,
2136 was applied. Python 2.4 allocated small objects in 256K-sized arenas, but never
2137 freed arenas. With this patch, Python will free arenas when they're empty. The
2138 net effect is that on some platforms, when you allocate many objects, Python's
2139 memory usage may actually drop when you delete them and the memory may be
2140 returned to the operating system. (Implemented by Evan Jones, and reworked by
2141 Tim Peters.)
2142
2143 Note that this change means extension modules must be more careful when
2144 allocating memory. Python's API has many different functions for allocating
Georg Brandl60203b42010-10-06 10:11:56 +00002145 memory that are grouped into families. For example, :c:func:`PyMem_Malloc`,
2146 :c:func:`PyMem_Realloc`, and :c:func:`PyMem_Free` are one family that allocates
2147 raw memory, while :c:func:`PyObject_Malloc`, :c:func:`PyObject_Realloc`, and
2148 :c:func:`PyObject_Free` are another family that's supposed to be used for
Georg Brandl116aa622007-08-15 14:28:22 +00002149 creating Python objects.
2150
2151 Previously these different families all reduced to the platform's
Georg Brandl60203b42010-10-06 10:11:56 +00002152 :c:func:`malloc` and :c:func:`free` functions. This meant it didn't matter if
2153 you got things wrong and allocated memory with the :c:func:`PyMem` function but
2154 freed it with the :c:func:`PyObject` function. With 2.5's changes to obmalloc,
Georg Brandl116aa622007-08-15 14:28:22 +00002155 these families now do different things and mismatches will probably result in a
2156 segfault. You should carefully test your C extension modules with Python 2.5.
2157
Georg Brandl60203b42010-10-06 10:11:56 +00002158* The built-in set types now have an official C API. Call :c:func:`PySet_New`
2159 and :c:func:`PyFrozenSet_New` to create a new set, :c:func:`PySet_Add` and
2160 :c:func:`PySet_Discard` to add and remove elements, and :c:func:`PySet_Contains`
2161 and :c:func:`PySet_Size` to examine the set's state. (Contributed by Raymond
Georg Brandl116aa622007-08-15 14:28:22 +00002162 Hettinger.)
2163
2164* C code can now obtain information about the exact revision of the Python
Georg Brandl60203b42010-10-06 10:11:56 +00002165 interpreter by calling the :c:func:`Py_GetBuildInfo` function that returns a
Georg Brandl116aa622007-08-15 14:28:22 +00002166 string of build information like this: ``"trunk:45355:45356M, Apr 13 2006,
2167 07:42:19"``. (Contributed by Barry Warsaw.)
2168
2169* Two new macros can be used to indicate C functions that are local to the
2170 current file so that a faster calling convention can be used.
Andrew Svetlova2fe3342012-08-11 21:14:08 +03002171 ``Py_LOCAL(type)`` declares the function as returning a value of the
Georg Brandl116aa622007-08-15 14:28:22 +00002172 specified *type* and uses a fast-calling qualifier.
Andrew Svetlova2fe3342012-08-11 21:14:08 +03002173 ``Py_LOCAL_INLINE(type)`` does the same thing and also requests the
Georg Brandl60203b42010-10-06 10:11:56 +00002174 function be inlined. If :c:func:`PY_LOCAL_AGGRESSIVE` is defined before
Georg Brandl116aa622007-08-15 14:28:22 +00002175 :file:`python.h` is included, a set of more aggressive optimizations are enabled
2176 for the module; you should benchmark the results to find out if these
2177 optimizations actually make the code faster. (Contributed by Fredrik Lundh at
2178 the NeedForSpeed sprint.)
2179
Andrew Svetlova2fe3342012-08-11 21:14:08 +03002180* ``PyErr_NewException(name, base, dict)`` can now accept a tuple of base
Georg Brandl116aa622007-08-15 14:28:22 +00002181 classes as its *base* argument. (Contributed by Georg Brandl.)
2182
Georg Brandl60203b42010-10-06 10:11:56 +00002183* The :c:func:`PyErr_Warn` function for issuing warnings is now deprecated in
Andrew Svetlova2fe3342012-08-11 21:14:08 +03002184 favour of ``PyErr_WarnEx(category, message, stacklevel)`` which lets you
Georg Brandl116aa622007-08-15 14:28:22 +00002185 specify the number of stack frames separating this function and the caller. A
Georg Brandl60203b42010-10-06 10:11:56 +00002186 *stacklevel* of 1 is the function calling :c:func:`PyErr_WarnEx`, 2 is the
Georg Brandl116aa622007-08-15 14:28:22 +00002187 function above that, and so forth. (Added by Neal Norwitz.)
2188
2189* The CPython interpreter is still written in C, but the code can now be
2190 compiled with a C++ compiler without errors. (Implemented by Anthony Baxter,
2191 Martin von Löwis, Skip Montanaro.)
2192
Georg Brandl60203b42010-10-06 10:11:56 +00002193* The :c:func:`PyRange_New` function was removed. It was never documented, never
Georg Brandl116aa622007-08-15 14:28:22 +00002194 used in the core code, and had dangerously lax error checking. In the unlikely
2195 case that your extensions were using it, you can replace it by something like
2196 the following::
2197
Georg Brandl48310cd2009-01-03 21:18:54 +00002198 range = PyObject_CallFunction((PyObject*) &PyRange_Type, "lll",
Georg Brandl116aa622007-08-15 14:28:22 +00002199 start, stop, step);
2200
Christian Heimes5b5e81c2007-12-31 16:14:33 +00002201.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +00002202
2203
2204.. _ports:
2205
2206Port-Specific Changes
2207---------------------
2208
2209* MacOS X (10.3 and higher): dynamic loading of modules now uses the
Georg Brandl60203b42010-10-06 10:11:56 +00002210 :c:func:`dlopen` function instead of MacOS-specific functions.
Georg Brandl116aa622007-08-15 14:28:22 +00002211
Georg Brandl0c77a822008-06-10 16:37:50 +00002212* MacOS X: an :option:`--enable-universalsdk` switch was added to the
Georg Brandl116aa622007-08-15 14:28:22 +00002213 :program:`configure` script that compiles the interpreter as a universal binary
2214 able to run on both PowerPC and Intel processors. (Contributed by Ronald
Georg Brandl0c77a822008-06-10 16:37:50 +00002215 Oussoren; :issue:`2573`.)
Georg Brandl116aa622007-08-15 14:28:22 +00002216
2217* Windows: :file:`.dll` is no longer supported as a filename extension for
2218 extension modules. :file:`.pyd` is now the only filename extension that will be
2219 searched for.
2220
Christian Heimes5b5e81c2007-12-31 16:14:33 +00002221.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +00002222
2223
2224.. _porting:
2225
2226Porting to Python 2.5
2227=====================
2228
2229This section lists previously described changes that may require changes to your
2230code:
2231
2232* ASCII is now the default encoding for modules. It's now a syntax error if a
2233 module contains string literals with 8-bit characters but doesn't have an
2234 encoding declaration. In Python 2.4 this triggered a warning, not a syntax
2235 error.
2236
2237* Previously, the :attr:`gi_frame` attribute of a generator was always a frame
2238 object. Because of the :pep:`342` changes described in section :ref:`pep-342`,
2239 it's now possible for :attr:`gi_frame` to be ``None``.
2240
2241* A new warning, :class:`UnicodeWarning`, is triggered when you attempt to
2242 compare a Unicode string and an 8-bit string that can't be converted to Unicode
2243 using the default ASCII encoding. Previously such comparisons would raise a
2244 :class:`UnicodeDecodeError` exception.
2245
2246* Library: the :mod:`csv` module is now stricter about multi-line quoted fields.
2247 If your files contain newlines embedded within fields, the input should be split
2248 into lines in a manner which preserves the newline characters.
2249
2250* Library: the :mod:`locale` module's :func:`format` function's would
2251 previously accept any string as long as no more than one %char specifier
2252 appeared. In Python 2.5, the argument must be exactly one %char specifier with
2253 no surrounding text.
2254
2255* Library: The :mod:`pickle` and :mod:`cPickle` modules no longer accept a
2256 return value of ``None`` from the :meth:`__reduce__` method; the method must
2257 return a tuple of arguments instead. The modules also no longer accept the
2258 deprecated *bin* keyword parameter.
2259
2260* Library: The :mod:`SimpleXMLRPCServer` and :mod:`DocXMLRPCServer` classes now
2261 have a :attr:`rpc_paths` attribute that constrains XML-RPC operations to a
2262 limited set of URL paths; the default is to allow only ``'/'`` and ``'/RPC2'``.
2263 Setting :attr:`rpc_paths` to ``None`` or an empty tuple disables this path
2264 checking.
2265
Georg Brandl60203b42010-10-06 10:11:56 +00002266* C API: Many functions now use :c:type:`Py_ssize_t` instead of :c:type:`int` to
Georg Brandl116aa622007-08-15 14:28:22 +00002267 allow processing more data on 64-bit machines. Extension code may need to make
2268 the same change to avoid warnings and to support 64-bit machines. See the
2269 earlier section :ref:`pep-353` for a discussion of this change.
2270
2271* C API: The obmalloc changes mean that you must be careful to not mix usage
Georg Brandl60203b42010-10-06 10:11:56 +00002272 of the :c:func:`PyMem_\*` and :c:func:`PyObject_\*` families of functions. Memory
2273 allocated with one family's :c:func:`\*_Malloc` must be freed with the
2274 corresponding family's :c:func:`\*_Free` function.
Georg Brandl116aa622007-08-15 14:28:22 +00002275
Christian Heimes5b5e81c2007-12-31 16:14:33 +00002276.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +00002277
2278
Georg Brandl116aa622007-08-15 14:28:22 +00002279Acknowledgements
2280================
2281
2282The author would like to thank the following people for offering suggestions,
2283corrections and assistance with various drafts of this article: Georg Brandl,
2284Nick Coghlan, Phillip J. Eby, Lars Gustäbel, Raymond Hettinger, Ralf W. Grosse-
2285Kunstleve, Kent Johnson, Iain Lowe, Martin von Löwis, Fredrik Lundh, Andrew
2286McNamara, Skip Montanaro, Gustavo Niemeyer, Paul Prescod, James Pryor, Mike
2287Rovner, Scott Weikart, Barry Warsaw, Thomas Wouters.
2288