Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1 | **************************** |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 2 | What's New in Python 2.5 |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 3 | **************************** |
| 4 | |
| 5 | :Author: A.M. Kuchling |
| 6 | |
| 7 | .. |release| replace:: 1.01 |
| 8 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 9 | .. $Id: whatsnew25.tex 56611 2007-07-29 08:26:10Z georg.brandl $ |
| 10 | .. Fix XXX comments |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 11 | |
| 12 | This article explains the new features in Python 2.5. The final release of |
| 13 | Python 2.5 is scheduled for August 2006; :pep:`356` describes the planned |
| 14 | release schedule. |
| 15 | |
| 16 | The changes in Python 2.5 are an interesting mix of language and library |
| 17 | improvements. The library enhancements will be more important to Python's user |
| 18 | community, I think, because several widely-useful packages were added. New |
Georg Brandl | 151f42f | 2008-10-08 18:57:13 +0000 | [diff] [blame] | 19 | modules include ElementTree for XML processing (:mod:`xml.etree`), |
| 20 | the SQLite database module (:mod:`sqlite`), and the :mod:`ctypes` |
| 21 | module for calling C functions. |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 22 | |
| 23 | The language changes are of middling significance. Some pleasant new features |
| 24 | were added, but most of them aren't features that you'll use every day. |
| 25 | Conditional expressions were finally added to the language using a novel syntax; |
| 26 | see section :ref:`pep-308`. The new ':keyword:`with`' statement will make |
| 27 | writing cleanup code easier (section :ref:`pep-343`). Values can now be passed |
| 28 | into generators (section :ref:`pep-342`). Imports are now visible as either |
| 29 | absolute or relative (section :ref:`pep-328`). Some corner cases of exception |
| 30 | handling are handled better (section :ref:`pep-341`). All these improvements |
| 31 | are worthwhile, but they're improvements to one specific language feature or |
| 32 | another; none of them are broad modifications to Python's semantics. |
| 33 | |
| 34 | As well as the language and library additions, other improvements and bugfixes |
| 35 | were made throughout the source tree. A search through the SVN change logs |
| 36 | finds there were 353 patches applied and 458 bugs fixed between Python 2.4 and |
| 37 | 2.5. (Both figures are likely to be underestimates.) |
| 38 | |
| 39 | This article doesn't try to be a complete specification of the new features; |
| 40 | instead changes are briefly introduced using helpful examples. For full |
| 41 | details, you should always refer to the documentation for Python 2.5 at |
| 42 | http://docs.python.org. If you want to understand the complete implementation |
| 43 | and design rationale, refer to the PEP for a particular new feature. |
| 44 | |
| 45 | Comments, suggestions, and error reports for this document are welcome; please |
| 46 | e-mail them to the author or open a bug in the Python bug tracker. |
| 47 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 48 | .. ====================================================================== |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 49 | |
| 50 | |
| 51 | .. _pep-308: |
| 52 | |
| 53 | PEP 308: Conditional Expressions |
| 54 | ================================ |
| 55 | |
| 56 | For a long time, people have been requesting a way to write conditional |
| 57 | expressions, which are expressions that return value A or value B depending on |
| 58 | whether a Boolean value is true or false. A conditional expression lets you |
| 59 | write a single assignment statement that has the same effect as the following:: |
| 60 | |
| 61 | if condition: |
| 62 | x = true_value |
| 63 | else: |
| 64 | x = false_value |
| 65 | |
| 66 | There have been endless tedious discussions of syntax on both python-dev and |
| 67 | comp.lang.python. A vote was even held that found the majority of voters wanted |
| 68 | conditional expressions in some form, but there was no syntax that was preferred |
| 69 | by a clear majority. Candidates included C's ``cond ? true_v : false_v``, ``if |
| 70 | cond then true_v else false_v``, and 16 other variations. |
| 71 | |
| 72 | Guido van Rossum eventually chose a surprising syntax:: |
| 73 | |
| 74 | x = true_value if condition else false_value |
| 75 | |
| 76 | Evaluation is still lazy as in existing Boolean expressions, so the order of |
| 77 | evaluation jumps around a bit. The *condition* expression in the middle is |
| 78 | evaluated first, and the *true_value* expression is evaluated only if the |
| 79 | condition was true. Similarly, the *false_value* expression is only evaluated |
| 80 | when the condition is false. |
| 81 | |
| 82 | This syntax may seem strange and backwards; why does the condition go in the |
| 83 | *middle* of the expression, and not in the front as in C's ``c ? x : y``? The |
| 84 | decision was checked by applying the new syntax to the modules in the standard |
| 85 | library and seeing how the resulting code read. In many cases where a |
| 86 | conditional expression is used, one value seems to be the 'common case' and one |
| 87 | value is an 'exceptional case', used only on rarer occasions when the condition |
| 88 | isn't met. The conditional syntax makes this pattern a bit more obvious:: |
| 89 | |
| 90 | contents = ((doc + '\n') if doc else '') |
| 91 | |
| 92 | I read the above statement as meaning "here *contents* is usually assigned a |
| 93 | value of ``doc+'\n'``; sometimes *doc* is empty, in which special case an empty |
| 94 | string is returned." I doubt I will use conditional expressions very often |
| 95 | where there isn't a clear common and uncommon case. |
| 96 | |
| 97 | There was some discussion of whether the language should require surrounding |
| 98 | conditional expressions with parentheses. The decision was made to *not* |
| 99 | require parentheses in the Python language's grammar, but as a matter of style I |
| 100 | think you should always use them. Consider these two statements:: |
| 101 | |
| 102 | # First version -- no parens |
| 103 | level = 1 if logging else 0 |
| 104 | |
| 105 | # Second version -- with parens |
| 106 | level = (1 if logging else 0) |
| 107 | |
| 108 | In the first version, I think a reader's eye might group the statement into |
| 109 | 'level = 1', 'if logging', 'else 0', and think that the condition decides |
| 110 | whether the assignment to *level* is performed. The second version reads |
| 111 | better, in my opinion, because it makes it clear that the assignment is always |
| 112 | performed and the choice is being made between two values. |
| 113 | |
| 114 | Another reason for including the brackets: a few odd combinations of list |
| 115 | comprehensions and lambdas could look like incorrect conditional expressions. |
| 116 | See :pep:`308` for some examples. If you put parentheses around your |
| 117 | conditional expressions, you won't run into this case. |
| 118 | |
| 119 | |
| 120 | .. seealso:: |
| 121 | |
| 122 | :pep:`308` - Conditional Expressions |
| 123 | PEP written by Guido van Rossum and Raymond D. Hettinger; implemented by Thomas |
| 124 | Wouters. |
| 125 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 126 | .. ====================================================================== |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 127 | |
| 128 | |
| 129 | .. _pep-309: |
| 130 | |
| 131 | PEP 309: Partial Function Application |
| 132 | ===================================== |
| 133 | |
| 134 | The :mod:`functools` module is intended to contain tools for functional-style |
| 135 | programming. |
| 136 | |
| 137 | One useful tool in this module is the :func:`partial` function. For programs |
| 138 | written in a functional style, you'll sometimes want to construct variants of |
| 139 | existing functions that have some of the parameters filled in. Consider a |
| 140 | Python function ``f(a, b, c)``; you could create a new function ``g(b, c)`` that |
| 141 | was equivalent to ``f(1, b, c)``. This is called "partial function |
| 142 | application". |
| 143 | |
| 144 | :func:`partial` takes the arguments ``(function, arg1, arg2, ... kwarg1=value1, |
| 145 | kwarg2=value2)``. The resulting object is callable, so you can just call it to |
| 146 | invoke *function* with the filled-in arguments. |
| 147 | |
| 148 | Here's a small but realistic example:: |
| 149 | |
| 150 | import functools |
| 151 | |
| 152 | def log (message, subsystem): |
| 153 | "Write the contents of 'message' to the specified subsystem." |
| 154 | print '%s: %s' % (subsystem, message) |
| 155 | ... |
| 156 | |
| 157 | server_log = functools.partial(log, subsystem='server') |
| 158 | server_log('Unable to open socket') |
| 159 | |
| 160 | Here's another example, from a program that uses PyGTK. Here a context- |
| 161 | sensitive pop-up menu is being constructed dynamically. The callback provided |
| 162 | for the menu option is a partially applied version of the :meth:`open_item` |
| 163 | method, where the first argument has been provided. :: |
| 164 | |
| 165 | ... |
| 166 | class Application: |
| 167 | def open_item(self, path): |
| 168 | ... |
| 169 | def init (self): |
| 170 | open_func = functools.partial(self.open_item, item_path) |
| 171 | popup_menu.append( ("Open", open_func, 1) ) |
| 172 | |
| 173 | Another function in the :mod:`functools` module is the |
| 174 | :func:`update_wrapper(wrapper, wrapped)` function that helps you write well- |
| 175 | behaved decorators. :func:`update_wrapper` copies the name, module, and |
| 176 | docstring attribute to a wrapper function so that tracebacks inside the wrapped |
| 177 | function are easier to understand. For example, you might write:: |
| 178 | |
| 179 | def my_decorator(f): |
| 180 | def wrapper(*args, **kwds): |
| 181 | print 'Calling decorated function' |
| 182 | return f(*args, **kwds) |
| 183 | functools.update_wrapper(wrapper, f) |
| 184 | return wrapper |
| 185 | |
| 186 | :func:`wraps` is a decorator that can be used inside your own decorators to copy |
| 187 | the wrapped function's information. An alternate version of the previous |
| 188 | example would be:: |
| 189 | |
| 190 | def my_decorator(f): |
| 191 | @functools.wraps(f) |
| 192 | def wrapper(*args, **kwds): |
| 193 | print 'Calling decorated function' |
| 194 | return f(*args, **kwds) |
| 195 | return wrapper |
| 196 | |
| 197 | |
| 198 | .. seealso:: |
| 199 | |
| 200 | :pep:`309` - Partial Function Application |
| 201 | PEP proposed and written by Peter Harris; implemented by Hye-Shik Chang and Nick |
| 202 | Coghlan, with adaptations by Raymond Hettinger. |
| 203 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 204 | .. ====================================================================== |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 205 | |
| 206 | |
| 207 | .. _pep-314: |
| 208 | |
| 209 | PEP 314: Metadata for Python Software Packages v1.1 |
| 210 | =================================================== |
| 211 | |
| 212 | Some simple dependency support was added to Distutils. The :func:`setup` |
| 213 | function now has ``requires``, ``provides``, and ``obsoletes`` keyword |
| 214 | parameters. When you build a source distribution using the ``sdist`` command, |
| 215 | the dependency information will be recorded in the :file:`PKG-INFO` file. |
| 216 | |
| 217 | Another new keyword parameter is ``download_url``, which should be set to a URL |
| 218 | for the package's source code. This means it's now possible to look up an entry |
| 219 | in the package index, determine the dependencies for a package, and download the |
| 220 | required packages. :: |
| 221 | |
| 222 | VERSION = '1.0' |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 223 | setup(name='PyPackage', |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 224 | version=VERSION, |
| 225 | requires=['numarray', 'zlib (>=1.1.4)'], |
| 226 | obsoletes=['OldPackage'] |
| 227 | download_url=('http://www.example.com/pypackage/dist/pkg-%s.tar.gz' |
| 228 | % VERSION), |
| 229 | ) |
| 230 | |
| 231 | Another new enhancement to the Python package index at |
| 232 | http://cheeseshop.python.org is storing source and binary archives for a |
| 233 | package. The new :command:`upload` Distutils command will upload a package to |
| 234 | the repository. |
| 235 | |
| 236 | Before a package can be uploaded, you must be able to build a distribution using |
| 237 | the :command:`sdist` Distutils command. Once that works, you can run ``python |
| 238 | setup.py upload`` to add your package to the PyPI archive. Optionally you can |
| 239 | GPG-sign the package by supplying the :option:`--sign` and :option:`--identity` |
| 240 | options. |
| 241 | |
| 242 | Package uploading was implemented by Martin von Löwis and Richard Jones. |
| 243 | |
| 244 | |
| 245 | .. seealso:: |
| 246 | |
| 247 | :pep:`314` - Metadata for Python Software Packages v1.1 |
| 248 | PEP proposed and written by A.M. Kuchling, Richard Jones, and Fred Drake; |
| 249 | implemented by Richard Jones and Fred Drake. |
| 250 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 251 | .. ====================================================================== |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 252 | |
| 253 | |
| 254 | .. _pep-328: |
| 255 | |
| 256 | PEP 328: Absolute and Relative Imports |
| 257 | ====================================== |
| 258 | |
| 259 | The simpler part of PEP 328 was implemented in Python 2.4: parentheses could now |
| 260 | be used to enclose the names imported from a module using the ``from ... import |
| 261 | ...`` statement, making it easier to import many different names. |
| 262 | |
| 263 | The more complicated part has been implemented in Python 2.5: importing a module |
| 264 | can be specified to use absolute or package-relative imports. The plan is to |
| 265 | move toward making absolute imports the default in future versions of Python. |
| 266 | |
| 267 | Let's say you have a package directory like this:: |
| 268 | |
| 269 | pkg/ |
| 270 | pkg/__init__.py |
| 271 | pkg/main.py |
| 272 | pkg/string.py |
| 273 | |
| 274 | This defines a package named :mod:`pkg` containing the :mod:`pkg.main` and |
| 275 | :mod:`pkg.string` submodules. |
| 276 | |
| 277 | Consider the code in the :file:`main.py` module. What happens if it executes |
| 278 | the statement ``import string``? In Python 2.4 and earlier, it will first look |
| 279 | in the package's directory to perform a relative import, finds |
| 280 | :file:`pkg/string.py`, imports the contents of that file as the |
| 281 | :mod:`pkg.string` module, and that module is bound to the name ``string`` in the |
| 282 | :mod:`pkg.main` module's namespace. |
| 283 | |
| 284 | That's fine if :mod:`pkg.string` was what you wanted. But what if you wanted |
| 285 | Python's standard :mod:`string` module? There's no clean way to ignore |
| 286 | :mod:`pkg.string` and look for the standard module; generally you had to look at |
| 287 | the contents of ``sys.modules``, which is slightly unclean. Holger Krekel's |
| 288 | :mod:`py.std` package provides a tidier way to perform imports from the standard |
| 289 | library, ``import py ; py.std.string.join()``, but that package isn't available |
| 290 | on all Python installations. |
| 291 | |
| 292 | Reading code which relies on relative imports is also less clear, because a |
| 293 | reader may be confused about which module, :mod:`string` or :mod:`pkg.string`, |
| 294 | is intended to be used. Python users soon learned not to duplicate the names of |
| 295 | standard library modules in the names of their packages' submodules, but you |
| 296 | can't protect against having your submodule's name being used for a new module |
| 297 | added in a future version of Python. |
| 298 | |
| 299 | In Python 2.5, you can switch :keyword:`import`'s behaviour to absolute imports |
| 300 | using a ``from __future__ import absolute_import`` directive. This absolute- |
| 301 | import behaviour will become the default in a future version (probably Python |
| 302 | 2.7). Once absolute imports are the default, ``import string`` will always |
| 303 | find the standard library's version. It's suggested that users should begin |
| 304 | using absolute imports as much as possible, so it's preferable to begin writing |
| 305 | ``from pkg import string`` in your code. |
| 306 | |
| 307 | Relative imports are still possible by adding a leading period to the module |
| 308 | name when using the ``from ... import`` form:: |
| 309 | |
| 310 | # Import names from pkg.string |
| 311 | from .string import name1, name2 |
| 312 | # Import pkg.string |
| 313 | from . import string |
| 314 | |
| 315 | This imports the :mod:`string` module relative to the current package, so in |
| 316 | :mod:`pkg.main` this will import *name1* and *name2* from :mod:`pkg.string`. |
| 317 | Additional leading periods perform the relative import starting from the parent |
| 318 | of the current package. For example, code in the :mod:`A.B.C` module can do:: |
| 319 | |
| 320 | from . import D # Imports A.B.D |
| 321 | from .. import E # Imports A.E |
| 322 | from ..F import G # Imports A.F.G |
| 323 | |
| 324 | Leading periods cannot be used with the ``import modname`` form of the import |
| 325 | statement, only the ``from ... import`` form. |
| 326 | |
| 327 | |
| 328 | .. seealso:: |
| 329 | |
| 330 | :pep:`328` - Imports: Multi-Line and Absolute/Relative |
| 331 | PEP written by Aahz; implemented by Thomas Wouters. |
| 332 | |
| 333 | http://codespeak.net/py/current/doc/index.html |
| 334 | The py library by Holger Krekel, which contains the :mod:`py.std` package. |
| 335 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 336 | .. ====================================================================== |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 337 | |
| 338 | |
| 339 | .. _pep-338: |
| 340 | |
| 341 | PEP 338: Executing Modules as Scripts |
| 342 | ===================================== |
| 343 | |
| 344 | The :option:`-m` switch added in Python 2.4 to execute a module as a script |
| 345 | gained a few more abilities. Instead of being implemented in C code inside the |
| 346 | Python interpreter, the switch now uses an implementation in a new module, |
| 347 | :mod:`runpy`. |
| 348 | |
| 349 | The :mod:`runpy` module implements a more sophisticated import mechanism so that |
| 350 | it's now possible to run modules in a package such as :mod:`pychecker.checker`. |
| 351 | The module also supports alternative import mechanisms such as the |
| 352 | :mod:`zipimport` module. This means you can add a .zip archive's path to |
| 353 | ``sys.path`` and then use the :option:`-m` switch to execute code from the |
| 354 | archive. |
| 355 | |
| 356 | |
| 357 | .. seealso:: |
| 358 | |
| 359 | :pep:`338` - Executing modules as scripts |
| 360 | PEP written and implemented by Nick Coghlan. |
| 361 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 362 | .. ====================================================================== |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 363 | |
| 364 | |
| 365 | .. _pep-341: |
| 366 | |
| 367 | PEP 341: Unified try/except/finally |
| 368 | =================================== |
| 369 | |
| 370 | Until Python 2.5, the :keyword:`try` statement came in two flavours. You could |
| 371 | use a :keyword:`finally` block to ensure that code is always executed, or one or |
| 372 | more :keyword:`except` blocks to catch specific exceptions. You couldn't |
| 373 | combine both :keyword:`except` blocks and a :keyword:`finally` block, because |
| 374 | generating the right bytecode for the combined version was complicated and it |
| 375 | wasn't clear what the semantics of the combined statement should be. |
| 376 | |
| 377 | Guido van Rossum spent some time working with Java, which does support the |
| 378 | equivalent of combining :keyword:`except` blocks and a :keyword:`finally` block, |
| 379 | and this clarified what the statement should mean. In Python 2.5, you can now |
| 380 | write:: |
| 381 | |
| 382 | try: |
| 383 | block-1 ... |
| 384 | except Exception1: |
| 385 | handler-1 ... |
| 386 | except Exception2: |
| 387 | handler-2 ... |
| 388 | else: |
| 389 | else-block |
| 390 | finally: |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 391 | final-block |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 392 | |
| 393 | The code in *block-1* is executed. If the code raises an exception, the various |
| 394 | :keyword:`except` blocks are tested: if the exception is of class |
| 395 | :class:`Exception1`, *handler-1* is executed; otherwise if it's of class |
| 396 | :class:`Exception2`, *handler-2* is executed, and so forth. If no exception is |
| 397 | raised, the *else-block* is executed. |
| 398 | |
| 399 | No matter what happened previously, the *final-block* is executed once the code |
| 400 | block is complete and any raised exceptions handled. Even if there's an error in |
| 401 | an exception handler or the *else-block* and a new exception is raised, the code |
| 402 | in the *final-block* is still run. |
| 403 | |
| 404 | |
| 405 | .. seealso:: |
| 406 | |
| 407 | :pep:`341` - Unifying try-except and try-finally |
| 408 | PEP written by Georg Brandl; implementation by Thomas Lee. |
| 409 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 410 | .. ====================================================================== |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 411 | |
| 412 | |
| 413 | .. _pep-342: |
| 414 | |
| 415 | PEP 342: New Generator Features |
| 416 | =============================== |
| 417 | |
| 418 | Python 2.5 adds a simple way to pass values *into* a generator. As introduced in |
| 419 | Python 2.3, generators only produce output; once a generator's code was invoked |
| 420 | to create an iterator, there was no way to pass any new information into the |
| 421 | function when its execution is resumed. Sometimes the ability to pass in some |
| 422 | information would be useful. Hackish solutions to this include making the |
| 423 | generator's code look at a global variable and then changing the global |
| 424 | variable's value, or passing in some mutable object that callers then modify. |
| 425 | |
| 426 | To refresh your memory of basic generators, here's a simple example:: |
| 427 | |
| 428 | def counter (maximum): |
| 429 | i = 0 |
| 430 | while i < maximum: |
| 431 | yield i |
| 432 | i += 1 |
| 433 | |
| 434 | When you call ``counter(10)``, the result is an iterator that returns the values |
| 435 | from 0 up to 9. On encountering the :keyword:`yield` statement, the iterator |
| 436 | returns the provided value and suspends the function's execution, preserving the |
| 437 | local variables. Execution resumes on the following call to the iterator's |
| 438 | :meth:`next` method, picking up after the :keyword:`yield` statement. |
| 439 | |
| 440 | In Python 2.3, :keyword:`yield` was a statement; it didn't return any value. In |
| 441 | 2.5, :keyword:`yield` is now an expression, returning a value that can be |
| 442 | assigned to a variable or otherwise operated on:: |
| 443 | |
| 444 | val = (yield i) |
| 445 | |
| 446 | I recommend that you always put parentheses around a :keyword:`yield` expression |
| 447 | when you're doing something with the returned value, as in the above example. |
| 448 | The parentheses aren't always necessary, but it's easier to always add them |
| 449 | instead of having to remember when they're needed. |
| 450 | |
| 451 | (:pep:`342` explains the exact rules, which are that a :keyword:`yield`\ |
| 452 | -expression must always be parenthesized except when it occurs at the top-level |
| 453 | expression on the right-hand side of an assignment. This means you can write |
| 454 | ``val = yield i`` but have to use parentheses when there's an operation, as in |
| 455 | ``val = (yield i) + 12``.) |
| 456 | |
| 457 | Values are sent into a generator by calling its :meth:`send(value)` method. The |
| 458 | generator's code is then resumed and the :keyword:`yield` expression returns the |
| 459 | specified *value*. If the regular :meth:`next` method is called, the |
| 460 | :keyword:`yield` returns :const:`None`. |
| 461 | |
| 462 | Here's the previous example, modified to allow changing the value of the |
| 463 | internal counter. :: |
| 464 | |
| 465 | def counter (maximum): |
| 466 | i = 0 |
| 467 | while i < maximum: |
| 468 | val = (yield i) |
| 469 | # If value provided, change counter |
| 470 | if val is not None: |
| 471 | i = val |
| 472 | else: |
| 473 | i += 1 |
| 474 | |
| 475 | And here's an example of changing the counter:: |
| 476 | |
| 477 | >>> it = counter(10) |
| 478 | >>> print it.next() |
| 479 | 0 |
| 480 | >>> print it.next() |
| 481 | 1 |
| 482 | >>> print it.send(8) |
| 483 | 8 |
| 484 | >>> print it.next() |
| 485 | 9 |
| 486 | >>> print it.next() |
| 487 | Traceback (most recent call last): |
Georg Brandl | fc29f27 | 2009-01-02 20:25:14 +0000 | [diff] [blame] | 488 | File "t.py", line 15, in ? |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 489 | print it.next() |
| 490 | StopIteration |
| 491 | |
| 492 | :keyword:`yield` will usually return :const:`None`, so you should always check |
| 493 | for this case. Don't just use its value in expressions unless you're sure that |
| 494 | the :meth:`send` method will be the only method used to resume your generator |
| 495 | function. |
| 496 | |
| 497 | In addition to :meth:`send`, there are two other new methods on generators: |
| 498 | |
| 499 | * :meth:`throw(type, value=None, traceback=None)` is used to raise an exception |
| 500 | inside the generator; the exception is raised by the :keyword:`yield` expression |
| 501 | where the generator's execution is paused. |
| 502 | |
| 503 | * :meth:`close` raises a new :exc:`GeneratorExit` exception inside the generator |
| 504 | to terminate the iteration. On receiving this exception, the generator's code |
| 505 | must either raise :exc:`GeneratorExit` or :exc:`StopIteration`. Catching the |
| 506 | :exc:`GeneratorExit` exception and returning a value is illegal and will trigger |
| 507 | a :exc:`RuntimeError`; if the function raises some other exception, that |
| 508 | exception is propagated to the caller. :meth:`close` will also be called by |
| 509 | Python's garbage collector when the generator is garbage-collected. |
| 510 | |
| 511 | If you need to run cleanup code when a :exc:`GeneratorExit` occurs, I suggest |
| 512 | using a ``try: ... finally:`` suite instead of catching :exc:`GeneratorExit`. |
| 513 | |
| 514 | The cumulative effect of these changes is to turn generators from one-way |
| 515 | producers of information into both producers and consumers. |
| 516 | |
| 517 | Generators also become *coroutines*, a more generalized form of subroutines. |
| 518 | Subroutines are entered at one point and exited at another point (the top of the |
| 519 | function, and a :keyword:`return` statement), but coroutines can be entered, |
| 520 | exited, and resumed at many different points (the :keyword:`yield` statements). |
| 521 | We'll have to figure out patterns for using coroutines effectively in Python. |
| 522 | |
| 523 | The addition of the :meth:`close` method has one side effect that isn't obvious. |
| 524 | :meth:`close` is called when a generator is garbage-collected, so this means the |
| 525 | generator's code gets one last chance to run before the generator is destroyed. |
| 526 | This last chance means that ``try...finally`` statements in generators can now |
| 527 | be guaranteed to work; the :keyword:`finally` clause will now always get a |
| 528 | chance to run. The syntactic restriction that you couldn't mix :keyword:`yield` |
| 529 | statements with a ``try...finally`` suite has therefore been removed. This |
| 530 | seems like a minor bit of language trivia, but using generators and |
| 531 | ``try...finally`` is actually necessary in order to implement the |
| 532 | :keyword:`with` statement described by PEP 343. I'll look at this new statement |
| 533 | in the following section. |
| 534 | |
| 535 | Another even more esoteric effect of this change: previously, the |
| 536 | :attr:`gi_frame` attribute of a generator was always a frame object. It's now |
| 537 | possible for :attr:`gi_frame` to be ``None`` once the generator has been |
| 538 | exhausted. |
| 539 | |
| 540 | |
| 541 | .. seealso:: |
| 542 | |
| 543 | :pep:`342` - Coroutines via Enhanced Generators |
| 544 | PEP written by Guido van Rossum and Phillip J. Eby; implemented by Phillip J. |
| 545 | Eby. Includes examples of some fancier uses of generators as coroutines. |
| 546 | |
| 547 | Earlier versions of these features were proposed in :pep:`288` by Raymond |
| 548 | Hettinger and :pep:`325` by Samuele Pedroni. |
| 549 | |
| 550 | http://en.wikipedia.org/wiki/Coroutine |
| 551 | The Wikipedia entry for coroutines. |
| 552 | |
| 553 | http://www.sidhe.org/~dan/blog/archives/000178.html |
| 554 | An explanation of coroutines from a Perl point of view, written by Dan Sugalski. |
| 555 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 556 | .. ====================================================================== |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 557 | |
| 558 | |
| 559 | .. _pep-343: |
| 560 | |
| 561 | PEP 343: The 'with' statement |
| 562 | ============================= |
| 563 | |
| 564 | The ':keyword:`with`' statement clarifies code that previously would use |
| 565 | ``try...finally`` blocks to ensure that clean-up code is executed. In this |
| 566 | section, I'll discuss the statement as it will commonly be used. In the next |
| 567 | section, I'll examine the implementation details and show how to write objects |
| 568 | for use with this statement. |
| 569 | |
| 570 | The ':keyword:`with`' statement is a new control-flow structure whose basic |
| 571 | structure is:: |
| 572 | |
| 573 | with expression [as variable]: |
| 574 | with-block |
| 575 | |
| 576 | The expression is evaluated, and it should result in an object that supports the |
| 577 | context management protocol (that is, has :meth:`__enter__` and :meth:`__exit__` |
| 578 | methods. |
| 579 | |
| 580 | The object's :meth:`__enter__` is called before *with-block* is executed and |
| 581 | therefore can run set-up code. It also may return a value that is bound to the |
| 582 | name *variable*, if given. (Note carefully that *variable* is *not* assigned |
| 583 | the result of *expression*.) |
| 584 | |
| 585 | After execution of the *with-block* is finished, the object's :meth:`__exit__` |
| 586 | method is called, even if the block raised an exception, and can therefore run |
| 587 | clean-up code. |
| 588 | |
| 589 | To enable the statement in Python 2.5, you need to add the following directive |
| 590 | to your module:: |
| 591 | |
| 592 | from __future__ import with_statement |
| 593 | |
| 594 | The statement will always be enabled in Python 2.6. |
| 595 | |
| 596 | Some standard Python objects now support the context management protocol and can |
| 597 | be used with the ':keyword:`with`' statement. File objects are one example:: |
| 598 | |
| 599 | with open('/etc/passwd', 'r') as f: |
| 600 | for line in f: |
| 601 | print line |
| 602 | ... more processing code ... |
| 603 | |
| 604 | After this statement has executed, the file object in *f* will have been |
| 605 | automatically closed, even if the :keyword:`for` loop raised an exception part- |
| 606 | way through the block. |
| 607 | |
| 608 | .. note:: |
| 609 | |
| 610 | In this case, *f* is the same object created by :func:`open`, because |
| 611 | :meth:`file.__enter__` returns *self*. |
| 612 | |
| 613 | The :mod:`threading` module's locks and condition variables also support the |
| 614 | ':keyword:`with`' statement:: |
| 615 | |
| 616 | lock = threading.Lock() |
| 617 | with lock: |
| 618 | # Critical section of code |
| 619 | ... |
| 620 | |
| 621 | The lock is acquired before the block is executed and always released once the |
| 622 | block is complete. |
| 623 | |
| 624 | The new :func:`localcontext` function in the :mod:`decimal` module makes it easy |
| 625 | to save and restore the current decimal context, which encapsulates the desired |
| 626 | precision and rounding characteristics for computations:: |
| 627 | |
| 628 | from decimal import Decimal, Context, localcontext |
| 629 | |
| 630 | # Displays with default precision of 28 digits |
| 631 | v = Decimal('578') |
| 632 | print v.sqrt() |
| 633 | |
| 634 | with localcontext(Context(prec=16)): |
| 635 | # All code in this block uses a precision of 16 digits. |
| 636 | # The original context is restored on exiting the block. |
| 637 | print v.sqrt() |
| 638 | |
| 639 | |
Andrew M. Kuchling | 1338fbf | 2007-09-13 22:50:10 +0000 | [diff] [blame] | 640 | .. _new-25-context-managers: |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 641 | |
| 642 | Writing Context Managers |
| 643 | ------------------------ |
| 644 | |
| 645 | Under the hood, the ':keyword:`with`' statement is fairly complicated. Most |
| 646 | people will only use ':keyword:`with`' in company with existing objects and |
| 647 | don't need to know these details, so you can skip the rest of this section if |
| 648 | you like. Authors of new objects will need to understand the details of the |
| 649 | underlying implementation and should keep reading. |
| 650 | |
| 651 | A high-level explanation of the context management protocol is: |
| 652 | |
| 653 | * The expression is evaluated and should result in an object called a "context |
| 654 | manager". The context manager must have :meth:`__enter__` and :meth:`__exit__` |
| 655 | methods. |
| 656 | |
| 657 | * The context manager's :meth:`__enter__` method is called. The value returned |
| 658 | is assigned to *VAR*. If no ``'as VAR'`` clause is present, the value is simply |
| 659 | discarded. |
| 660 | |
| 661 | * The code in *BLOCK* is executed. |
| 662 | |
| 663 | * If *BLOCK* raises an exception, the :meth:`__exit__(type, value, traceback)` |
| 664 | is called with the exception details, the same values returned by |
| 665 | :func:`sys.exc_info`. The method's return value controls whether the exception |
| 666 | is re-raised: any false value re-raises the exception, and ``True`` will result |
| 667 | in suppressing it. You'll only rarely want to suppress the exception, because |
| 668 | if you do the author of the code containing the ':keyword:`with`' statement will |
| 669 | never realize anything went wrong. |
| 670 | |
| 671 | * If *BLOCK* didn't raise an exception, the :meth:`__exit__` method is still |
| 672 | called, but *type*, *value*, and *traceback* are all ``None``. |
| 673 | |
| 674 | Let's think through an example. I won't present detailed code but will only |
| 675 | sketch the methods necessary for a database that supports transactions. |
| 676 | |
| 677 | (For people unfamiliar with database terminology: a set of changes to the |
| 678 | database are grouped into a transaction. Transactions can be either committed, |
| 679 | meaning that all the changes are written into the database, or rolled back, |
| 680 | meaning that the changes are all discarded and the database is unchanged. See |
| 681 | any database textbook for more information.) |
| 682 | |
| 683 | Let's assume there's an object representing a database connection. Our goal will |
| 684 | be to let the user write code like this:: |
| 685 | |
| 686 | db_connection = DatabaseConnection() |
| 687 | with db_connection as cursor: |
| 688 | cursor.execute('insert into ...') |
| 689 | cursor.execute('delete from ...') |
| 690 | # ... more operations ... |
| 691 | |
| 692 | The transaction should be committed if the code in the block runs flawlessly or |
| 693 | rolled back if there's an exception. Here's the basic interface for |
| 694 | :class:`DatabaseConnection` that I'll assume:: |
| 695 | |
| 696 | class DatabaseConnection: |
| 697 | # Database interface |
| 698 | def cursor (self): |
| 699 | "Returns a cursor object and starts a new transaction" |
| 700 | def commit (self): |
| 701 | "Commits current transaction" |
| 702 | def rollback (self): |
| 703 | "Rolls back current transaction" |
| 704 | |
| 705 | The :meth:`__enter__` method is pretty easy, having only to start a new |
| 706 | transaction. For this application the resulting cursor object would be a useful |
| 707 | result, so the method will return it. The user can then add ``as cursor`` to |
| 708 | their ':keyword:`with`' statement to bind the cursor to a variable name. :: |
| 709 | |
| 710 | class DatabaseConnection: |
| 711 | ... |
| 712 | def __enter__ (self): |
| 713 | # Code to start a new transaction |
| 714 | cursor = self.cursor() |
| 715 | return cursor |
| 716 | |
| 717 | The :meth:`__exit__` method is the most complicated because it's where most of |
| 718 | the work has to be done. The method has to check if an exception occurred. If |
| 719 | there was no exception, the transaction is committed. The transaction is rolled |
| 720 | back if there was an exception. |
| 721 | |
| 722 | In the code below, execution will just fall off the end of the function, |
| 723 | returning the default value of ``None``. ``None`` is false, so the exception |
| 724 | will be re-raised automatically. If you wished, you could be more explicit and |
| 725 | add a :keyword:`return` statement at the marked location. :: |
| 726 | |
| 727 | class DatabaseConnection: |
| 728 | ... |
| 729 | def __exit__ (self, type, value, tb): |
| 730 | if tb is None: |
| 731 | # No exception, so commit |
| 732 | self.commit() |
| 733 | else: |
| 734 | # Exception occurred, so rollback. |
| 735 | self.rollback() |
| 736 | # return False |
| 737 | |
| 738 | |
Georg Brandl | 151f42f | 2008-10-08 18:57:13 +0000 | [diff] [blame] | 739 | .. _contextlibmod: |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 740 | |
| 741 | The contextlib module |
| 742 | --------------------- |
| 743 | |
| 744 | The new :mod:`contextlib` module provides some functions and a decorator that |
| 745 | are useful for writing objects for use with the ':keyword:`with`' statement. |
| 746 | |
| 747 | The decorator is called :func:`contextmanager`, and lets you write a single |
| 748 | generator function instead of defining a new class. The generator should yield |
| 749 | exactly one value. The code up to the :keyword:`yield` will be executed as the |
| 750 | :meth:`__enter__` method, and the value yielded will be the method's return |
| 751 | value that will get bound to the variable in the ':keyword:`with`' statement's |
| 752 | :keyword:`as` clause, if any. The code after the :keyword:`yield` will be |
| 753 | executed in the :meth:`__exit__` method. Any exception raised in the block will |
| 754 | be raised by the :keyword:`yield` statement. |
| 755 | |
| 756 | Our database example from the previous section could be written using this |
| 757 | decorator as:: |
| 758 | |
| 759 | from contextlib import contextmanager |
| 760 | |
| 761 | @contextmanager |
| 762 | def db_transaction (connection): |
| 763 | cursor = connection.cursor() |
| 764 | try: |
| 765 | yield cursor |
| 766 | except: |
| 767 | connection.rollback() |
| 768 | raise |
| 769 | else: |
| 770 | connection.commit() |
| 771 | |
| 772 | db = DatabaseConnection() |
| 773 | with db_transaction(db) as cursor: |
| 774 | ... |
| 775 | |
| 776 | The :mod:`contextlib` module also has a :func:`nested(mgr1, mgr2, ...)` function |
| 777 | that combines a number of context managers so you don't need to write nested |
| 778 | ':keyword:`with`' statements. In this example, the single ':keyword:`with`' |
| 779 | statement both starts a database transaction and acquires a thread lock:: |
| 780 | |
| 781 | lock = threading.Lock() |
| 782 | with nested (db_transaction(db), lock) as (cursor, locked): |
| 783 | ... |
| 784 | |
| 785 | Finally, the :func:`closing(object)` function returns *object* so that it can be |
| 786 | bound to a variable, and calls ``object.close`` at the end of the block. :: |
| 787 | |
| 788 | import urllib, sys |
| 789 | from contextlib import closing |
| 790 | |
| 791 | with closing(urllib.urlopen('http://www.yahoo.com')) as f: |
| 792 | for line in f: |
| 793 | sys.stdout.write(line) |
| 794 | |
| 795 | |
| 796 | .. seealso:: |
| 797 | |
| 798 | :pep:`343` - The "with" statement |
| 799 | PEP written by Guido van Rossum and Nick Coghlan; implemented by Mike Bland, |
| 800 | Guido van Rossum, and Neal Norwitz. The PEP shows the code generated for a |
| 801 | ':keyword:`with`' statement, which can be helpful in learning how the statement |
| 802 | works. |
| 803 | |
| 804 | The documentation for the :mod:`contextlib` module. |
| 805 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 806 | .. ====================================================================== |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 807 | |
| 808 | |
| 809 | .. _pep-352: |
| 810 | |
| 811 | PEP 352: Exceptions as New-Style Classes |
| 812 | ======================================== |
| 813 | |
| 814 | Exception classes can now be new-style classes, not just classic classes, and |
| 815 | the built-in :exc:`Exception` class and all the standard built-in exceptions |
| 816 | (:exc:`NameError`, :exc:`ValueError`, etc.) are now new-style classes. |
| 817 | |
| 818 | The inheritance hierarchy for exceptions has been rearranged a bit. In 2.5, the |
| 819 | inheritance relationships are:: |
| 820 | |
| 821 | BaseException # New in Python 2.5 |
| 822 | |- KeyboardInterrupt |
| 823 | |- SystemExit |
| 824 | |- Exception |
| 825 | |- (all other current built-in exceptions) |
| 826 | |
| 827 | This rearrangement was done because people often want to catch all exceptions |
| 828 | that indicate program errors. :exc:`KeyboardInterrupt` and :exc:`SystemExit` |
| 829 | aren't errors, though, and usually represent an explicit action such as the user |
| 830 | hitting Control-C or code calling :func:`sys.exit`. A bare ``except:`` will |
| 831 | catch all exceptions, so you commonly need to list :exc:`KeyboardInterrupt` and |
| 832 | :exc:`SystemExit` in order to re-raise them. The usual pattern is:: |
| 833 | |
| 834 | try: |
| 835 | ... |
| 836 | except (KeyboardInterrupt, SystemExit): |
| 837 | raise |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 838 | except: |
| 839 | # Log error... |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 840 | # Continue running program... |
| 841 | |
| 842 | In Python 2.5, you can now write ``except Exception`` to achieve the same |
| 843 | result, catching all the exceptions that usually indicate errors but leaving |
| 844 | :exc:`KeyboardInterrupt` and :exc:`SystemExit` alone. As in previous versions, |
| 845 | a bare ``except:`` still catches all exceptions. |
| 846 | |
| 847 | The goal for Python 3.0 is to require any class raised as an exception to derive |
| 848 | from :exc:`BaseException` or some descendant of :exc:`BaseException`, and future |
| 849 | releases in the Python 2.x series may begin to enforce this constraint. |
| 850 | Therefore, I suggest you begin making all your exception classes derive from |
| 851 | :exc:`Exception` now. It's been suggested that the bare ``except:`` form should |
| 852 | be removed in Python 3.0, but Guido van Rossum hasn't decided whether to do this |
| 853 | or not. |
| 854 | |
| 855 | Raising of strings as exceptions, as in the statement ``raise "Error |
| 856 | occurred"``, is deprecated in Python 2.5 and will trigger a warning. The aim is |
| 857 | to be able to remove the string-exception feature in a few releases. |
| 858 | |
| 859 | |
| 860 | .. seealso:: |
| 861 | |
| 862 | :pep:`352` - Required Superclass for Exceptions |
| 863 | PEP written by Brett Cannon and Guido van Rossum; implemented by Brett Cannon. |
| 864 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 865 | .. ====================================================================== |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 866 | |
| 867 | |
| 868 | .. _pep-353: |
| 869 | |
| 870 | PEP 353: Using ssize_t as the index type |
| 871 | ======================================== |
| 872 | |
| 873 | A wide-ranging change to Python's C API, using a new :ctype:`Py_ssize_t` type |
| 874 | definition instead of :ctype:`int`, will permit the interpreter to handle more |
| 875 | data on 64-bit platforms. This change doesn't affect Python's capacity on 32-bit |
| 876 | platforms. |
| 877 | |
| 878 | Various pieces of the Python interpreter used C's :ctype:`int` type to store |
| 879 | sizes or counts; for example, the number of items in a list or tuple were stored |
| 880 | in an :ctype:`int`. The C compilers for most 64-bit platforms still define |
| 881 | :ctype:`int` as a 32-bit type, so that meant that lists could only hold up to |
| 882 | ``2**31 - 1`` = 2147483647 items. (There are actually a few different |
| 883 | programming models that 64-bit C compilers can use -- see |
| 884 | http://www.unix.org/version2/whatsnew/lp64_wp.html for a discussion -- but the |
| 885 | most commonly available model leaves :ctype:`int` as 32 bits.) |
| 886 | |
| 887 | A limit of 2147483647 items doesn't really matter on a 32-bit platform because |
| 888 | you'll run out of memory before hitting the length limit. Each list item |
| 889 | requires space for a pointer, which is 4 bytes, plus space for a |
| 890 | :ctype:`PyObject` representing the item. 2147483647\*4 is already more bytes |
| 891 | than a 32-bit address space can contain. |
| 892 | |
| 893 | It's possible to address that much memory on a 64-bit platform, however. The |
| 894 | pointers for a list that size would only require 16 GiB of space, so it's not |
| 895 | unreasonable that Python programmers might construct lists that large. |
| 896 | Therefore, the Python interpreter had to be changed to use some type other than |
| 897 | :ctype:`int`, and this will be a 64-bit type on 64-bit platforms. The change |
| 898 | will cause incompatibilities on 64-bit machines, so it was deemed worth making |
| 899 | the transition now, while the number of 64-bit users is still relatively small. |
| 900 | (In 5 or 10 years, we may *all* be on 64-bit machines, and the transition would |
| 901 | be more painful then.) |
| 902 | |
| 903 | This change most strongly affects authors of C extension modules. Python |
| 904 | strings and container types such as lists and tuples now use |
| 905 | :ctype:`Py_ssize_t` to store their size. Functions such as |
| 906 | :cfunc:`PyList_Size` now return :ctype:`Py_ssize_t`. Code in extension modules |
| 907 | may therefore need to have some variables changed to :ctype:`Py_ssize_t`. |
| 908 | |
| 909 | The :cfunc:`PyArg_ParseTuple` and :cfunc:`Py_BuildValue` functions have a new |
| 910 | conversion code, ``n``, for :ctype:`Py_ssize_t`. :cfunc:`PyArg_ParseTuple`'s |
| 911 | ``s#`` and ``t#`` still output :ctype:`int` by default, but you can define the |
| 912 | macro :cmacro:`PY_SSIZE_T_CLEAN` before including :file:`Python.h` to make |
| 913 | them return :ctype:`Py_ssize_t`. |
| 914 | |
| 915 | :pep:`353` has a section on conversion guidelines that extension authors should |
| 916 | read to learn about supporting 64-bit platforms. |
| 917 | |
| 918 | |
| 919 | .. seealso:: |
| 920 | |
| 921 | :pep:`353` - Using ssize_t as the index type |
| 922 | PEP written and implemented by Martin von Löwis. |
| 923 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 924 | .. ====================================================================== |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 925 | |
| 926 | |
| 927 | .. _pep-357: |
| 928 | |
| 929 | PEP 357: The '__index__' method |
| 930 | =============================== |
| 931 | |
| 932 | The NumPy developers had a problem that could only be solved by adding a new |
| 933 | special method, :meth:`__index__`. When using slice notation, as in |
| 934 | ``[start:stop:step]``, the values of the *start*, *stop*, and *step* indexes |
| 935 | must all be either integers or long integers. NumPy defines a variety of |
| 936 | specialized integer types corresponding to unsigned and signed integers of 8, |
| 937 | 16, 32, and 64 bits, but there was no way to signal that these types could be |
| 938 | used as slice indexes. |
| 939 | |
| 940 | Slicing can't just use the existing :meth:`__int__` method because that method |
| 941 | is also used to implement coercion to integers. If slicing used |
| 942 | :meth:`__int__`, floating-point numbers would also become legal slice indexes |
| 943 | and that's clearly an undesirable behaviour. |
| 944 | |
| 945 | Instead, a new special method called :meth:`__index__` was added. It takes no |
| 946 | arguments and returns an integer giving the slice index to use. For example:: |
| 947 | |
| 948 | class C: |
| 949 | def __index__ (self): |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 950 | return self.value |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 951 | |
| 952 | The return value must be either a Python integer or long integer. The |
| 953 | interpreter will check that the type returned is correct, and raises a |
| 954 | :exc:`TypeError` if this requirement isn't met. |
| 955 | |
| 956 | A corresponding :attr:`nb_index` slot was added to the C-level |
| 957 | :ctype:`PyNumberMethods` structure to let C extensions implement this protocol. |
| 958 | :cfunc:`PyNumber_Index(obj)` can be used in extension code to call the |
| 959 | :meth:`__index__` function and retrieve its result. |
| 960 | |
| 961 | |
| 962 | .. seealso:: |
| 963 | |
| 964 | :pep:`357` - Allowing Any Object to be Used for Slicing |
| 965 | PEP written and implemented by Travis Oliphant. |
| 966 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 967 | .. ====================================================================== |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 968 | |
| 969 | |
| 970 | .. _other-lang: |
| 971 | |
| 972 | Other Language Changes |
| 973 | ====================== |
| 974 | |
| 975 | Here are all of the changes that Python 2.5 makes to the core Python language. |
| 976 | |
| 977 | * The :class:`dict` type has a new hook for letting subclasses provide a default |
| 978 | value when a key isn't contained in the dictionary. When a key isn't found, the |
| 979 | dictionary's :meth:`__missing__(key)` method will be called. This hook is used |
| 980 | to implement the new :class:`defaultdict` class in the :mod:`collections` |
| 981 | module. The following example defines a dictionary that returns zero for any |
| 982 | missing key:: |
| 983 | |
| 984 | class zerodict (dict): |
| 985 | def __missing__ (self, key): |
| 986 | return 0 |
| 987 | |
| 988 | d = zerodict({1:1, 2:2}) |
| 989 | print d[1], d[2] # Prints 1, 2 |
| 990 | print d[3], d[4] # Prints 0, 0 |
| 991 | |
| 992 | * Both 8-bit and Unicode strings have new :meth:`partition(sep)` and |
| 993 | :meth:`rpartition(sep)` methods that simplify a common use case. |
| 994 | |
| 995 | The :meth:`find(S)` method is often used to get an index which is then used to |
| 996 | slice the string and obtain the pieces that are before and after the separator. |
| 997 | :meth:`partition(sep)` condenses this pattern into a single method call that |
| 998 | returns a 3-tuple containing the substring before the separator, the separator |
| 999 | itself, and the substring after the separator. If the separator isn't found, |
| 1000 | the first element of the tuple is the entire string and the other two elements |
| 1001 | are empty. :meth:`rpartition(sep)` also returns a 3-tuple but starts searching |
| 1002 | from the end of the string; the ``r`` stands for 'reverse'. |
| 1003 | |
| 1004 | Some examples:: |
| 1005 | |
| 1006 | >>> ('http://www.python.org').partition('://') |
| 1007 | ('http', '://', 'www.python.org') |
| 1008 | >>> ('file:/usr/share/doc/index.html').partition('://') |
| 1009 | ('file:/usr/share/doc/index.html', '', '') |
| 1010 | >>> (u'Subject: a quick question').partition(':') |
| 1011 | (u'Subject', u':', u' a quick question') |
| 1012 | >>> 'www.python.org'.rpartition('.') |
| 1013 | ('www.python', '.', 'org') |
| 1014 | >>> 'www.python.org'.rpartition(':') |
| 1015 | ('', '', 'www.python.org') |
| 1016 | |
| 1017 | (Implemented by Fredrik Lundh following a suggestion by Raymond Hettinger.) |
| 1018 | |
| 1019 | * The :meth:`startswith` and :meth:`endswith` methods of string types now accept |
| 1020 | tuples of strings to check for. :: |
| 1021 | |
| 1022 | def is_image_file (filename): |
| 1023 | return filename.endswith(('.gif', '.jpg', '.tiff')) |
| 1024 | |
| 1025 | (Implemented by Georg Brandl following a suggestion by Tom Lynn.) |
| 1026 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 1027 | .. RFE #1491485 |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1028 | |
| 1029 | * The :func:`min` and :func:`max` built-in functions gained a ``key`` keyword |
| 1030 | parameter analogous to the ``key`` argument for :meth:`sort`. This parameter |
| 1031 | supplies a function that takes a single argument and is called for every value |
| 1032 | in the list; :func:`min`/:func:`max` will return the element with the |
| 1033 | smallest/largest return value from this function. For example, to find the |
| 1034 | longest string in a list, you can do:: |
| 1035 | |
| 1036 | L = ['medium', 'longest', 'short'] |
| 1037 | # Prints 'longest' |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 1038 | print max(L, key=len) |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1039 | # Prints 'short', because lexicographically 'short' has the largest value |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 1040 | print max(L) |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1041 | |
| 1042 | (Contributed by Steven Bethard and Raymond Hettinger.) |
| 1043 | |
| 1044 | * Two new built-in functions, :func:`any` and :func:`all`, evaluate whether an |
| 1045 | iterator contains any true or false values. :func:`any` returns :const:`True` |
| 1046 | if any value returned by the iterator is true; otherwise it will return |
| 1047 | :const:`False`. :func:`all` returns :const:`True` only if all of the values |
| 1048 | returned by the iterator evaluate as true. (Suggested by Guido van Rossum, and |
| 1049 | implemented by Raymond Hettinger.) |
| 1050 | |
| 1051 | * The result of a class's :meth:`__hash__` method can now be either a long |
| 1052 | integer or a regular integer. If a long integer is returned, the hash of that |
| 1053 | value is taken. In earlier versions the hash value was required to be a |
| 1054 | regular integer, but in 2.5 the :func:`id` built-in was changed to always |
| 1055 | return non-negative numbers, and users often seem to use ``id(self)`` in |
| 1056 | :meth:`__hash__` methods (though this is discouraged). |
| 1057 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 1058 | .. Bug #1536021 |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1059 | |
| 1060 | * ASCII is now the default encoding for modules. It's now a syntax error if a |
| 1061 | module contains string literals with 8-bit characters but doesn't have an |
| 1062 | encoding declaration. In Python 2.4 this triggered a warning, not a syntax |
| 1063 | error. See :pep:`263` for how to declare a module's encoding; for example, you |
| 1064 | might add a line like this near the top of the source file:: |
| 1065 | |
| 1066 | # -*- coding: latin1 -*- |
| 1067 | |
| 1068 | * A new warning, :class:`UnicodeWarning`, is triggered when you attempt to |
| 1069 | compare a Unicode string and an 8-bit string that can't be converted to Unicode |
| 1070 | using the default ASCII encoding. The result of the comparison is false:: |
| 1071 | |
| 1072 | >>> chr(128) == unichr(128) # Can't convert chr(128) to Unicode |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 1073 | __main__:1: UnicodeWarning: Unicode equal comparison failed |
| 1074 | to convert both arguments to Unicode - interpreting them |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1075 | as being unequal |
| 1076 | False |
| 1077 | >>> chr(127) == unichr(127) # chr(127) can be converted |
| 1078 | True |
| 1079 | |
| 1080 | Previously this would raise a :class:`UnicodeDecodeError` exception, but in 2.5 |
| 1081 | this could result in puzzling problems when accessing a dictionary. If you |
| 1082 | looked up ``unichr(128)`` and ``chr(128)`` was being used as a key, you'd get a |
| 1083 | :class:`UnicodeDecodeError` exception. Other changes in 2.5 resulted in this |
| 1084 | exception being raised instead of suppressed by the code in :file:`dictobject.c` |
| 1085 | that implements dictionaries. |
| 1086 | |
| 1087 | Raising an exception for such a comparison is strictly correct, but the change |
| 1088 | might have broken code, so instead :class:`UnicodeWarning` was introduced. |
| 1089 | |
| 1090 | (Implemented by Marc-André Lemburg.) |
| 1091 | |
| 1092 | * One error that Python programmers sometimes make is forgetting to include an |
| 1093 | :file:`__init__.py` module in a package directory. Debugging this mistake can be |
| 1094 | confusing, and usually requires running Python with the :option:`-v` switch to |
| 1095 | log all the paths searched. In Python 2.5, a new :exc:`ImportWarning` warning is |
| 1096 | triggered when an import would have picked up a directory as a package but no |
| 1097 | :file:`__init__.py` was found. This warning is silently ignored by default; |
| 1098 | provide the :option:`-Wd` option when running the Python executable to display |
| 1099 | the warning message. (Implemented by Thomas Wouters.) |
| 1100 | |
| 1101 | * The list of base classes in a class definition can now be empty. As an |
| 1102 | example, this is now legal:: |
| 1103 | |
| 1104 | class C(): |
| 1105 | pass |
| 1106 | |
| 1107 | (Implemented by Brett Cannon.) |
| 1108 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 1109 | .. ====================================================================== |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1110 | |
| 1111 | |
Benjamin Peterson | fc72de7 | 2008-10-08 21:11:33 +0000 | [diff] [blame] | 1112 | .. _25interactive: |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1113 | |
| 1114 | Interactive Interpreter Changes |
| 1115 | ------------------------------- |
| 1116 | |
| 1117 | In the interactive interpreter, ``quit`` and ``exit`` have long been strings so |
| 1118 | that new users get a somewhat helpful message when they try to quit:: |
| 1119 | |
| 1120 | >>> quit |
| 1121 | 'Use Ctrl-D (i.e. EOF) to exit.' |
| 1122 | |
| 1123 | In Python 2.5, ``quit`` and ``exit`` are now objects that still produce string |
| 1124 | representations of themselves, but are also callable. Newbies who try ``quit()`` |
| 1125 | or ``exit()`` will now exit the interpreter as they expect. (Implemented by |
| 1126 | Georg Brandl.) |
| 1127 | |
| 1128 | The Python executable now accepts the standard long options :option:`--help` |
| 1129 | and :option:`--version`; on Windows, it also accepts the :option:`/?` option |
| 1130 | for displaying a help message. (Implemented by Georg Brandl.) |
| 1131 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 1132 | .. ====================================================================== |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1133 | |
| 1134 | |
| 1135 | .. _opts: |
| 1136 | |
| 1137 | Optimizations |
| 1138 | ------------- |
| 1139 | |
| 1140 | Several of the optimizations were developed at the NeedForSpeed sprint, an event |
| 1141 | held in Reykjavik, Iceland, from May 21--28 2006. The sprint focused on speed |
| 1142 | enhancements to the CPython implementation and was funded by EWT LLC with local |
| 1143 | support from CCP Games. Those optimizations added at this sprint are specially |
| 1144 | marked in the following list. |
| 1145 | |
| 1146 | * When they were introduced in Python 2.4, the built-in :class:`set` and |
| 1147 | :class:`frozenset` types were built on top of Python's dictionary type. In 2.5 |
| 1148 | the internal data structure has been customized for implementing sets, and as a |
| 1149 | result sets will use a third less memory and are somewhat faster. (Implemented |
| 1150 | by Raymond Hettinger.) |
| 1151 | |
| 1152 | * The speed of some Unicode operations, such as finding substrings, string |
| 1153 | splitting, and character map encoding and decoding, has been improved. |
| 1154 | (Substring search and splitting improvements were added by Fredrik Lundh and |
| 1155 | Andrew Dalke at the NeedForSpeed sprint. Character maps were improved by Walter |
| 1156 | Dörwald and Martin von Löwis.) |
| 1157 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 1158 | .. Patch 1313939, 1359618 |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1159 | |
| 1160 | * The :func:`long(str, base)` function is now faster on long digit strings |
| 1161 | because fewer intermediate results are calculated. The peak is for strings of |
| 1162 | around 800--1000 digits where the function is 6 times faster. (Contributed by |
| 1163 | Alan McIntyre and committed at the NeedForSpeed sprint.) |
| 1164 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 1165 | .. Patch 1442927 |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1166 | |
| 1167 | * It's now illegal to mix iterating over a file with ``for line in file`` and |
| 1168 | calling the file object's :meth:`read`/:meth:`readline`/:meth:`readlines` |
| 1169 | methods. Iteration uses an internal buffer and the :meth:`read\*` methods |
| 1170 | don't use that buffer. Instead they would return the data following the |
| 1171 | buffer, causing the data to appear out of order. Mixing iteration and these |
| 1172 | methods will now trigger a :exc:`ValueError` from the :meth:`read\*` method. |
| 1173 | (Implemented by Thomas Wouters.) |
| 1174 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 1175 | .. Patch 1397960 |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1176 | |
| 1177 | * The :mod:`struct` module now compiles structure format strings into an |
| 1178 | internal representation and caches this representation, yielding a 20% speedup. |
| 1179 | (Contributed by Bob Ippolito at the NeedForSpeed sprint.) |
| 1180 | |
| 1181 | * The :mod:`re` module got a 1 or 2% speedup by switching to Python's allocator |
| 1182 | functions instead of the system's :cfunc:`malloc` and :cfunc:`free`. |
| 1183 | (Contributed by Jack Diederich at the NeedForSpeed sprint.) |
| 1184 | |
| 1185 | * The code generator's peephole optimizer now performs simple constant folding |
| 1186 | in expressions. If you write something like ``a = 2+3``, the code generator |
| 1187 | will do the arithmetic and produce code corresponding to ``a = 5``. (Proposed |
| 1188 | and implemented by Raymond Hettinger.) |
| 1189 | |
| 1190 | * Function calls are now faster because code objects now keep the most recently |
| 1191 | finished frame (a "zombie frame") in an internal field of the code object, |
| 1192 | reusing it the next time the code object is invoked. (Original patch by Michael |
| 1193 | Hudson, modified by Armin Rigo and Richard Jones; committed at the NeedForSpeed |
| 1194 | sprint.) Frame objects are also slightly smaller, which may improve cache |
| 1195 | locality and reduce memory usage a bit. (Contributed by Neal Norwitz.) |
| 1196 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 1197 | .. Patch 876206 |
| 1198 | .. Patch 1337051 |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1199 | |
| 1200 | * Python's built-in exceptions are now new-style classes, a change that speeds |
| 1201 | up instantiation considerably. Exception handling in Python 2.5 is therefore |
| 1202 | about 30% faster than in 2.4. (Contributed by Richard Jones, Georg Brandl and |
| 1203 | Sean Reifschneider at the NeedForSpeed sprint.) |
| 1204 | |
| 1205 | * Importing now caches the paths tried, recording whether they exist or not so |
| 1206 | that the interpreter makes fewer :cfunc:`open` and :cfunc:`stat` calls on |
| 1207 | startup. (Contributed by Martin von Löwis and Georg Brandl.) |
| 1208 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 1209 | .. Patch 921466 |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1210 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 1211 | .. ====================================================================== |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1212 | |
| 1213 | |
Benjamin Peterson | fc72de7 | 2008-10-08 21:11:33 +0000 | [diff] [blame] | 1214 | .. _25modules: |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1215 | |
| 1216 | New, Improved, and Removed Modules |
| 1217 | ================================== |
| 1218 | |
| 1219 | The standard library received many enhancements and bug fixes in Python 2.5. |
| 1220 | Here's a partial list of the most notable changes, sorted alphabetically by |
| 1221 | module name. Consult the :file:`Misc/NEWS` file in the source tree for a more |
| 1222 | complete list of changes, or look through the SVN logs for all the details. |
| 1223 | |
| 1224 | * The :mod:`audioop` module now supports the a-LAW encoding, and the code for |
| 1225 | u-LAW encoding has been improved. (Contributed by Lars Immisch.) |
| 1226 | |
| 1227 | * The :mod:`codecs` module gained support for incremental codecs. The |
| 1228 | :func:`codec.lookup` function now returns a :class:`CodecInfo` instance instead |
| 1229 | of a tuple. :class:`CodecInfo` instances behave like a 4-tuple to preserve |
| 1230 | backward compatibility but also have the attributes :attr:`encode`, |
| 1231 | :attr:`decode`, :attr:`incrementalencoder`, :attr:`incrementaldecoder`, |
| 1232 | :attr:`streamwriter`, and :attr:`streamreader`. Incremental codecs can receive |
| 1233 | input and produce output in multiple chunks; the output is the same as if the |
| 1234 | entire input was fed to the non-incremental codec. See the :mod:`codecs` module |
| 1235 | documentation for details. (Designed and implemented by Walter Dörwald.) |
| 1236 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 1237 | .. Patch 1436130 |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1238 | |
| 1239 | * The :mod:`collections` module gained a new type, :class:`defaultdict`, that |
| 1240 | subclasses the standard :class:`dict` type. The new type mostly behaves like a |
| 1241 | dictionary but constructs a default value when a key isn't present, |
| 1242 | automatically adding it to the dictionary for the requested key value. |
| 1243 | |
| 1244 | The first argument to :class:`defaultdict`'s constructor is a factory function |
| 1245 | that gets called whenever a key is requested but not found. This factory |
| 1246 | function receives no arguments, so you can use built-in type constructors such |
| 1247 | as :func:`list` or :func:`int`. For example, you can make an index of words |
| 1248 | based on their initial letter like this:: |
| 1249 | |
| 1250 | words = """Nel mezzo del cammin di nostra vita |
| 1251 | mi ritrovai per una selva oscura |
| 1252 | che la diritta via era smarrita""".lower().split() |
| 1253 | |
| 1254 | index = defaultdict(list) |
| 1255 | |
| 1256 | for w in words: |
| 1257 | init_letter = w[0] |
| 1258 | index[init_letter].append(w) |
| 1259 | |
| 1260 | Printing ``index`` results in the following output:: |
| 1261 | |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 1262 | defaultdict(<type 'list'>, {'c': ['cammin', 'che'], 'e': ['era'], |
| 1263 | 'd': ['del', 'di', 'diritta'], 'm': ['mezzo', 'mi'], |
| 1264 | 'l': ['la'], 'o': ['oscura'], 'n': ['nel', 'nostra'], |
| 1265 | 'p': ['per'], 's': ['selva', 'smarrita'], |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1266 | 'r': ['ritrovai'], 'u': ['una'], 'v': ['vita', 'via']} |
| 1267 | |
| 1268 | (Contributed by Guido van Rossum.) |
| 1269 | |
| 1270 | * The :class:`deque` double-ended queue type supplied by the :mod:`collections` |
| 1271 | module now has a :meth:`remove(value)` method that removes the first occurrence |
| 1272 | of *value* in the queue, raising :exc:`ValueError` if the value isn't found. |
| 1273 | (Contributed by Raymond Hettinger.) |
| 1274 | |
| 1275 | * New module: The :mod:`contextlib` module contains helper functions for use |
Georg Brandl | 151f42f | 2008-10-08 18:57:13 +0000 | [diff] [blame] | 1276 | with the new ':keyword:`with`' statement. See section :ref:`contextlibmod` |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1277 | for more about this module. |
| 1278 | |
| 1279 | * New module: The :mod:`cProfile` module is a C implementation of the existing |
| 1280 | :mod:`profile` module that has much lower overhead. The module's interface is |
| 1281 | the same as :mod:`profile`: you run ``cProfile.run('main()')`` to profile a |
| 1282 | function, can save profile data to a file, etc. It's not yet known if the |
| 1283 | Hotshot profiler, which is also written in C but doesn't match the |
| 1284 | :mod:`profile` module's interface, will continue to be maintained in future |
| 1285 | versions of Python. (Contributed by Armin Rigo.) |
| 1286 | |
| 1287 | Also, the :mod:`pstats` module for analyzing the data measured by the profiler |
| 1288 | now supports directing the output to any file object by supplying a *stream* |
| 1289 | argument to the :class:`Stats` constructor. (Contributed by Skip Montanaro.) |
| 1290 | |
| 1291 | * The :mod:`csv` module, which parses files in comma-separated value format, |
| 1292 | received several enhancements and a number of bugfixes. You can now set the |
| 1293 | maximum size in bytes of a field by calling the |
| 1294 | :meth:`csv.field_size_limit(new_limit)` function; omitting the *new_limit* |
| 1295 | argument will return the currently-set limit. The :class:`reader` class now has |
| 1296 | a :attr:`line_num` attribute that counts the number of physical lines read from |
| 1297 | the source; records can span multiple physical lines, so :attr:`line_num` is not |
| 1298 | the same as the number of records read. |
| 1299 | |
| 1300 | The CSV parser is now stricter about multi-line quoted fields. Previously, if a |
| 1301 | line ended within a quoted field without a terminating newline character, a |
| 1302 | newline would be inserted into the returned field. This behavior caused problems |
| 1303 | when reading files that contained carriage return characters within fields, so |
| 1304 | the code was changed to return the field without inserting newlines. As a |
| 1305 | consequence, if newlines embedded within fields are important, the input should |
| 1306 | be split into lines in a manner that preserves the newline characters. |
| 1307 | |
| 1308 | (Contributed by Skip Montanaro and Andrew McNamara.) |
| 1309 | |
| 1310 | * The :class:`datetime` class in the :mod:`datetime` module now has a |
| 1311 | :meth:`strptime(string, format)` method for parsing date strings, contributed |
| 1312 | by Josh Spoerri. It uses the same format characters as :func:`time.strptime` and |
| 1313 | :func:`time.strftime`:: |
| 1314 | |
| 1315 | from datetime import datetime |
| 1316 | |
| 1317 | ts = datetime.strptime('10:13:15 2006-03-07', |
| 1318 | '%H:%M:%S %Y-%m-%d') |
| 1319 | |
| 1320 | * The :meth:`SequenceMatcher.get_matching_blocks` method in the :mod:`difflib` |
| 1321 | module now guarantees to return a minimal list of blocks describing matching |
| 1322 | subsequences. Previously, the algorithm would occasionally break a block of |
| 1323 | matching elements into two list entries. (Enhancement by Tim Peters.) |
| 1324 | |
| 1325 | * The :mod:`doctest` module gained a ``SKIP`` option that keeps an example from |
| 1326 | being executed at all. This is intended for code snippets that are usage |
| 1327 | examples intended for the reader and aren't actually test cases. |
| 1328 | |
| 1329 | An *encoding* parameter was added to the :func:`testfile` function and the |
| 1330 | :class:`DocFileSuite` class to specify the file's encoding. This makes it |
| 1331 | easier to use non-ASCII characters in tests contained within a docstring. |
| 1332 | (Contributed by Bjorn Tillenius.) |
| 1333 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 1334 | .. Patch 1080727 |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1335 | |
| 1336 | * The :mod:`email` package has been updated to version 4.0. (Contributed by |
| 1337 | Barry Warsaw.) |
| 1338 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 1339 | .. XXX need to provide some more detail here |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1340 | |
| 1341 | * The :mod:`fileinput` module was made more flexible. Unicode filenames are now |
| 1342 | supported, and a *mode* parameter that defaults to ``"r"`` was added to the |
| 1343 | :func:`input` function to allow opening files in binary or universal-newline |
| 1344 | mode. Another new parameter, *openhook*, lets you use a function other than |
| 1345 | :func:`open` to open the input files. Once you're iterating over the set of |
| 1346 | files, the :class:`FileInput` object's new :meth:`fileno` returns the file |
| 1347 | descriptor for the currently opened file. (Contributed by Georg Brandl.) |
| 1348 | |
| 1349 | * In the :mod:`gc` module, the new :func:`get_count` function returns a 3-tuple |
| 1350 | containing the current collection counts for the three GC generations. This is |
| 1351 | accounting information for the garbage collector; when these counts reach a |
| 1352 | specified threshold, a garbage collection sweep will be made. The existing |
| 1353 | :func:`gc.collect` function now takes an optional *generation* argument of 0, 1, |
| 1354 | or 2 to specify which generation to collect. (Contributed by Barry Warsaw.) |
| 1355 | |
| 1356 | * The :func:`nsmallest` and :func:`nlargest` functions in the :mod:`heapq` |
| 1357 | module now support a ``key`` keyword parameter similar to the one provided by |
| 1358 | the :func:`min`/:func:`max` functions and the :meth:`sort` methods. For |
| 1359 | example:: |
| 1360 | |
| 1361 | >>> import heapq |
| 1362 | >>> L = ["short", 'medium', 'longest', 'longer still'] |
| 1363 | >>> heapq.nsmallest(2, L) # Return two lowest elements, lexicographically |
| 1364 | ['longer still', 'longest'] |
| 1365 | >>> heapq.nsmallest(2, L, key=len) # Return two shortest elements |
| 1366 | ['short', 'medium'] |
| 1367 | |
| 1368 | (Contributed by Raymond Hettinger.) |
| 1369 | |
| 1370 | * The :func:`itertools.islice` function now accepts ``None`` for the start and |
| 1371 | step arguments. This makes it more compatible with the attributes of slice |
| 1372 | objects, so that you can now write the following:: |
| 1373 | |
| 1374 | s = slice(5) # Create slice object |
| 1375 | itertools.islice(iterable, s.start, s.stop, s.step) |
| 1376 | |
| 1377 | (Contributed by Raymond Hettinger.) |
| 1378 | |
| 1379 | * The :func:`format` function in the :mod:`locale` module has been modified and |
| 1380 | two new functions were added, :func:`format_string` and :func:`currency`. |
| 1381 | |
| 1382 | The :func:`format` function's *val* parameter could previously be a string as |
| 1383 | long as no more than one %char specifier appeared; now the parameter must be |
| 1384 | exactly one %char specifier with no surrounding text. An optional *monetary* |
| 1385 | parameter was also added which, if ``True``, will use the locale's rules for |
| 1386 | formatting currency in placing a separator between groups of three digits. |
| 1387 | |
| 1388 | To format strings with multiple %char specifiers, use the new |
| 1389 | :func:`format_string` function that works like :func:`format` but also supports |
| 1390 | mixing %char specifiers with arbitrary text. |
| 1391 | |
| 1392 | A new :func:`currency` function was also added that formats a number according |
| 1393 | to the current locale's settings. |
| 1394 | |
| 1395 | (Contributed by Georg Brandl.) |
| 1396 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 1397 | .. Patch 1180296 |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1398 | |
| 1399 | * The :mod:`mailbox` module underwent a massive rewrite to add the capability to |
| 1400 | modify mailboxes in addition to reading them. A new set of classes that include |
| 1401 | :class:`mbox`, :class:`MH`, and :class:`Maildir` are used to read mailboxes, and |
| 1402 | have an :meth:`add(message)` method to add messages, :meth:`remove(key)` to |
| 1403 | remove messages, and :meth:`lock`/:meth:`unlock` to lock/unlock the mailbox. |
| 1404 | The following example converts a maildir-format mailbox into an mbox-format |
| 1405 | one:: |
| 1406 | |
| 1407 | import mailbox |
| 1408 | |
| 1409 | # 'factory=None' uses email.Message.Message as the class representing |
| 1410 | # individual messages. |
| 1411 | src = mailbox.Maildir('maildir', factory=None) |
| 1412 | dest = mailbox.mbox('/tmp/mbox') |
| 1413 | |
| 1414 | for msg in src: |
| 1415 | dest.add(msg) |
| 1416 | |
| 1417 | (Contributed by Gregory K. Johnson. Funding was provided by Google's 2005 |
| 1418 | Summer of Code.) |
| 1419 | |
| 1420 | * New module: the :mod:`msilib` module allows creating Microsoft Installer |
| 1421 | :file:`.msi` files and CAB files. Some support for reading the :file:`.msi` |
| 1422 | database is also included. (Contributed by Martin von Löwis.) |
| 1423 | |
| 1424 | * The :mod:`nis` module now supports accessing domains other than the system |
| 1425 | default domain by supplying a *domain* argument to the :func:`nis.match` and |
| 1426 | :func:`nis.maps` functions. (Contributed by Ben Bell.) |
| 1427 | |
| 1428 | * The :mod:`operator` module's :func:`itemgetter` and :func:`attrgetter` |
| 1429 | functions now support multiple fields. A call such as |
| 1430 | ``operator.attrgetter('a', 'b')`` will return a function that retrieves the |
| 1431 | :attr:`a` and :attr:`b` attributes. Combining this new feature with the |
| 1432 | :meth:`sort` method's ``key`` parameter lets you easily sort lists using |
| 1433 | multiple fields. (Contributed by Raymond Hettinger.) |
| 1434 | |
| 1435 | * The :mod:`optparse` module was updated to version 1.5.1 of the Optik library. |
| 1436 | The :class:`OptionParser` class gained an :attr:`epilog` attribute, a string |
| 1437 | that will be printed after the help message, and a :meth:`destroy` method to |
| 1438 | break reference cycles created by the object. (Contributed by Greg Ward.) |
| 1439 | |
| 1440 | * The :mod:`os` module underwent several changes. The :attr:`stat_float_times` |
| 1441 | variable now defaults to true, meaning that :func:`os.stat` will now return time |
| 1442 | values as floats. (This doesn't necessarily mean that :func:`os.stat` will |
| 1443 | return times that are precise to fractions of a second; not all systems support |
| 1444 | such precision.) |
| 1445 | |
| 1446 | Constants named :attr:`os.SEEK_SET`, :attr:`os.SEEK_CUR`, and |
| 1447 | :attr:`os.SEEK_END` have been added; these are the parameters to the |
| 1448 | :func:`os.lseek` function. Two new constants for locking are |
| 1449 | :attr:`os.O_SHLOCK` and :attr:`os.O_EXLOCK`. |
| 1450 | |
| 1451 | Two new functions, :func:`wait3` and :func:`wait4`, were added. They're similar |
| 1452 | the :func:`waitpid` function which waits for a child process to exit and returns |
| 1453 | a tuple of the process ID and its exit status, but :func:`wait3` and |
| 1454 | :func:`wait4` return additional information. :func:`wait3` doesn't take a |
| 1455 | process ID as input, so it waits for any child process to exit and returns a |
| 1456 | 3-tuple of *process-id*, *exit-status*, *resource-usage* as returned from the |
| 1457 | :func:`resource.getrusage` function. :func:`wait4(pid)` does take a process ID. |
| 1458 | (Contributed by Chad J. Schroeder.) |
| 1459 | |
| 1460 | On FreeBSD, the :func:`os.stat` function now returns times with nanosecond |
| 1461 | resolution, and the returned object now has :attr:`st_gen` and |
| 1462 | :attr:`st_birthtime`. The :attr:`st_flags` member is also available, if the |
| 1463 | platform supports it. (Contributed by Antti Louko and Diego Pettenò.) |
| 1464 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 1465 | .. (Patch 1180695, 1212117) |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1466 | |
| 1467 | * The Python debugger provided by the :mod:`pdb` module can now store lists of |
| 1468 | commands to execute when a breakpoint is reached and execution stops. Once |
| 1469 | breakpoint #1 has been created, enter ``commands 1`` and enter a series of |
| 1470 | commands to be executed, finishing the list with ``end``. The command list can |
| 1471 | include commands that resume execution, such as ``continue`` or ``next``. |
| 1472 | (Contributed by Grégoire Dooms.) |
| 1473 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 1474 | .. Patch 790710 |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1475 | |
| 1476 | * The :mod:`pickle` and :mod:`cPickle` modules no longer accept a return value |
| 1477 | of ``None`` from the :meth:`__reduce__` method; the method must return a tuple |
| 1478 | of arguments instead. The ability to return ``None`` was deprecated in Python |
| 1479 | 2.4, so this completes the removal of the feature. |
| 1480 | |
| 1481 | * The :mod:`pkgutil` module, containing various utility functions for finding |
| 1482 | packages, was enhanced to support PEP 302's import hooks and now also works for |
| 1483 | packages stored in ZIP-format archives. (Contributed by Phillip J. Eby.) |
| 1484 | |
| 1485 | * The pybench benchmark suite by Marc-André Lemburg is now included in the |
| 1486 | :file:`Tools/pybench` directory. The pybench suite is an improvement on the |
| 1487 | commonly used :file:`pystone.py` program because pybench provides a more |
| 1488 | detailed measurement of the interpreter's speed. It times particular operations |
| 1489 | such as function calls, tuple slicing, method lookups, and numeric operations, |
| 1490 | instead of performing many different operations and reducing the result to a |
| 1491 | single number as :file:`pystone.py` does. |
| 1492 | |
| 1493 | * The :mod:`pyexpat` module now uses version 2.0 of the Expat parser. |
| 1494 | (Contributed by Trent Mick.) |
| 1495 | |
| 1496 | * The :class:`Queue` class provided by the :mod:`Queue` module gained two new |
| 1497 | methods. :meth:`join` blocks until all items in the queue have been retrieved |
| 1498 | and all processing work on the items have been completed. Worker threads call |
| 1499 | the other new method, :meth:`task_done`, to signal that processing for an item |
| 1500 | has been completed. (Contributed by Raymond Hettinger.) |
| 1501 | |
| 1502 | * The old :mod:`regex` and :mod:`regsub` modules, which have been deprecated |
| 1503 | ever since Python 2.0, have finally been deleted. Other deleted modules: |
| 1504 | :mod:`statcache`, :mod:`tzparse`, :mod:`whrandom`. |
| 1505 | |
| 1506 | * Also deleted: the :file:`lib-old` directory, which includes ancient modules |
| 1507 | such as :mod:`dircmp` and :mod:`ni`, was removed. :file:`lib-old` wasn't on the |
| 1508 | default ``sys.path``, so unless your programs explicitly added the directory to |
| 1509 | ``sys.path``, this removal shouldn't affect your code. |
| 1510 | |
| 1511 | * The :mod:`rlcompleter` module is no longer dependent on importing the |
| 1512 | :mod:`readline` module and therefore now works on non-Unix platforms. (Patch |
| 1513 | from Robert Kiendl.) |
| 1514 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 1515 | .. Patch #1472854 |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1516 | |
| 1517 | * The :mod:`SimpleXMLRPCServer` and :mod:`DocXMLRPCServer` classes now have a |
| 1518 | :attr:`rpc_paths` attribute that constrains XML-RPC operations to a limited set |
| 1519 | of URL paths; the default is to allow only ``'/'`` and ``'/RPC2'``. Setting |
| 1520 | :attr:`rpc_paths` to ``None`` or an empty tuple disables this path checking. |
| 1521 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 1522 | .. Bug #1473048 |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1523 | |
| 1524 | * The :mod:`socket` module now supports :const:`AF_NETLINK` sockets on Linux, |
| 1525 | thanks to a patch from Philippe Biondi. Netlink sockets are a Linux-specific |
| 1526 | mechanism for communications between a user-space process and kernel code; an |
| 1527 | introductory article about them is at http://www.linuxjournal.com/article/7356. |
| 1528 | In Python code, netlink addresses are represented as a tuple of 2 integers, |
| 1529 | ``(pid, group_mask)``. |
| 1530 | |
| 1531 | Two new methods on socket objects, :meth:`recv_into(buffer)` and |
| 1532 | :meth:`recvfrom_into(buffer)`, store the received data in an object that |
| 1533 | supports the buffer protocol instead of returning the data as a string. This |
| 1534 | means you can put the data directly into an array or a memory-mapped file. |
| 1535 | |
| 1536 | Socket objects also gained :meth:`getfamily`, :meth:`gettype`, and |
| 1537 | :meth:`getproto` accessor methods to retrieve the family, type, and protocol |
| 1538 | values for the socket. |
| 1539 | |
| 1540 | * New module: the :mod:`spwd` module provides functions for accessing the shadow |
| 1541 | password database on systems that support shadow passwords. |
| 1542 | |
| 1543 | * The :mod:`struct` is now faster because it compiles format strings into |
| 1544 | :class:`Struct` objects with :meth:`pack` and :meth:`unpack` methods. This is |
| 1545 | similar to how the :mod:`re` module lets you create compiled regular expression |
| 1546 | objects. You can still use the module-level :func:`pack` and :func:`unpack` |
| 1547 | functions; they'll create :class:`Struct` objects and cache them. Or you can |
| 1548 | use :class:`Struct` instances directly:: |
| 1549 | |
| 1550 | s = struct.Struct('ih3s') |
| 1551 | |
| 1552 | data = s.pack(1972, 187, 'abc') |
| 1553 | year, number, name = s.unpack(data) |
| 1554 | |
| 1555 | You can also pack and unpack data to and from buffer objects directly using the |
| 1556 | :meth:`pack_into(buffer, offset, v1, v2, ...)` and :meth:`unpack_from(buffer, |
| 1557 | offset)` methods. This lets you store data directly into an array or a memory- |
| 1558 | mapped file. |
| 1559 | |
| 1560 | (:class:`Struct` objects were implemented by Bob Ippolito at the NeedForSpeed |
| 1561 | sprint. Support for buffer objects was added by Martin Blais, also at the |
| 1562 | NeedForSpeed sprint.) |
| 1563 | |
| 1564 | * The Python developers switched from CVS to Subversion during the 2.5 |
| 1565 | development process. Information about the exact build version is available as |
| 1566 | the ``sys.subversion`` variable, a 3-tuple of ``(interpreter-name, branch-name, |
| 1567 | revision-range)``. For example, at the time of writing my copy of 2.5 was |
| 1568 | reporting ``('CPython', 'trunk', '45313:45315')``. |
| 1569 | |
| 1570 | This information is also available to C extensions via the |
| 1571 | :cfunc:`Py_GetBuildInfo` function that returns a string of build information |
| 1572 | like this: ``"trunk:45355:45356M, Apr 13 2006, 07:42:19"``. (Contributed by |
| 1573 | Barry Warsaw.) |
| 1574 | |
| 1575 | * Another new function, :func:`sys._current_frames`, returns the current stack |
| 1576 | frames for all running threads as a dictionary mapping thread identifiers to the |
| 1577 | topmost stack frame currently active in that thread at the time the function is |
| 1578 | called. (Contributed by Tim Peters.) |
| 1579 | |
| 1580 | * The :class:`TarFile` class in the :mod:`tarfile` module now has an |
| 1581 | :meth:`extractall` method that extracts all members from the archive into the |
| 1582 | current working directory. It's also possible to set a different directory as |
| 1583 | the extraction target, and to unpack only a subset of the archive's members. |
| 1584 | |
| 1585 | The compression used for a tarfile opened in stream mode can now be autodetected |
| 1586 | using the mode ``'r|*'``. (Contributed by Lars Gustäbel.) |
| 1587 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 1588 | .. patch 918101 |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1589 | |
| 1590 | * The :mod:`threading` module now lets you set the stack size used when new |
| 1591 | threads are created. The :func:`stack_size([*size*])` function returns the |
| 1592 | currently configured stack size, and supplying the optional *size* parameter |
| 1593 | sets a new value. Not all platforms support changing the stack size, but |
| 1594 | Windows, POSIX threading, and OS/2 all do. (Contributed by Andrew MacIntyre.) |
| 1595 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 1596 | .. Patch 1454481 |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1597 | |
| 1598 | * The :mod:`unicodedata` module has been updated to use version 4.1.0 of the |
| 1599 | Unicode character database. Version 3.2.0 is required by some specifications, |
| 1600 | so it's still available as :attr:`unicodedata.ucd_3_2_0`. |
| 1601 | |
| 1602 | * New module: the :mod:`uuid` module generates universally unique identifiers |
| 1603 | (UUIDs) according to :rfc:`4122`. The RFC defines several different UUID |
| 1604 | versions that are generated from a starting string, from system properties, or |
| 1605 | purely randomly. This module contains a :class:`UUID` class and functions |
| 1606 | named :func:`uuid1`, :func:`uuid3`, :func:`uuid4`, and :func:`uuid5` to |
| 1607 | generate different versions of UUID. (Version 2 UUIDs are not specified in |
| 1608 | :rfc:`4122` and are not supported by this module.) :: |
| 1609 | |
| 1610 | >>> import uuid |
| 1611 | >>> # make a UUID based on the host ID and current time |
| 1612 | >>> uuid.uuid1() |
| 1613 | UUID('a8098c1a-f86e-11da-bd1a-00112444be1e') |
| 1614 | |
| 1615 | >>> # make a UUID using an MD5 hash of a namespace UUID and a name |
| 1616 | >>> uuid.uuid3(uuid.NAMESPACE_DNS, 'python.org') |
| 1617 | UUID('6fa459ea-ee8a-3ca4-894e-db77e160355e') |
| 1618 | |
| 1619 | >>> # make a random UUID |
| 1620 | >>> uuid.uuid4() |
| 1621 | UUID('16fd2706-8baf-433b-82eb-8c7fada847da') |
| 1622 | |
| 1623 | >>> # make a UUID using a SHA-1 hash of a namespace UUID and a name |
| 1624 | >>> uuid.uuid5(uuid.NAMESPACE_DNS, 'python.org') |
| 1625 | UUID('886313e1-3b8a-5372-9b90-0c9aee199e5d') |
| 1626 | |
| 1627 | (Contributed by Ka-Ping Yee.) |
| 1628 | |
| 1629 | * The :mod:`weakref` module's :class:`WeakKeyDictionary` and |
| 1630 | :class:`WeakValueDictionary` types gained new methods for iterating over the |
| 1631 | weak references contained in the dictionary. :meth:`iterkeyrefs` and |
| 1632 | :meth:`keyrefs` methods were added to :class:`WeakKeyDictionary`, and |
| 1633 | :meth:`itervaluerefs` and :meth:`valuerefs` were added to |
| 1634 | :class:`WeakValueDictionary`. (Contributed by Fred L. Drake, Jr.) |
| 1635 | |
| 1636 | * The :mod:`webbrowser` module received a number of enhancements. It's now |
| 1637 | usable as a script with ``python -m webbrowser``, taking a URL as the argument; |
| 1638 | there are a number of switches to control the behaviour (:option:`-n` for a new |
| 1639 | browser window, :option:`-t` for a new tab). New module-level functions, |
| 1640 | :func:`open_new` and :func:`open_new_tab`, were added to support this. The |
| 1641 | module's :func:`open` function supports an additional feature, an *autoraise* |
| 1642 | parameter that signals whether to raise the open window when possible. A number |
| 1643 | of additional browsers were added to the supported list such as Firefox, Opera, |
| 1644 | Konqueror, and elinks. (Contributed by Oleg Broytmann and Georg Brandl.) |
| 1645 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 1646 | .. Patch #754022 |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1647 | |
| 1648 | * The :mod:`xmlrpclib` module now supports returning :class:`datetime` objects |
| 1649 | for the XML-RPC date type. Supply ``use_datetime=True`` to the :func:`loads` |
| 1650 | function or the :class:`Unmarshaller` class to enable this feature. (Contributed |
| 1651 | by Skip Montanaro.) |
| 1652 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 1653 | .. Patch 1120353 |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1654 | |
| 1655 | * The :mod:`zipfile` module now supports the ZIP64 version of the format, |
| 1656 | meaning that a .zip archive can now be larger than 4 GiB and can contain |
| 1657 | individual files larger than 4 GiB. (Contributed by Ronald Oussoren.) |
| 1658 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 1659 | .. Patch 1446489 |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1660 | |
| 1661 | * The :mod:`zlib` module's :class:`Compress` and :class:`Decompress` objects now |
| 1662 | support a :meth:`copy` method that makes a copy of the object's internal state |
| 1663 | and returns a new :class:`Compress` or :class:`Decompress` object. |
| 1664 | (Contributed by Chris AtLee.) |
| 1665 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 1666 | .. Patch 1435422 |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1667 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 1668 | .. ====================================================================== |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1669 | |
| 1670 | |
| 1671 | .. _module-ctypes: |
| 1672 | |
| 1673 | The ctypes package |
| 1674 | ------------------ |
| 1675 | |
| 1676 | The :mod:`ctypes` package, written by Thomas Heller, has been added to the |
| 1677 | standard library. :mod:`ctypes` lets you call arbitrary functions in shared |
| 1678 | libraries or DLLs. Long-time users may remember the :mod:`dl` module, which |
| 1679 | provides functions for loading shared libraries and calling functions in them. |
| 1680 | The :mod:`ctypes` package is much fancier. |
| 1681 | |
| 1682 | To load a shared library or DLL, you must create an instance of the |
| 1683 | :class:`CDLL` class and provide the name or path of the shared library or DLL. |
| 1684 | Once that's done, you can call arbitrary functions by accessing them as |
| 1685 | attributes of the :class:`CDLL` object. :: |
| 1686 | |
| 1687 | import ctypes |
| 1688 | |
| 1689 | libc = ctypes.CDLL('libc.so.6') |
| 1690 | result = libc.printf("Line of output\n") |
| 1691 | |
| 1692 | Type constructors for the various C types are provided: :func:`c_int`, |
| 1693 | :func:`c_float`, :func:`c_double`, :func:`c_char_p` (equivalent to :ctype:`char |
| 1694 | \*`), and so forth. Unlike Python's types, the C versions are all mutable; you |
| 1695 | can assign to their :attr:`value` attribute to change the wrapped value. Python |
| 1696 | integers and strings will be automatically converted to the corresponding C |
| 1697 | types, but for other types you must call the correct type constructor. (And I |
| 1698 | mean *must*; getting it wrong will often result in the interpreter crashing |
| 1699 | with a segmentation fault.) |
| 1700 | |
| 1701 | You shouldn't use :func:`c_char_p` with a Python string when the C function will |
| 1702 | be modifying the memory area, because Python strings are supposed to be |
| 1703 | immutable; breaking this rule will cause puzzling bugs. When you need a |
| 1704 | modifiable memory area, use :func:`create_string_buffer`:: |
| 1705 | |
| 1706 | s = "this is a string" |
| 1707 | buf = ctypes.create_string_buffer(s) |
| 1708 | libc.strfry(buf) |
| 1709 | |
| 1710 | C functions are assumed to return integers, but you can set the :attr:`restype` |
| 1711 | attribute of the function object to change this:: |
| 1712 | |
| 1713 | >>> libc.atof('2.71828') |
| 1714 | -1783957616 |
| 1715 | >>> libc.atof.restype = ctypes.c_double |
| 1716 | >>> libc.atof('2.71828') |
| 1717 | 2.71828 |
| 1718 | |
| 1719 | :mod:`ctypes` also provides a wrapper for Python's C API as the |
| 1720 | ``ctypes.pythonapi`` object. This object does *not* release the global |
| 1721 | interpreter lock before calling a function, because the lock must be held when |
| 1722 | calling into the interpreter's code. There's a :class:`py_object()` type |
| 1723 | constructor that will create a :ctype:`PyObject \*` pointer. A simple usage:: |
| 1724 | |
| 1725 | import ctypes |
| 1726 | |
| 1727 | d = {} |
| 1728 | ctypes.pythonapi.PyObject_SetItem(ctypes.py_object(d), |
| 1729 | ctypes.py_object("abc"), ctypes.py_object(1)) |
| 1730 | # d is now {'abc', 1}. |
| 1731 | |
| 1732 | Don't forget to use :class:`py_object()`; if it's omitted you end up with a |
| 1733 | segmentation fault. |
| 1734 | |
| 1735 | :mod:`ctypes` has been around for a while, but people still write and |
| 1736 | distribution hand-coded extension modules because you can't rely on |
| 1737 | :mod:`ctypes` being present. Perhaps developers will begin to write Python |
| 1738 | wrappers atop a library accessed through :mod:`ctypes` instead of extension |
| 1739 | modules, now that :mod:`ctypes` is included with core Python. |
| 1740 | |
| 1741 | |
| 1742 | .. seealso:: |
| 1743 | |
| 1744 | http://starship.python.net/crew/theller/ctypes/ |
| 1745 | The ctypes web page, with a tutorial, reference, and FAQ. |
| 1746 | |
| 1747 | The documentation for the :mod:`ctypes` module. |
| 1748 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 1749 | .. ====================================================================== |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1750 | |
| 1751 | |
| 1752 | .. _module-etree: |
| 1753 | |
| 1754 | The ElementTree package |
| 1755 | ----------------------- |
| 1756 | |
| 1757 | A subset of Fredrik Lundh's ElementTree library for processing XML has been |
| 1758 | added to the standard library as :mod:`xml.etree`. The available modules are |
| 1759 | :mod:`ElementTree`, :mod:`ElementPath`, and :mod:`ElementInclude` from |
| 1760 | ElementTree 1.2.6. The :mod:`cElementTree` accelerator module is also |
| 1761 | included. |
| 1762 | |
| 1763 | The rest of this section will provide a brief overview of using ElementTree. |
Georg Brandl | 4a69872 | 2009-02-20 07:48:21 +0000 | [diff] [blame] | 1764 | Full documentation for ElementTree is available at |
| 1765 | http://effbot.org/zone/element-index.htm. |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1766 | |
| 1767 | ElementTree represents an XML document as a tree of element nodes. The text |
Georg Brandl | 821fc08 | 2010-08-01 21:26:45 +0000 | [diff] [blame] | 1768 | content of the document is stored as the :attr:`text` and :attr:`tail` |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1769 | attributes of (This is one of the major differences between ElementTree and |
| 1770 | the Document Object Model; in the DOM there are many different types of node, |
| 1771 | including :class:`TextNode`.) |
| 1772 | |
| 1773 | The most commonly used parsing function is :func:`parse`, that takes either a |
| 1774 | string (assumed to contain a filename) or a file-like object and returns an |
| 1775 | :class:`ElementTree` instance:: |
| 1776 | |
| 1777 | from xml.etree import ElementTree as ET |
| 1778 | |
| 1779 | tree = ET.parse('ex-1.xml') |
| 1780 | |
| 1781 | feed = urllib.urlopen( |
| 1782 | 'http://planet.python.org/rss10.xml') |
| 1783 | tree = ET.parse(feed) |
| 1784 | |
| 1785 | Once you have an :class:`ElementTree` instance, you can call its :meth:`getroot` |
| 1786 | method to get the root :class:`Element` node. |
| 1787 | |
| 1788 | There's also an :func:`XML` function that takes a string literal and returns an |
| 1789 | :class:`Element` node (not an :class:`ElementTree`). This function provides a |
| 1790 | tidy way to incorporate XML fragments, approaching the convenience of an XML |
| 1791 | literal:: |
| 1792 | |
| 1793 | svg = ET.XML("""<svg width="10px" version="1.0"> |
| 1794 | </svg>""") |
| 1795 | svg.set('height', '320px') |
| 1796 | svg.append(elem1) |
| 1797 | |
| 1798 | Each XML element supports some dictionary-like and some list-like access |
| 1799 | methods. Dictionary-like operations are used to access attribute values, and |
| 1800 | list-like operations are used to access child nodes. |
| 1801 | |
| 1802 | +-------------------------------+--------------------------------------------+ |
| 1803 | | Operation | Result | |
| 1804 | +===============================+============================================+ |
| 1805 | | ``elem[n]`` | Returns n'th child element. | |
| 1806 | +-------------------------------+--------------------------------------------+ |
| 1807 | | ``elem[m:n]`` | Returns list of m'th through n'th child | |
| 1808 | | | elements. | |
| 1809 | +-------------------------------+--------------------------------------------+ |
| 1810 | | ``len(elem)`` | Returns number of child elements. | |
| 1811 | +-------------------------------+--------------------------------------------+ |
| 1812 | | ``list(elem)`` | Returns list of child elements. | |
| 1813 | +-------------------------------+--------------------------------------------+ |
| 1814 | | ``elem.append(elem2)`` | Adds *elem2* as a child. | |
| 1815 | +-------------------------------+--------------------------------------------+ |
| 1816 | | ``elem.insert(index, elem2)`` | Inserts *elem2* at the specified location. | |
| 1817 | +-------------------------------+--------------------------------------------+ |
| 1818 | | ``del elem[n]`` | Deletes n'th child element. | |
| 1819 | +-------------------------------+--------------------------------------------+ |
| 1820 | | ``elem.keys()`` | Returns list of attribute names. | |
| 1821 | +-------------------------------+--------------------------------------------+ |
| 1822 | | ``elem.get(name)`` | Returns value of attribute *name*. | |
| 1823 | +-------------------------------+--------------------------------------------+ |
| 1824 | | ``elem.set(name, value)`` | Sets new value for attribute *name*. | |
| 1825 | +-------------------------------+--------------------------------------------+ |
| 1826 | | ``elem.attrib`` | Retrieves the dictionary containing | |
| 1827 | | | attributes. | |
| 1828 | +-------------------------------+--------------------------------------------+ |
| 1829 | | ``del elem.attrib[name]`` | Deletes attribute *name*. | |
| 1830 | +-------------------------------+--------------------------------------------+ |
| 1831 | |
| 1832 | Comments and processing instructions are also represented as :class:`Element` |
| 1833 | nodes. To check if a node is a comment or processing instructions:: |
| 1834 | |
| 1835 | if elem.tag is ET.Comment: |
| 1836 | ... |
| 1837 | elif elem.tag is ET.ProcessingInstruction: |
| 1838 | ... |
| 1839 | |
| 1840 | To generate XML output, you should call the :meth:`ElementTree.write` method. |
| 1841 | Like :func:`parse`, it can take either a string or a file-like object:: |
| 1842 | |
| 1843 | # Encoding is US-ASCII |
| 1844 | tree.write('output.xml') |
| 1845 | |
| 1846 | # Encoding is UTF-8 |
| 1847 | f = open('output.xml', 'w') |
| 1848 | tree.write(f, encoding='utf-8') |
| 1849 | |
| 1850 | (Caution: the default encoding used for output is ASCII. For general XML work, |
| 1851 | where an element's name may contain arbitrary Unicode characters, ASCII isn't a |
| 1852 | very useful encoding because it will raise an exception if an element's name |
| 1853 | contains any characters with values greater than 127. Therefore, it's best to |
| 1854 | specify a different encoding such as UTF-8 that can handle any Unicode |
| 1855 | character.) |
| 1856 | |
| 1857 | This section is only a partial description of the ElementTree interfaces. Please |
| 1858 | read the package's official documentation for more details. |
| 1859 | |
| 1860 | |
| 1861 | .. seealso:: |
| 1862 | |
| 1863 | http://effbot.org/zone/element-index.htm |
| 1864 | Official documentation for ElementTree. |
| 1865 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 1866 | .. ====================================================================== |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1867 | |
| 1868 | |
| 1869 | .. _module-hashlib: |
| 1870 | |
| 1871 | The hashlib package |
| 1872 | ------------------- |
| 1873 | |
| 1874 | A new :mod:`hashlib` module, written by Gregory P. Smith, has been added to |
| 1875 | replace the :mod:`md5` and :mod:`sha` modules. :mod:`hashlib` adds support for |
| 1876 | additional secure hashes (SHA-224, SHA-256, SHA-384, and SHA-512). When |
| 1877 | available, the module uses OpenSSL for fast platform optimized implementations |
| 1878 | of algorithms. |
| 1879 | |
| 1880 | The old :mod:`md5` and :mod:`sha` modules still exist as wrappers around hashlib |
| 1881 | to preserve backwards compatibility. The new module's interface is very close |
| 1882 | to that of the old modules, but not identical. The most significant difference |
| 1883 | is that the constructor functions for creating new hashing objects are named |
| 1884 | differently. :: |
| 1885 | |
| 1886 | # Old versions |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 1887 | h = md5.md5() |
| 1888 | h = md5.new() |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1889 | |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 1890 | # New version |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1891 | h = hashlib.md5() |
| 1892 | |
| 1893 | # Old versions |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 1894 | h = sha.sha() |
| 1895 | h = sha.new() |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1896 | |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 1897 | # New version |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1898 | h = hashlib.sha1() |
| 1899 | |
| 1900 | # Hash that weren't previously available |
| 1901 | h = hashlib.sha224() |
| 1902 | h = hashlib.sha256() |
| 1903 | h = hashlib.sha384() |
| 1904 | h = hashlib.sha512() |
| 1905 | |
| 1906 | # Alternative form |
| 1907 | h = hashlib.new('md5') # Provide algorithm as a string |
| 1908 | |
| 1909 | Once a hash object has been created, its methods are the same as before: |
| 1910 | :meth:`update(string)` hashes the specified string into the current digest |
| 1911 | state, :meth:`digest` and :meth:`hexdigest` return the digest value as a binary |
| 1912 | string or a string of hex digits, and :meth:`copy` returns a new hashing object |
| 1913 | with the same digest state. |
| 1914 | |
| 1915 | |
| 1916 | .. seealso:: |
| 1917 | |
| 1918 | The documentation for the :mod:`hashlib` module. |
| 1919 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 1920 | .. ====================================================================== |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1921 | |
| 1922 | |
| 1923 | .. _module-sqlite: |
| 1924 | |
| 1925 | The sqlite3 package |
| 1926 | ------------------- |
| 1927 | |
| 1928 | The pysqlite module (http://www.pysqlite.org), a wrapper for the SQLite embedded |
| 1929 | database, has been added to the standard library under the package name |
| 1930 | :mod:`sqlite3`. |
| 1931 | |
| 1932 | SQLite is a C library that provides a lightweight disk-based database that |
| 1933 | doesn't require a separate server process and allows accessing the database |
| 1934 | using a nonstandard variant of the SQL query language. Some applications can use |
| 1935 | SQLite for internal data storage. It's also possible to prototype an |
| 1936 | application using SQLite and then port the code to a larger database such as |
| 1937 | PostgreSQL or Oracle. |
| 1938 | |
| 1939 | pysqlite was written by Gerhard Häring and provides a SQL interface compliant |
| 1940 | with the DB-API 2.0 specification described by :pep:`249`. |
| 1941 | |
| 1942 | If you're compiling the Python source yourself, note that the source tree |
| 1943 | doesn't include the SQLite code, only the wrapper module. You'll need to have |
| 1944 | the SQLite libraries and headers installed before compiling Python, and the |
| 1945 | build process will compile the module when the necessary headers are available. |
| 1946 | |
| 1947 | To use the module, you must first create a :class:`Connection` object that |
| 1948 | represents the database. Here the data will be stored in the |
| 1949 | :file:`/tmp/example` file:: |
| 1950 | |
| 1951 | conn = sqlite3.connect('/tmp/example') |
| 1952 | |
| 1953 | You can also supply the special name ``:memory:`` to create a database in RAM. |
| 1954 | |
| 1955 | Once you have a :class:`Connection`, you can create a :class:`Cursor` object |
| 1956 | and call its :meth:`execute` method to perform SQL commands:: |
| 1957 | |
| 1958 | c = conn.cursor() |
| 1959 | |
| 1960 | # Create table |
| 1961 | c.execute('''create table stocks |
| 1962 | (date text, trans text, symbol text, |
| 1963 | qty real, price real)''') |
| 1964 | |
| 1965 | # Insert a row of data |
| 1966 | c.execute("""insert into stocks |
| 1967 | values ('2006-01-05','BUY','RHAT',100,35.14)""") |
| 1968 | |
| 1969 | Usually your SQL operations will need to use values from Python variables. You |
| 1970 | shouldn't assemble your query using Python's string operations because doing so |
| 1971 | is insecure; it makes your program vulnerable to an SQL injection attack. |
| 1972 | |
| 1973 | Instead, use the DB-API's parameter substitution. Put ``?`` as a placeholder |
| 1974 | wherever you want to use a value, and then provide a tuple of values as the |
| 1975 | second argument to the cursor's :meth:`execute` method. (Other database modules |
| 1976 | may use a different placeholder, such as ``%s`` or ``:1``.) For example:: |
| 1977 | |
| 1978 | # Never do this -- insecure! |
| 1979 | symbol = 'IBM' |
| 1980 | c.execute("... where symbol = '%s'" % symbol) |
| 1981 | |
| 1982 | # Do this instead |
| 1983 | t = (symbol,) |
| 1984 | c.execute('select * from stocks where symbol=?', t) |
| 1985 | |
| 1986 | # Larger example |
| 1987 | for t in (('2006-03-28', 'BUY', 'IBM', 1000, 45.00), |
| 1988 | ('2006-04-05', 'BUY', 'MSOFT', 1000, 72.00), |
| 1989 | ('2006-04-06', 'SELL', 'IBM', 500, 53.00), |
| 1990 | ): |
| 1991 | c.execute('insert into stocks values (?,?,?,?,?)', t) |
| 1992 | |
| 1993 | To retrieve data after executing a SELECT statement, you can either treat the |
| 1994 | cursor as an iterator, call the cursor's :meth:`fetchone` method to retrieve a |
| 1995 | single matching row, or call :meth:`fetchall` to get a list of the matching |
| 1996 | rows. |
| 1997 | |
| 1998 | This example uses the iterator form:: |
| 1999 | |
| 2000 | >>> c = conn.cursor() |
| 2001 | >>> c.execute('select * from stocks order by price') |
| 2002 | >>> for row in c: |
| 2003 | ... print row |
| 2004 | ... |
| 2005 | (u'2006-01-05', u'BUY', u'RHAT', 100, 35.140000000000001) |
| 2006 | (u'2006-03-28', u'BUY', u'IBM', 1000, 45.0) |
| 2007 | (u'2006-04-06', u'SELL', u'IBM', 500, 53.0) |
| 2008 | (u'2006-04-05', u'BUY', u'MSOFT', 1000, 72.0) |
| 2009 | >>> |
| 2010 | |
| 2011 | For more information about the SQL dialect supported by SQLite, see |
| 2012 | http://www.sqlite.org. |
| 2013 | |
| 2014 | |
| 2015 | .. seealso:: |
| 2016 | |
| 2017 | http://www.pysqlite.org |
| 2018 | The pysqlite web page. |
| 2019 | |
| 2020 | http://www.sqlite.org |
| 2021 | The SQLite web page; the documentation describes the syntax and the available |
| 2022 | data types for the supported SQL dialect. |
| 2023 | |
| 2024 | The documentation for the :mod:`sqlite3` module. |
| 2025 | |
| 2026 | :pep:`249` - Database API Specification 2.0 |
| 2027 | PEP written by Marc-André Lemburg. |
| 2028 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 2029 | .. ====================================================================== |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 2030 | |
| 2031 | |
| 2032 | .. _module-wsgiref: |
| 2033 | |
| 2034 | The wsgiref package |
| 2035 | ------------------- |
| 2036 | |
| 2037 | The Web Server Gateway Interface (WSGI) v1.0 defines a standard interface |
| 2038 | between web servers and Python web applications and is described in :pep:`333`. |
| 2039 | The :mod:`wsgiref` package is a reference implementation of the WSGI |
| 2040 | specification. |
| 2041 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 2042 | .. XXX should this be in a PEP 333 section instead? |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 2043 | |
| 2044 | The package includes a basic HTTP server that will run a WSGI application; this |
| 2045 | server is useful for debugging but isn't intended for production use. Setting |
| 2046 | up a server takes only a few lines of code:: |
| 2047 | |
| 2048 | from wsgiref import simple_server |
| 2049 | |
| 2050 | wsgi_app = ... |
| 2051 | |
| 2052 | host = '' |
| 2053 | port = 8000 |
| 2054 | httpd = simple_server.make_server(host, port, wsgi_app) |
| 2055 | httpd.serve_forever() |
| 2056 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 2057 | .. XXX discuss structure of WSGI applications? |
| 2058 | .. XXX provide an example using Django or some other framework? |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 2059 | |
| 2060 | |
| 2061 | .. seealso:: |
| 2062 | |
| 2063 | http://www.wsgi.org |
| 2064 | A central web site for WSGI-related resources. |
| 2065 | |
| 2066 | :pep:`333` - Python Web Server Gateway Interface v1.0 |
| 2067 | PEP written by Phillip J. Eby. |
| 2068 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 2069 | .. ====================================================================== |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 2070 | |
| 2071 | |
| 2072 | .. _build-api: |
| 2073 | |
| 2074 | Build and C API Changes |
| 2075 | ======================= |
| 2076 | |
| 2077 | Changes to Python's build process and to the C API include: |
| 2078 | |
| 2079 | * The Python source tree was converted from CVS to Subversion, in a complex |
| 2080 | migration procedure that was supervised and flawlessly carried out by Martin von |
| 2081 | Löwis. The procedure was developed as :pep:`347`. |
| 2082 | |
| 2083 | * Coverity, a company that markets a source code analysis tool called Prevent, |
| 2084 | provided the results of their examination of the Python source code. The |
| 2085 | analysis found about 60 bugs that were quickly fixed. Many of the bugs were |
| 2086 | refcounting problems, often occurring in error-handling code. See |
| 2087 | http://scan.coverity.com for the statistics. |
| 2088 | |
| 2089 | * The largest change to the C API came from :pep:`353`, which modifies the |
| 2090 | interpreter to use a :ctype:`Py_ssize_t` type definition instead of |
| 2091 | :ctype:`int`. See the earlier section :ref:`pep-353` for a discussion of this |
| 2092 | change. |
| 2093 | |
| 2094 | * The design of the bytecode compiler has changed a great deal, no longer |
| 2095 | generating bytecode by traversing the parse tree. Instead the parse tree is |
| 2096 | converted to an abstract syntax tree (or AST), and it is the abstract syntax |
| 2097 | tree that's traversed to produce the bytecode. |
| 2098 | |
| 2099 | It's possible for Python code to obtain AST objects by using the |
| 2100 | :func:`compile` built-in and specifying ``_ast.PyCF_ONLY_AST`` as the value of |
| 2101 | the *flags* parameter:: |
| 2102 | |
| 2103 | from _ast import PyCF_ONLY_AST |
| 2104 | ast = compile("""a=0 |
| 2105 | for i in range(10): |
| 2106 | a += i |
| 2107 | """, "<string>", 'exec', PyCF_ONLY_AST) |
| 2108 | |
| 2109 | assignment = ast.body[0] |
| 2110 | for_loop = ast.body[1] |
| 2111 | |
| 2112 | No official documentation has been written for the AST code yet, but :pep:`339` |
| 2113 | discusses the design. To start learning about the code, read the definition of |
| 2114 | the various AST nodes in :file:`Parser/Python.asdl`. A Python script reads this |
| 2115 | file and generates a set of C structure definitions in |
| 2116 | :file:`Include/Python-ast.h`. The :cfunc:`PyParser_ASTFromString` and |
| 2117 | :cfunc:`PyParser_ASTFromFile`, defined in :file:`Include/pythonrun.h`, take |
| 2118 | Python source as input and return the root of an AST representing the contents. |
| 2119 | This AST can then be turned into a code object by :cfunc:`PyAST_Compile`. For |
| 2120 | more information, read the source code, and then ask questions on python-dev. |
| 2121 | |
| 2122 | The AST code was developed under Jeremy Hylton's management, and implemented by |
| 2123 | (in alphabetical order) Brett Cannon, Nick Coghlan, Grant Edwards, John |
| 2124 | Ehresman, Kurt Kaiser, Neal Norwitz, Tim Peters, Armin Rigo, and Neil |
| 2125 | Schemenauer, plus the participants in a number of AST sprints at conferences |
| 2126 | such as PyCon. |
| 2127 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 2128 | .. List of names taken from Jeremy's python-dev post at |
| 2129 | .. http://mail.python.org/pipermail/python-dev/2005-October/057500.html |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 2130 | |
| 2131 | * Evan Jones's patch to obmalloc, first described in a talk at PyCon DC 2005, |
| 2132 | was applied. Python 2.4 allocated small objects in 256K-sized arenas, but never |
| 2133 | freed arenas. With this patch, Python will free arenas when they're empty. The |
| 2134 | net effect is that on some platforms, when you allocate many objects, Python's |
| 2135 | memory usage may actually drop when you delete them and the memory may be |
| 2136 | returned to the operating system. (Implemented by Evan Jones, and reworked by |
| 2137 | Tim Peters.) |
| 2138 | |
| 2139 | Note that this change means extension modules must be more careful when |
| 2140 | allocating memory. Python's API has many different functions for allocating |
| 2141 | memory that are grouped into families. For example, :cfunc:`PyMem_Malloc`, |
| 2142 | :cfunc:`PyMem_Realloc`, and :cfunc:`PyMem_Free` are one family that allocates |
| 2143 | raw memory, while :cfunc:`PyObject_Malloc`, :cfunc:`PyObject_Realloc`, and |
| 2144 | :cfunc:`PyObject_Free` are another family that's supposed to be used for |
| 2145 | creating Python objects. |
| 2146 | |
| 2147 | Previously these different families all reduced to the platform's |
| 2148 | :cfunc:`malloc` and :cfunc:`free` functions. This meant it didn't matter if |
| 2149 | you got things wrong and allocated memory with the :cfunc:`PyMem` function but |
| 2150 | freed it with the :cfunc:`PyObject` function. With 2.5's changes to obmalloc, |
| 2151 | these families now do different things and mismatches will probably result in a |
| 2152 | segfault. You should carefully test your C extension modules with Python 2.5. |
| 2153 | |
| 2154 | * The built-in set types now have an official C API. Call :cfunc:`PySet_New` |
| 2155 | and :cfunc:`PyFrozenSet_New` to create a new set, :cfunc:`PySet_Add` and |
| 2156 | :cfunc:`PySet_Discard` to add and remove elements, and :cfunc:`PySet_Contains` |
| 2157 | and :cfunc:`PySet_Size` to examine the set's state. (Contributed by Raymond |
| 2158 | Hettinger.) |
| 2159 | |
| 2160 | * C code can now obtain information about the exact revision of the Python |
| 2161 | interpreter by calling the :cfunc:`Py_GetBuildInfo` function that returns a |
| 2162 | string of build information like this: ``"trunk:45355:45356M, Apr 13 2006, |
| 2163 | 07:42:19"``. (Contributed by Barry Warsaw.) |
| 2164 | |
| 2165 | * Two new macros can be used to indicate C functions that are local to the |
| 2166 | current file so that a faster calling convention can be used. |
| 2167 | :cfunc:`Py_LOCAL(type)` declares the function as returning a value of the |
| 2168 | specified *type* and uses a fast-calling qualifier. |
| 2169 | :cfunc:`Py_LOCAL_INLINE(type)` does the same thing and also requests the |
| 2170 | function be inlined. If :cfunc:`PY_LOCAL_AGGRESSIVE` is defined before |
| 2171 | :file:`python.h` is included, a set of more aggressive optimizations are enabled |
| 2172 | for the module; you should benchmark the results to find out if these |
| 2173 | optimizations actually make the code faster. (Contributed by Fredrik Lundh at |
| 2174 | the NeedForSpeed sprint.) |
| 2175 | |
| 2176 | * :cfunc:`PyErr_NewException(name, base, dict)` can now accept a tuple of base |
| 2177 | classes as its *base* argument. (Contributed by Georg Brandl.) |
| 2178 | |
| 2179 | * The :cfunc:`PyErr_Warn` function for issuing warnings is now deprecated in |
| 2180 | favour of :cfunc:`PyErr_WarnEx(category, message, stacklevel)` which lets you |
| 2181 | specify the number of stack frames separating this function and the caller. A |
| 2182 | *stacklevel* of 1 is the function calling :cfunc:`PyErr_WarnEx`, 2 is the |
| 2183 | function above that, and so forth. (Added by Neal Norwitz.) |
| 2184 | |
| 2185 | * The CPython interpreter is still written in C, but the code can now be |
| 2186 | compiled with a C++ compiler without errors. (Implemented by Anthony Baxter, |
| 2187 | Martin von Löwis, Skip Montanaro.) |
| 2188 | |
| 2189 | * The :cfunc:`PyRange_New` function was removed. It was never documented, never |
| 2190 | used in the core code, and had dangerously lax error checking. In the unlikely |
| 2191 | case that your extensions were using it, you can replace it by something like |
| 2192 | the following:: |
| 2193 | |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 2194 | range = PyObject_CallFunction((PyObject*) &PyRange_Type, "lll", |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 2195 | start, stop, step); |
| 2196 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 2197 | .. ====================================================================== |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 2198 | |
| 2199 | |
| 2200 | .. _ports: |
| 2201 | |
| 2202 | Port-Specific Changes |
| 2203 | --------------------- |
| 2204 | |
| 2205 | * MacOS X (10.3 and higher): dynamic loading of modules now uses the |
| 2206 | :cfunc:`dlopen` function instead of MacOS-specific functions. |
| 2207 | |
Andrew M. Kuchling | 5e3e6ba | 2008-06-05 23:35:48 +0000 | [diff] [blame] | 2208 | * MacOS X: an :option:`--enable-universalsdk` switch was added to the |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 2209 | :program:`configure` script that compiles the interpreter as a universal binary |
| 2210 | able to run on both PowerPC and Intel processors. (Contributed by Ronald |
Andrew M. Kuchling | 5e3e6ba | 2008-06-05 23:35:48 +0000 | [diff] [blame] | 2211 | Oussoren; :issue:`2573`.) |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 2212 | |
| 2213 | * Windows: :file:`.dll` is no longer supported as a filename extension for |
| 2214 | extension modules. :file:`.pyd` is now the only filename extension that will be |
| 2215 | searched for. |
| 2216 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 2217 | .. ====================================================================== |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 2218 | |
| 2219 | |
| 2220 | .. _porting: |
| 2221 | |
| 2222 | Porting to Python 2.5 |
| 2223 | ===================== |
| 2224 | |
| 2225 | This section lists previously described changes that may require changes to your |
| 2226 | code: |
| 2227 | |
| 2228 | * ASCII is now the default encoding for modules. It's now a syntax error if a |
| 2229 | module contains string literals with 8-bit characters but doesn't have an |
| 2230 | encoding declaration. In Python 2.4 this triggered a warning, not a syntax |
| 2231 | error. |
| 2232 | |
| 2233 | * Previously, the :attr:`gi_frame` attribute of a generator was always a frame |
| 2234 | object. Because of the :pep:`342` changes described in section :ref:`pep-342`, |
| 2235 | it's now possible for :attr:`gi_frame` to be ``None``. |
| 2236 | |
| 2237 | * A new warning, :class:`UnicodeWarning`, is triggered when you attempt to |
| 2238 | compare a Unicode string and an 8-bit string that can't be converted to Unicode |
| 2239 | using the default ASCII encoding. Previously such comparisons would raise a |
| 2240 | :class:`UnicodeDecodeError` exception. |
| 2241 | |
| 2242 | * Library: the :mod:`csv` module is now stricter about multi-line quoted fields. |
| 2243 | If your files contain newlines embedded within fields, the input should be split |
| 2244 | into lines in a manner which preserves the newline characters. |
| 2245 | |
| 2246 | * Library: the :mod:`locale` module's :func:`format` function's would |
| 2247 | previously accept any string as long as no more than one %char specifier |
| 2248 | appeared. In Python 2.5, the argument must be exactly one %char specifier with |
| 2249 | no surrounding text. |
| 2250 | |
| 2251 | * Library: The :mod:`pickle` and :mod:`cPickle` modules no longer accept a |
| 2252 | return value of ``None`` from the :meth:`__reduce__` method; the method must |
| 2253 | return a tuple of arguments instead. The modules also no longer accept the |
| 2254 | deprecated *bin* keyword parameter. |
| 2255 | |
| 2256 | * Library: The :mod:`SimpleXMLRPCServer` and :mod:`DocXMLRPCServer` classes now |
| 2257 | have a :attr:`rpc_paths` attribute that constrains XML-RPC operations to a |
| 2258 | limited set of URL paths; the default is to allow only ``'/'`` and ``'/RPC2'``. |
| 2259 | Setting :attr:`rpc_paths` to ``None`` or an empty tuple disables this path |
| 2260 | checking. |
| 2261 | |
| 2262 | * C API: Many functions now use :ctype:`Py_ssize_t` instead of :ctype:`int` to |
| 2263 | allow processing more data on 64-bit machines. Extension code may need to make |
| 2264 | the same change to avoid warnings and to support 64-bit machines. See the |
| 2265 | earlier section :ref:`pep-353` for a discussion of this change. |
| 2266 | |
| 2267 | * C API: The obmalloc changes mean that you must be careful to not mix usage |
| 2268 | of the :cfunc:`PyMem_\*` and :cfunc:`PyObject_\*` families of functions. Memory |
| 2269 | allocated with one family's :cfunc:`\*_Malloc` must be freed with the |
| 2270 | corresponding family's :cfunc:`\*_Free` function. |
| 2271 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 2272 | .. ====================================================================== |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 2273 | |
| 2274 | |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 2275 | Acknowledgements |
| 2276 | ================ |
| 2277 | |
| 2278 | The author would like to thank the following people for offering suggestions, |
| 2279 | corrections and assistance with various drafts of this article: Georg Brandl, |
| 2280 | Nick Coghlan, Phillip J. Eby, Lars Gustäbel, Raymond Hettinger, Ralf W. Grosse- |
| 2281 | Kunstleve, Kent Johnson, Iain Lowe, Martin von Löwis, Fredrik Lundh, Andrew |
| 2282 | McNamara, Skip Montanaro, Gustavo Niemeyer, Paul Prescod, James Pryor, Mike |
| 2283 | Rovner, Scott Weikart, Barry Warsaw, Thomas Wouters. |
| 2284 | |