blob: f71422f51893a7f8efe72570a49d078eb71e4f75 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001****************************
Georg Brandl48310cd2009-01-03 21:18:54 +00002 What's New in Python 2.3
Georg Brandl116aa622007-08-15 14:28:22 +00003****************************
4
5:Author: A.M. Kuchling
6
7.. |release| replace:: 1.01
8
Christian Heimes5b5e81c2007-12-31 16:14:33 +00009.. $Id: whatsnew23.tex 54631 2007-03-31 11:58:36Z georg.brandl $
Georg Brandl116aa622007-08-15 14:28:22 +000010
11This article explains the new features in Python 2.3. Python 2.3 was released
12on July 29, 2003.
13
14The main themes for Python 2.3 are polishing some of the features added in 2.2,
15adding various small but useful enhancements to the core language, and expanding
16the standard library. The new object model introduced in the previous version
17has benefited from 18 months of bugfixes and from optimization efforts that have
18improved the performance of new-style classes. A few new built-in functions
19have been added such as :func:`sum` and :func:`enumerate`. The :keyword:`in`
20operator can now be used for substring searches (e.g. ``"ab" in "abc"`` returns
21:const:`True`).
22
23Some of the many new library features include Boolean, set, heap, and date/time
24data types, the ability to import modules from ZIP-format archives, metadata
25support for the long-awaited Python catalog, an updated version of IDLE, and
26modules for logging messages, wrapping text, parsing CSV files, processing
27command-line options, using BerkeleyDB databases... the list of new and
28enhanced modules is lengthy.
29
30This article doesn't attempt to provide a complete specification of the new
31features, but instead provides a convenient overview. For full details, you
32should refer to the documentation for Python 2.3, such as the Python Library
33Reference and the Python Reference Manual. If you want to understand the
34complete implementation and design rationale, refer to the PEP for a particular
35new feature.
36
Christian Heimes5b5e81c2007-12-31 16:14:33 +000037.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +000038
39
40PEP 218: A Standard Set Datatype
41================================
42
43The new :mod:`sets` module contains an implementation of a set datatype. The
44:class:`Set` class is for mutable sets, sets that can have members added and
45removed. The :class:`ImmutableSet` class is for sets that can't be modified,
46and instances of :class:`ImmutableSet` can therefore be used as dictionary keys.
47Sets are built on top of dictionaries, so the elements within a set must be
48hashable.
49
50Here's a simple example::
51
52 >>> import sets
53 >>> S = sets.Set([1,2,3])
54 >>> S
55 Set([1, 2, 3])
56 >>> 1 in S
57 True
58 >>> 0 in S
59 False
60 >>> S.add(5)
61 >>> S.remove(3)
62 >>> S
63 Set([1, 2, 5])
64 >>>
65
66The union and intersection of sets can be computed with the :meth:`union` and
67:meth:`intersection` methods; an alternative notation uses the bitwise operators
68``&`` and ``|``. Mutable sets also have in-place versions of these methods,
69:meth:`union_update` and :meth:`intersection_update`. ::
70
71 >>> S1 = sets.Set([1,2,3])
72 >>> S2 = sets.Set([4,5,6])
73 >>> S1.union(S2)
74 Set([1, 2, 3, 4, 5, 6])
75 >>> S1 | S2 # Alternative notation
76 Set([1, 2, 3, 4, 5, 6])
77 >>> S1.intersection(S2)
78 Set([])
79 >>> S1 & S2 # Alternative notation
80 Set([])
81 >>> S1.union_update(S2)
82 >>> S1
83 Set([1, 2, 3, 4, 5, 6])
84 >>>
85
86It's also possible to take the symmetric difference of two sets. This is the
87set of all elements in the union that aren't in the intersection. Another way
88of putting it is that the symmetric difference contains all elements that are in
89exactly one set. Again, there's an alternative notation (``^``), and an in-
90place version with the ungainly name :meth:`symmetric_difference_update`. ::
91
92 >>> S1 = sets.Set([1,2,3,4])
93 >>> S2 = sets.Set([3,4,5,6])
94 >>> S1.symmetric_difference(S2)
95 Set([1, 2, 5, 6])
96 >>> S1 ^ S2
97 Set([1, 2, 5, 6])
98 >>>
99
100There are also :meth:`issubset` and :meth:`issuperset` methods for checking
101whether one set is a subset or superset of another::
102
103 >>> S1 = sets.Set([1,2,3])
104 >>> S2 = sets.Set([2,3])
105 >>> S2.issubset(S1)
106 True
107 >>> S1.issubset(S2)
108 False
109 >>> S1.issuperset(S2)
110 True
111 >>>
112
113
114.. seealso::
115
116 :pep:`218` - Adding a Built-In Set Object Type
117 PEP written by Greg V. Wilson. Implemented by Greg V. Wilson, Alex Martelli, and
118 GvR.
119
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000120.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +0000121
122
123.. _section-generators:
124
125PEP 255: Simple Generators
126==========================
127
128In Python 2.2, generators were added as an optional feature, to be enabled by a
129``from __future__ import generators`` directive. In 2.3 generators no longer
130need to be specially enabled, and are now always present; this means that
131:keyword:`yield` is now always a keyword. The rest of this section is a copy of
132the description of generators from the "What's New in Python 2.2" document; if
133you read it back when Python 2.2 came out, you can skip the rest of this
134section.
135
136You're doubtless familiar with how function calls work in Python or C. When you
137call a function, it gets a private namespace where its local variables are
138created. When the function reaches a :keyword:`return` statement, the local
139variables are destroyed and the resulting value is returned to the caller. A
140later call to the same function will get a fresh new set of local variables.
141But, what if the local variables weren't thrown away on exiting a function?
142What if you could later resume the function where it left off? This is what
143generators provide; they can be thought of as resumable functions.
144
145Here's the simplest example of a generator function::
146
147 def generate_ints(N):
148 for i in range(N):
149 yield i
150
151A new keyword, :keyword:`yield`, was introduced for generators. Any function
152containing a :keyword:`yield` statement is a generator function; this is
153detected by Python's bytecode compiler which compiles the function specially as
154a result.
155
156When you call a generator function, it doesn't return a single value; instead it
157returns a generator object that supports the iterator protocol. On executing
158the :keyword:`yield` statement, the generator outputs the value of ``i``,
159similar to a :keyword:`return` statement. The big difference between
160:keyword:`yield` and a :keyword:`return` statement is that on reaching a
161:keyword:`yield` the generator's state of execution is suspended and local
162variables are preserved. On the next call to the generator's ``.next()``
163method, the function will resume executing immediately after the
164:keyword:`yield` statement. (For complicated reasons, the :keyword:`yield`
165statement isn't allowed inside the :keyword:`try` block of a :keyword:`try`...\
166:keyword:`finally` statement; read :pep:`255` for a full explanation of the
167interaction between :keyword:`yield` and exceptions.)
168
169Here's a sample usage of the :func:`generate_ints` generator::
170
171 >>> gen = generate_ints(3)
172 >>> gen
173 <generator object at 0x8117f90>
174 >>> gen.next()
175 0
176 >>> gen.next()
177 1
178 >>> gen.next()
179 2
180 >>> gen.next()
181 Traceback (most recent call last):
182 File "stdin", line 1, in ?
183 File "stdin", line 2, in generate_ints
184 StopIteration
185
186You could equally write ``for i in generate_ints(5)``, or ``a,b,c =
187generate_ints(3)``.
188
189Inside a generator function, the :keyword:`return` statement can only be used
190without a value, and signals the end of the procession of values; afterwards the
191generator cannot return any further values. :keyword:`return` with a value, such
192as ``return 5``, is a syntax error inside a generator function. The end of the
193generator's results can also be indicated by raising :exc:`StopIteration`
194manually, or by just letting the flow of execution fall off the bottom of the
195function.
196
197You could achieve the effect of generators manually by writing your own class
198and storing all the local variables of the generator as instance variables. For
199example, returning a list of integers could be done by setting ``self.count`` to
2000, and having the :meth:`next` method increment ``self.count`` and return it.
201However, for a moderately complicated generator, writing a corresponding class
202would be much messier. :file:`Lib/test/test_generators.py` contains a number of
203more interesting examples. The simplest one implements an in-order traversal of
204a tree using generators recursively. ::
205
206 # A recursive generator that generates Tree leaves in in-order.
207 def inorder(t):
208 if t:
209 for x in inorder(t.left):
210 yield x
211 yield t.label
212 for x in inorder(t.right):
213 yield x
214
215Two other examples in :file:`Lib/test/test_generators.py` produce solutions for
216the N-Queens problem (placing $N$ queens on an $NxN$ chess board so that no
217queen threatens another) and the Knight's Tour (a route that takes a knight to
218every square of an $NxN$ chessboard without visiting any square twice).
219
220The idea of generators comes from other programming languages, especially Icon
221(http://www.cs.arizona.edu/icon/), where the idea of generators is central. In
222Icon, every expression and function call behaves like a generator. One example
223from "An Overview of the Icon Programming Language" at
224http://www.cs.arizona.edu/icon/docs/ipd266.htm gives an idea of what this looks
225like::
226
227 sentence := "Store it in the neighboring harbor"
228 if (i := find("or", sentence)) > 5 then write(i)
229
230In Icon the :func:`find` function returns the indexes at which the substring
231"or" is found: 3, 23, 33. In the :keyword:`if` statement, ``i`` is first
232assigned a value of 3, but 3 is less than 5, so the comparison fails, and Icon
233retries it with the second value of 23. 23 is greater than 5, so the comparison
234now succeeds, and the code prints the value 23 to the screen.
235
236Python doesn't go nearly as far as Icon in adopting generators as a central
237concept. Generators are considered part of the core Python language, but
238learning or using them isn't compulsory; if they don't solve any problems that
239you have, feel free to ignore them. One novel feature of Python's interface as
240compared to Icon's is that a generator's state is represented as a concrete
241object (the iterator) that can be passed around to other functions or stored in
242a data structure.
243
244
245.. seealso::
246
247 :pep:`255` - Simple Generators
248 Written by Neil Schemenauer, Tim Peters, Magnus Lie Hetland. Implemented mostly
249 by Neil Schemenauer and Tim Peters, with other fixes from the Python Labs crew.
250
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000251.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +0000252
253
254.. _section-encodings:
255
256PEP 263: Source Code Encodings
257==============================
258
259Python source files can now be declared as being in different character set
260encodings. Encodings are declared by including a specially formatted comment in
261the first or second line of the source file. For example, a UTF-8 file can be
262declared with::
263
264 #!/usr/bin/env python
265 # -*- coding: UTF-8 -*-
266
267Without such an encoding declaration, the default encoding used is 7-bit ASCII.
268Executing or importing modules that contain string literals with 8-bit
269characters and have no encoding declaration will result in a
270:exc:`DeprecationWarning` being signalled by Python 2.3; in 2.4 this will be a
271syntax error.
272
273The encoding declaration only affects Unicode string literals, which will be
274converted to Unicode using the specified encoding. Note that Python identifiers
275are still restricted to ASCII characters, so you can't have variable names that
276use characters outside of the usual alphanumerics.
277
278
279.. seealso::
280
281 :pep:`263` - Defining Python Source Code Encodings
282 Written by Marc-André Lemburg and Martin von Löwis; implemented by Suzuki Hisao
283 and Martin von Löwis.
284
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000285.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +0000286
287
288PEP 273: Importing Modules from ZIP Archives
289============================================
290
291The new :mod:`zipimport` module adds support for importing modules from a ZIP-
292format archive. You don't need to import the module explicitly; it will be
293automatically imported if a ZIP archive's filename is added to ``sys.path``.
294For example::
295
296 amk@nyman:~/src/python$ unzip -l /tmp/example.zip
297 Archive: /tmp/example.zip
298 Length Date Time Name
299 -------- ---- ---- ----
300 8467 11-26-02 22:30 jwzthreading.py
301 -------- -------
302 8467 1 file
303 amk@nyman:~/src/python$ ./python
Georg Brandl48310cd2009-01-03 21:18:54 +0000304 Python 2.3 (#1, Aug 1 2003, 19:54:32)
Georg Brandl116aa622007-08-15 14:28:22 +0000305 >>> import sys
306 >>> sys.path.insert(0, '/tmp/example.zip') # Add .zip file to front of path
307 >>> import jwzthreading
308 >>> jwzthreading.__file__
309 '/tmp/example.zip/jwzthreading.py'
310 >>>
311
312An entry in ``sys.path`` can now be the filename of a ZIP archive. The ZIP
313archive can contain any kind of files, but only files named :file:`\*.py`,
314:file:`\*.pyc`, or :file:`\*.pyo` can be imported. If an archive only contains
315:file:`\*.py` files, Python will not attempt to modify the archive by adding the
316corresponding :file:`\*.pyc` file, meaning that if a ZIP archive doesn't contain
317:file:`\*.pyc` files, importing may be rather slow.
318
319A path within the archive can also be specified to only import from a
320subdirectory; for example, the path :file:`/tmp/example.zip/lib/` would only
321import from the :file:`lib/` subdirectory within the archive.
322
323
324.. seealso::
325
326 :pep:`273` - Import Modules from Zip Archives
327 Written by James C. Ahlstrom, who also provided an implementation. Python 2.3
328 follows the specification in :pep:`273`, but uses an implementation written by
329 Just van Rossum that uses the import hooks described in :pep:`302`. See section
330 :ref:`section-pep302` for a description of the new import hooks.
331
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000332.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +0000333
334
335PEP 277: Unicode file name support for Windows NT
336=================================================
337
338On Windows NT, 2000, and XP, the system stores file names as Unicode strings.
339Traditionally, Python has represented file names as byte strings, which is
340inadequate because it renders some file names inaccessible.
341
342Python now allows using arbitrary Unicode strings (within the limitations of the
343file system) for all functions that expect file names, most notably the
344:func:`open` built-in function. If a Unicode string is passed to
345:func:`os.listdir`, Python now returns a list of Unicode strings. A new
346function, :func:`os.getcwdu`, returns the current directory as a Unicode string.
347
348Byte strings still work as file names, and on Windows Python will transparently
349convert them to Unicode using the ``mbcs`` encoding.
350
351Other systems also allow Unicode strings as file names but convert them to byte
352strings before passing them to the system, which can cause a :exc:`UnicodeError`
353to be raised. Applications can test whether arbitrary Unicode strings are
354supported as file names by checking :attr:`os.path.supports_unicode_filenames`,
355a Boolean value.
356
357Under MacOS, :func:`os.listdir` may now return Unicode filenames.
358
359
360.. seealso::
361
362 :pep:`277` - Unicode file name support for Windows NT
363 Written by Neil Hodgson; implemented by Neil Hodgson, Martin von Löwis, and Mark
364 Hammond.
365
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000366.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +0000367
368
R David Murray1b00f252012-08-15 10:43:58 -0400369.. index::
370 single: universal newlines; What's new
371
Georg Brandl116aa622007-08-15 14:28:22 +0000372PEP 278: Universal Newline Support
373==================================
374
375The three major operating systems used today are Microsoft Windows, Apple's
376Macintosh OS, and the various Unix derivatives. A minor irritation of cross-
377platform work is that these three platforms all use different characters to
378mark the ends of lines in text files. Unix uses the linefeed (ASCII character
37910), MacOS uses the carriage return (ASCII character 13), and Windows uses a
380two-character sequence of a carriage return plus a newline.
381
382Python's file objects can now support end of line conventions other than the one
383followed by the platform on which Python is running. Opening a file with the
R David Murray1b00f252012-08-15 10:43:58 -0400384mode ``'U'`` or ``'rU'`` will open a file for reading in
385:term:`universal newlines` mode.
Georg Brandl116aa622007-08-15 14:28:22 +0000386All three line ending conventions will be translated to a ``'\n'`` in the
387strings returned by the various file methods such as :meth:`read` and
388:meth:`readline`.
389
390Universal newline support is also used when importing modules and when executing
391a file with the :func:`execfile` function. This means that Python modules can
392be shared between all three operating systems without needing to convert the
393line-endings.
394
395This feature can be disabled when compiling Python by specifying the
396:option:`--without-universal-newlines` switch when running Python's
397:program:`configure` script.
398
399
400.. seealso::
401
402 :pep:`278` - Universal Newline Support
403 Written and implemented by Jack Jansen.
404
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000405.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +0000406
407
408.. _section-enumerate:
409
410PEP 279: enumerate()
411====================
412
413A new built-in function, :func:`enumerate`, will make certain loops a bit
414clearer. ``enumerate(thing)``, where *thing* is either an iterator or a
415sequence, returns a iterator that will return ``(0, thing[0])``, ``(1,
416thing[1])``, ``(2, thing[2])``, and so forth.
417
418A common idiom to change every element of a list looks like this::
419
420 for i in range(len(L)):
421 item = L[i]
422 # ... compute some result based on item ...
423 L[i] = result
424
425This can be rewritten using :func:`enumerate` as::
426
427 for i, item in enumerate(L):
428 # ... compute some result based on item ...
429 L[i] = result
430
431
432.. seealso::
433
434 :pep:`279` - The enumerate() built-in function
435 Written and implemented by Raymond D. Hettinger.
436
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000437.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +0000438
439
440PEP 282: The logging Package
441============================
442
443A standard package for writing logs, :mod:`logging`, has been added to Python
4442.3. It provides a powerful and flexible mechanism for generating logging
445output which can then be filtered and processed in various ways. A
446configuration file written in a standard format can be used to control the
447logging behavior of a program. Python includes handlers that will write log
448records to standard error or to a file or socket, send them to the system log,
449or even e-mail them to a particular address; of course, it's also possible to
450write your own handler classes.
451
452The :class:`Logger` class is the primary class. Most application code will deal
453with one or more :class:`Logger` objects, each one used by a particular
454subsystem of the application. Each :class:`Logger` is identified by a name, and
455names are organized into a hierarchy using ``.`` as the component separator.
456For example, you might have :class:`Logger` instances named ``server``,
457``server.auth`` and ``server.network``. The latter two instances are below
458``server`` in the hierarchy. This means that if you turn up the verbosity for
459``server`` or direct ``server`` messages to a different handler, the changes
460will also apply to records logged to ``server.auth`` and ``server.network``.
461There's also a root :class:`Logger` that's the parent of all other loggers.
462
463For simple uses, the :mod:`logging` package contains some convenience functions
464that always use the root log::
465
466 import logging
467
468 logging.debug('Debugging information')
469 logging.info('Informational message')
470 logging.warning('Warning:config file %s not found', 'server.conf')
471 logging.error('Error occurred')
472 logging.critical('Critical error -- shutting down')
473
474This produces the following output::
475
476 WARNING:root:Warning:config file server.conf not found
477 ERROR:root:Error occurred
478 CRITICAL:root:Critical error -- shutting down
479
480In the default configuration, informational and debugging messages are
481suppressed and the output is sent to standard error. You can enable the display
482of informational and debugging messages by calling the :meth:`setLevel` method
483on the root logger.
484
485Notice the :func:`warning` call's use of string formatting operators; all of the
486functions for logging messages take the arguments ``(msg, arg1, arg2, ...)`` and
487log the string resulting from ``msg % (arg1, arg2, ...)``.
488
489There's also an :func:`exception` function that records the most recent
490traceback. Any of the other functions will also record the traceback if you
491specify a true value for the keyword argument *exc_info*. ::
492
493 def f():
494 try: 1/0
495 except: logging.exception('Problem recorded')
496
497 f()
498
499This produces the following output::
500
501 ERROR:root:Problem recorded
502 Traceback (most recent call last):
503 File "t.py", line 6, in f
504 1/0
505 ZeroDivisionError: integer division or modulo by zero
506
507Slightly more advanced programs will use a logger other than the root logger.
508The :func:`getLogger(name)` function is used to get a particular log, creating
509it if it doesn't exist yet. :func:`getLogger(None)` returns the root logger. ::
510
511 log = logging.getLogger('server')
512 ...
513 log.info('Listening on port %i', port)
514 ...
515 log.critical('Disk full')
516 ...
517
518Log records are usually propagated up the hierarchy, so a message logged to
519``server.auth`` is also seen by ``server`` and ``root``, but a :class:`Logger`
520can prevent this by setting its :attr:`propagate` attribute to :const:`False`.
521
522There are more classes provided by the :mod:`logging` package that can be
523customized. When a :class:`Logger` instance is told to log a message, it
524creates a :class:`LogRecord` instance that is sent to any number of different
525:class:`Handler` instances. Loggers and handlers can also have an attached list
526of filters, and each filter can cause the :class:`LogRecord` to be ignored or
527can modify the record before passing it along. When they're finally output,
528:class:`LogRecord` instances are converted to text by a :class:`Formatter`
529class. All of these classes can be replaced by your own specially-written
530classes.
531
532With all of these features the :mod:`logging` package should provide enough
533flexibility for even the most complicated applications. This is only an
534incomplete overview of its features, so please see the package's reference
535documentation for all of the details. Reading :pep:`282` will also be helpful.
536
537
538.. seealso::
539
540 :pep:`282` - A Logging System
541 Written by Vinay Sajip and Trent Mick; implemented by Vinay Sajip.
542
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000543.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +0000544
545
546.. _section-bool:
547
548PEP 285: A Boolean Type
549=======================
550
551A Boolean type was added to Python 2.3. Two new constants were added to the
552:mod:`__builtin__` module, :const:`True` and :const:`False`. (:const:`True` and
553:const:`False` constants were added to the built-ins in Python 2.2.1, but the
5542.2.1 versions are simply set to integer values of 1 and 0 and aren't a
555different type.)
556
557The type object for this new type is named :class:`bool`; the constructor for it
558takes any Python value and converts it to :const:`True` or :const:`False`. ::
559
560 >>> bool(1)
561 True
562 >>> bool(0)
563 False
564 >>> bool([])
565 False
566 >>> bool( (1,) )
567 True
568
569Most of the standard library modules and built-in functions have been changed to
570return Booleans. ::
571
572 >>> obj = []
573 >>> hasattr(obj, 'append')
574 True
575 >>> isinstance(obj, list)
576 True
577 >>> isinstance(obj, tuple)
578 False
579
580Python's Booleans were added with the primary goal of making code clearer. For
581example, if you're reading a function and encounter the statement ``return 1``,
582you might wonder whether the ``1`` represents a Boolean truth value, an index,
583or a coefficient that multiplies some other quantity. If the statement is
584``return True``, however, the meaning of the return value is quite clear.
585
586Python's Booleans were *not* added for the sake of strict type-checking. A very
587strict language such as Pascal would also prevent you performing arithmetic with
588Booleans, and would require that the expression in an :keyword:`if` statement
589always evaluate to a Boolean result. Python is not this strict and never will
590be, as :pep:`285` explicitly says. This means you can still use any expression
591in an :keyword:`if` statement, even ones that evaluate to a list or tuple or
592some random object. The Boolean type is a subclass of the :class:`int` class so
593that arithmetic using a Boolean still works. ::
594
595 >>> True + 1
596 2
597 >>> False + 1
598 1
599 >>> False * 75
600 0
601 >>> True * 75
602 75
603
604To sum up :const:`True` and :const:`False` in a sentence: they're alternative
605ways to spell the integer values 1 and 0, with the single difference that
606:func:`str` and :func:`repr` return the strings ``'True'`` and ``'False'``
607instead of ``'1'`` and ``'0'``.
608
609
610.. seealso::
611
612 :pep:`285` - Adding a bool type
613 Written and implemented by GvR.
614
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000615.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +0000616
617
618PEP 293: Codec Error Handling Callbacks
619=======================================
620
621When encoding a Unicode string into a byte string, unencodable characters may be
622encountered. So far, Python has allowed specifying the error processing as
623either "strict" (raising :exc:`UnicodeError`), "ignore" (skipping the
624character), or "replace" (using a question mark in the output string), with
625"strict" being the default behavior. It may be desirable to specify alternative
626processing of such errors, such as inserting an XML character reference or HTML
627entity reference into the converted string.
628
629Python now has a flexible framework to add different processing strategies. New
630error handlers can be added with :func:`codecs.register_error`, and codecs then
631can access the error handler with :func:`codecs.lookup_error`. An equivalent C
632API has been added for codecs written in C. The error handler gets the necessary
633state information such as the string being converted, the position in the string
634where the error was detected, and the target encoding. The handler can then
635either raise an exception or return a replacement string.
636
637Two additional error handlers have been implemented using this framework:
638"backslashreplace" uses Python backslash quoting to represent unencodable
639characters and "xmlcharrefreplace" emits XML character references.
640
641
642.. seealso::
643
644 :pep:`293` - Codec Error Handling Callbacks
645 Written and implemented by Walter Dörwald.
646
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000647.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +0000648
649
650.. _section-pep301:
651
652PEP 301: Package Index and Metadata for Distutils
653=================================================
654
655Support for the long-requested Python catalog makes its first appearance in 2.3.
656
657The heart of the catalog is the new Distutils :command:`register` command.
658Running ``python setup.py register`` will collect the metadata describing a
659package, such as its name, version, maintainer, description, &c., and send it to
660a central catalog server. The resulting catalog is available from
661http://www.python.org/pypi.
662
663To make the catalog a bit more useful, a new optional *classifiers* keyword
664argument has been added to the Distutils :func:`setup` function. A list of
665`Trove <http://catb.org/~esr/trove/>`_-style strings can be supplied to help
666classify the software.
667
668Here's an example :file:`setup.py` with classifiers, written to be compatible
669with older versions of the Distutils::
670
671 from distutils import core
672 kw = {'name': "Quixote",
673 'version': "0.5.1",
674 'description': "A highly Pythonic Web application framework",
675 # ...
676 }
677
Georg Brandl48310cd2009-01-03 21:18:54 +0000678 if (hasattr(core, 'setup_keywords') and
Georg Brandl116aa622007-08-15 14:28:22 +0000679 'classifiers' in core.setup_keywords):
680 kw['classifiers'] = \
681 ['Topic :: Internet :: WWW/HTTP :: Dynamic Content',
682 'Environment :: No Input/Output (Daemon)',
683 'Intended Audience :: Developers'],
684
685 core.setup(**kw)
686
687The full list of classifiers can be obtained by running ``python setup.py
688register --list-classifiers``.
689
690
691.. seealso::
692
693 :pep:`301` - Package Index and Metadata for Distutils
694 Written and implemented by Richard Jones.
695
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000696.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +0000697
698
699.. _section-pep302:
700
701PEP 302: New Import Hooks
702=========================
703
704While it's been possible to write custom import hooks ever since the
705:mod:`ihooks` module was introduced in Python 1.3, no one has ever been really
706happy with it because writing new import hooks is difficult and messy. There
707have been various proposed alternatives such as the :mod:`imputil` and :mod:`iu`
708modules, but none of them has ever gained much acceptance, and none of them were
709easily usable from C code.
710
711:pep:`302` borrows ideas from its predecessors, especially from Gordon
712McMillan's :mod:`iu` module. Three new items are added to the :mod:`sys`
713module:
714
715* ``sys.path_hooks`` is a list of callable objects; most often they'll be
716 classes. Each callable takes a string containing a path and either returns an
717 importer object that will handle imports from this path or raises an
718 :exc:`ImportError` exception if it can't handle this path.
719
720* ``sys.path_importer_cache`` caches importer objects for each path, so
721 ``sys.path_hooks`` will only need to be traversed once for each path.
722
723* ``sys.meta_path`` is a list of importer objects that will be traversed before
724 ``sys.path`` is checked. This list is initially empty, but user code can add
725 objects to it. Additional built-in and frozen modules can be imported by an
726 object added to this list.
727
728Importer objects must have a single method, :meth:`find_module(fullname,
729path=None)`. *fullname* will be a module or package name, e.g. ``string`` or
730``distutils.core``. :meth:`find_module` must return a loader object that has a
731single method, :meth:`load_module(fullname)`, that creates and returns the
732corresponding module object.
733
734Pseudo-code for Python's new import logic, therefore, looks something like this
735(simplified a bit; see :pep:`302` for the full details)::
736
737 for mp in sys.meta_path:
738 loader = mp(fullname)
739 if loader is not None:
740 <module> = loader.load_module(fullname)
741
742 for path in sys.path:
743 for hook in sys.path_hooks:
744 try:
745 importer = hook(path)
746 except ImportError:
747 # ImportError, so try the other path hooks
748 pass
749 else:
750 loader = importer.find_module(fullname)
751 <module> = loader.load_module(fullname)
752
753 # Not found!
754 raise ImportError
755
756
757.. seealso::
758
759 :pep:`302` - New Import Hooks
760 Written by Just van Rossum and Paul Moore. Implemented by Just van Rossum.
761
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000762.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +0000763
764
765.. _section-pep305:
766
767PEP 305: Comma-separated Files
768==============================
769
770Comma-separated files are a format frequently used for exporting data from
771databases and spreadsheets. Python 2.3 adds a parser for comma-separated files.
772
773Comma-separated format is deceptively simple at first glance::
774
775 Costs,150,200,3.95
776
777Read a line and call ``line.split(',')``: what could be simpler? But toss in
778string data that can contain commas, and things get more complicated::
779
780 "Costs",150,200,3.95,"Includes taxes, shipping, and sundry items"
781
782A big ugly regular expression can parse this, but using the new :mod:`csv`
783package is much simpler::
784
785 import csv
786
787 input = open('datafile', 'rb')
788 reader = csv.reader(input)
789 for line in reader:
790 print line
791
792The :func:`reader` function takes a number of different options. The field
793separator isn't limited to the comma and can be changed to any character, and so
794can the quoting and line-ending characters.
795
796Different dialects of comma-separated files can be defined and registered;
797currently there are two dialects, both used by Microsoft Excel. A separate
798:class:`csv.writer` class will generate comma-separated files from a succession
799of tuples or lists, quoting strings that contain the delimiter.
800
801
802.. seealso::
803
804 :pep:`305` - CSV File API
805 Written and implemented by Kevin Altis, Dave Cole, Andrew McNamara, Skip
806 Montanaro, Cliff Wells.
807
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000808.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +0000809
810
811.. _section-pep307:
812
813PEP 307: Pickle Enhancements
814============================
815
816The :mod:`pickle` and :mod:`cPickle` modules received some attention during the
8172.3 development cycle. In 2.2, new-style classes could be pickled without
818difficulty, but they weren't pickled very compactly; :pep:`307` quotes a trivial
819example where a new-style class results in a pickled string three times longer
820than that for a classic class.
821
822The solution was to invent a new pickle protocol. The :func:`pickle.dumps`
823function has supported a text-or-binary flag for a long time. In 2.3, this
824flag is redefined from a Boolean to an integer: 0 is the old text-mode pickle
825format, 1 is the old binary format, and now 2 is a new 2.3-specific format. A
826new constant, :const:`pickle.HIGHEST_PROTOCOL`, can be used to select the
827fanciest protocol available.
828
829Unpickling is no longer considered a safe operation. 2.2's :mod:`pickle`
830provided hooks for trying to prevent unsafe classes from being unpickled
831(specifically, a :attr:`__safe_for_unpickling__` attribute), but none of this
832code was ever audited and therefore it's all been ripped out in 2.3. You should
833not unpickle untrusted data in any version of Python.
834
835To reduce the pickling overhead for new-style classes, a new interface for
836customizing pickling was added using three special methods:
837:meth:`__getstate__`, :meth:`__setstate__`, and :meth:`__getnewargs__`. Consult
838:pep:`307` for the full semantics of these methods.
839
840As a way to compress pickles yet further, it's now possible to use integer codes
841instead of long strings to identify pickled classes. The Python Software
842Foundation will maintain a list of standardized codes; there's also a range of
843codes for private use. Currently no codes have been specified.
844
845
846.. seealso::
847
848 :pep:`307` - Extensions to the pickle protocol
849 Written and implemented by Guido van Rossum and Tim Peters.
850
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000851.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +0000852
853
854.. _section-slices:
855
856Extended Slices
857===============
858
859Ever since Python 1.4, the slicing syntax has supported an optional third "step"
860or "stride" argument. For example, these are all legal Python syntax:
861``L[1:10:2]``, ``L[:-1:1]``, ``L[::-1]``. This was added to Python at the
862request of the developers of Numerical Python, which uses the third argument
863extensively. However, Python's built-in list, tuple, and string sequence types
864have never supported this feature, raising a :exc:`TypeError` if you tried it.
865Michael Hudson contributed a patch to fix this shortcoming.
866
867For example, you can now easily extract the elements of a list that have even
868indexes::
869
870 >>> L = range(10)
871 >>> L[::2]
872 [0, 2, 4, 6, 8]
873
874Negative values also work to make a copy of the same list in reverse order::
875
876 >>> L[::-1]
877 [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
878
879This also works for tuples, arrays, and strings::
880
881 >>> s='abcd'
882 >>> s[::2]
883 'ac'
884 >>> s[::-1]
885 'dcba'
886
887If you have a mutable sequence such as a list or an array you can assign to or
888delete an extended slice, but there are some differences between assignment to
889extended and regular slices. Assignment to a regular slice can be used to
890change the length of the sequence::
891
892 >>> a = range(3)
893 >>> a
894 [0, 1, 2]
895 >>> a[1:3] = [4, 5, 6]
896 >>> a
897 [0, 4, 5, 6]
898
899Extended slices aren't this flexible. When assigning to an extended slice, the
900list on the right hand side of the statement must contain the same number of
901items as the slice it is replacing::
902
903 >>> a = range(4)
904 >>> a
905 [0, 1, 2, 3]
906 >>> a[::2]
907 [0, 2]
908 >>> a[::2] = [0, -1]
909 >>> a
910 [0, 1, -1, 3]
911 >>> a[::2] = [0,1,2]
912 Traceback (most recent call last):
913 File "<stdin>", line 1, in ?
914 ValueError: attempt to assign sequence of size 3 to extended slice of size 2
915
916Deletion is more straightforward::
917
918 >>> a = range(4)
919 >>> a
920 [0, 1, 2, 3]
921 >>> a[::2]
922 [0, 2]
923 >>> del a[::2]
924 >>> a
925 [1, 3]
926
927One can also now pass slice objects to the :meth:`__getitem__` methods of the
928built-in sequences::
929
930 >>> range(10).__getitem__(slice(0, 5, 2))
931 [0, 2, 4]
932
933Or use slice objects directly in subscripts::
934
935 >>> range(10)[slice(0, 5, 2)]
936 [0, 2, 4]
937
938To simplify implementing sequences that support extended slicing, slice objects
939now have a method :meth:`indices(length)` which, given the length of a sequence,
940returns a ``(start, stop, step)`` tuple that can be passed directly to
941:func:`range`. :meth:`indices` handles omitted and out-of-bounds indices in a
942manner consistent with regular slices (and this innocuous phrase hides a welter
943of confusing details!). The method is intended to be used like this::
944
945 class FakeSeq:
946 ...
947 def calc_item(self, i):
948 ...
949 def __getitem__(self, item):
950 if isinstance(item, slice):
951 indices = item.indices(len(self))
952 return FakeSeq([self.calc_item(i) for i in range(*indices)])
953 else:
954 return self.calc_item(i)
955
956From this example you can also see that the built-in :class:`slice` object is
957now the type object for the slice type, and is no longer a function. This is
958consistent with Python 2.2, where :class:`int`, :class:`str`, etc., underwent
959the same change.
960
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000961.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +0000962
963
964Other Language Changes
965======================
966
967Here are all of the changes that Python 2.3 makes to the core Python language.
968
969* The :keyword:`yield` statement is now always a keyword, as described in
970 section :ref:`section-generators` of this document.
971
972* A new built-in function :func:`enumerate` was added, as described in section
973 :ref:`section-enumerate` of this document.
974
975* Two new constants, :const:`True` and :const:`False` were added along with the
976 built-in :class:`bool` type, as described in section :ref:`section-bool` of this
977 document.
978
979* The :func:`int` type constructor will now return a long integer instead of
980 raising an :exc:`OverflowError` when a string or floating-point number is too
981 large to fit into an integer. This can lead to the paradoxical result that
982 ``isinstance(int(expression), int)`` is false, but that seems unlikely to cause
983 problems in practice.
984
985* Built-in types now support the extended slicing syntax, as described in
986 section :ref:`section-slices` of this document.
987
988* A new built-in function, :func:`sum(iterable, start=0)`, adds up the numeric
989 items in the iterable object and returns their sum. :func:`sum` only accepts
990 numbers, meaning that you can't use it to concatenate a bunch of strings.
991 (Contributed by Alex Martelli.)
992
993* ``list.insert(pos, value)`` used to insert *value* at the front of the list
994 when *pos* was negative. The behaviour has now been changed to be consistent
995 with slice indexing, so when *pos* is -1 the value will be inserted before the
996 last element, and so forth.
997
998* ``list.index(value)``, which searches for *value* within the list and returns
999 its index, now takes optional *start* and *stop* arguments to limit the search
1000 to only part of the list.
1001
1002* Dictionaries have a new method, :meth:`pop(key[, *default*])`, that returns
1003 the value corresponding to *key* and removes that key/value pair from the
1004 dictionary. If the requested key isn't present in the dictionary, *default* is
1005 returned if it's specified and :exc:`KeyError` raised if it isn't. ::
1006
1007 >>> d = {1:2}
1008 >>> d
1009 {1: 2}
1010 >>> d.pop(4)
1011 Traceback (most recent call last):
1012 File "stdin", line 1, in ?
1013 KeyError: 4
1014 >>> d.pop(1)
1015 2
1016 >>> d.pop(1)
1017 Traceback (most recent call last):
1018 File "stdin", line 1, in ?
1019 KeyError: 'pop(): dictionary is empty'
1020 >>> d
1021 {}
1022 >>>
1023
1024 There's also a new class method, :meth:`dict.fromkeys(iterable, value)`, that
1025 creates a dictionary with keys taken from the supplied iterator *iterable* and
1026 all values set to *value*, defaulting to ``None``.
1027
1028 (Patches contributed by Raymond Hettinger.)
1029
1030 Also, the :func:`dict` constructor now accepts keyword arguments to simplify
1031 creating small dictionaries::
1032
1033 >>> dict(red=1, blue=2, green=3, black=4)
Georg Brandl48310cd2009-01-03 21:18:54 +00001034 {'blue': 2, 'black': 4, 'green': 3, 'red': 1}
Georg Brandl116aa622007-08-15 14:28:22 +00001035
1036 (Contributed by Just van Rossum.)
1037
1038* The :keyword:`assert` statement no longer checks the ``__debug__`` flag, so
1039 you can no longer disable assertions by assigning to ``__debug__``. Running
1040 Python with the :option:`-O` switch will still generate code that doesn't
1041 execute any assertions.
1042
1043* Most type objects are now callable, so you can use them to create new objects
1044 such as functions, classes, and modules. (This means that the :mod:`new` module
1045 can be deprecated in a future Python version, because you can now use the type
1046 objects available in the :mod:`types` module.) For example, you can create a new
1047 module object with the following code:
1048
Georg Brandl116aa622007-08-15 14:28:22 +00001049 ::
1050
1051 >>> import types
1052 >>> m = types.ModuleType('abc','docstring')
1053 >>> m
1054 <module 'abc' (built-in)>
1055 >>> m.__doc__
1056 'docstring'
1057
1058* A new warning, :exc:`PendingDeprecationWarning` was added to indicate features
1059 which are in the process of being deprecated. The warning will *not* be printed
1060 by default. To check for use of features that will be deprecated in the future,
1061 supply :option:`-Walways::PendingDeprecationWarning::` on the command line or
1062 use :func:`warnings.filterwarnings`.
1063
1064* The process of deprecating string-based exceptions, as in ``raise "Error
1065 occurred"``, has begun. Raising a string will now trigger
1066 :exc:`PendingDeprecationWarning`.
1067
1068* Using ``None`` as a variable name will now result in a :exc:`SyntaxWarning`
1069 warning. In a future version of Python, ``None`` may finally become a keyword.
1070
1071* The :meth:`xreadlines` method of file objects, introduced in Python 2.1, is no
1072 longer necessary because files now behave as their own iterator.
1073 :meth:`xreadlines` was originally introduced as a faster way to loop over all
1074 the lines in a file, but now you can simply write ``for line in file_obj``.
1075 File objects also have a new read-only :attr:`encoding` attribute that gives the
1076 encoding used by the file; Unicode strings written to the file will be
1077 automatically converted to bytes using the given encoding.
1078
1079* The method resolution order used by new-style classes has changed, though
1080 you'll only notice the difference if you have a really complicated inheritance
1081 hierarchy. Classic classes are unaffected by this change. Python 2.2
1082 originally used a topological sort of a class's ancestors, but 2.3 now uses the
1083 C3 algorithm as described in the paper `"A Monotonic Superclass Linearization
1084 for Dylan" <http://www.webcom.com/haahr/dylan/linearization-oopsla96.html>`_. To
1085 understand the motivation for this change, read Michele Simionato's article
1086 `"Python 2.3 Method Resolution Order" <http://www.python.org/2.3/mro.html>`_, or
1087 read the thread on python-dev starting with the message at
1088 http://mail.python.org/pipermail/python-dev/2002-October/029035.html. Samuele
1089 Pedroni first pointed out the problem and also implemented the fix by coding the
1090 C3 algorithm.
1091
1092* Python runs multithreaded programs by switching between threads after
1093 executing N bytecodes. The default value for N has been increased from 10 to
1094 100 bytecodes, speeding up single-threaded applications by reducing the
1095 switching overhead. Some multithreaded applications may suffer slower response
1096 time, but that's easily fixed by setting the limit back to a lower number using
1097 :func:`sys.setcheckinterval(N)`. The limit can be retrieved with the new
1098 :func:`sys.getcheckinterval` function.
1099
1100* One minor but far-reaching change is that the names of extension types defined
1101 by the modules included with Python now contain the module and a ``'.'`` in
1102 front of the type name. For example, in Python 2.2, if you created a socket and
1103 printed its :attr:`__class__`, you'd get this output::
1104
1105 >>> s = socket.socket()
1106 >>> s.__class__
1107 <type 'socket'>
1108
1109 In 2.3, you get this::
1110
1111 >>> s.__class__
1112 <type '_socket.socket'>
1113
1114* One of the noted incompatibilities between old- and new-style classes has been
1115 removed: you can now assign to the :attr:`__name__` and :attr:`__bases__`
1116 attributes of new-style classes. There are some restrictions on what can be
1117 assigned to :attr:`__bases__` along the lines of those relating to assigning to
1118 an instance's :attr:`__class__` attribute.
1119
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001120.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +00001121
1122
1123String Changes
1124--------------
1125
1126* The :keyword:`in` operator now works differently for strings. Previously, when
1127 evaluating ``X in Y`` where *X* and *Y* are strings, *X* could only be a single
1128 character. That's now changed; *X* can be a string of any length, and ``X in Y``
1129 will return :const:`True` if *X* is a substring of *Y*. If *X* is the empty
1130 string, the result is always :const:`True`. ::
1131
1132 >>> 'ab' in 'abcd'
1133 True
1134 >>> 'ad' in 'abcd'
1135 False
1136 >>> '' in 'abcd'
1137 True
1138
1139 Note that this doesn't tell you where the substring starts; if you need that
1140 information, use the :meth:`find` string method.
1141
1142* The :meth:`strip`, :meth:`lstrip`, and :meth:`rstrip` string methods now have
1143 an optional argument for specifying the characters to strip. The default is
1144 still to remove all whitespace characters::
1145
1146 >>> ' abc '.strip()
1147 'abc'
1148 >>> '><><abc<><><>'.strip('<>')
1149 'abc'
1150 >>> '><><abc<><><>\n'.strip('<>')
1151 'abc<><><>\n'
1152 >>> u'\u4000\u4001abc\u4000'.strip(u'\u4000')
1153 u'\u4001abc'
1154 >>>
1155
1156 (Suggested by Simon Brunning and implemented by Walter Dörwald.)
1157
1158* The :meth:`startswith` and :meth:`endswith` string methods now accept negative
1159 numbers for the *start* and *end* parameters.
1160
1161* Another new string method is :meth:`zfill`, originally a function in the
1162 :mod:`string` module. :meth:`zfill` pads a numeric string with zeros on the
1163 left until it's the specified width. Note that the ``%`` operator is still more
1164 flexible and powerful than :meth:`zfill`. ::
1165
1166 >>> '45'.zfill(4)
1167 '0045'
1168 >>> '12345'.zfill(4)
1169 '12345'
1170 >>> 'goofy'.zfill(6)
1171 '0goofy'
1172
1173 (Contributed by Walter Dörwald.)
1174
1175* A new type object, :class:`basestring`, has been added. Both 8-bit strings and
1176 Unicode strings inherit from this type, so ``isinstance(obj, basestring)`` will
1177 return :const:`True` for either kind of string. It's a completely abstract
1178 type, so you can't create :class:`basestring` instances.
1179
1180* Interned strings are no longer immortal and will now be garbage-collected in
1181 the usual way when the only reference to them is from the internal dictionary of
1182 interned strings. (Implemented by Oren Tirosh.)
1183
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001184.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +00001185
1186
1187Optimizations
1188-------------
1189
1190* The creation of new-style class instances has been made much faster; they're
1191 now faster than classic classes!
1192
1193* The :meth:`sort` method of list objects has been extensively rewritten by Tim
1194 Peters, and the implementation is significantly faster.
1195
1196* Multiplication of large long integers is now much faster thanks to an
1197 implementation of Karatsuba multiplication, an algorithm that scales better than
1198 the O(n\*n) required for the grade-school multiplication algorithm. (Original
1199 patch by Christopher A. Craig, and significantly reworked by Tim Peters.)
1200
1201* The ``SET_LINENO`` opcode is now gone. This may provide a small speed
1202 increase, depending on your compiler's idiosyncrasies. See section
Benjamin Petersonf10a79a2008-10-11 00:49:57 +00001203 :ref:`23section-other` for a longer explanation. (Removed by Michael Hudson.)
Georg Brandl116aa622007-08-15 14:28:22 +00001204
1205* :func:`xrange` objects now have their own iterator, making ``for i in
1206 xrange(n)`` slightly faster than ``for i in range(n)``. (Patch by Raymond
1207 Hettinger.)
1208
1209* A number of small rearrangements have been made in various hotspots to improve
1210 performance, such as inlining a function or removing some code. (Implemented
1211 mostly by GvR, but lots of people have contributed single changes.)
1212
1213The net result of the 2.3 optimizations is that Python 2.3 runs the pystone
1214benchmark around 25% faster than Python 2.2.
1215
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001216.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +00001217
1218
1219New, Improved, and Deprecated Modules
1220=====================================
1221
1222As usual, Python's standard library received a number of enhancements and bug
1223fixes. Here's a partial list of the most notable changes, sorted alphabetically
1224by module name. Consult the :file:`Misc/NEWS` file in the source tree for a more
1225complete list of changes, or look through the CVS logs for all the details.
1226
1227* The :mod:`array` module now supports arrays of Unicode characters using the
1228 ``'u'`` format character. Arrays also now support using the ``+=`` assignment
1229 operator to add another array's contents, and the ``*=`` assignment operator to
1230 repeat an array. (Contributed by Jason Orendorff.)
1231
1232* The :mod:`bsddb` module has been replaced by version 4.1.6 of the `PyBSDDB
1233 <http://pybsddb.sourceforge.net>`_ package, providing a more complete interface
1234 to the transactional features of the BerkeleyDB library.
1235
1236 The old version of the module has been renamed to :mod:`bsddb185` and is no
1237 longer built automatically; you'll have to edit :file:`Modules/Setup` to enable
1238 it. Note that the new :mod:`bsddb` package is intended to be compatible with
1239 the old module, so be sure to file bugs if you discover any incompatibilities.
1240 When upgrading to Python 2.3, if the new interpreter is compiled with a new
1241 version of the underlying BerkeleyDB library, you will almost certainly have to
1242 convert your database files to the new version. You can do this fairly easily
1243 with the new scripts :file:`db2pickle.py` and :file:`pickle2db.py` which you
1244 will find in the distribution's :file:`Tools/scripts` directory. If you've
1245 already been using the PyBSDDB package and importing it as :mod:`bsddb3`, you
1246 will have to change your ``import`` statements to import it as :mod:`bsddb`.
1247
1248* The new :mod:`bz2` module is an interface to the bz2 data compression library.
1249 bz2-compressed data is usually smaller than corresponding :mod:`zlib`\
1250 -compressed data. (Contributed by Gustavo Niemeyer.)
1251
1252* A set of standard date/time types has been added in the new :mod:`datetime`
1253 module. See the following section for more details.
1254
1255* The Distutils :class:`Extension` class now supports an extra constructor
1256 argument named *depends* for listing additional source files that an extension
1257 depends on. This lets Distutils recompile the module if any of the dependency
1258 files are modified. For example, if :file:`sampmodule.c` includes the header
1259 file :file:`sample.h`, you would create the :class:`Extension` object like
1260 this::
1261
1262 ext = Extension("samp",
1263 sources=["sampmodule.c"],
1264 depends=["sample.h"])
1265
1266 Modifying :file:`sample.h` would then cause the module to be recompiled.
1267 (Contributed by Jeremy Hylton.)
1268
1269* Other minor changes to Distutils: it now checks for the :envvar:`CC`,
1270 :envvar:`CFLAGS`, :envvar:`CPP`, :envvar:`LDFLAGS`, and :envvar:`CPPFLAGS`
1271 environment variables, using them to override the settings in Python's
1272 configuration (contributed by Robert Weber).
1273
1274* Previously the :mod:`doctest` module would only search the docstrings of
1275 public methods and functions for test cases, but it now also examines private
1276 ones as well. The :func:`DocTestSuite(` function creates a
1277 :class:`unittest.TestSuite` object from a set of :mod:`doctest` tests.
1278
1279* The new :func:`gc.get_referents(object)` function returns a list of all the
1280 objects referenced by *object*.
1281
1282* The :mod:`getopt` module gained a new function, :func:`gnu_getopt`, that
1283 supports the same arguments as the existing :func:`getopt` function but uses
1284 GNU-style scanning mode. The existing :func:`getopt` stops processing options as
1285 soon as a non-option argument is encountered, but in GNU-style mode processing
1286 continues, meaning that options and arguments can be mixed. For example::
1287
1288 >>> getopt.getopt(['-f', 'filename', 'output', '-v'], 'f:v')
1289 ([('-f', 'filename')], ['output', '-v'])
1290 >>> getopt.gnu_getopt(['-f', 'filename', 'output', '-v'], 'f:v')
1291 ([('-f', 'filename'), ('-v', '')], ['output'])
1292
1293 (Contributed by Peter Ă…strand.)
1294
1295* The :mod:`grp`, :mod:`pwd`, and :mod:`resource` modules now return enhanced
1296 tuples::
1297
1298 >>> import grp
1299 >>> g = grp.getgrnam('amk')
1300 >>> g.gr_name, g.gr_gid
1301 ('amk', 500)
1302
1303* The :mod:`gzip` module can now handle files exceeding 2 GiB.
1304
1305* The new :mod:`heapq` module contains an implementation of a heap queue
1306 algorithm. A heap is an array-like data structure that keeps items in a
1307 partially sorted order such that, for every index *k*, ``heap[k] <=
1308 heap[2*k+1]`` and ``heap[k] <= heap[2*k+2]``. This makes it quick to remove the
1309 smallest item, and inserting a new item while maintaining the heap property is
1310 O(lg n). (See http://www.nist.gov/dads/HTML/priorityque.html for more
1311 information about the priority queue data structure.)
1312
1313 The :mod:`heapq` module provides :func:`heappush` and :func:`heappop` functions
1314 for adding and removing items while maintaining the heap property on top of some
1315 other mutable Python sequence type. Here's an example that uses a Python list::
1316
1317 >>> import heapq
1318 >>> heap = []
1319 >>> for item in [3, 7, 5, 11, 1]:
1320 ... heapq.heappush(heap, item)
1321 ...
1322 >>> heap
1323 [1, 3, 5, 11, 7]
1324 >>> heapq.heappop(heap)
1325 1
1326 >>> heapq.heappop(heap)
1327 3
1328 >>> heap
1329 [5, 7, 11]
1330
1331 (Contributed by Kevin O'Connor.)
1332
1333* The IDLE integrated development environment has been updated using the code
1334 from the IDLEfork project (http://idlefork.sf.net). The most notable feature is
1335 that the code being developed is now executed in a subprocess, meaning that
1336 there's no longer any need for manual ``reload()`` operations. IDLE's core code
1337 has been incorporated into the standard library as the :mod:`idlelib` package.
1338
1339* The :mod:`imaplib` module now supports IMAP over SSL. (Contributed by Piers
1340 Lauder and Tino Lange.)
1341
1342* The :mod:`itertools` contains a number of useful functions for use with
1343 iterators, inspired by various functions provided by the ML and Haskell
1344 languages. For example, ``itertools.ifilter(predicate, iterator)`` returns all
1345 elements in the iterator for which the function :func:`predicate` returns
1346 :const:`True`, and ``itertools.repeat(obj, N)`` returns ``obj`` *N* times.
1347 There are a number of other functions in the module; see the package's reference
1348 documentation for details.
1349 (Contributed by Raymond Hettinger.)
1350
1351* Two new functions in the :mod:`math` module, :func:`degrees(rads)` and
1352 :func:`radians(degs)`, convert between radians and degrees. Other functions in
1353 the :mod:`math` module such as :func:`math.sin` and :func:`math.cos` have always
1354 required input values measured in radians. Also, an optional *base* argument
1355 was added to :func:`math.log` to make it easier to compute logarithms for bases
1356 other than ``e`` and ``10``. (Contributed by Raymond Hettinger.)
1357
1358* Several new POSIX functions (:func:`getpgid`, :func:`killpg`, :func:`lchown`,
1359 :func:`loadavg`, :func:`major`, :func:`makedev`, :func:`minor`, and
1360 :func:`mknod`) were added to the :mod:`posix` module that underlies the
1361 :mod:`os` module. (Contributed by Gustavo Niemeyer, Geert Jansen, and Denis S.
1362 Otkidach.)
1363
1364* In the :mod:`os` module, the :func:`\*stat` family of functions can now report
1365 fractions of a second in a timestamp. Such time stamps are represented as
1366 floats, similar to the value returned by :func:`time.time`.
1367
1368 During testing, it was found that some applications will break if time stamps
1369 are floats. For compatibility, when using the tuple interface of the
1370 :class:`stat_result` time stamps will be represented as integers. When using
1371 named fields (a feature first introduced in Python 2.2), time stamps are still
1372 represented as integers, unless :func:`os.stat_float_times` is invoked to enable
1373 float return values::
1374
1375 >>> os.stat("/tmp").st_mtime
1376 1034791200
1377 >>> os.stat_float_times(True)
1378 >>> os.stat("/tmp").st_mtime
1379 1034791200.6335014
1380
1381 In Python 2.4, the default will change to always returning floats.
1382
1383 Application developers should enable this feature only if all their libraries
1384 work properly when confronted with floating point time stamps, or if they use
1385 the tuple API. If used, the feature should be activated on an application level
1386 instead of trying to enable it on a per-use basis.
1387
1388* The :mod:`optparse` module contains a new parser for command-line arguments
1389 that can convert option values to a particular Python type and will
1390 automatically generate a usage message. See the following section for more
1391 details.
1392
1393* The old and never-documented :mod:`linuxaudiodev` module has been deprecated,
1394 and a new version named :mod:`ossaudiodev` has been added. The module was
1395 renamed because the OSS sound drivers can be used on platforms other than Linux,
1396 and the interface has also been tidied and brought up to date in various ways.
1397 (Contributed by Greg Ward and Nicholas FitzRoy-Dale.)
1398
1399* The new :mod:`platform` module contains a number of functions that try to
1400 determine various properties of the platform you're running on. There are
1401 functions for getting the architecture, CPU type, the Windows OS version, and
1402 even the Linux distribution version. (Contributed by Marc-André Lemburg.)
1403
1404* The parser objects provided by the :mod:`pyexpat` module can now optionally
1405 buffer character data, resulting in fewer calls to your character data handler
1406 and therefore faster performance. Setting the parser object's
1407 :attr:`buffer_text` attribute to :const:`True` will enable buffering.
1408
1409* The :func:`sample(population, k)` function was added to the :mod:`random`
1410 module. *population* is a sequence or :class:`xrange` object containing the
1411 elements of a population, and :func:`sample` chooses *k* elements from the
1412 population without replacing chosen elements. *k* can be any value up to
1413 ``len(population)``. For example::
1414
1415 >>> days = ['Mo', 'Tu', 'We', 'Th', 'Fr', 'St', 'Sn']
1416 >>> random.sample(days, 3) # Choose 3 elements
1417 ['St', 'Sn', 'Th']
1418 >>> random.sample(days, 7) # Choose 7 elements
1419 ['Tu', 'Th', 'Mo', 'We', 'St', 'Fr', 'Sn']
1420 >>> random.sample(days, 7) # Choose 7 again
1421 ['We', 'Mo', 'Sn', 'Fr', 'Tu', 'St', 'Th']
1422 >>> random.sample(days, 8) # Can't choose eight
1423 Traceback (most recent call last):
1424 File "<stdin>", line 1, in ?
1425 File "random.py", line 414, in sample
1426 raise ValueError, "sample larger than population"
1427 ValueError: sample larger than population
1428 >>> random.sample(xrange(1,10000,2), 10) # Choose ten odd nos. under 10000
1429 [3407, 3805, 1505, 7023, 2401, 2267, 9733, 3151, 8083, 9195]
1430
1431 The :mod:`random` module now uses a new algorithm, the Mersenne Twister,
1432 implemented in C. It's faster and more extensively studied than the previous
1433 algorithm.
1434
1435 (All changes contributed by Raymond Hettinger.)
1436
1437* The :mod:`readline` module also gained a number of new functions:
1438 :func:`get_history_item`, :func:`get_current_history_length`, and
1439 :func:`redisplay`.
1440
1441* The :mod:`rexec` and :mod:`Bastion` modules have been declared dead, and
1442 attempts to import them will fail with a :exc:`RuntimeError`. New-style classes
1443 provide new ways to break out of the restricted execution environment provided
1444 by :mod:`rexec`, and no one has interest in fixing them or time to do so. If
1445 you have applications using :mod:`rexec`, rewrite them to use something else.
1446
1447 (Sticking with Python 2.2 or 2.1 will not make your applications any safer
1448 because there are known bugs in the :mod:`rexec` module in those versions. To
1449 repeat: if you're using :mod:`rexec`, stop using it immediately.)
1450
1451* The :mod:`rotor` module has been deprecated because the algorithm it uses for
1452 encryption is not believed to be secure. If you need encryption, use one of the
1453 several AES Python modules that are available separately.
1454
1455* The :mod:`shutil` module gained a :func:`move(src, dest)` function that
1456 recursively moves a file or directory to a new location.
1457
1458* Support for more advanced POSIX signal handling was added to the :mod:`signal`
1459 but then removed again as it proved impossible to make it work reliably across
1460 platforms.
1461
1462* The :mod:`socket` module now supports timeouts. You can call the
1463 :meth:`settimeout(t)` method on a socket object to set a timeout of *t* seconds.
1464 Subsequent socket operations that take longer than *t* seconds to complete will
1465 abort and raise a :exc:`socket.timeout` exception.
1466
1467 The original timeout implementation was by Tim O'Malley. Michael Gilfix
1468 integrated it into the Python :mod:`socket` module and shepherded it through a
1469 lengthy review. After the code was checked in, Guido van Rossum rewrote parts
1470 of it. (This is a good example of a collaborative development process in
1471 action.)
1472
1473* On Windows, the :mod:`socket` module now ships with Secure Sockets Layer
1474 (SSL) support.
1475
1476* The value of the C :const:`PYTHON_API_VERSION` macro is now exposed at the
1477 Python level as ``sys.api_version``. The current exception can be cleared by
1478 calling the new :func:`sys.exc_clear` function.
1479
1480* The new :mod:`tarfile` module allows reading from and writing to
1481 :program:`tar`\ -format archive files. (Contributed by Lars Gustäbel.)
1482
1483* The new :mod:`textwrap` module contains functions for wrapping strings
1484 containing paragraphs of text. The :func:`wrap(text, width)` function takes a
1485 string and returns a list containing the text split into lines of no more than
1486 the chosen width. The :func:`fill(text, width)` function returns a single
1487 string, reformatted to fit into lines no longer than the chosen width. (As you
1488 can guess, :func:`fill` is built on top of :func:`wrap`. For example::
1489
1490 >>> import textwrap
1491 >>> paragraph = "Not a whit, we defy augury: ... more text ..."
1492 >>> textwrap.wrap(paragraph, 60)
1493 ["Not a whit, we defy augury: there's a special providence in",
1494 "the fall of a sparrow. If it be now, 'tis not to come; if it",
1495 ...]
1496 >>> print textwrap.fill(paragraph, 35)
1497 Not a whit, we defy augury: there's
1498 a special providence in the fall of
1499 a sparrow. If it be now, 'tis not
1500 to come; if it be not to come, it
1501 will be now; if it be not now, yet
1502 it will come: the readiness is all.
1503 >>>
1504
1505 The module also contains a :class:`TextWrapper` class that actually implements
1506 the text wrapping strategy. Both the :class:`TextWrapper` class and the
1507 :func:`wrap` and :func:`fill` functions support a number of additional keyword
1508 arguments for fine-tuning the formatting; consult the module's documentation
1509 for details. (Contributed by Greg Ward.)
1510
1511* The :mod:`thread` and :mod:`threading` modules now have companion modules,
1512 :mod:`dummy_thread` and :mod:`dummy_threading`, that provide a do-nothing
1513 implementation of the :mod:`thread` module's interface for platforms where
1514 threads are not supported. The intention is to simplify thread-aware modules
1515 (ones that *don't* rely on threads to run) by putting the following code at the
1516 top::
1517
1518 try:
1519 import threading as _threading
1520 except ImportError:
1521 import dummy_threading as _threading
1522
1523 In this example, :mod:`_threading` is used as the module name to make it clear
1524 that the module being used is not necessarily the actual :mod:`threading`
1525 module. Code can call functions and use classes in :mod:`_threading` whether or
1526 not threads are supported, avoiding an :keyword:`if` statement and making the
1527 code slightly clearer. This module will not magically make multithreaded code
1528 run without threads; code that waits for another thread to return or to do
1529 something will simply hang forever.
1530
1531* The :mod:`time` module's :func:`strptime` function has long been an annoyance
1532 because it uses the platform C library's :func:`strptime` implementation, and
1533 different platforms sometimes have odd bugs. Brett Cannon contributed a
1534 portable implementation that's written in pure Python and should behave
1535 identically on all platforms.
1536
1537* The new :mod:`timeit` module helps measure how long snippets of Python code
1538 take to execute. The :file:`timeit.py` file can be run directly from the
1539 command line, or the module's :class:`Timer` class can be imported and used
1540 directly. Here's a short example that figures out whether it's faster to
1541 convert an 8-bit string to Unicode by appending an empty Unicode string to it or
1542 by using the :func:`unicode` function::
1543
1544 import timeit
1545
1546 timer1 = timeit.Timer('unicode("abc")')
1547 timer2 = timeit.Timer('"abc" + u""')
1548
1549 # Run three trials
1550 print timer1.repeat(repeat=3, number=100000)
1551 print timer2.repeat(repeat=3, number=100000)
1552
1553 # On my laptop this outputs:
1554 # [0.36831796169281006, 0.37441694736480713, 0.35304892063140869]
1555 # [0.17574405670166016, 0.18193507194519043, 0.17565798759460449]
1556
1557* The :mod:`Tix` module has received various bug fixes and updates for the
1558 current version of the Tix package.
1559
1560* The :mod:`Tkinter` module now works with a thread-enabled version of Tcl.
1561 Tcl's threading model requires that widgets only be accessed from the thread in
1562 which they're created; accesses from another thread can cause Tcl to panic. For
1563 certain Tcl interfaces, :mod:`Tkinter` will now automatically avoid this when a
1564 widget is accessed from a different thread by marshalling a command, passing it
1565 to the correct thread, and waiting for the results. Other interfaces can't be
1566 handled automatically but :mod:`Tkinter` will now raise an exception on such an
1567 access so that you can at least find out about the problem. See
1568 http://mail.python.org/pipermail/python-dev/2002-December/031107.html for a more
1569 detailed explanation of this change. (Implemented by Martin von Löwis.)
1570
Georg Brandl116aa622007-08-15 14:28:22 +00001571* Calling Tcl methods through :mod:`_tkinter` no longer returns only strings.
1572 Instead, if Tcl returns other objects those objects are converted to their
1573 Python equivalent, if one exists, or wrapped with a :class:`_tkinter.Tcl_Obj`
1574 object if no Python equivalent exists. This behavior can be controlled through
1575 the :meth:`wantobjects` method of :class:`tkapp` objects.
1576
1577 When using :mod:`_tkinter` through the :mod:`Tkinter` module (as most Tkinter
1578 applications will), this feature is always activated. It should not cause
1579 compatibility problems, since Tkinter would always convert string results to
1580 Python types where possible.
1581
1582 If any incompatibilities are found, the old behavior can be restored by setting
1583 the :attr:`wantobjects` variable in the :mod:`Tkinter` module to false before
1584 creating the first :class:`tkapp` object. ::
1585
1586 import Tkinter
1587 Tkinter.wantobjects = 0
1588
1589 Any breakage caused by this change should be reported as a bug.
1590
1591* The :mod:`UserDict` module has a new :class:`DictMixin` class which defines
1592 all dictionary methods for classes that already have a minimum mapping
1593 interface. This greatly simplifies writing classes that need to be
1594 substitutable for dictionaries, such as the classes in the :mod:`shelve`
1595 module.
1596
1597 Adding the mix-in as a superclass provides the full dictionary interface
1598 whenever the class defines :meth:`__getitem__`, :meth:`__setitem__`,
1599 :meth:`__delitem__`, and :meth:`keys`. For example::
1600
1601 >>> import UserDict
1602 >>> class SeqDict(UserDict.DictMixin):
1603 ... """Dictionary lookalike implemented with lists."""
1604 ... def __init__(self):
1605 ... self.keylist = []
1606 ... self.valuelist = []
1607 ... def __getitem__(self, key):
1608 ... try:
1609 ... i = self.keylist.index(key)
1610 ... except ValueError:
1611 ... raise KeyError
1612 ... return self.valuelist[i]
1613 ... def __setitem__(self, key, value):
1614 ... try:
1615 ... i = self.keylist.index(key)
1616 ... self.valuelist[i] = value
1617 ... except ValueError:
1618 ... self.keylist.append(key)
1619 ... self.valuelist.append(value)
1620 ... def __delitem__(self, key):
1621 ... try:
1622 ... i = self.keylist.index(key)
1623 ... except ValueError:
1624 ... raise KeyError
1625 ... self.keylist.pop(i)
1626 ... self.valuelist.pop(i)
1627 ... def keys(self):
1628 ... return list(self.keylist)
Georg Brandl48310cd2009-01-03 21:18:54 +00001629 ...
Georg Brandl116aa622007-08-15 14:28:22 +00001630 >>> s = SeqDict()
1631 >>> dir(s) # See that other dictionary methods are implemented
1632 ['__cmp__', '__contains__', '__delitem__', '__doc__', '__getitem__',
1633 '__init__', '__iter__', '__len__', '__module__', '__repr__',
1634 '__setitem__', 'clear', 'get', 'has_key', 'items', 'iteritems',
1635 'iterkeys', 'itervalues', 'keylist', 'keys', 'pop', 'popitem',
1636 'setdefault', 'update', 'valuelist', 'values']
1637
1638 (Contributed by Raymond Hettinger.)
1639
1640* The DOM implementation in :mod:`xml.dom.minidom` can now generate XML output
1641 in a particular encoding by providing an optional encoding argument to the
1642 :meth:`toxml` and :meth:`toprettyxml` methods of DOM nodes.
1643
1644* The :mod:`xmlrpclib` module now supports an XML-RPC extension for handling nil
1645 data values such as Python's ``None``. Nil values are always supported on
1646 unmarshalling an XML-RPC response. To generate requests containing ``None``,
1647 you must supply a true value for the *allow_none* parameter when creating a
1648 :class:`Marshaller` instance.
1649
1650* The new :mod:`DocXMLRPCServer` module allows writing self-documenting XML-RPC
1651 servers. Run it in demo mode (as a program) to see it in action. Pointing the
1652 Web browser to the RPC server produces pydoc-style documentation; pointing
1653 xmlrpclib to the server allows invoking the actual methods. (Contributed by
1654 Brian Quinlan.)
1655
1656* Support for internationalized domain names (RFCs 3454, 3490, 3491, and 3492)
1657 has been added. The "idna" encoding can be used to convert between a Unicode
1658 domain name and the ASCII-compatible encoding (ACE) of that name. ::
1659
1660 >{}>{}> u"www.Alliancefrançaise.nu".encode("idna")
1661 'www.xn--alliancefranaise-npb.nu'
1662
1663 The :mod:`socket` module has also been extended to transparently convert
1664 Unicode hostnames to the ACE version before passing them to the C library.
1665 Modules that deal with hostnames such as :mod:`httplib` and :mod:`ftplib`)
1666 also support Unicode host names; :mod:`httplib` also sends HTTP ``Host``
1667 headers using the ACE version of the domain name. :mod:`urllib` supports
1668 Unicode URLs with non-ASCII host names as long as the ``path`` part of the URL
1669 is ASCII only.
1670
1671 To implement this change, the :mod:`stringprep` module, the ``mkstringprep``
1672 tool and the ``punycode`` encoding have been added.
1673
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001674.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +00001675
1676
1677Date/Time Type
1678--------------
1679
1680Date and time types suitable for expressing timestamps were added as the
1681:mod:`datetime` module. The types don't support different calendars or many
1682fancy features, and just stick to the basics of representing time.
1683
1684The three primary types are: :class:`date`, representing a day, month, and year;
1685:class:`time`, consisting of hour, minute, and second; and :class:`datetime`,
1686which contains all the attributes of both :class:`date` and :class:`time`.
1687There's also a :class:`timedelta` class representing differences between two
1688points in time, and time zone logic is implemented by classes inheriting from
1689the abstract :class:`tzinfo` class.
1690
1691You can create instances of :class:`date` and :class:`time` by either supplying
1692keyword arguments to the appropriate constructor, e.g.
1693``datetime.date(year=1972, month=10, day=15)``, or by using one of a number of
1694class methods. For example, the :meth:`date.today` class method returns the
1695current local date.
1696
1697Once created, instances of the date/time classes are all immutable. There are a
1698number of methods for producing formatted strings from objects::
1699
1700 >>> import datetime
1701 >>> now = datetime.datetime.now()
1702 >>> now.isoformat()
1703 '2002-12-30T21:27:03.994956'
1704 >>> now.ctime() # Only available on date, datetime
1705 'Mon Dec 30 21:27:03 2002'
1706 >>> now.strftime('%Y %d %b')
1707 '2002 30 Dec'
1708
1709The :meth:`replace` method allows modifying one or more fields of a
1710:class:`date` or :class:`datetime` instance, returning a new instance::
1711
1712 >>> d = datetime.datetime.now()
1713 >>> d
1714 datetime.datetime(2002, 12, 30, 22, 15, 38, 827738)
1715 >>> d.replace(year=2001, hour = 12)
1716 datetime.datetime(2001, 12, 30, 12, 15, 38, 827738)
1717 >>>
1718
1719Instances can be compared, hashed, and converted to strings (the result is the
1720same as that of :meth:`isoformat`). :class:`date` and :class:`datetime`
1721instances can be subtracted from each other, and added to :class:`timedelta`
1722instances. The largest missing feature is that there's no standard library
1723support for parsing strings and getting back a :class:`date` or
1724:class:`datetime`.
1725
1726For more information, refer to the module's reference documentation.
1727(Contributed by Tim Peters.)
1728
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001729.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +00001730
1731
1732The optparse Module
1733-------------------
1734
1735The :mod:`getopt` module provides simple parsing of command-line arguments. The
1736new :mod:`optparse` module (originally named Optik) provides more elaborate
1737command-line parsing that follows the Unix conventions, automatically creates
1738the output for :option:`--help`, and can perform different actions for different
1739options.
1740
1741You start by creating an instance of :class:`OptionParser` and telling it what
1742your program's options are. ::
1743
1744 import sys
1745 from optparse import OptionParser
1746
1747 op = OptionParser()
1748 op.add_option('-i', '--input',
1749 action='store', type='string', dest='input',
1750 help='set input filename')
1751 op.add_option('-l', '--length',
1752 action='store', type='int', dest='length',
1753 help='set maximum length of output')
1754
1755Parsing a command line is then done by calling the :meth:`parse_args` method. ::
1756
1757 options, args = op.parse_args(sys.argv[1:])
1758 print options
1759 print args
1760
1761This returns an object containing all of the option values, and a list of
1762strings containing the remaining arguments.
1763
1764Invoking the script with the various arguments now works as you'd expect it to.
1765Note that the length argument is automatically converted to an integer. ::
1766
1767 $ ./python opt.py -i data arg1
1768 <Values at 0x400cad4c: {'input': 'data', 'length': None}>
1769 ['arg1']
1770 $ ./python opt.py --input=data --length=4
1771 <Values at 0x400cad2c: {'input': 'data', 'length': 4}>
1772 []
1773 $
1774
1775The help message is automatically generated for you::
1776
1777 $ ./python opt.py --help
1778 usage: opt.py [options]
1779
1780 options:
1781 -h, --help show this help message and exit
1782 -iINPUT, --input=INPUT
1783 set input filename
1784 -lLENGTH, --length=LENGTH
1785 set maximum length of output
Georg Brandl48310cd2009-01-03 21:18:54 +00001786 $
Georg Brandl116aa622007-08-15 14:28:22 +00001787
1788See the module's documentation for more details.
1789
1790
1791Optik was written by Greg Ward, with suggestions from the readers of the Getopt
1792SIG.
1793
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001794.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +00001795
1796
1797.. _section-pymalloc:
1798
1799Pymalloc: A Specialized Object Allocator
1800========================================
1801
1802Pymalloc, a specialized object allocator written by Vladimir Marangozov, was a
1803feature added to Python 2.1. Pymalloc is intended to be faster than the system
Georg Brandl60203b42010-10-06 10:11:56 +00001804:c:func:`malloc` and to have less memory overhead for allocation patterns typical
1805of Python programs. The allocator uses C's :c:func:`malloc` function to get large
Georg Brandl116aa622007-08-15 14:28:22 +00001806pools of memory and then fulfills smaller memory requests from these pools.
1807
1808In 2.1 and 2.2, pymalloc was an experimental feature and wasn't enabled by
1809default; you had to explicitly enable it when compiling Python by providing the
1810:option:`--with-pymalloc` option to the :program:`configure` script. In 2.3,
1811pymalloc has had further enhancements and is now enabled by default; you'll have
1812to supply :option:`--without-pymalloc` to disable it.
1813
1814This change is transparent to code written in Python; however, pymalloc may
1815expose bugs in C extensions. Authors of C extension modules should test their
1816code with pymalloc enabled, because some incorrect code may cause core dumps at
1817runtime.
1818
1819There's one particularly common error that causes problems. There are a number
1820of memory allocation functions in Python's C API that have previously just been
Georg Brandl60203b42010-10-06 10:11:56 +00001821aliases for the C library's :c:func:`malloc` and :c:func:`free`, meaning that if
Georg Brandl116aa622007-08-15 14:28:22 +00001822you accidentally called mismatched functions the error wouldn't be noticeable.
1823When the object allocator is enabled, these functions aren't aliases of
Georg Brandl60203b42010-10-06 10:11:56 +00001824:c:func:`malloc` and :c:func:`free` any more, and calling the wrong function to
Georg Brandl116aa622007-08-15 14:28:22 +00001825free memory may get you a core dump. For example, if memory was allocated using
Georg Brandl60203b42010-10-06 10:11:56 +00001826:c:func:`PyObject_Malloc`, it has to be freed using :c:func:`PyObject_Free`, not
1827:c:func:`free`. A few modules included with Python fell afoul of this and had to
Georg Brandl116aa622007-08-15 14:28:22 +00001828be fixed; doubtless there are more third-party modules that will have the same
1829problem.
1830
1831As part of this change, the confusing multiple interfaces for allocating memory
1832have been consolidated down into two API families. Memory allocated with one
1833family must not be manipulated with functions from the other family. There is
1834one family for allocating chunks of memory and another family of functions
1835specifically for allocating Python objects.
1836
1837* To allocate and free an undistinguished chunk of memory use the "raw memory"
Georg Brandl60203b42010-10-06 10:11:56 +00001838 family: :c:func:`PyMem_Malloc`, :c:func:`PyMem_Realloc`, and :c:func:`PyMem_Free`.
Georg Brandl116aa622007-08-15 14:28:22 +00001839
1840* The "object memory" family is the interface to the pymalloc facility described
1841 above and is biased towards a large number of "small" allocations:
Georg Brandl60203b42010-10-06 10:11:56 +00001842 :c:func:`PyObject_Malloc`, :c:func:`PyObject_Realloc`, and :c:func:`PyObject_Free`.
Georg Brandl116aa622007-08-15 14:28:22 +00001843
1844* To allocate and free Python objects, use the "object" family
Georg Brandl60203b42010-10-06 10:11:56 +00001845 :c:func:`PyObject_New`, :c:func:`PyObject_NewVar`, and :c:func:`PyObject_Del`.
Georg Brandl116aa622007-08-15 14:28:22 +00001846
1847Thanks to lots of work by Tim Peters, pymalloc in 2.3 also provides debugging
1848features to catch memory overwrites and doubled frees in both extension modules
1849and in the interpreter itself. To enable this support, compile a debugging
1850version of the Python interpreter by running :program:`configure` with
1851:option:`--with-pydebug`.
1852
1853To aid extension writers, a header file :file:`Misc/pymemcompat.h` is
1854distributed with the source to Python 2.3 that allows Python extensions to use
1855the 2.3 interfaces to memory allocation while compiling against any version of
1856Python since 1.5.2. You would copy the file from Python's source distribution
1857and bundle it with the source of your extension.
1858
1859
1860.. seealso::
1861
Georg Brandl495f7b52009-10-27 15:28:25 +00001862 http://svn.python.org/view/python/trunk/Objects/obmalloc.c
1863 For the full details of the pymalloc implementation, see the comments at
1864 the top of the file :file:`Objects/obmalloc.c` in the Python source code.
1865 The above link points to the file within the python.org SVN browser.
Georg Brandl116aa622007-08-15 14:28:22 +00001866
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001867.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +00001868
1869
1870Build and C API Changes
1871=======================
1872
1873Changes to Python's build process and to the C API include:
1874
1875* The cycle detection implementation used by the garbage collection has proven
1876 to be stable, so it's now been made mandatory. You can no longer compile Python
1877 without it, and the :option:`--with-cycle-gc` switch to :program:`configure` has
1878 been removed.
1879
1880* Python can now optionally be built as a shared library
1881 (:file:`libpython2.3.so`) by supplying :option:`--enable-shared` when running
1882 Python's :program:`configure` script. (Contributed by Ondrej Palkovsky.)
1883
Georg Brandl60203b42010-10-06 10:11:56 +00001884* The :c:macro:`DL_EXPORT` and :c:macro:`DL_IMPORT` macros are now deprecated.
Georg Brandl116aa622007-08-15 14:28:22 +00001885 Initialization functions for Python extension modules should now be declared
Georg Brandl60203b42010-10-06 10:11:56 +00001886 using the new macro :c:macro:`PyMODINIT_FUNC`, while the Python core will
1887 generally use the :c:macro:`PyAPI_FUNC` and :c:macro:`PyAPI_DATA` macros.
Georg Brandl116aa622007-08-15 14:28:22 +00001888
1889* The interpreter can be compiled without any docstrings for the built-in
1890 functions and modules by supplying :option:`--without-doc-strings` to the
1891 :program:`configure` script. This makes the Python executable about 10% smaller,
1892 but will also mean that you can't get help for Python's built-ins. (Contributed
1893 by Gustavo Niemeyer.)
1894
Georg Brandl60203b42010-10-06 10:11:56 +00001895* The :c:func:`PyArg_NoArgs` macro is now deprecated, and code that uses it
Georg Brandl116aa622007-08-15 14:28:22 +00001896 should be changed. For Python 2.2 and later, the method definition table can
1897 specify the :const:`METH_NOARGS` flag, signalling that there are no arguments,
1898 and the argument checking can then be removed. If compatibility with pre-2.2
1899 versions of Python is important, the code could use ``PyArg_ParseTuple(args,
1900 "")`` instead, but this will be slower than using :const:`METH_NOARGS`.
1901
Georg Brandl60203b42010-10-06 10:11:56 +00001902* :c:func:`PyArg_ParseTuple` accepts new format characters for various sizes of
1903 unsigned integers: ``B`` for :c:type:`unsigned char`, ``H`` for :c:type:`unsigned
1904 short int`, ``I`` for :c:type:`unsigned int`, and ``K`` for :c:type:`unsigned
Georg Brandl116aa622007-08-15 14:28:22 +00001905 long long`.
1906
Georg Brandl60203b42010-10-06 10:11:56 +00001907* A new function, :c:func:`PyObject_DelItemString(mapping, char \*key)` was added
Georg Brandl116aa622007-08-15 14:28:22 +00001908 as shorthand for ``PyObject_DelItem(mapping, PyString_New(key))``.
1909
1910* File objects now manage their internal string buffer differently, increasing
1911 it exponentially when needed. This results in the benchmark tests in
1912 :file:`Lib/test/test_bufio.py` speeding up considerably (from 57 seconds to 1.7
1913 seconds, according to one measurement).
1914
1915* It's now possible to define class and static methods for a C extension type by
1916 setting either the :const:`METH_CLASS` or :const:`METH_STATIC` flags in a
Georg Brandl60203b42010-10-06 10:11:56 +00001917 method's :c:type:`PyMethodDef` structure.
Georg Brandl116aa622007-08-15 14:28:22 +00001918
1919* Python now includes a copy of the Expat XML parser's source code, removing any
1920 dependence on a system version or local installation of Expat.
1921
1922* If you dynamically allocate type objects in your extension, you should be
1923 aware of a change in the rules relating to the :attr:`__module__` and
1924 :attr:`__name__` attributes. In summary, you will want to ensure the type's
1925 dictionary contains a ``'__module__'`` key; making the module name the part of
1926 the type name leading up to the final period will no longer have the desired
1927 effect. For more detail, read the API reference documentation or the source.
1928
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001929.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +00001930
1931
1932Port-Specific Changes
1933---------------------
1934
1935Support for a port to IBM's OS/2 using the EMX runtime environment was merged
1936into the main Python source tree. EMX is a POSIX emulation layer over the OS/2
1937system APIs. The Python port for EMX tries to support all the POSIX-like
1938capability exposed by the EMX runtime, and mostly succeeds; :func:`fork` and
1939:func:`fcntl` are restricted by the limitations of the underlying emulation
1940layer. The standard OS/2 port, which uses IBM's Visual Age compiler, also
1941gained support for case-sensitive import semantics as part of the integration of
1942the EMX port into CVS. (Contributed by Andrew MacIntyre.)
1943
1944On MacOS, most toolbox modules have been weaklinked to improve backward
1945compatibility. This means that modules will no longer fail to load if a single
1946routine is missing on the current OS version. Instead calling the missing
1947routine will raise an exception. (Contributed by Jack Jansen.)
1948
1949The RPM spec files, found in the :file:`Misc/RPM/` directory in the Python
1950source distribution, were updated for 2.3. (Contributed by Sean Reifschneider.)
1951
1952Other new platforms now supported by Python include AtheOS
1953(http://www.atheos.cx/), GNU/Hurd, and OpenVMS.
1954
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001955.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +00001956
1957
Benjamin Petersonf10a79a2008-10-11 00:49:57 +00001958.. _23section-other:
Georg Brandl116aa622007-08-15 14:28:22 +00001959
1960Other Changes and Fixes
1961=======================
1962
1963As usual, there were a bunch of other improvements and bugfixes scattered
1964throughout the source tree. A search through the CVS change logs finds there
1965were 523 patches applied and 514 bugs fixed between Python 2.2 and 2.3. Both
1966figures are likely to be underestimates.
1967
1968Some of the more notable changes are:
1969
1970* If the :envvar:`PYTHONINSPECT` environment variable is set, the Python
1971 interpreter will enter the interactive prompt after running a Python program, as
1972 if Python had been invoked with the :option:`-i` option. The environment
1973 variable can be set before running the Python interpreter, or it can be set by
1974 the Python program as part of its execution.
1975
1976* The :file:`regrtest.py` script now provides a way to allow "all resources
1977 except *foo*." A resource name passed to the :option:`-u` option can now be
1978 prefixed with a hyphen (``'-'``) to mean "remove this resource." For example,
1979 the option '``-uall,-bsddb``' could be used to enable the use of all resources
1980 except ``bsddb``.
1981
1982* The tools used to build the documentation now work under Cygwin as well as
1983 Unix.
1984
1985* The ``SET_LINENO`` opcode has been removed. Back in the mists of time, this
1986 opcode was needed to produce line numbers in tracebacks and support trace
1987 functions (for, e.g., :mod:`pdb`). Since Python 1.5, the line numbers in
1988 tracebacks have been computed using a different mechanism that works with
1989 "python -O". For Python 2.3 Michael Hudson implemented a similar scheme to
1990 determine when to call the trace function, removing the need for ``SET_LINENO``
1991 entirely.
1992
1993 It would be difficult to detect any resulting difference from Python code, apart
1994 from a slight speed up when Python is run without :option:`-O`.
1995
1996 C extensions that access the :attr:`f_lineno` field of frame objects should
1997 instead call ``PyCode_Addr2Line(f->f_code, f->f_lasti)``. This will have the
1998 added effect of making the code work as desired under "python -O" in earlier
1999 versions of Python.
2000
2001 A nifty new feature is that trace functions can now assign to the
2002 :attr:`f_lineno` attribute of frame objects, changing the line that will be
2003 executed next. A ``jump`` command has been added to the :mod:`pdb` debugger
2004 taking advantage of this new feature. (Implemented by Richie Hindle.)
2005
Christian Heimes5b5e81c2007-12-31 16:14:33 +00002006.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +00002007
2008
2009Porting to Python 2.3
2010=====================
2011
2012This section lists previously described changes that may require changes to your
2013code:
2014
2015* :keyword:`yield` is now always a keyword; if it's used as a variable name in
2016 your code, a different name must be chosen.
2017
2018* For strings *X* and *Y*, ``X in Y`` now works if *X* is more than one
2019 character long.
2020
2021* The :func:`int` type constructor will now return a long integer instead of
2022 raising an :exc:`OverflowError` when a string or floating-point number is too
2023 large to fit into an integer.
2024
2025* If you have Unicode strings that contain 8-bit characters, you must declare
2026 the file's encoding (UTF-8, Latin-1, or whatever) by adding a comment to the top
2027 of the file. See section :ref:`section-encodings` for more information.
2028
2029* Calling Tcl methods through :mod:`_tkinter` no longer returns only strings.
2030 Instead, if Tcl returns other objects those objects are converted to their
2031 Python equivalent, if one exists, or wrapped with a :class:`_tkinter.Tcl_Obj`
2032 object if no Python equivalent exists.
2033
2034* Large octal and hex literals such as ``0xffffffff`` now trigger a
2035 :exc:`FutureWarning`. Currently they're stored as 32-bit numbers and result in a
2036 negative value, but in Python 2.4 they'll become positive long integers.
2037
2038 There are a few ways to fix this warning. If you really need a positive number,
2039 just add an ``L`` to the end of the literal. If you're trying to get a 32-bit
2040 integer with low bits set and have previously used an expression such as ``~(1
2041 << 31)``, it's probably clearest to start with all bits set and clear the
2042 desired upper bits. For example, to clear just the top bit (bit 31), you could
2043 write ``0xffffffffL &~(1L<<31)``.
2044
Georg Brandl116aa622007-08-15 14:28:22 +00002045* You can no longer disable assertions by assigning to ``__debug__``.
2046
2047* The Distutils :func:`setup` function has gained various new keyword arguments
2048 such as *depends*. Old versions of the Distutils will abort if passed unknown
2049 keywords. A solution is to check for the presence of the new
2050 :func:`get_distutil_options` function in your :file:`setup.py` and only uses the
2051 new keywords with a version of the Distutils that supports them::
2052
2053 from distutils import core
2054
2055 kw = {'sources': 'foo.c', ...}
2056 if hasattr(core, 'get_distutil_options'):
2057 kw['depends'] = ['foo.h']
2058 ext = Extension(**kw)
2059
2060* Using ``None`` as a variable name will now result in a :exc:`SyntaxWarning`
2061 warning.
2062
2063* Names of extension types defined by the modules included with Python now
2064 contain the module and a ``'.'`` in front of the type name.
2065
Christian Heimes5b5e81c2007-12-31 16:14:33 +00002066.. ======================================================================
Georg Brandl116aa622007-08-15 14:28:22 +00002067
2068
Benjamin Petersonf10a79a2008-10-11 00:49:57 +00002069.. _23acks:
Georg Brandl116aa622007-08-15 14:28:22 +00002070
2071Acknowledgements
2072================
2073
2074The author would like to thank the following people for offering suggestions,
2075corrections and assistance with various drafts of this article: Jeff Bauer,
2076Simon Brunning, Brett Cannon, Michael Chermside, Andrew Dalke, Scott David
2077Daniels, Fred L. Drake, Jr., David Fraser, Kelly Gerber, Raymond Hettinger,
2078Michael Hudson, Chris Lambert, Detlef Lannert, Martin von Löwis, Andrew
2079MacIntyre, Lalo Martins, Chad Netzer, Gustavo Niemeyer, Neal Norwitz, Hans
2080Nowak, Chris Reedy, Francesco Ricciardi, Vinay Sajip, Neil Schemenauer, Roman
2081Suzi, Jason Tishler, Just van Rossum.
2082