blob: 042c640a608fd9b221a96fa1a781a3ef63f37b67 [file] [log] [blame]
Fred Drake03e10312002-03-26 19:17:43 +00001\documentclass{howto}
Fred Drake693aea22003-02-07 14:52:18 +00002\usepackage{distutils}
Andrew M. Kuchling03594bb2002-03-27 02:29:48 +00003% $Id$
4
5\title{What's New in Python 2.3}
Andrew M. Kuchlingfcf6b3e2003-05-07 17:00:35 +00006\release{0.11}
Fred Drakeaac8c582003-01-17 22:50:10 +00007\author{A.M.\ Kuchling}
Andrew M. Kuchlingbc5e3cc2002-11-05 00:26:33 +00008\authoraddress{\email{amk@amk.ca}}
Fred Drake03e10312002-03-26 19:17:43 +00009
10\begin{document}
11\maketitle
12\tableofcontents
13
Andrew M. Kuchlingd39078b2003-04-13 21:44:28 +000014% To do:
Andrew M. Kuchlingc760c6c2003-07-16 20:12:33 +000015% PYTHONINSPECT
16% list.index
17% file.encoding
18% doctest extensions
19% new version of IDLE
20% XML-RPC nil extension
Andrew M. Kuchlingc61ec522002-08-04 01:20:05 +000021% MacOS framework-related changes (section of its own, probably)
Andrew M. Kuchlingf70a0a82002-06-10 13:22:46 +000022
Andrew M. Kuchling03594bb2002-03-27 02:29:48 +000023%\section{Introduction \label{intro}}
24
Andrew M. Kuchling974ab9d2002-12-31 01:20:30 +000025{\large This article is a draft, and is currently up to date for
Andrew M. Kuchling2cd77312003-07-16 14:44:12 +000026Python 2.3rc1. Please send any additions, comments or errata to the
Andrew M. Kuchlinge36b6902003-04-19 15:38:47 +000027author.}
Andrew M. Kuchling03594bb2002-03-27 02:29:48 +000028
29This article explains the new features in Python 2.3. The tentative
Andrew M. Kuchling2cd77312003-07-16 14:44:12 +000030release date of Python 2.3 is currently scheduled for August 2003.
Andrew M. Kuchling03594bb2002-03-27 02:29:48 +000031
32This article doesn't attempt to provide a complete specification of
33the new features, but instead provides a convenient overview. For
34full details, you should refer to the documentation for Python 2.3,
Fred Drake693aea22003-02-07 14:52:18 +000035such as the \citetitle[../lib/lib.html]{Python Library Reference} and
36the \citetitle[../ref/ref.html]{Python Reference Manual}. If you want
Andrew M. Kuchling2cd77312003-07-16 14:44:12 +000037to understand the complete implementation and design rationale,
38refer to the PEP for a particular new feature.
Fred Drake03e10312002-03-26 19:17:43 +000039
40
Andrew M. Kuchling03594bb2002-03-27 02:29:48 +000041%======================================================================
Andrew M. Kuchlingbc465102002-08-20 01:34:06 +000042\section{PEP 218: A Standard Set Datatype}
43
44The new \module{sets} module contains an implementation of a set
45datatype. The \class{Set} class is for mutable sets, sets that can
46have members added and removed. The \class{ImmutableSet} class is for
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +000047sets that can't be modified, and instances of \class{ImmutableSet} can
48therefore be used as dictionary keys. Sets are built on top of
49dictionaries, so the elements within a set must be hashable.
Andrew M. Kuchlingbc465102002-08-20 01:34:06 +000050
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +000051Here's a simple example:
Andrew M. Kuchlingbc465102002-08-20 01:34:06 +000052
53\begin{verbatim}
54>>> import sets
55>>> S = sets.Set([1,2,3])
56>>> S
57Set([1, 2, 3])
58>>> 1 in S
59True
60>>> 0 in S
61False
62>>> S.add(5)
63>>> S.remove(3)
64>>> S
65Set([1, 2, 5])
Fred Drake5c4cf152002-11-13 14:59:06 +000066>>>
Andrew M. Kuchlingbc465102002-08-20 01:34:06 +000067\end{verbatim}
68
69The union and intersection of sets can be computed with the
Andrew M. Kuchling2cd77312003-07-16 14:44:12 +000070\method{union()} and \method{intersection()} methods; an alternative
71notation uses the bitwise operators \code{\&} and \code{|}.
Andrew M. Kuchlingbc465102002-08-20 01:34:06 +000072Mutable sets also have in-place versions of these methods,
73\method{union_update()} and \method{intersection_update()}.
74
75\begin{verbatim}
76>>> S1 = sets.Set([1,2,3])
77>>> S2 = sets.Set([4,5,6])
78>>> S1.union(S2)
79Set([1, 2, 3, 4, 5, 6])
80>>> S1 | S2 # Alternative notation
81Set([1, 2, 3, 4, 5, 6])
Fred Drake5c4cf152002-11-13 14:59:06 +000082>>> S1.intersection(S2)
Andrew M. Kuchlingbc465102002-08-20 01:34:06 +000083Set([])
84>>> S1 & S2 # Alternative notation
85Set([])
86>>> S1.union_update(S2)
Andrew M. Kuchlingbc465102002-08-20 01:34:06 +000087>>> S1
88Set([1, 2, 3, 4, 5, 6])
Fred Drake5c4cf152002-11-13 14:59:06 +000089>>>
Andrew M. Kuchlingbc465102002-08-20 01:34:06 +000090\end{verbatim}
91
92It's also possible to take the symmetric difference of two sets. This
93is the set of all elements in the union that aren't in the
Andrew M. Kuchling2cd77312003-07-16 14:44:12 +000094intersection. Another way of putting it is that the symmetric
95difference contains all elements that are in exactly one
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +000096set. Again, there's an alternative notation (\code{\^}), and an
97in-place version with the ungainly name
Andrew M. Kuchlingbc465102002-08-20 01:34:06 +000098\method{symmetric_difference_update()}.
99
100\begin{verbatim}
101>>> S1 = sets.Set([1,2,3,4])
102>>> S2 = sets.Set([3,4,5,6])
103>>> S1.symmetric_difference(S2)
104Set([1, 2, 5, 6])
105>>> S1 ^ S2
106Set([1, 2, 5, 6])
107>>>
108\end{verbatim}
109
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +0000110There are also \method{issubset()} and \method{issuperset()} methods
Michael W. Hudson065f5fa2003-02-10 19:24:50 +0000111for checking whether one set is a subset or superset of another:
Andrew M. Kuchlingbc465102002-08-20 01:34:06 +0000112
113\begin{verbatim}
114>>> S1 = sets.Set([1,2,3])
115>>> S2 = sets.Set([2,3])
116>>> S2.issubset(S1)
117True
118>>> S1.issubset(S2)
119False
120>>> S1.issuperset(S2)
121True
122>>>
123\end{verbatim}
124
125
126\begin{seealso}
127
128\seepep{218}{Adding a Built-In Set Object Type}{PEP written by Greg V. Wilson.
129Implemented by Greg V. Wilson, Alex Martelli, and GvR.}
130
131\end{seealso}
132
133
134
135%======================================================================
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000136\section{PEP 255: Simple Generators\label{section-generators}}
Andrew M. Kuchlingf4dd65d2002-04-01 19:28:09 +0000137
138In Python 2.2, generators were added as an optional feature, to be
139enabled by a \code{from __future__ import generators} directive. In
1402.3 generators no longer need to be specially enabled, and are now
141always present; this means that \keyword{yield} is now always a
142keyword. The rest of this section is a copy of the description of
143generators from the ``What's New in Python 2.2'' document; if you read
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +0000144it back when Python 2.2 came out, you can skip the rest of this section.
Andrew M. Kuchlingf4dd65d2002-04-01 19:28:09 +0000145
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000146You're doubtless familiar with how function calls work in Python or C.
147When you call a function, it gets a private namespace where its local
Andrew M. Kuchlingf4dd65d2002-04-01 19:28:09 +0000148variables are created. When the function reaches a \keyword{return}
149statement, the local variables are destroyed and the resulting value
150is returned to the caller. A later call to the same function will get
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000151a fresh new set of local variables. But, what if the local variables
Andrew M. Kuchlingf4dd65d2002-04-01 19:28:09 +0000152weren't thrown away on exiting a function? What if you could later
153resume the function where it left off? This is what generators
154provide; they can be thought of as resumable functions.
155
156Here's the simplest example of a generator function:
157
158\begin{verbatim}
159def generate_ints(N):
160 for i in range(N):
161 yield i
162\end{verbatim}
163
164A new keyword, \keyword{yield}, was introduced for generators. Any
165function containing a \keyword{yield} statement is a generator
166function; this is detected by Python's bytecode compiler which
Fred Drake5c4cf152002-11-13 14:59:06 +0000167compiles the function specially as a result.
Andrew M. Kuchlingf4dd65d2002-04-01 19:28:09 +0000168
169When you call a generator function, it doesn't return a single value;
170instead it returns a generator object that supports the iterator
171protocol. On executing the \keyword{yield} statement, the generator
172outputs the value of \code{i}, similar to a \keyword{return}
173statement. The big difference between \keyword{yield} and a
174\keyword{return} statement is that on reaching a \keyword{yield} the
175generator's state of execution is suspended and local variables are
176preserved. On the next call to the generator's \code{.next()} method,
177the function will resume executing immediately after the
178\keyword{yield} statement. (For complicated reasons, the
179\keyword{yield} statement isn't allowed inside the \keyword{try} block
Fred Drakeaac8c582003-01-17 22:50:10 +0000180of a \keyword{try}...\keyword{finally} statement; read \pep{255} for a full
Andrew M. Kuchlingf4dd65d2002-04-01 19:28:09 +0000181explanation of the interaction between \keyword{yield} and
182exceptions.)
183
Fred Drakeaac8c582003-01-17 22:50:10 +0000184Here's a sample usage of the \function{generate_ints()} generator:
Andrew M. Kuchlingf4dd65d2002-04-01 19:28:09 +0000185
186\begin{verbatim}
187>>> gen = generate_ints(3)
188>>> gen
189<generator object at 0x8117f90>
190>>> gen.next()
1910
192>>> gen.next()
1931
194>>> gen.next()
1952
196>>> gen.next()
197Traceback (most recent call last):
Andrew M. Kuchling9f6e1042002-06-17 13:40:04 +0000198 File "stdin", line 1, in ?
199 File "stdin", line 2, in generate_ints
Andrew M. Kuchlingf4dd65d2002-04-01 19:28:09 +0000200StopIteration
201\end{verbatim}
202
203You could equally write \code{for i in generate_ints(5)}, or
204\code{a,b,c = generate_ints(3)}.
205
206Inside a generator function, the \keyword{return} statement can only
207be used without a value, and signals the end of the procession of
208values; afterwards the generator cannot return any further values.
209\keyword{return} with a value, such as \code{return 5}, is a syntax
210error inside a generator function. The end of the generator's results
211can also be indicated by raising \exception{StopIteration} manually,
212or by just letting the flow of execution fall off the bottom of the
213function.
214
215You could achieve the effect of generators manually by writing your
216own class and storing all the local variables of the generator as
217instance variables. For example, returning a list of integers could
218be done by setting \code{self.count} to 0, and having the
219\method{next()} method increment \code{self.count} and return it.
220However, for a moderately complicated generator, writing a
221corresponding class would be much messier.
222\file{Lib/test/test_generators.py} contains a number of more
223interesting examples. The simplest one implements an in-order
224traversal of a tree using generators recursively.
225
226\begin{verbatim}
227# A recursive generator that generates Tree leaves in in-order.
228def inorder(t):
229 if t:
230 for x in inorder(t.left):
231 yield x
232 yield t.label
233 for x in inorder(t.right):
234 yield x
235\end{verbatim}
236
237Two other examples in \file{Lib/test/test_generators.py} produce
238solutions for the N-Queens problem (placing $N$ queens on an $NxN$
239chess board so that no queen threatens another) and the Knight's Tour
240(a route that takes a knight to every square of an $NxN$ chessboard
Fred Drake5c4cf152002-11-13 14:59:06 +0000241without visiting any square twice).
Andrew M. Kuchlingf4dd65d2002-04-01 19:28:09 +0000242
243The idea of generators comes from other programming languages,
244especially Icon (\url{http://www.cs.arizona.edu/icon/}), where the
245idea of generators is central. In Icon, every
246expression and function call behaves like a generator. One example
247from ``An Overview of the Icon Programming Language'' at
248\url{http://www.cs.arizona.edu/icon/docs/ipd266.htm} gives an idea of
249what this looks like:
250
251\begin{verbatim}
252sentence := "Store it in the neighboring harbor"
253if (i := find("or", sentence)) > 5 then write(i)
254\end{verbatim}
255
256In Icon the \function{find()} function returns the indexes at which the
257substring ``or'' is found: 3, 23, 33. In the \keyword{if} statement,
258\code{i} is first assigned a value of 3, but 3 is less than 5, so the
259comparison fails, and Icon retries it with the second value of 23. 23
260is greater than 5, so the comparison now succeeds, and the code prints
261the value 23 to the screen.
262
263Python doesn't go nearly as far as Icon in adopting generators as a
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +0000264central concept. Generators are considered part of the core
Andrew M. Kuchlingf4dd65d2002-04-01 19:28:09 +0000265Python language, but learning or using them isn't compulsory; if they
266don't solve any problems that you have, feel free to ignore them.
267One novel feature of Python's interface as compared to
268Icon's is that a generator's state is represented as a concrete object
269(the iterator) that can be passed around to other functions or stored
270in a data structure.
271
272\begin{seealso}
273
274\seepep{255}{Simple Generators}{Written by Neil Schemenauer, Tim
275Peters, Magnus Lie Hetland. Implemented mostly by Neil Schemenauer
276and Tim Peters, with other fixes from the Python Labs crew.}
277
278\end{seealso}
279
280
281%======================================================================
Fred Drake13090e12002-08-22 16:51:08 +0000282\section{PEP 263: Source Code Encodings \label{section-encodings}}
Andrew M. Kuchling950725f2002-08-06 01:40:48 +0000283
284Python source files can now be declared as being in different
285character set encodings. Encodings are declared by including a
286specially formatted comment in the first or second line of the source
287file. For example, a UTF-8 file can be declared with:
288
289\begin{verbatim}
290#!/usr/bin/env python
291# -*- coding: UTF-8 -*-
292\end{verbatim}
293
294Without such an encoding declaration, the default encoding used is
Andrew M. Kuchling2cd77312003-07-16 14:44:12 +00002957-bit ASCII. Executing or importing modules that contain string
296literals with 8-bit characters and have no encoding declaration will result
Andrew M. Kuchlingacddabc2003-02-18 00:43:24 +0000297in a \exception{DeprecationWarning} being signalled by Python 2.3; in
2982.4 this will be a syntax error.
Andrew M. Kuchling950725f2002-08-06 01:40:48 +0000299
Andrew M. Kuchlingacddabc2003-02-18 00:43:24 +0000300The encoding declaration only affects Unicode string literals, which
301will be converted to Unicode using the specified encoding. Note that
302Python identifiers are still restricted to ASCII characters, so you
303can't have variable names that use characters outside of the usual
304alphanumerics.
Andrew M. Kuchling950725f2002-08-06 01:40:48 +0000305
306\begin{seealso}
307
308\seepep{263}{Defining Python Source Code Encodings}{Written by
Andrew M. Kuchlingfcf6b3e2003-05-07 17:00:35 +0000309Marc-Andr\'e Lemburg and Martin von~L\"owis; implemented by Suzuki
310Hisao and Martin von~L\"owis.}
Andrew M. Kuchling950725f2002-08-06 01:40:48 +0000311
312\end{seealso}
313
314
315%======================================================================
Martin v. Löwisbd5e38d2002-10-07 18:52:29 +0000316\section{PEP 277: Unicode file name support for Windows NT}
Andrew M. Kuchling0f345562002-10-04 22:34:11 +0000317
Martin v. Löwisbd5e38d2002-10-07 18:52:29 +0000318On Windows NT, 2000, and XP, the system stores file names as Unicode
Andrew M. Kuchling0a6fa962002-10-09 12:11:10 +0000319strings. Traditionally, Python has represented file names as byte
320strings, which is inadequate because it renders some file names
Martin v. Löwisbd5e38d2002-10-07 18:52:29 +0000321inaccessible.
322
Andrew M. Kuchling0a6fa962002-10-09 12:11:10 +0000323Python now allows using arbitrary Unicode strings (within the
324limitations of the file system) for all functions that expect file
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +0000325names, most notably the \function{open()} built-in function. If a Unicode
326string is passed to \function{os.listdir()}, Python now returns a list
Andrew M. Kuchling0a6fa962002-10-09 12:11:10 +0000327of Unicode strings. A new function, \function{os.getcwdu()}, returns
328the current directory as a Unicode string.
Martin v. Löwisbd5e38d2002-10-07 18:52:29 +0000329
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +0000330Byte strings still work as file names, and on Windows Python will
331transparently convert them to Unicode using the \code{mbcs} encoding.
Martin v. Löwisbd5e38d2002-10-07 18:52:29 +0000332
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +0000333Other systems also allow Unicode strings as file names but convert
334them to byte strings before passing them to the system, which can
335cause a \exception{UnicodeError} to be raised. Applications can test
336whether arbitrary Unicode strings are supported as file names by
Andrew M. Kuchlingb9ba4e62003-02-03 15:16:15 +0000337checking \member{os.path.supports_unicode_filenames}, a Boolean value.
Martin v. Löwisbd5e38d2002-10-07 18:52:29 +0000338
Andrew M. Kuchling563389f2003-03-02 02:31:58 +0000339Under MacOS, \function{os.listdir()} may now return Unicode filenames.
340
Martin v. Löwisbd5e38d2002-10-07 18:52:29 +0000341\begin{seealso}
342
343\seepep{277}{Unicode file name support for Windows NT}{Written by Neil
Andrew M. Kuchlingfcf6b3e2003-05-07 17:00:35 +0000344Hodgson; implemented by Neil Hodgson, Martin von~L\"owis, and Mark
Martin v. Löwisbd5e38d2002-10-07 18:52:29 +0000345Hammond.}
346
347\end{seealso}
Andrew M. Kuchling0f345562002-10-04 22:34:11 +0000348
349
350%======================================================================
Andrew M. Kuchlingf3676512002-04-15 02:27:55 +0000351\section{PEP 278: Universal Newline Support}
352
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000353The three major operating systems used today are Microsoft Windows,
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000354Apple's Macintosh OS, and the various \UNIX\ derivatives. A minor
Andrew M. Kuchling2cd77312003-07-16 14:44:12 +0000355irritation of cross-platform work
356is that these three platforms all use different characters
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +0000357to mark the ends of lines in text files. \UNIX\ uses the linefeed
Andrew M. Kuchling2cd77312003-07-16 14:44:12 +0000358(ASCII character 10), MacOS uses the carriage return (ASCII
359character 13), and Windows uses a two-character sequence of a
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +0000360carriage return plus a newline.
Andrew M. Kuchlingf3676512002-04-15 02:27:55 +0000361
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000362Python's file objects can now support end of line conventions other
363than the one followed by the platform on which Python is running.
Fred Drake5c4cf152002-11-13 14:59:06 +0000364Opening a file with the mode \code{'U'} or \code{'rU'} will open a file
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000365for reading in universal newline mode. All three line ending
Fred Drake5c4cf152002-11-13 14:59:06 +0000366conventions will be translated to a \character{\e n} in the strings
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000367returned by the various file methods such as \method{read()} and
Fred Drake5c4cf152002-11-13 14:59:06 +0000368\method{readline()}.
Andrew M. Kuchlingf3676512002-04-15 02:27:55 +0000369
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000370Universal newline support is also used when importing modules and when
371executing a file with the \function{execfile()} function. This means
372that Python modules can be shared between all three operating systems
373without needing to convert the line-endings.
374
Andrew M. Kuchling2cd77312003-07-16 14:44:12 +0000375This feature can be disabled when compiling Python by specifying
376the \longprogramopt{without-universal-newlines} switch when running Python's
Fred Drake5c4cf152002-11-13 14:59:06 +0000377\program{configure} script.
Andrew M. Kuchlingf3676512002-04-15 02:27:55 +0000378
379\begin{seealso}
380
Fred Drake5c4cf152002-11-13 14:59:06 +0000381\seepep{278}{Universal Newline Support}{Written
Andrew M. Kuchlingf3676512002-04-15 02:27:55 +0000382and implemented by Jack Jansen.}
383
384\end{seealso}
385
Andrew M. Kuchlingfad2f592002-05-10 21:00:05 +0000386
387%======================================================================
Andrew M. Kuchling433307b2003-05-13 14:23:54 +0000388\section{PEP 279: enumerate()\label{section-enumerate}}
Andrew M. Kuchlingfad2f592002-05-10 21:00:05 +0000389
390A new built-in function, \function{enumerate()}, will make
391certain loops a bit clearer. \code{enumerate(thing)}, where
392\var{thing} is either an iterator or a sequence, returns a iterator
Fred Drake3605ae52003-07-16 03:26:31 +0000393that will return \code{(0, \var{thing}[0])}, \code{(1,
394\var{thing}[1])}, \code{(2, \var{thing}[2])}, and so forth.
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +0000395
Andrew M. Kuchling2cd77312003-07-16 14:44:12 +0000396A common idiom to change every element of a list looks like this:
Andrew M. Kuchlingfad2f592002-05-10 21:00:05 +0000397
398\begin{verbatim}
399for i in range(len(L)):
400 item = L[i]
401 # ... compute some result based on item ...
402 L[i] = result
403\end{verbatim}
404
405This can be rewritten using \function{enumerate()} as:
406
407\begin{verbatim}
408for i, item in enumerate(L):
409 # ... compute some result based on item ...
410 L[i] = result
411\end{verbatim}
412
413
414\begin{seealso}
415
Fred Drake5c4cf152002-11-13 14:59:06 +0000416\seepep{279}{The enumerate() built-in function}{Written
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +0000417and implemented by Raymond D. Hettinger.}
Andrew M. Kuchlingfad2f592002-05-10 21:00:05 +0000418
419\end{seealso}
420
421
Andrew M. Kuchlingf3676512002-04-15 02:27:55 +0000422%======================================================================
Andrew M. Kuchling433307b2003-05-13 14:23:54 +0000423\section{PEP 282: The logging Package}
Andrew M. Kuchling28f2f882002-11-14 14:14:16 +0000424
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +0000425A standard package for writing logs, \module{logging}, has been added
426to Python 2.3. It provides a powerful and flexible mechanism for
Andrew M. Kuchling366c10c2002-11-14 23:07:57 +0000427components to generate logging output which can then be filtered and
428processed in various ways. A standard configuration file format can
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +0000429be used to control the logging behavior of a program. Python's
430standard library includes handlers that will write log records to
431standard error or to a file or socket, send them to the system log, or
432even e-mail them to a particular address, and of course it's also
433possible to write your own handler classes.
Andrew M. Kuchling28f2f882002-11-14 14:14:16 +0000434
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +0000435The \class{Logger} class is the primary class.
Andrew M. Kuchling366c10c2002-11-14 23:07:57 +0000436Most application code will deal with one or more \class{Logger}
437objects, each one used by a particular subsystem of the application.
438Each \class{Logger} is identified by a name, and names are organized
439into a hierarchy using \samp{.} as the component separator. For
440example, you might have \class{Logger} instances named \samp{server},
441\samp{server.auth} and \samp{server.network}. The latter two
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +0000442instances are below \samp{server} in the hierarchy. This means that
443if you turn up the verbosity for \samp{server} or direct \samp{server}
444messages to a different handler, the changes will also apply to
445records logged to \samp{server.auth} and \samp{server.network}.
446There's also a root \class{Logger} that's the parent of all other
447loggers.
Andrew M. Kuchling28f2f882002-11-14 14:14:16 +0000448
Andrew M. Kuchling366c10c2002-11-14 23:07:57 +0000449For simple uses, the \module{logging} package contains some
450convenience functions that always use the root log:
Andrew M. Kuchling28f2f882002-11-14 14:14:16 +0000451
452\begin{verbatim}
453import logging
454
455logging.debug('Debugging information')
456logging.info('Informational message')
Andrew M. Kuchling37495072003-02-19 13:46:18 +0000457logging.warning('Warning:config file %s not found', 'server.conf')
Andrew M. Kuchling28f2f882002-11-14 14:14:16 +0000458logging.error('Error occurred')
459logging.critical('Critical error -- shutting down')
460\end{verbatim}
461
462This produces the following output:
463
464\begin{verbatim}
Andrew M. Kuchling37495072003-02-19 13:46:18 +0000465WARNING:root:Warning:config file server.conf not found
Andrew M. Kuchling28f2f882002-11-14 14:14:16 +0000466ERROR:root:Error occurred
467CRITICAL:root:Critical error -- shutting down
468\end{verbatim}
469
470In the default configuration, informational and debugging messages are
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +0000471suppressed and the output is sent to standard error. You can enable
472the display of information and debugging messages by calling the
473\method{setLevel()} method on the root logger.
Andrew M. Kuchling366c10c2002-11-14 23:07:57 +0000474
Andrew M. Kuchling37495072003-02-19 13:46:18 +0000475Notice the \function{warning()} call's use of string formatting
Andrew M. Kuchling366c10c2002-11-14 23:07:57 +0000476operators; all of the functions for logging messages take the
477arguments \code{(\var{msg}, \var{arg1}, \var{arg2}, ...)} and log the
478string resulting from \code{\var{msg} \% (\var{arg1}, \var{arg2},
479...)}.
Andrew M. Kuchling28f2f882002-11-14 14:14:16 +0000480
481There's also an \function{exception()} function that records the most
482recent traceback. Any of the other functions will also record the
Andrew M. Kuchling366c10c2002-11-14 23:07:57 +0000483traceback if you specify a true value for the keyword argument
Fred Drakeaac8c582003-01-17 22:50:10 +0000484\var{exc_info}.
Andrew M. Kuchling28f2f882002-11-14 14:14:16 +0000485
486\begin{verbatim}
487def f():
488 try: 1/0
489 except: logging.exception('Problem recorded')
490
491f()
492\end{verbatim}
493
494This produces the following output:
495
496\begin{verbatim}
497ERROR:root:Problem recorded
498Traceback (most recent call last):
499 File "t.py", line 6, in f
500 1/0
501ZeroDivisionError: integer division or modulo by zero
502\end{verbatim}
503
Andrew M. Kuchling366c10c2002-11-14 23:07:57 +0000504Slightly more advanced programs will use a logger other than the root
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +0000505logger. The \function{getLogger(\var{name})} function is used to get
506a particular log, creating it if it doesn't exist yet.
Andrew M. Kuchlingb1e4bf92002-12-03 13:35:17 +0000507\function{getLogger(None)} returns the root logger.
508
Andrew M. Kuchling28f2f882002-11-14 14:14:16 +0000509
510\begin{verbatim}
511log = logging.getLogger('server')
512 ...
513log.info('Listening on port %i', port)
514 ...
515log.critical('Disk full')
516 ...
517\end{verbatim}
518
Andrew M. Kuchling366c10c2002-11-14 23:07:57 +0000519Log records are usually propagated up the hierarchy, so a message
520logged to \samp{server.auth} is also seen by \samp{server} and
Andrew M. Kuchlingd39078b2003-04-13 21:44:28 +0000521\samp{root}, but a \class{Logger} can prevent this by setting its
Fred Drakeaac8c582003-01-17 22:50:10 +0000522\member{propagate} attribute to \constant{False}.
Andrew M. Kuchling366c10c2002-11-14 23:07:57 +0000523
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +0000524There are more classes provided by the \module{logging} package that
525can be customized. When a \class{Logger} instance is told to log a
526message, it creates a \class{LogRecord} instance that is sent to any
527number of different \class{Handler} instances. Loggers and handlers
528can also have an attached list of filters, and each filter can cause
529the \class{LogRecord} to be ignored or can modify the record before
Andrew M. Kuchlingd39078b2003-04-13 21:44:28 +0000530passing it along. When they're finally output, \class{LogRecord}
531instances are converted to text by a \class{Formatter} class. All of
532these classes can be replaced by your own specially-written classes.
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +0000533
Andrew M. Kuchling366c10c2002-11-14 23:07:57 +0000534With all of these features the \module{logging} package should provide
535enough flexibility for even the most complicated applications. This
Andrew M. Kuchlingd39078b2003-04-13 21:44:28 +0000536is only an incomplete overview of its features, so please see the
537\ulink{package's reference documentation}{../lib/module-logging.html}
538for all of the details. Reading \pep{282} will also be helpful.
Andrew M. Kuchling28f2f882002-11-14 14:14:16 +0000539
540
541\begin{seealso}
542
543\seepep{282}{A Logging System}{Written by Vinay Sajip and Trent Mick;
544implemented by Vinay Sajip.}
545
546\end{seealso}
547
548
549%======================================================================
Andrew M. Kuchling433307b2003-05-13 14:23:54 +0000550\section{PEP 285: A Boolean Type\label{section-bool}}
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000551
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000552A Boolean type was added to Python 2.3. Two new constants were added
553to the \module{__builtin__} module, \constant{True} and
Andrew M. Kuchling5a224532003-01-03 16:52:27 +0000554\constant{False}. (\constant{True} and
555\constant{False} constants were added to the built-ins
Michael W. Hudson065f5fa2003-02-10 19:24:50 +0000556in Python 2.2.1, but the 2.2.1 versions simply have integer values of
Andrew M. Kuchling5a224532003-01-03 16:52:27 +00005571 and 0 and aren't a different type.)
558
559The type object for this new type is named
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000560\class{bool}; the constructor for it takes any Python value and
561converts it to \constant{True} or \constant{False}.
562
563\begin{verbatim}
564>>> bool(1)
565True
566>>> bool(0)
567False
568>>> bool([])
569False
570>>> bool( (1,) )
571True
572\end{verbatim}
573
574Most of the standard library modules and built-in functions have been
575changed to return Booleans.
576
577\begin{verbatim}
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000578>>> obj = []
579>>> hasattr(obj, 'append')
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000580True
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000581>>> isinstance(obj, list)
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000582True
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000583>>> isinstance(obj, tuple)
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000584False
585\end{verbatim}
586
587Python's Booleans were added with the primary goal of making code
588clearer. For example, if you're reading a function and encounter the
Fred Drake5c4cf152002-11-13 14:59:06 +0000589statement \code{return 1}, you might wonder whether the \code{1}
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +0000590represents a Boolean truth value, an index, or a
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000591coefficient that multiplies some other quantity. If the statement is
592\code{return True}, however, the meaning of the return value is quite
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +0000593clear.
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000594
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +0000595Python's Booleans were \emph{not} added for the sake of strict
596type-checking. A very strict language such as Pascal would also
597prevent you performing arithmetic with Booleans, and would require
598that the expression in an \keyword{if} statement always evaluate to a
599Boolean. Python is not this strict, and it never will be, as
600\pep{285} explicitly says. This means you can still use any
601expression in an \keyword{if} statement, even ones that evaluate to a
602list or tuple or some random object, and the Boolean type is a
603subclass of the \class{int} class so that arithmetic using a Boolean
604still works.
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000605
606\begin{verbatim}
607>>> True + 1
6082
609>>> False + 1
6101
611>>> False * 75
6120
613>>> True * 75
61475
615\end{verbatim}
616
617To sum up \constant{True} and \constant{False} in a sentence: they're
618alternative ways to spell the integer values 1 and 0, with the single
619difference that \function{str()} and \function{repr()} return the
Fred Drake5c4cf152002-11-13 14:59:06 +0000620strings \code{'True'} and \code{'False'} instead of \code{'1'} and
621\code{'0'}.
Andrew M. Kuchling3a52ff62002-04-03 22:44:47 +0000622
623\begin{seealso}
624
625\seepep{285}{Adding a bool type}{Written and implemented by GvR.}
626
627\end{seealso}
628
Michael W. Hudson5efaf7e2002-06-11 10:55:12 +0000629
Andrew M. Kuchling65b72822002-09-03 00:53:21 +0000630%======================================================================
631\section{PEP 293: Codec Error Handling Callbacks}
632
Martin v. Löwis20eae692002-10-07 19:01:07 +0000633When encoding a Unicode string into a byte string, unencodable
Andrew M. Kuchling0a6fa962002-10-09 12:11:10 +0000634characters may be encountered. So far, Python has allowed specifying
635the error processing as either ``strict'' (raising
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +0000636\exception{UnicodeError}), ``ignore'' (skipping the character), or
637``replace'' (using a question mark in the output string), with
638``strict'' being the default behavior. It may be desirable to specify
639alternative processing of such errors, such as inserting an XML
640character reference or HTML entity reference into the converted
641string.
Martin v. Löwis20eae692002-10-07 19:01:07 +0000642
Andrew M. Kuchlingb492fa92002-11-27 19:11:10 +0000643Python now has a flexible framework to add different processing
Andrew M. Kuchling0a6fa962002-10-09 12:11:10 +0000644strategies. New error handlers can be added with
Martin v. Löwis20eae692002-10-07 19:01:07 +0000645\function{codecs.register_error}. Codecs then can access the error
Andrew M. Kuchling0a6fa962002-10-09 12:11:10 +0000646handler with \function{codecs.lookup_error}. An equivalent C API has
647been added for codecs written in C. The error handler gets the
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +0000648necessary state information such as the string being converted, the
Andrew M. Kuchling0a6fa962002-10-09 12:11:10 +0000649position in the string where the error was detected, and the target
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +0000650encoding. The handler can then either raise an exception or return a
Andrew M. Kuchling0a6fa962002-10-09 12:11:10 +0000651replacement string.
Martin v. Löwis20eae692002-10-07 19:01:07 +0000652
653Two additional error handlers have been implemented using this
Andrew M. Kuchling0a6fa962002-10-09 12:11:10 +0000654framework: ``backslashreplace'' uses Python backslash quoting to
Andrew M. Kuchlingb492fa92002-11-27 19:11:10 +0000655represent unencodable characters and ``xmlcharrefreplace'' emits
Martin v. Löwis20eae692002-10-07 19:01:07 +0000656XML character references.
Andrew M. Kuchling65b72822002-09-03 00:53:21 +0000657
658\begin{seealso}
659
Fred Drake5c4cf152002-11-13 14:59:06 +0000660\seepep{293}{Codec Error Handling Callbacks}{Written and implemented by
Andrew M. Kuchling0a6fa962002-10-09 12:11:10 +0000661Walter D\"orwald.}
Andrew M. Kuchling65b72822002-09-03 00:53:21 +0000662
663\end{seealso}
664
665
666%======================================================================
Andrew M. Kuchling974ab9d2002-12-31 01:20:30 +0000667\section{PEP 273: Importing Modules from Zip Archives}
668
669The new \module{zipimport} module adds support for importing
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +0000670modules from a ZIP-format archive. You don't need to import the
Andrew M. Kuchling974ab9d2002-12-31 01:20:30 +0000671module explicitly; it will be automatically imported if a ZIP
672archive's filename is added to \code{sys.path}. For example:
673
674\begin{verbatim}
675amk@nyman:~/src/python$ unzip -l /tmp/example.zip
676Archive: /tmp/example.zip
677 Length Date Time Name
678 -------- ---- ---- ----
679 8467 11-26-02 22:30 jwzthreading.py
680 -------- -------
681 8467 1 file
682amk@nyman:~/src/python$ ./python
683Python 2.3a0 (#1, Dec 30 2002, 19:54:32)
Andrew M. Kuchling974ab9d2002-12-31 01:20:30 +0000684>>> import sys
685>>> sys.path.insert(0, '/tmp/example.zip') # Add .zip file to front of path
686>>> import jwzthreading
687>>> jwzthreading.__file__
688'/tmp/example.zip/jwzthreading.py'
689>>>
690\end{verbatim}
691
692An entry in \code{sys.path} can now be the filename of a ZIP archive.
693The ZIP archive can contain any kind of files, but only files named
Fred Drakeaac8c582003-01-17 22:50:10 +0000694\file{*.py}, \file{*.pyc}, or \file{*.pyo} can be imported. If an
695archive only contains \file{*.py} files, Python will not attempt to
696modify the archive by adding the corresponding \file{*.pyc} file, meaning
697that if a ZIP archive doesn't contain \file{*.pyc} files, importing may be
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +0000698rather slow.
Andrew M. Kuchling974ab9d2002-12-31 01:20:30 +0000699
700A path within the archive can also be specified to only import from a
701subdirectory; for example, the path \file{/tmp/example.zip/lib/}
702would only import from the \file{lib/} subdirectory within the
703archive.
704
Andrew M. Kuchling974ab9d2002-12-31 01:20:30 +0000705\begin{seealso}
706
707\seepep{273}{Import Modules from Zip Archives}{Written by James C. Ahlstrom,
708who also provided an implementation.
709Python 2.3 follows the specification in \pep{273},
Andrew M. Kuchlingae3bbf52002-12-31 14:03:45 +0000710but uses an implementation written by Just van~Rossum
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +0000711that uses the import hooks described in \pep{302}.
712See section~\ref{section-pep302} for a description of the new import hooks.
713}
Andrew M. Kuchling974ab9d2002-12-31 01:20:30 +0000714
715\end{seealso}
716
717%======================================================================
Fred Drake693aea22003-02-07 14:52:18 +0000718\section{PEP 301: Package Index and Metadata for
719Distutils\label{section-pep301}}
Andrew M. Kuchling87cebbf2003-01-03 16:24:28 +0000720
Andrew M. Kuchling5a224532003-01-03 16:52:27 +0000721Support for the long-requested Python catalog makes its first
722appearance in 2.3.
723
Fred Drake693aea22003-02-07 14:52:18 +0000724The core component is the new Distutils \command{register} command.
725Running \code{python setup.py register} will collect the metadata
Andrew M. Kuchling5a224532003-01-03 16:52:27 +0000726describing a package, such as its name, version, maintainer,
Andrew M. Kuchlingc61402b2003-02-26 19:00:52 +0000727description, \&c., and send it to a central catalog server. The
728catalog is available from \url{http://www.python.org/pypi}.
Andrew M. Kuchling5a224532003-01-03 16:52:27 +0000729
730To make the catalog a bit more useful, a new optional
Fred Drake693aea22003-02-07 14:52:18 +0000731\var{classifiers} keyword argument has been added to the Distutils
Andrew M. Kuchling5a224532003-01-03 16:52:27 +0000732\function{setup()} function. A list of
Fred Drake693aea22003-02-07 14:52:18 +0000733\ulink{Trove}{http://catb.org/\textasciitilde esr/trove/}-style
734strings can be supplied to help classify the software.
Andrew M. Kuchling5a224532003-01-03 16:52:27 +0000735
Andrew M. Kuchlinga31bb372003-01-27 16:36:34 +0000736Here's an example \file{setup.py} with classifiers, written to be compatible
737with older versions of the Distutils:
Andrew M. Kuchling5a224532003-01-03 16:52:27 +0000738
739\begin{verbatim}
Andrew M. Kuchlinga31bb372003-01-27 16:36:34 +0000740from distutils import core
Fred Drake693aea22003-02-07 14:52:18 +0000741kw = {'name': "Quixote",
Andrew M. Kuchlinga31bb372003-01-27 16:36:34 +0000742 'version': "0.5.1",
743 'description': "A highly Pythonic Web application framework",
Fred Drake693aea22003-02-07 14:52:18 +0000744 # ...
745 }
Andrew M. Kuchlinga31bb372003-01-27 16:36:34 +0000746
Andrew M. Kuchlinga6b1c752003-04-09 17:26:38 +0000747if (hasattr(core, 'setup_keywords') and
748 'classifiers' in core.setup_keywords):
Fred Drake693aea22003-02-07 14:52:18 +0000749 kw['classifiers'] = \
750 ['Topic :: Internet :: WWW/HTTP :: Dynamic Content',
751 'Environment :: No Input/Output (Daemon)',
752 'Intended Audience :: Developers'],
Andrew M. Kuchlinga31bb372003-01-27 16:36:34 +0000753
Fred Drake693aea22003-02-07 14:52:18 +0000754core.setup(**kw)
Andrew M. Kuchling5a224532003-01-03 16:52:27 +0000755\end{verbatim}
756
757The full list of classifiers can be obtained by running
758\code{python setup.py register --list-classifiers}.
Andrew M. Kuchling87cebbf2003-01-03 16:24:28 +0000759
760\begin{seealso}
761
Fred Drake693aea22003-02-07 14:52:18 +0000762\seepep{301}{Package Index and Metadata for Distutils}{Written and
763implemented by Richard Jones.}
Andrew M. Kuchling87cebbf2003-01-03 16:24:28 +0000764
765\end{seealso}
766
767
768%======================================================================
Andrew M. Kuchling974ab9d2002-12-31 01:20:30 +0000769\section{PEP 302: New Import Hooks \label{section-pep302}}
770
771While it's been possible to write custom import hooks ever since the
772\module{ihooks} module was introduced in Python 1.3, no one has ever
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +0000773been really happy with it because writing new import hooks is
774difficult and messy. There have been various proposed alternatives
775such as the \module{imputil} and \module{iu} modules, but none of them
776has ever gained much acceptance, and none of them were easily usable
777from \C{} code.
Andrew M. Kuchling974ab9d2002-12-31 01:20:30 +0000778
779\pep{302} borrows ideas from its predecessors, especially from
780Gordon McMillan's \module{iu} module. Three new items
781are added to the \module{sys} module:
782
783\begin{itemize}
Andrew M. Kuchlingd5ac8d02003-01-02 21:33:15 +0000784 \item \code{sys.path_hooks} is a list of callable objects; most
Fred Drakeaac8c582003-01-17 22:50:10 +0000785 often they'll be classes. Each callable takes a string containing a
786 path and either returns an importer object that will handle imports
787 from this path or raises an \exception{ImportError} exception if it
788 can't handle this path.
Andrew M. Kuchling974ab9d2002-12-31 01:20:30 +0000789
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +0000790 \item \code{sys.path_importer_cache} caches importer objects for
Fred Drakeaac8c582003-01-17 22:50:10 +0000791 each path, so \code{sys.path_hooks} will only need to be traversed
792 once for each path.
Andrew M. Kuchling974ab9d2002-12-31 01:20:30 +0000793
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +0000794 \item \code{sys.meta_path} is a list of importer objects that will
795 be traversed before \code{sys.path} is checked. This list is
796 initially empty, but user code can add objects to it. Additional
797 built-in and frozen modules can be imported by an object added to
798 this list.
Andrew M. Kuchling974ab9d2002-12-31 01:20:30 +0000799
800\end{itemize}
801
802Importer objects must have a single method,
803\method{find_module(\var{fullname}, \var{path}=None)}. \var{fullname}
804will be a module or package name, e.g. \samp{string} or
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +0000805\samp{distutils.core}. \method{find_module()} must return a loader object
Andrew M. Kuchling974ab9d2002-12-31 01:20:30 +0000806that has a single method, \method{load_module(\var{fullname})}, that
807creates and returns the corresponding module object.
808
809Pseudo-code for Python's new import logic, therefore, looks something
810like this (simplified a bit; see \pep{302} for the full details):
811
812\begin{verbatim}
813for mp in sys.meta_path:
814 loader = mp(fullname)
815 if loader is not None:
Andrew M. Kuchlingd5ac8d02003-01-02 21:33:15 +0000816 <module> = loader.load_module(fullname)
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +0000817
Andrew M. Kuchling974ab9d2002-12-31 01:20:30 +0000818for path in sys.path:
819 for hook in sys.path_hooks:
Andrew M. Kuchlingd5ac8d02003-01-02 21:33:15 +0000820 try:
821 importer = hook(path)
822 except ImportError:
823 # ImportError, so try the other path hooks
824 pass
825 else:
Andrew M. Kuchling974ab9d2002-12-31 01:20:30 +0000826 loader = importer.find_module(fullname)
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +0000827 <module> = loader.load_module(fullname)
Andrew M. Kuchling974ab9d2002-12-31 01:20:30 +0000828
829# Not found!
830raise ImportError
831\end{verbatim}
832
833\begin{seealso}
834
835\seepep{302}{New Import Hooks}{Written by Just van~Rossum and Paul Moore.
Andrew M. Kuchlingae3bbf52002-12-31 14:03:45 +0000836Implemented by Just van~Rossum.
Andrew M. Kuchling974ab9d2002-12-31 01:20:30 +0000837}
838
839\end{seealso}
840
841
842%======================================================================
Andrew M. Kuchlinga978e102003-03-21 18:10:12 +0000843\section{PEP 305: Comma-separated Files \label{section-pep305}}
844
845Comma-separated files are a format frequently used for exporting data
846from databases and spreadsheets. Python 2.3 adds a parser for
847comma-separated files.
848The format is deceptively simple at first glance:
849
850\begin{verbatim}
851Costs,150,200,3.95
852\end{verbatim}
853
854Read a line and call \code{line.split(',')}: what could be simpler?
855But toss in string data that can contain commas, and things get more
856complicated:
857
858\begin{verbatim}
859"Costs",150,200,3.95,"Includes taxes, shipping, and sundry items"
860\end{verbatim}
861
862A big ugly regular expression can parse this, but using the new
863\module{csv} package is much simpler:
864
865\begin{verbatim}
Andrew M. Kuchlingba887bb2003-04-13 21:13:02 +0000866import csv
Andrew M. Kuchlinga978e102003-03-21 18:10:12 +0000867
868input = open('datafile', 'rb')
869reader = csv.reader(input)
870for line in reader:
871 print line
872\end{verbatim}
873
874The \function{reader} function takes a number of different options.
875The field separator isn't limited to the comma and can be changed to
876any character, and so can the quoting and line-ending characters.
877
878Different dialects of comma-separated files can be defined and
879registered; currently there are two, both for Microsoft Excel.
880A separate \class{csv.writer} class will generate comma-separated files
881from a succession of tuples or lists, quoting strings that contain the
882delimiter.
883
884\begin{seealso}
885
886\seepep{305}{CSV File API}{Written and implemented
887by Kevin Altis, Dave Cole, Andrew McNamara, Skip Montanaro, Cliff Wells.
888}
889
890\end{seealso}
891
892%======================================================================
Andrew M. Kuchlinga092ba12003-03-21 18:32:43 +0000893\section{PEP 307: Pickle Enhancements \label{section-pep305}}
894
895The \module{pickle} and \module{cPickle} modules received some
896attention during the 2.3 development cycle. In 2.2, new-style classes
Andrew M. Kuchlinga6b1c752003-04-09 17:26:38 +0000897could be pickled without difficulty, but they weren't pickled very
Andrew M. Kuchlinga092ba12003-03-21 18:32:43 +0000898compactly; \pep{307} quotes a trivial example where a new-style class
899results in a pickled string three times longer than that for a classic
900class.
901
902The solution was to invent a new pickle protocol. The
903\function{pickle.dumps()} function has supported a text-or-binary flag
904for a long time. In 2.3, this flag is redefined from a Boolean to an
905integer; 0 is the old text-mode pickle format, 1 is the old binary
906format, and now 2 is a new 2.3-specific format. (A new constant,
907\constant{pickle.HIGHEST_PROTOCOL}, can be used to select the fanciest
908protocol available.)
909
910Unpickling is no longer considered a safe operation. 2.2's
911\module{pickle} provided hooks for trying to prevent unsafe classes
912from being unpickled (specifically, a
913\member{__safe_for_unpickling__} attribute), but none of this code
914was ever audited and therefore it's all been ripped out in 2.3. You
915should not unpickle untrusted data in any version of Python.
916
917To reduce the pickling overhead for new-style classes, a new interface
918for customizing pickling was added using three special methods:
919\method{__getstate__}, \method{__setstate__}, and
920\method{__getnewargs__}. Consult \pep{307} for the full semantics
921of these methods.
922
923As a way to compress pickles yet further, it's now possible to use
924integer codes instead of long strings to identify pickled classes.
925The Python Software Foundation will maintain a list of standardized
926codes; there's also a range of codes for private use. Currently no
927codes have been specified.
928
929\begin{seealso}
930
931\seepep{307}{Extensions to the pickle protocol}{Written and implemented
932by Guido van Rossum and Tim Peters.}
933
934\end{seealso}
935
936%======================================================================
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +0000937\section{Extended Slices\label{section-slices}}
Michael W. Hudson5efaf7e2002-06-11 10:55:12 +0000938
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +0000939Ever since Python 1.4, the slicing syntax has supported an optional
940third ``step'' or ``stride'' argument. For example, these are all
941legal Python syntax: \code{L[1:10:2]}, \code{L[:-1:1]},
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +0000942\code{L[::-1]}. This was added to Python at the request of
943the developers of Numerical Python, which uses the third argument
944extensively. However, Python's built-in list, tuple, and string
945sequence types have never supported this feature, and you got a
946\exception{TypeError} if you tried it. Michael Hudson contributed a
947patch to fix this shortcoming.
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +0000948
949For example, you can now easily extract the elements of a list that
950have even indexes:
Fred Drakedf872a22002-07-03 12:02:01 +0000951
952\begin{verbatim}
953>>> L = range(10)
954>>> L[::2]
955[0, 2, 4, 6, 8]
956\end{verbatim}
957
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +0000958Negative values also work to make a copy of the same list in reverse
959order:
Fred Drakedf872a22002-07-03 12:02:01 +0000960
961\begin{verbatim}
962>>> L[::-1]
963[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
964\end{verbatim}
Andrew M. Kuchling3a52ff62002-04-03 22:44:47 +0000965
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +0000966This also works for tuples, arrays, and strings:
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +0000967
968\begin{verbatim}
969>>> s='abcd'
970>>> s[::2]
971'ac'
972>>> s[::-1]
973'dcba'
974\end{verbatim}
975
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +0000976If you have a mutable sequence such as a list or an array you can
Michael W. Hudson4da01ed2002-07-19 15:48:56 +0000977assign to or delete an extended slice, but there are some differences
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +0000978between assignment to extended and regular slices. Assignment to a
979regular slice can be used to change the length of the sequence:
Michael W. Hudson4da01ed2002-07-19 15:48:56 +0000980
981\begin{verbatim}
982>>> a = range(3)
983>>> a
984[0, 1, 2]
985>>> a[1:3] = [4, 5, 6]
986>>> a
987[0, 4, 5, 6]
988\end{verbatim}
989
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +0000990Extended slices aren't this flexible. When assigning to an extended
991slice the list on the right hand side of the statement must contain
992the same number of items as the slice it is replacing:
Michael W. Hudson4da01ed2002-07-19 15:48:56 +0000993
994\begin{verbatim}
995>>> a = range(4)
996>>> a
997[0, 1, 2, 3]
998>>> a[::2]
999[0, 2]
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +00001000>>> a[::2] = [0, -1]
Michael W. Hudson4da01ed2002-07-19 15:48:56 +00001001>>> a
1002[0, 1, -1, 3]
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +00001003>>> a[::2] = [0,1,2]
Michael W. Hudson4da01ed2002-07-19 15:48:56 +00001004Traceback (most recent call last):
1005 File "<stdin>", line 1, in ?
Raymond Hettingeree1bded2003-01-17 16:20:23 +00001006ValueError: attempt to assign sequence of size 3 to extended slice of size 2
Michael W. Hudson4da01ed2002-07-19 15:48:56 +00001007\end{verbatim}
1008
1009Deletion is more straightforward:
1010
1011\begin{verbatim}
1012>>> a = range(4)
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +00001013>>> a
1014[0, 1, 2, 3]
Michael W. Hudson4da01ed2002-07-19 15:48:56 +00001015>>> a[::2]
1016[0, 2]
1017>>> del a[::2]
1018>>> a
1019[1, 3]
1020\end{verbatim}
1021
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +00001022One can also now pass slice objects to the
1023\method{__getitem__} methods of the built-in sequences:
Michael W. Hudson4da01ed2002-07-19 15:48:56 +00001024
1025\begin{verbatim}
1026>>> range(10).__getitem__(slice(0, 5, 2))
1027[0, 2, 4]
1028\end{verbatim}
1029
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +00001030Or use slice objects directly in subscripts:
Michael W. Hudson4da01ed2002-07-19 15:48:56 +00001031
1032\begin{verbatim}
1033>>> range(10)[slice(0, 5, 2)]
1034[0, 2, 4]
1035\end{verbatim}
1036
Andrew M. Kuchlingb6f79592002-11-29 19:43:45 +00001037To simplify implementing sequences that support extended slicing,
1038slice objects now have a method \method{indices(\var{length})} which,
Fred Drakeaac8c582003-01-17 22:50:10 +00001039given the length of a sequence, returns a \code{(\var{start},
1040\var{stop}, \var{step})} tuple that can be passed directly to
1041\function{range()}.
Andrew M. Kuchlingb6f79592002-11-29 19:43:45 +00001042\method{indices()} handles omitted and out-of-bounds indices in a
1043manner consistent with regular slices (and this innocuous phrase hides
1044a welter of confusing details!). The method is intended to be used
1045like this:
Michael W. Hudson4da01ed2002-07-19 15:48:56 +00001046
1047\begin{verbatim}
1048class FakeSeq:
1049 ...
1050 def calc_item(self, i):
1051 ...
1052 def __getitem__(self, item):
1053 if isinstance(item, slice):
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +00001054 indices = item.indices(len(self))
1055 return FakeSeq([self.calc_item(i) in range(*indices)])
Fred Drake5c4cf152002-11-13 14:59:06 +00001056 else:
Michael W. Hudson4da01ed2002-07-19 15:48:56 +00001057 return self.calc_item(i)
1058\end{verbatim}
1059
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +00001060From this example you can also see that the built-in \class{slice}
Andrew M. Kuchling90e9a792002-08-15 00:40:21 +00001061object is now the type object for the slice type, and is no longer a
1062function. This is consistent with Python 2.2, where \class{int},
1063\class{str}, etc., underwent the same change.
1064
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +00001065
Andrew M. Kuchling3a52ff62002-04-03 22:44:47 +00001066%======================================================================
Fred Drakedf872a22002-07-03 12:02:01 +00001067\section{Other Language Changes}
Andrew M. Kuchling517109b2002-05-07 21:01:16 +00001068
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +00001069Here are all of the changes that Python 2.3 makes to the core Python
1070language.
Andrew M. Kuchling517109b2002-05-07 21:01:16 +00001071
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +00001072\begin{itemize}
1073\item The \keyword{yield} statement is now always a keyword, as
1074described in section~\ref{section-generators} of this document.
Andrew M. Kuchling517109b2002-05-07 21:01:16 +00001075
Fred Drake5c4cf152002-11-13 14:59:06 +00001076\item A new built-in function \function{enumerate()}
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +00001077was added, as described in section~\ref{section-enumerate} of this
1078document.
Andrew M. Kuchling517109b2002-05-07 21:01:16 +00001079
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +00001080\item Two new constants, \constant{True} and \constant{False} were
1081added along with the built-in \class{bool} type, as described in
1082section~\ref{section-bool} of this document.
Andrew M. Kuchling517109b2002-05-07 21:01:16 +00001083
Andrew M. Kuchling495172c2002-11-20 13:50:15 +00001084\item The \function{int()} type constructor will now return a long
1085integer instead of raising an \exception{OverflowError} when a string
1086or floating-point number is too large to fit into an integer. This
1087can lead to the paradoxical result that
Andrew M. Kuchling974ab9d2002-12-31 01:20:30 +00001088\code{isinstance(int(\var{expression}), int)} is false, but that seems
1089unlikely to cause problems in practice.
Andrew M. Kuchling495172c2002-11-20 13:50:15 +00001090
Fred Drake5c4cf152002-11-13 14:59:06 +00001091\item Built-in types now support the extended slicing syntax,
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +00001092as described in section~\ref{section-slices} of this document.
Andrew M. Kuchling517109b2002-05-07 21:01:16 +00001093
Andrew M. Kuchling035272b2003-04-24 16:38:20 +00001094\item A new built-in function, \function{sum(\var{iterable}, \var{start}=0)},
1095adds up the numeric items in the iterable object and returns their sum.
1096\function{sum()} only accepts numbers, meaning that you can't use it
1097to concatenate a bunch of strings, for example. (Contributed by Alex
1098Martelli.)
1099
Andrew M. Kuchlingfcf6b3e2003-05-07 17:00:35 +00001100\item \code{list.insert(\var{pos}, \var{value})} used to
1101insert \var{value} at the front of the list when \var{pos} was
1102negative. The behaviour has now been changed to be consistent with
1103slice indexing, so when \var{pos} is -1 the value will be inserted
1104before the last element, and so forth.
1105
Andrew M. Kuchlingd39078b2003-04-13 21:44:28 +00001106\item Dictionaries have a new method, \method{pop(\var{key}\optional{,
1107\var{default}})}, that returns the value corresponding to \var{key}
1108and removes that key/value pair from the dictionary. If the requested
Andrew M. Kuchling035272b2003-04-24 16:38:20 +00001109key isn't present in the dictionary, \var{default} is returned if it's
1110specified and \exception{KeyError} raised if it isn't.
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +00001111
1112\begin{verbatim}
1113>>> d = {1:2}
1114>>> d
1115{1: 2}
1116>>> d.pop(4)
1117Traceback (most recent call last):
Andrew M. Kuchling28f2f882002-11-14 14:14:16 +00001118 File "stdin", line 1, in ?
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +00001119KeyError: 4
1120>>> d.pop(1)
11212
1122>>> d.pop(1)
1123Traceback (most recent call last):
Andrew M. Kuchling28f2f882002-11-14 14:14:16 +00001124 File "stdin", line 1, in ?
Raymond Hettingeree1bded2003-01-17 16:20:23 +00001125KeyError: 'pop(): dictionary is empty'
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +00001126>>> d
1127{}
1128>>>
1129\end{verbatim}
1130
Andrew M. Kuchlingb492fa92002-11-27 19:11:10 +00001131There's also a new class method,
1132\method{dict.fromkeys(\var{iterable}, \var{value})}, that
1133creates a dictionary with keys taken from the supplied iterator
1134\var{iterable} and all values set to \var{value}, defaulting to
1135\code{None}.
1136
1137(Patches contributed by Raymond Hettinger.)
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +00001138
Andrew M. Kuchling974ab9d2002-12-31 01:20:30 +00001139Also, the \function{dict()} constructor now accepts keyword arguments to
Raymond Hettinger45bda572002-12-14 20:20:45 +00001140simplify creating small dictionaries:
Andrew M. Kuchling449a87d2002-12-11 15:03:51 +00001141
1142\begin{verbatim}
1143>>> dict(red=1, blue=2, green=3, black=4)
1144{'blue': 2, 'black': 4, 'green': 3, 'red': 1}
1145\end{verbatim}
1146
Andrew M. Kuchlingae3bbf52002-12-31 14:03:45 +00001147(Contributed by Just van~Rossum.)
Andrew M. Kuchling449a87d2002-12-11 15:03:51 +00001148
Andrew M. Kuchling7a82b8c2002-11-04 20:17:24 +00001149\item The \keyword{assert} statement no longer checks the \code{__debug__}
Andrew M. Kuchling6974aa92002-08-20 00:54:36 +00001150flag, so you can no longer disable assertions by assigning to \code{__debug__}.
Fred Drake5c4cf152002-11-13 14:59:06 +00001151Running Python with the \programopt{-O} switch will still generate
Andrew M. Kuchling6974aa92002-08-20 00:54:36 +00001152code that doesn't execute any assertions.
1153
1154\item Most type objects are now callable, so you can use them
1155to create new objects such as functions, classes, and modules. (This
1156means that the \module{new} module can be deprecated in a future
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +00001157Python version, because you can now use the type objects available in
1158the \module{types} module.)
Andrew M. Kuchling6974aa92002-08-20 00:54:36 +00001159% XXX should new.py use PendingDeprecationWarning?
1160For example, you can create a new module object with the following code:
1161
1162\begin{verbatim}
1163>>> import types
1164>>> m = types.ModuleType('abc','docstring')
1165>>> m
1166<module 'abc' (built-in)>
1167>>> m.__doc__
1168'docstring'
1169\end{verbatim}
1170
Fred Drake5c4cf152002-11-13 14:59:06 +00001171\item
Andrew M. Kuchling6974aa92002-08-20 00:54:36 +00001172A new warning, \exception{PendingDeprecationWarning} was added to
1173indicate features which are in the process of being
1174deprecated. The warning will \emph{not} be printed by default. To
1175check for use of features that will be deprecated in the future,
1176supply \programopt{-Walways::PendingDeprecationWarning::} on the
1177command line or use \function{warnings.filterwarnings()}.
1178
Andrew M. Kuchlingc1dd1742003-01-13 13:59:22 +00001179\item The process of deprecating string-based exceptions, as
1180in \code{raise "Error occurred"}, has begun. Raising a string will
1181now trigger \exception{PendingDeprecationWarning}.
1182
Andrew M. Kuchling6974aa92002-08-20 00:54:36 +00001183\item Using \code{None} as a variable name will now result in a
1184\exception{SyntaxWarning} warning. In a future version of Python,
1185\code{None} may finally become a keyword.
1186
Andrew M. Kuchlingb60ea3f2002-11-15 14:37:10 +00001187\item The method resolution order used by new-style classes has
1188changed, though you'll only notice the difference if you have a really
1189complicated inheritance hierarchy. (Classic classes are unaffected by
1190this change.) Python 2.2 originally used a topological sort of a
1191class's ancestors, but 2.3 now uses the C3 algorithm as described in
Andrew M. Kuchling6f429c32002-11-19 13:09:00 +00001192the paper \ulink{``A Monotonic Superclass Linearization for
1193Dylan''}{http://www.webcom.com/haahr/dylan/linearization-oopsla96.html}.
Andrew M. Kuchlingc1dd1742003-01-13 13:59:22 +00001194To understand the motivation for this change,
1195read Michele Simionato's article
Fred Drake693aea22003-02-07 14:52:18 +00001196\ulink{``Python 2.3 Method Resolution Order''}
Andrew M. Kuchlingb8a39052003-02-07 20:22:33 +00001197 {http://www.python.org/2.3/mro.html}, or
Andrew M. Kuchlingc1dd1742003-01-13 13:59:22 +00001198read the thread on python-dev starting with the message at
Andrew M. Kuchlingb60ea3f2002-11-15 14:37:10 +00001199\url{http://mail.python.org/pipermail/python-dev/2002-October/029035.html}.
1200Samuele Pedroni first pointed out the problem and also implemented the
1201fix by coding the C3 algorithm.
1202
Andrew M. Kuchlingdcfd8252002-09-13 22:21:42 +00001203\item Python runs multithreaded programs by switching between threads
1204after executing N bytecodes. The default value for N has been
1205increased from 10 to 100 bytecodes, speeding up single-threaded
1206applications by reducing the switching overhead. Some multithreaded
1207applications may suffer slower response time, but that's easily fixed
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +00001208by setting the limit back to a lower number using
Andrew M. Kuchlingdcfd8252002-09-13 22:21:42 +00001209\function{sys.setcheckinterval(\var{N})}.
Andrew M. Kuchlingc760c6c2003-07-16 20:12:33 +00001210The limit can be retrieved with the new
1211\function{sys.getcheckinterval()} function.
Andrew M. Kuchlingdcfd8252002-09-13 22:21:42 +00001212
Andrew M. Kuchling6974aa92002-08-20 00:54:36 +00001213\item One minor but far-reaching change is that the names of extension
1214types defined by the modules included with Python now contain the
Fred Drake5c4cf152002-11-13 14:59:06 +00001215module and a \character{.} in front of the type name. For example, in
Andrew M. Kuchling6974aa92002-08-20 00:54:36 +00001216Python 2.2, if you created a socket and printed its
1217\member{__class__}, you'd get this output:
1218
1219\begin{verbatim}
1220>>> s = socket.socket()
1221>>> s.__class__
1222<type 'socket'>
1223\end{verbatim}
1224
1225In 2.3, you get this:
1226\begin{verbatim}
1227>>> s.__class__
1228<type '_socket.socket'>
1229\end{verbatim}
1230
Michael W. Hudson96bc3b42002-11-26 14:48:23 +00001231\item One of the noted incompatibilities between old- and new-style
1232 classes has been removed: you can now assign to the
1233 \member{__name__} and \member{__bases__} attributes of new-style
1234 classes. There are some restrictions on what can be assigned to
1235 \member{__bases__} along the lines of those relating to assigning to
1236 an instance's \member{__class__} attribute.
1237
Andrew M. Kuchling6974aa92002-08-20 00:54:36 +00001238\end{itemize}
1239
1240
Andrew M. Kuchling366c10c2002-11-14 23:07:57 +00001241%======================================================================
Andrew M. Kuchling6974aa92002-08-20 00:54:36 +00001242\subsection{String Changes}
1243
1244\begin{itemize}
1245
Fred Drakeaac8c582003-01-17 22:50:10 +00001246\item The \keyword{in} operator now works differently for strings.
Andrew M. Kuchling6974aa92002-08-20 00:54:36 +00001247Previously, when evaluating \code{\var{X} in \var{Y}} where \var{X}
1248and \var{Y} are strings, \var{X} could only be a single character.
1249That's now changed; \var{X} can be a string of any length, and
1250\code{\var{X} in \var{Y}} will return \constant{True} if \var{X} is a
1251substring of \var{Y}. If \var{X} is the empty string, the result is
1252always \constant{True}.
1253
1254\begin{verbatim}
1255>>> 'ab' in 'abcd'
1256True
1257>>> 'ad' in 'abcd'
1258False
1259>>> '' in 'abcd'
1260True
1261\end{verbatim}
1262
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +00001263Note that this doesn't tell you where the substring starts; if you
1264need that information, you must use the \method{find()} method
1265instead.
Andrew M. Kuchling6974aa92002-08-20 00:54:36 +00001266
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +00001267\item The \method{strip()}, \method{lstrip()}, and \method{rstrip()}
1268string methods now have an optional argument for specifying the
1269characters to strip. The default is still to remove all whitespace
1270characters:
1271
1272\begin{verbatim}
1273>>> ' abc '.strip()
1274'abc'
1275>>> '><><abc<><><>'.strip('<>')
1276'abc'
1277>>> '><><abc<><><>\n'.strip('<>')
1278'abc<><><>\n'
1279>>> u'\u4000\u4001abc\u4000'.strip(u'\u4000')
1280u'\u4001abc'
1281>>>
1282\end{verbatim}
1283
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +00001284(Suggested by Simon Brunning and implemented by Walter D\"orwald.)
Andrew M. Kuchling346386f2002-07-12 20:24:42 +00001285
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +00001286\item The \method{startswith()} and \method{endswith()}
1287string methods now accept negative numbers for the start and end
1288parameters.
1289
1290\item Another new string method is \method{zfill()}, originally a
1291function in the \module{string} module. \method{zfill()} pads a
1292numeric string with zeros on the left until it's the specified width.
1293Note that the \code{\%} operator is still more flexible and powerful
1294than \method{zfill()}.
1295
1296\begin{verbatim}
1297>>> '45'.zfill(4)
1298'0045'
1299>>> '12345'.zfill(4)
1300'12345'
1301>>> 'goofy'.zfill(6)
1302'0goofy'
1303\end{verbatim}
1304
Andrew M. Kuchling346386f2002-07-12 20:24:42 +00001305(Contributed by Walter D\"orwald.)
1306
Fred Drake5c4cf152002-11-13 14:59:06 +00001307\item A new type object, \class{basestring}, has been added.
Andrew M. Kuchling20e5abc2002-07-11 20:50:34 +00001308 Both 8-bit strings and Unicode strings inherit from this type, so
1309 \code{isinstance(obj, basestring)} will return \constant{True} for
1310 either kind of string. It's a completely abstract type, so you
1311 can't create \class{basestring} instances.
1312
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +00001313\item Interned strings are no longer immortal, and will now be
Andrew M. Kuchling6974aa92002-08-20 00:54:36 +00001314garbage-collected in the usual way when the only reference to them is
1315from the internal dictionary of interned strings. (Implemented by
1316Oren Tirosh.)
1317
1318\end{itemize}
1319
1320
Andrew M. Kuchling366c10c2002-11-14 23:07:57 +00001321%======================================================================
Andrew M. Kuchling6974aa92002-08-20 00:54:36 +00001322\subsection{Optimizations}
1323
1324\begin{itemize}
1325
Andrew M. Kuchling974ab9d2002-12-31 01:20:30 +00001326\item The creation of new-style class instances has been made much
1327faster; they're now faster than classic classes!
1328
Andrew M. Kuchling950725f2002-08-06 01:40:48 +00001329\item The \method{sort()} method of list objects has been extensively
1330rewritten by Tim Peters, and the implementation is significantly
1331faster.
1332
Andrew M. Kuchling6974aa92002-08-20 00:54:36 +00001333\item Multiplication of large long integers is now much faster thanks
1334to an implementation of Karatsuba multiplication, an algorithm that
1335scales better than the O(n*n) required for the grade-school
1336multiplication algorithm. (Original patch by Christopher A. Craig,
1337and significantly reworked by Tim Peters.)
Andrew M. Kuchling20e5abc2002-07-11 20:50:34 +00001338
Andrew M. Kuchling6974aa92002-08-20 00:54:36 +00001339\item The \code{SET_LINENO} opcode is now gone. This may provide a
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +00001340small speed increase, depending on your compiler's idiosyncrasies.
1341See section~\ref{section-other} for a longer explanation.
Andrew M. Kuchling6974aa92002-08-20 00:54:36 +00001342(Removed by Michael Hudson.)
Andrew M. Kuchling20e5abc2002-07-11 20:50:34 +00001343
Andrew M. Kuchling449a87d2002-12-11 15:03:51 +00001344\item \function{xrange()} objects now have their own iterator, making
1345\code{for i in xrange(n)} slightly faster than
1346\code{for i in range(n)}. (Patch by Raymond Hettinger.)
1347
Andrew M. Kuchling6974aa92002-08-20 00:54:36 +00001348\item A number of small rearrangements have been made in various
1349hotspots to improve performance, inlining a function here, removing
1350some code there. (Implemented mostly by GvR, but lots of people have
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +00001351contributed single changes.)
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +00001352
1353\end{itemize}
Neal Norwitzd68f5172002-05-29 15:54:55 +00001354
Andrew M. Kuchling6974aa92002-08-20 00:54:36 +00001355
Andrew M. Kuchling517109b2002-05-07 21:01:16 +00001356%======================================================================
Andrew M. Kuchlingef893fe2003-01-06 20:04:17 +00001357\section{New, Improved, and Deprecated Modules}
Andrew M. Kuchling03594bb2002-03-27 02:29:48 +00001358
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +00001359As usual, Python's standard library received a number of enhancements and
Andrew M. Kuchlinga982eb12002-07-22 18:57:36 +00001360bug fixes. Here's a partial list of the most notable changes, sorted
1361alphabetically by module name. Consult the
1362\file{Misc/NEWS} file in the source tree for a more
1363complete list of changes, or look through the CVS logs for all the
1364details.
Andrew M. Kuchling821013e2002-05-06 17:46:39 +00001365
1366\begin{itemize}
1367
Andrew M. Kuchlinga982eb12002-07-22 18:57:36 +00001368\item The \module{array} module now supports arrays of Unicode
Fred Drake5c4cf152002-11-13 14:59:06 +00001369characters using the \character{u} format character. Arrays also now
Andrew M. Kuchlinga982eb12002-07-22 18:57:36 +00001370support using the \code{+=} assignment operator to add another array's
1371contents, and the \code{*=} assignment operator to repeat an array.
1372(Contributed by Jason Orendorff.)
1373
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +00001374\item The \module{bsddb} module has been replaced by version 4.1.1
Andrew M. Kuchling669249e2002-11-19 13:05:33 +00001375of the \ulink{PyBSDDB}{http://pybsddb.sourceforge.net} package,
1376providing a more complete interface to the transactional features of
1377the BerkeleyDB library.
1378The old version of the module has been renamed to
1379\module{bsddb185} and is no longer built automatically; you'll
1380have to edit \file{Modules/Setup} to enable it. Note that the new
1381\module{bsddb} package is intended to be compatible with the
1382old module, so be sure to file bugs if you discover any
Skip Montanaro959c7722003-03-07 15:45:15 +00001383incompatibilities. When upgrading to Python 2.3, if you also change
1384the underlying BerkeleyDB library, you will almost certainly have to
1385convert your database files to the new version. You can do this
1386fairly easily with the new scripts \file{db2pickle.py} and
1387\file{pickle2db.py} which you will find in the distribution's
1388Tools/scripts directory. If you've already been using the PyBSDDB
Andrew M. Kuchlinge36b6902003-04-19 15:38:47 +00001389package and importing it as \module{bsddb3}, you will have to change your
Skip Montanaro959c7722003-03-07 15:45:15 +00001390\code{import} statements.
Andrew M. Kuchlinge36b6902003-04-19 15:38:47 +00001391
1392\item The new \module{bz2} module is an interface to the bz2 data
1393compression library. bz2 usually produces output that's smaller than
1394the compressed output from the \module{zlib} module, meaning that it
1395compresses data more highly. (Contributed by Gustavo Niemeyer.)
Andrew M. Kuchling669249e2002-11-19 13:05:33 +00001396
Fred Drake5c4cf152002-11-13 14:59:06 +00001397\item The Distutils \class{Extension} class now supports
1398an extra constructor argument named \var{depends} for listing
Andrew M. Kuchlinga982eb12002-07-22 18:57:36 +00001399additional source files that an extension depends on. This lets
1400Distutils recompile the module if any of the dependency files are
Fred Drake5c4cf152002-11-13 14:59:06 +00001401modified. For example, if \file{sampmodule.c} includes the header
Andrew M. Kuchlinga982eb12002-07-22 18:57:36 +00001402file \file{sample.h}, you would create the \class{Extension} object like
1403this:
1404
1405\begin{verbatim}
1406ext = Extension("samp",
1407 sources=["sampmodule.c"],
1408 depends=["sample.h"])
1409\end{verbatim}
1410
1411Modifying \file{sample.h} would then cause the module to be recompiled.
1412(Contributed by Jeremy Hylton.)
1413
Andrew M. Kuchlingdc3f7e12002-11-04 20:05:10 +00001414\item Other minor changes to Distutils:
1415it now checks for the \envvar{CC}, \envvar{CFLAGS}, \envvar{CPP},
1416\envvar{LDFLAGS}, and \envvar{CPPFLAGS} environment variables, using
1417them to override the settings in Python's configuration (contributed
Andrew M. Kuchlinga31bb372003-01-27 16:36:34 +00001418by Robert Weber).
Andrew M. Kuchlingdc3f7e12002-11-04 20:05:10 +00001419
Andrew M. Kuchling035272b2003-04-24 16:38:20 +00001420\item The new \function{gc.get_referents(\var{object})} function returns a
1421list of all the objects referenced by \var{object}.
1422
Andrew M. Kuchlinga982eb12002-07-22 18:57:36 +00001423\item The \module{getopt} module gained a new function,
1424\function{gnu_getopt()}, that supports the same arguments as the existing
Fred Drake5c4cf152002-11-13 14:59:06 +00001425\function{getopt()} function but uses GNU-style scanning mode.
Andrew M. Kuchlinga982eb12002-07-22 18:57:36 +00001426The existing \function{getopt()} stops processing options as soon as a
1427non-option argument is encountered, but in GNU-style mode processing
1428continues, meaning that options and arguments can be mixed. For
1429example:
1430
1431\begin{verbatim}
1432>>> getopt.getopt(['-f', 'filename', 'output', '-v'], 'f:v')
1433([('-f', 'filename')], ['output', '-v'])
1434>>> getopt.gnu_getopt(['-f', 'filename', 'output', '-v'], 'f:v')
1435([('-f', 'filename'), ('-v', '')], ['output'])
1436\end{verbatim}
1437
1438(Contributed by Peter \AA{strand}.)
1439
1440\item The \module{grp}, \module{pwd}, and \module{resource} modules
Fred Drake5c4cf152002-11-13 14:59:06 +00001441now return enhanced tuples:
Andrew M. Kuchlinga982eb12002-07-22 18:57:36 +00001442
1443\begin{verbatim}
1444>>> import grp
1445>>> g = grp.getgrnam('amk')
1446>>> g.gr_name, g.gr_gid
1447('amk', 500)
1448\end{verbatim}
1449
Andrew M. Kuchling974ab9d2002-12-31 01:20:30 +00001450\item The \module{gzip} module can now handle files exceeding 2~Gb.
1451
Andrew M. Kuchling950725f2002-08-06 01:40:48 +00001452\item The new \module{heapq} module contains an implementation of a
1453heap queue algorithm. A heap is an array-like data structure that
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +00001454keeps items in a partially sorted order such that, for every index
1455\var{k}, \code{heap[\var{k}] <= heap[2*\var{k}+1]} and
1456\code{heap[\var{k}] <= heap[2*\var{k}+2]}. This makes it quick to
1457remove the smallest item, and inserting a new item while maintaining
1458the heap property is O(lg~n). (See
Andrew M. Kuchling950725f2002-08-06 01:40:48 +00001459\url{http://www.nist.gov/dads/HTML/priorityque.html} for more
1460information about the priority queue data structure.)
1461
Andrew M. Kuchling8a61f492002-11-13 13:24:41 +00001462The \module{heapq} module provides \function{heappush()} and
Andrew M. Kuchling950725f2002-08-06 01:40:48 +00001463\function{heappop()} functions for adding and removing items while
1464maintaining the heap property on top of some other mutable Python
1465sequence type. For example:
1466
1467\begin{verbatim}
1468>>> import heapq
1469>>> heap = []
1470>>> for item in [3, 7, 5, 11, 1]:
1471... heapq.heappush(heap, item)
1472...
1473>>> heap
1474[1, 3, 5, 11, 7]
1475>>> heapq.heappop(heap)
14761
1477>>> heapq.heappop(heap)
14783
1479>>> heap
1480[5, 7, 11]
Andrew M. Kuchling950725f2002-08-06 01:40:48 +00001481\end{verbatim}
1482
1483(Contributed by Kevin O'Connor.)
Andrew M. Kuchlinga982eb12002-07-22 18:57:36 +00001484
Andrew M. Kuchling87cebbf2003-01-03 16:24:28 +00001485\item The \module{imaplib} module now supports IMAP over SSL.
1486(Contributed by Piers Lauder and Tino Lange.)
1487
Andrew M. Kuchling41c3e002003-03-02 02:13:52 +00001488\item The \module{itertools} contains a number of useful functions for
1489use with iterators, inspired by various functions provided by the ML
1490and Haskell languages. For example,
1491\code{itertools.ifilter(predicate, iterator)} returns all elements in
1492the iterator for which the function \function{predicate()} returns
Andrew M. Kuchling563389f2003-03-02 02:31:58 +00001493\constant{True}, and \code{itertools.repeat(obj, \var{N})} returns
Andrew M. Kuchling41c3e002003-03-02 02:13:52 +00001494\code{obj} \var{N} times. There are a number of other functions in
1495the module; see the \ulink{package's reference
1496documentation}{../lib/module-itertools.html} for details.
Raymond Hettinger5284b442003-03-09 07:19:38 +00001497(Contributed by Raymond Hettinger.)
Fred Drakecade7132003-02-19 16:08:08 +00001498
Fred Drake5c4cf152002-11-13 14:59:06 +00001499\item Two new functions in the \module{math} module,
Andrew M. Kuchlinga982eb12002-07-22 18:57:36 +00001500\function{degrees(\var{rads})} and \function{radians(\var{degs})},
Fred Drake5c4cf152002-11-13 14:59:06 +00001501convert between radians and degrees. Other functions in the
Andrew M. Kuchling8e5b53b2002-12-15 20:17:38 +00001502\module{math} module such as \function{math.sin()} and
1503\function{math.cos()} have always required input values measured in
1504radians. Also, an optional \var{base} argument was added to
1505\function{math.log()} to make it easier to compute logarithms for
1506bases other than \code{e} and \code{10}. (Contributed by Raymond
1507Hettinger.)
Andrew M. Kuchlinga982eb12002-07-22 18:57:36 +00001508
Andrew M. Kuchlingae3bbf52002-12-31 14:03:45 +00001509\item Several new functions (\function{getpgid()}, \function{killpg()},
1510\function{lchown()}, \function{loadavg()}, \function{major()}, \function{makedev()},
1511\function{minor()}, and \function{mknod()}) were added to the
Andrew M. Kuchlingc309cca2002-10-10 16:04:08 +00001512\module{posix} module that underlies the \module{os} module.
Andrew M. Kuchlingae3bbf52002-12-31 14:03:45 +00001513(Contributed by Gustavo Niemeyer, Geert Jansen, and Denis S. Otkidach.)
Andrew M. Kuchlinga982eb12002-07-22 18:57:36 +00001514
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +00001515\item In the \module{os} module, the \function{*stat()} family of functions can now report
1516fractions of a second in a timestamp. Such time stamps are
1517represented as floats, similar to \function{time.time()}.
1518
1519During testing, it was found that some applications will break if time
1520stamps are floats. For compatibility, when using the tuple interface
1521of the \class{stat_result} time stamps will be represented as integers.
1522When using named fields (a feature first introduced in Python 2.2),
1523time stamps are still represented as integers, unless
1524\function{os.stat_float_times()} is invoked to enable float return
1525values:
1526
1527\begin{verbatim}
1528>>> os.stat("/tmp").st_mtime
15291034791200
1530>>> os.stat_float_times(True)
1531>>> os.stat("/tmp").st_mtime
15321034791200.6335014
1533\end{verbatim}
1534
1535In Python 2.4, the default will change to always returning floats.
1536
1537Application developers should enable this feature only if all their
1538libraries work properly when confronted with floating point time
1539stamps, or if they use the tuple API. If used, the feature should be
1540activated on an application level instead of trying to enable it on a
1541per-use basis.
1542
Andrew M. Kuchling53262572002-12-01 14:00:21 +00001543\item The old and never-documented \module{linuxaudiodev} module has
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +00001544been deprecated, and a new version named \module{ossaudiodev} has been
1545added. The module was renamed because the OSS sound drivers can be
1546used on platforms other than Linux, and the interface has also been
1547tidied and brought up to date in various ways. (Contributed by Greg
Greg Wardaa1d3aa2003-01-03 18:03:21 +00001548Ward and Nicholas FitzRoy-Dale.)
Andrew M. Kuchling53262572002-12-01 14:00:21 +00001549
Andrew M. Kuchling035272b2003-04-24 16:38:20 +00001550\item The new \module{platform} module contains a number of functions
1551that try to determine various properties of the platform you're
1552running on. There are functions for getting the architecture, CPU
1553type, the Windows OS version, and even the Linux distribution version.
1554(Contributed by Marc-Andr\'e Lemburg.)
1555
Fred Drake5c4cf152002-11-13 14:59:06 +00001556\item The parser objects provided by the \module{pyexpat} module
Andrew M. Kuchlinga982eb12002-07-22 18:57:36 +00001557can now optionally buffer character data, resulting in fewer calls to
1558your character data handler and therefore faster performance. Setting
Fred Drake5c4cf152002-11-13 14:59:06 +00001559the parser object's \member{buffer_text} attribute to \constant{True}
Andrew M. Kuchlinga982eb12002-07-22 18:57:36 +00001560will enable buffering.
1561
Andrew M. Kuchling8a61f492002-11-13 13:24:41 +00001562\item The \function{sample(\var{population}, \var{k})} function was
1563added to the \module{random} module. \var{population} is a sequence
Fred Drakeaac8c582003-01-17 22:50:10 +00001564or \class{xrange} object containing the elements of a population, and
1565\function{sample()}
Andrew M. Kuchling8a61f492002-11-13 13:24:41 +00001566chooses \var{k} elements from the population without replacing chosen
1567elements. \var{k} can be any value up to \code{len(\var{population})}.
1568For example:
1569
1570\begin{verbatim}
Andrew M. Kuchling449a87d2002-12-11 15:03:51 +00001571>>> days = ['Mo', 'Tu', 'We', 'Th', 'Fr', 'St', 'Sn']
Michael W. Hudsoncfd38842002-12-17 16:15:34 +00001572>>> random.sample(days, 3) # Choose 3 elements
Andrew M. Kuchling449a87d2002-12-11 15:03:51 +00001573['St', 'Sn', 'Th']
Michael W. Hudsoncfd38842002-12-17 16:15:34 +00001574>>> random.sample(days, 7) # Choose 7 elements
Andrew M. Kuchling449a87d2002-12-11 15:03:51 +00001575['Tu', 'Th', 'Mo', 'We', 'St', 'Fr', 'Sn']
Michael W. Hudsoncfd38842002-12-17 16:15:34 +00001576>>> random.sample(days, 7) # Choose 7 again
Andrew M. Kuchling449a87d2002-12-11 15:03:51 +00001577['We', 'Mo', 'Sn', 'Fr', 'Tu', 'St', 'Th']
Michael W. Hudsoncfd38842002-12-17 16:15:34 +00001578>>> random.sample(days, 8) # Can't choose eight
Andrew M. Kuchling8a61f492002-11-13 13:24:41 +00001579Traceback (most recent call last):
Andrew M. Kuchling28f2f882002-11-14 14:14:16 +00001580 File "<stdin>", line 1, in ?
Andrew M. Kuchling449a87d2002-12-11 15:03:51 +00001581 File "random.py", line 414, in sample
1582 raise ValueError, "sample larger than population"
Andrew M. Kuchling8a61f492002-11-13 13:24:41 +00001583ValueError: sample larger than population
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +00001584>>> random.sample(xrange(1,10000,2), 10) # Choose ten odd nos. under 10000
Andrew M. Kuchling449a87d2002-12-11 15:03:51 +00001585[3407, 3805, 1505, 7023, 2401, 2267, 9733, 3151, 8083, 9195]
Andrew M. Kuchling8a61f492002-11-13 13:24:41 +00001586\end{verbatim}
Andrew M. Kuchling974ab9d2002-12-31 01:20:30 +00001587
1588The \module{random} module now uses a new algorithm, the Mersenne
1589Twister, implemented in C. It's faster and more extensively studied
1590than the previous algorithm.
1591
1592(All changes contributed by Raymond Hettinger.)
Andrew M. Kuchling8a61f492002-11-13 13:24:41 +00001593
Andrew M. Kuchlinga982eb12002-07-22 18:57:36 +00001594\item The \module{readline} module also gained a number of new
1595functions: \function{get_history_item()},
1596\function{get_current_history_length()}, and \function{redisplay()}.
1597
Andrew M. Kuchlingef893fe2003-01-06 20:04:17 +00001598\item The \module{rexec} and \module{Bastion} modules have been
1599declared dead, and attempts to import them will fail with a
1600\exception{RuntimeError}. New-style classes provide new ways to break
1601out of the restricted execution environment provided by
1602\module{rexec}, and no one has interest in fixing them or time to do
1603so. If you have applications using \module{rexec}, rewrite them to
1604use something else.
1605
1606(Sticking with Python 2.2 or 2.1 will not make your applications any
Andrew M. Kuchling13b4c412003-04-24 13:23:43 +00001607safer because there are known bugs in the \module{rexec} module in
Andrew M. Kuchling035272b2003-04-24 16:38:20 +00001608those versions. To repeat: if you're using \module{rexec}, stop using
Andrew M. Kuchlingef893fe2003-01-06 20:04:17 +00001609it immediately.)
1610
Andrew M. Kuchling13b4c412003-04-24 13:23:43 +00001611\item The \module{rotor} module has been deprecated because the
1612 algorithm it uses for encryption is not believed to be secure. If
1613 you need encryption, use one of the several AES Python modules
1614 that are available separately.
1615
Andrew M. Kuchling974ab9d2002-12-31 01:20:30 +00001616\item The \module{shutil} module gained a \function{move(\var{src},
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +00001617\var{dest})} function that recursively moves a file or directory to a new
Andrew M. Kuchling974ab9d2002-12-31 01:20:30 +00001618location.
1619
Andrew M. Kuchlinga982eb12002-07-22 18:57:36 +00001620\item Support for more advanced POSIX signal handling was added
Michael W. Hudson43ed43b2003-03-13 13:56:53 +00001621to the \module{signal} but then removed again as it proved impossible
1622to make it work reliably across platforms.
Andrew M. Kuchlinga982eb12002-07-22 18:57:36 +00001623
1624\item The \module{socket} module now supports timeouts. You
1625can call the \method{settimeout(\var{t})} method on a socket object to
1626set a timeout of \var{t} seconds. Subsequent socket operations that
1627take longer than \var{t} seconds to complete will abort and raise a
Andrew M. Kuchlingc760c6c2003-07-16 20:12:33 +00001628\exception{socket.timeout} exception.
Andrew M. Kuchlinga982eb12002-07-22 18:57:36 +00001629
1630The original timeout implementation was by Tim O'Malley. Michael
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +00001631Gilfix integrated it into the Python \module{socket} module and
1632shepherded it through a lengthy review. After the code was checked
1633in, Guido van~Rossum rewrote parts of it. (This is a good example of
1634a collaborative development process in action.)
Andrew M. Kuchlinga982eb12002-07-22 18:57:36 +00001635
Mark Hammond8af50bc2002-12-03 06:13:35 +00001636\item On Windows, the \module{socket} module now ships with Secure
Michael W. Hudson065f5fa2003-02-10 19:24:50 +00001637Sockets Layer (SSL) support.
Mark Hammond8af50bc2002-12-03 06:13:35 +00001638
Andrew M. Kuchling563389f2003-03-02 02:31:58 +00001639\item The value of the C \constant{PYTHON_API_VERSION} macro is now
1640exposed at the Python level as \code{sys.api_version}. The current
1641exception can be cleared by calling the new \function{sys.exc_clear()}
1642function.
Andrew M. Kuchlingdcfd8252002-09-13 22:21:42 +00001643
Andrew M. Kuchling674b0bf2003-01-07 00:07:19 +00001644\item The new \module{tarfile} module
Neal Norwitz55d555f2003-01-08 05:27:42 +00001645allows reading from and writing to \program{tar}-format archive files.
Andrew M. Kuchling674b0bf2003-01-07 00:07:19 +00001646(Contributed by Lars Gust\"abel.)
1647
Andrew M. Kuchling20e5abc2002-07-11 20:50:34 +00001648\item The new \module{textwrap} module contains functions for wrapping
Andrew M. Kuchlingd003a2a2002-06-26 13:23:55 +00001649strings containing paragraphs of text. The \function{wrap(\var{text},
1650\var{width})} function takes a string and returns a list containing
1651the text split into lines of no more than the chosen width. The
1652\function{fill(\var{text}, \var{width})} function returns a single
1653string, reformatted to fit into lines no longer than the chosen width.
1654(As you can guess, \function{fill()} is built on top of
1655\function{wrap()}. For example:
1656
1657\begin{verbatim}
1658>>> import textwrap
1659>>> paragraph = "Not a whit, we defy augury: ... more text ..."
1660>>> textwrap.wrap(paragraph, 60)
Fred Drake5c4cf152002-11-13 14:59:06 +00001661["Not a whit, we defy augury: there's a special providence in",
1662 "the fall of a sparrow. If it be now, 'tis not to come; if it",
Andrew M. Kuchlingd003a2a2002-06-26 13:23:55 +00001663 ...]
1664>>> print textwrap.fill(paragraph, 35)
1665Not a whit, we defy augury: there's
1666a special providence in the fall of
1667a sparrow. If it be now, 'tis not
1668to come; if it be not to come, it
1669will be now; if it be not now, yet
1670it will come: the readiness is all.
Fred Drake5c4cf152002-11-13 14:59:06 +00001671>>>
Andrew M. Kuchlingd003a2a2002-06-26 13:23:55 +00001672\end{verbatim}
1673
1674The module also contains a \class{TextWrapper} class that actually
Fred Drake5c4cf152002-11-13 14:59:06 +00001675implements the text wrapping strategy. Both the
Andrew M. Kuchlingd003a2a2002-06-26 13:23:55 +00001676\class{TextWrapper} class and the \function{wrap()} and
1677\function{fill()} functions support a number of additional keyword
Andrew M. Kuchlingd39078b2003-04-13 21:44:28 +00001678arguments for fine-tuning the formatting; consult the \ulink{module's
1679documentation}{../lib/module-textwrap.html} for details.
Andrew M. Kuchlingd003a2a2002-06-26 13:23:55 +00001680(Contributed by Greg Ward.)
1681
Andrew M. Kuchling974ab9d2002-12-31 01:20:30 +00001682\item The \module{thread} and \module{threading} modules now have
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +00001683companion modules, \module{dummy_thread} and \module{dummy_threading},
1684that provide a do-nothing implementation of the \module{thread}
1685module's interface for platforms where threads are not supported. The
1686intention is to simplify thread-aware modules (ones that \emph{don't}
1687rely on threads to run) by putting the following code at the top:
Andrew M. Kuchling974ab9d2002-12-31 01:20:30 +00001688
Andrew M. Kuchling974ab9d2002-12-31 01:20:30 +00001689\begin{verbatim}
1690try:
1691 import threading as _threading
1692except ImportError:
1693 import dummy_threading as _threading
1694\end{verbatim}
1695
1696Code can then call functions and use classes in \module{_threading}
1697whether or not threads are supported, avoiding an \keyword{if}
1698statement and making the code slightly clearer. This module will not
1699magically make multithreaded code run without threads; code that waits
1700for another thread to return or to do something will simply hang
Andrew M. Kuchlingd39078b2003-04-13 21:44:28 +00001701forever. (In this example, \module{_threading} is used as the module
1702name to make it clear that the module being used is not necessarily
1703the actual \module{threading} module.)
Andrew M. Kuchling974ab9d2002-12-31 01:20:30 +00001704
Andrew M. Kuchlingef5d06b2002-07-22 19:21:06 +00001705\item The \module{time} module's \function{strptime()} function has
Fred Drake5c4cf152002-11-13 14:59:06 +00001706long been an annoyance because it uses the platform C library's
Andrew M. Kuchlingef5d06b2002-07-22 19:21:06 +00001707\function{strptime()} implementation, and different platforms
1708sometimes have odd bugs. Brett Cannon contributed a portable
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +00001709implementation that's written in pure Python and should behave
Andrew M. Kuchlingef5d06b2002-07-22 19:21:06 +00001710identically on all platforms.
1711
Andrew M. Kuchlingd39078b2003-04-13 21:44:28 +00001712\item The new \module{timeit} module helps measure how long snippets
1713of Python code take to execute. The \file{timeit.py} file can be run
1714directly from the command line, or the module's \class{Timer} class
1715can be imported and used directly. Here's a short example that
1716figures out whether it's faster to convert an 8-bit string to Unicode
1717by appending an empty Unicode string to it or by using the
1718\function{unicode()} function:
1719
1720\begin{verbatim}
1721import timeit
1722
1723timer1 = timeit.Timer('unicode("abc")')
1724timer2 = timeit.Timer('"abc" + u""')
1725
1726# Run three trials
1727print timer1.repeat(repeat=3, number=100000)
1728print timer2.repeat(repeat=3, number=100000)
1729
1730# On my laptop this outputs:
1731# [0.36831796169281006, 0.37441694736480713, 0.35304892063140869]
1732# [0.17574405670166016, 0.18193507194519043, 0.17565798759460449]
1733\end{verbatim}
1734
1735
Andrew M. Kuchling6c50df22002-12-13 12:53:16 +00001736\item The \module{UserDict} module has a new \class{DictMixin} class which
Andrew M. Kuchling449a87d2002-12-11 15:03:51 +00001737defines all dictionary methods for classes that already have a minimum
1738mapping interface. This greatly simplifies writing classes that need
1739to be substitutable for dictionaries, such as the classes in
1740the \module{shelve} module.
1741
1742Adding the mixin as a superclass provides the full dictionary
1743interface whenever the class defines \method{__getitem__},
Andrew M. Kuchling6c50df22002-12-13 12:53:16 +00001744\method{__setitem__}, \method{__delitem__}, and \method{keys}.
Andrew M. Kuchling449a87d2002-12-11 15:03:51 +00001745For example:
1746
1747\begin{verbatim}
1748>>> import UserDict
1749>>> class SeqDict(UserDict.DictMixin):
1750 """Dictionary lookalike implemented with lists."""
1751 def __init__(self):
1752 self.keylist = []
1753 self.valuelist = []
1754 def __getitem__(self, key):
1755 try:
1756 i = self.keylist.index(key)
1757 except ValueError:
1758 raise KeyError
1759 return self.valuelist[i]
1760 def __setitem__(self, key, value):
1761 try:
1762 i = self.keylist.index(key)
1763 self.valuelist[i] = value
1764 except ValueError:
1765 self.keylist.append(key)
1766 self.valuelist.append(value)
1767 def __delitem__(self, key):
1768 try:
1769 i = self.keylist.index(key)
1770 except ValueError:
1771 raise KeyError
1772 self.keylist.pop(i)
1773 self.valuelist.pop(i)
1774 def keys(self):
1775 return list(self.keylist)
1776
1777>>> s = SeqDict()
1778>>> dir(s) # See that other dictionary methods are implemented
1779['__cmp__', '__contains__', '__delitem__', '__doc__', '__getitem__',
1780 '__init__', '__iter__', '__len__', '__module__', '__repr__',
1781 '__setitem__', 'clear', 'get', 'has_key', 'items', 'iteritems',
1782 'iterkeys', 'itervalues', 'keylist', 'keys', 'pop', 'popitem',
1783 'setdefault', 'update', 'valuelist', 'values']
Neal Norwitzc7d8c682002-12-24 14:51:43 +00001784\end{verbatim}
Andrew M. Kuchling449a87d2002-12-11 15:03:51 +00001785
1786(Contributed by Raymond Hettinger.)
1787
Raymond Hettinger8ccf4d72003-07-10 15:48:33 +00001788\item The \module{Tix} module has received various bug fixes and
Andrew M. Kuchlingef893fe2003-01-06 20:04:17 +00001789updates for the current version of the Tix package.
1790
Andrew M. Kuchling6c50df22002-12-13 12:53:16 +00001791\item The \module{Tkinter} module now works with a thread-enabled
1792version of Tcl. Tcl's threading model requires that widgets only be
1793accessed from the thread in which they're created; accesses from
1794another thread can cause Tcl to panic. For certain Tcl interfaces,
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +00001795\module{Tkinter} will now automatically avoid this
1796when a widget is accessed from a different thread by marshalling a
1797command, passing it to the correct thread, and waiting for the
1798results. Other interfaces can't be handled automatically but
1799\module{Tkinter} will now raise an exception on such an access so that
1800at least you can find out about the problem. See
Fred Drakeb876bcc2003-04-30 15:03:46 +00001801\url{http://mail.python.org/pipermail/python-dev/2002-December/031107.html} %
Andrew M. Kuchling6c50df22002-12-13 12:53:16 +00001802for a more detailed explanation of this change. (Implemented by
Andrew M. Kuchlingfcf6b3e2003-05-07 17:00:35 +00001803Martin von~L\"owis.)
Andrew M. Kuchling6c50df22002-12-13 12:53:16 +00001804
Andrew M. Kuchlingb492fa92002-11-27 19:11:10 +00001805\item Calling Tcl methods through \module{_tkinter} no longer
1806returns only strings. Instead, if Tcl returns other objects those
1807objects are converted to their Python equivalent, if one exists, or
1808wrapped with a \class{_tkinter.Tcl_Obj} object if no Python equivalent
Raymond Hettinger45bda572002-12-14 20:20:45 +00001809exists. This behavior can be controlled through the
Andrew M. Kuchlingb492fa92002-11-27 19:11:10 +00001810\method{wantobjects()} method of \class{tkapp} objects.
Martin v. Löwis39b48522002-11-26 09:47:25 +00001811
Andrew M. Kuchlingb492fa92002-11-27 19:11:10 +00001812When using \module{_tkinter} through the \module{Tkinter} module (as
1813most Tkinter applications will), this feature is always activated. It
1814should not cause compatibility problems, since Tkinter would always
1815convert string results to Python types where possible.
Martin v. Löwis39b48522002-11-26 09:47:25 +00001816
Raymond Hettinger45bda572002-12-14 20:20:45 +00001817If any incompatibilities are found, the old behavior can be restored
Andrew M. Kuchlingb492fa92002-11-27 19:11:10 +00001818by setting the \member{wantobjects} variable in the \module{Tkinter}
1819module to false before creating the first \class{tkapp} object.
Martin v. Löwis39b48522002-11-26 09:47:25 +00001820
1821\begin{verbatim}
1822import Tkinter
Martin v. Löwis8c8aa5d2002-11-26 21:39:48 +00001823Tkinter.wantobjects = 0
Martin v. Löwis39b48522002-11-26 09:47:25 +00001824\end{verbatim}
1825
Andrew M. Kuchling6c50df22002-12-13 12:53:16 +00001826Any breakage caused by this change should be reported as a bug.
Martin v. Löwis39b48522002-11-26 09:47:25 +00001827
Andrew M. Kuchlinge36b6902003-04-19 15:38:47 +00001828\item The DOM implementation
1829in \module{xml.dom.minidom} can now generate XML output in a
1830particular encoding by providing an optional encoding argument to
1831the \method{toxml()} and \method{toprettyxml()} methods of DOM nodes.
1832
1833\item The new \module{DocXMLRPCServer} module allows writing
1834self-documenting XML-RPC servers. Run it in demo mode (as a program)
1835to see it in action. Pointing the Web browser to the RPC server
1836produces pydoc-style documentation; pointing xmlrpclib to the
1837server allows invoking the actual methods.
1838(Contributed by Brian Quinlan.)
1839
Martin v. Löwis2548c732003-04-18 10:39:54 +00001840\item Support for internationalized domain names (RFCs 3454, 3490,
18413491, and 3492) has been added. The ``idna'' encoding can be used
1842to convert between a Unicode domain name and the ASCII-compatible
Andrew M. Kuchlinge36b6902003-04-19 15:38:47 +00001843encoding (ACE) of that name.
Martin v. Löwis2548c732003-04-18 10:39:54 +00001844
Martin v. Löwisfaf71ea2003-04-18 21:48:56 +00001845\begin{alltt}
Fred Drake15b3dba2003-07-16 04:00:14 +00001846>{}>{}> u"www.Alliancefran\c caise.nu".encode("idna")
Martin v. Löwis2548c732003-04-18 10:39:54 +00001847'www.xn--alliancefranaise-npb.nu'
Martin v. Löwisfaf71ea2003-04-18 21:48:56 +00001848\end{alltt}
Martin v. Löwis2548c732003-04-18 10:39:54 +00001849
Andrew M. Kuchlinge36b6902003-04-19 15:38:47 +00001850The \module{socket} module has also been extended to transparently
1851convert Unicode hostnames to the ACE version before passing them to
1852the C library. Modules that deal with hostnames such as
1853\module{httplib} and \module{ftplib}) also support Unicode host names;
1854\module{httplib} also sends HTTP \samp{Host} headers using the ACE
1855version of the domain name. \module{urllib} supports Unicode URLs
1856with non-ASCII host names as long as the \code{path} part of the URL
1857is ASCII only.
Martin v. Löwis2548c732003-04-18 10:39:54 +00001858
1859To implement this change, the module \module{stringprep}, the tool
Andrew M. Kuchlinge36b6902003-04-19 15:38:47 +00001860\code{mkstringprep} and the \code{punycode} encoding have been added.
Martin v. Löwis281b2c62003-04-18 21:04:39 +00001861
Andrew M. Kuchling821013e2002-05-06 17:46:39 +00001862\end{itemize}
1863
Andrew M. Kuchling03594bb2002-03-27 02:29:48 +00001864
Andrew M. Kuchlingef5d06b2002-07-22 19:21:06 +00001865%======================================================================
Andrew M. Kuchlinga974b392003-01-13 19:09:03 +00001866\subsection{Date/Time Type}
1867
1868Date and time types suitable for expressing timestamps were added as
1869the \module{datetime} module. The types don't support different
1870calendars or many fancy features, and just stick to the basics of
1871representing time.
1872
1873The three primary types are: \class{date}, representing a day, month,
1874and year; \class{time}, consisting of hour, minute, and second; and
1875\class{datetime}, which contains all the attributes of both
Andrew M. Kuchlingc71bb972003-03-21 17:23:07 +00001876\class{date} and \class{time}. There's also a
1877\class{timedelta} class representing differences between two points
Andrew M. Kuchlinga974b392003-01-13 19:09:03 +00001878in time, and time zone logic is implemented by classes inheriting from
1879the abstract \class{tzinfo} class.
1880
1881You can create instances of \class{date} and \class{time} by either
1882supplying keyword arguments to the appropriate constructor,
1883e.g. \code{datetime.date(year=1972, month=10, day=15)}, or by using
Andrew M. Kuchlingc71bb972003-03-21 17:23:07 +00001884one of a number of class methods. For example, the \method{date.today()}
Andrew M. Kuchlinga974b392003-01-13 19:09:03 +00001885class method returns the current local date.
1886
1887Once created, instances of the date/time classes are all immutable.
1888There are a number of methods for producing formatted strings from
1889objects:
1890
1891\begin{verbatim}
1892>>> import datetime
1893>>> now = datetime.datetime.now()
1894>>> now.isoformat()
1895'2002-12-30T21:27:03.994956'
1896>>> now.ctime() # Only available on date, datetime
1897'Mon Dec 30 21:27:03 2002'
Raymond Hettingeree1bded2003-01-17 16:20:23 +00001898>>> now.strftime('%Y %d %b')
Andrew M. Kuchlinga974b392003-01-13 19:09:03 +00001899'2002 30 Dec'
1900\end{verbatim}
1901
1902The \method{replace()} method allows modifying one or more fields
1903of a \class{date} or \class{datetime} instance:
1904
1905\begin{verbatim}
1906>>> d = datetime.datetime.now()
1907>>> d
1908datetime.datetime(2002, 12, 30, 22, 15, 38, 827738)
1909>>> d.replace(year=2001, hour = 12)
1910datetime.datetime(2001, 12, 30, 12, 15, 38, 827738)
1911>>>
1912\end{verbatim}
1913
1914Instances can be compared, hashed, and converted to strings (the
1915result is the same as that of \method{isoformat()}). \class{date} and
1916\class{datetime} instances can be subtracted from each other, and
Andrew M. Kuchlingc71bb972003-03-21 17:23:07 +00001917added to \class{timedelta} instances. The largest missing feature is
1918that there's no support for parsing strings and getting back a
1919\class{date} or \class{datetime}.
Andrew M. Kuchlinga974b392003-01-13 19:09:03 +00001920
1921For more information, refer to the \ulink{module's reference
Andrew M. Kuchlingd39078b2003-04-13 21:44:28 +00001922documentation}{../lib/module-datetime.html}.
Andrew M. Kuchlinga974b392003-01-13 19:09:03 +00001923(Contributed by Tim Peters.)
1924
1925
1926%======================================================================
Andrew M. Kuchling8d177092003-05-13 14:26:54 +00001927\subsection{The optparse Module}
Andrew M. Kuchling366c10c2002-11-14 23:07:57 +00001928
Andrew M. Kuchling24d5a522002-11-14 23:40:42 +00001929The \module{getopt} module provides simple parsing of command-line
1930arguments. The new \module{optparse} module (originally named Optik)
1931provides more elaborate command-line parsing that follows the Unix
1932conventions, automatically creates the output for \longprogramopt{help},
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +00001933and can perform different actions for different options.
Andrew M. Kuchling24d5a522002-11-14 23:40:42 +00001934
1935You start by creating an instance of \class{OptionParser} and telling
1936it what your program's options are.
1937
1938\begin{verbatim}
Andrew M. Kuchling7ee9b512003-02-18 00:48:23 +00001939import sys
Andrew M. Kuchling24d5a522002-11-14 23:40:42 +00001940from optparse import OptionParser
1941
1942op = OptionParser()
1943op.add_option('-i', '--input',
1944 action='store', type='string', dest='input',
1945 help='set input filename')
1946op.add_option('-l', '--length',
1947 action='store', type='int', dest='length',
1948 help='set maximum length of output')
1949\end{verbatim}
1950
1951Parsing a command line is then done by calling the \method{parse_args()}
1952method.
1953
1954\begin{verbatim}
Andrew M. Kuchling7ee9b512003-02-18 00:48:23 +00001955import optparse
1956
1957options, args = optparse.parse_args(sys.argv[1:])
Andrew M. Kuchling24d5a522002-11-14 23:40:42 +00001958print options
1959print args
1960\end{verbatim}
1961
1962This returns an object containing all of the option values,
1963and a list of strings containing the remaining arguments.
1964
1965Invoking the script with the various arguments now works as you'd
1966expect it to. Note that the length argument is automatically
1967converted to an integer.
1968
1969\begin{verbatim}
1970$ ./python opt.py -i data arg1
1971<Values at 0x400cad4c: {'input': 'data', 'length': None}>
1972['arg1']
1973$ ./python opt.py --input=data --length=4
1974<Values at 0x400cad2c: {'input': 'data', 'length': 4}>
Andrew M. Kuchling7ee9b512003-02-18 00:48:23 +00001975[]
Andrew M. Kuchling24d5a522002-11-14 23:40:42 +00001976$
1977\end{verbatim}
1978
1979The help message is automatically generated for you:
1980
1981\begin{verbatim}
1982$ ./python opt.py --help
1983usage: opt.py [options]
1984
1985options:
1986 -h, --help show this help message and exit
1987 -iINPUT, --input=INPUT
1988 set input filename
1989 -lLENGTH, --length=LENGTH
1990 set maximum length of output
1991$
1992\end{verbatim}
Andrew M. Kuchling669249e2002-11-19 13:05:33 +00001993% $ prevent Emacs tex-mode from getting confused
Andrew M. Kuchling24d5a522002-11-14 23:40:42 +00001994
Andrew M. Kuchlingd39078b2003-04-13 21:44:28 +00001995See the \ulink{module's documentation}{../lib/module-optparse.html}
1996for more details.
1997
Andrew M. Kuchling24d5a522002-11-14 23:40:42 +00001998Optik was written by Greg Ward, with suggestions from the readers of
1999the Getopt SIG.
2000
Andrew M. Kuchling366c10c2002-11-14 23:07:57 +00002001
2002%======================================================================
Andrew M. Kuchling8d177092003-05-13 14:26:54 +00002003\section{Pymalloc: A Specialized Object Allocator\label{section-pymalloc}}
Andrew M. Kuchlingef5d06b2002-07-22 19:21:06 +00002004
Andrew M. Kuchlingc71bb972003-03-21 17:23:07 +00002005Pymalloc, a specialized object allocator written by Vladimir
2006Marangozov, was a feature added to Python 2.1. Pymalloc is intended
2007to be faster than the system \cfunction{malloc()} and to have less
2008memory overhead for allocation patterns typical of Python programs.
2009The allocator uses C's \cfunction{malloc()} function to get large
2010pools of memory and then fulfills smaller memory requests from these
2011pools.
Andrew M. Kuchlingef5d06b2002-07-22 19:21:06 +00002012
2013In 2.1 and 2.2, pymalloc was an experimental feature and wasn't
Andrew M. Kuchlingc71bb972003-03-21 17:23:07 +00002014enabled by default; you had to explicitly enable it when compiling
2015Python by providing the
Andrew M. Kuchlingef5d06b2002-07-22 19:21:06 +00002016\longprogramopt{with-pymalloc} option to the \program{configure}
2017script. In 2.3, pymalloc has had further enhancements and is now
2018enabled by default; you'll have to supply
2019\longprogramopt{without-pymalloc} to disable it.
2020
2021This change is transparent to code written in Python; however,
2022pymalloc may expose bugs in C extensions. Authors of C extension
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +00002023modules should test their code with pymalloc enabled,
2024because some incorrect code may cause core dumps at runtime.
2025
2026There's one particularly common error that causes problems. There are
2027a number of memory allocation functions in Python's C API that have
2028previously just been aliases for the C library's \cfunction{malloc()}
Andrew M. Kuchlingef5d06b2002-07-22 19:21:06 +00002029and \cfunction{free()}, meaning that if you accidentally called
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +00002030mismatched functions the error wouldn't be noticeable. When the
Andrew M. Kuchlingef5d06b2002-07-22 19:21:06 +00002031object allocator is enabled, these functions aren't aliases of
2032\cfunction{malloc()} and \cfunction{free()} any more, and calling the
2033wrong function to free memory may get you a core dump. For example,
2034if memory was allocated using \cfunction{PyObject_Malloc()}, it has to
2035be freed using \cfunction{PyObject_Free()}, not \cfunction{free()}. A
2036few modules included with Python fell afoul of this and had to be
2037fixed; doubtless there are more third-party modules that will have the
2038same problem.
2039
2040As part of this change, the confusing multiple interfaces for
2041allocating memory have been consolidated down into two API families.
2042Memory allocated with one family must not be manipulated with
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +00002043functions from the other family. There is one family for allocating
2044chunks of memory, and another family of functions specifically for
2045allocating Python objects.
Andrew M. Kuchlingef5d06b2002-07-22 19:21:06 +00002046
2047\begin{itemize}
2048 \item To allocate and free an undistinguished chunk of memory use
2049 the ``raw memory'' family: \cfunction{PyMem_Malloc()},
2050 \cfunction{PyMem_Realloc()}, and \cfunction{PyMem_Free()}.
2051
2052 \item The ``object memory'' family is the interface to the pymalloc
2053 facility described above and is biased towards a large number of
2054 ``small'' allocations: \cfunction{PyObject_Malloc},
2055 \cfunction{PyObject_Realloc}, and \cfunction{PyObject_Free}.
2056
2057 \item To allocate and free Python objects, use the ``object'' family
2058 \cfunction{PyObject_New()}, \cfunction{PyObject_NewVar()}, and
2059 \cfunction{PyObject_Del()}.
2060\end{itemize}
2061
2062Thanks to lots of work by Tim Peters, pymalloc in 2.3 also provides
2063debugging features to catch memory overwrites and doubled frees in
2064both extension modules and in the interpreter itself. To enable this
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +00002065support, compile a debugging version of the Python interpreter by
2066running \program{configure} with \longprogramopt{with-pydebug}.
Andrew M. Kuchlingef5d06b2002-07-22 19:21:06 +00002067
2068To aid extension writers, a header file \file{Misc/pymemcompat.h} is
2069distributed with the source to Python 2.3 that allows Python
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +00002070extensions to use the 2.3 interfaces to memory allocation while
2071compiling against any version of Python since 1.5.2. You would copy
2072the file from Python's source distribution and bundle it with the
2073source of your extension.
Andrew M. Kuchlingef5d06b2002-07-22 19:21:06 +00002074
2075\begin{seealso}
2076
2077\seeurl{http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/Objects/obmalloc.c}
2078{For the full details of the pymalloc implementation, see
2079the comments at the top of the file \file{Objects/obmalloc.c} in the
2080Python source code. The above link points to the file within the
2081SourceForge CVS browser.}
2082
2083\end{seealso}
2084
2085
Andrew M. Kuchling821013e2002-05-06 17:46:39 +00002086% ======================================================================
2087\section{Build and C API Changes}
2088
Andrew M. Kuchling3c305d92002-07-22 18:50:11 +00002089Changes to Python's build process and to the C API include:
Andrew M. Kuchling821013e2002-05-06 17:46:39 +00002090
2091\begin{itemize}
2092
Andrew M. Kuchlingef5d06b2002-07-22 19:21:06 +00002093\item The C-level interface to the garbage collector has been changed,
2094to make it easier to write extension types that support garbage
2095collection, and to make it easier to debug misuses of the functions.
2096Various functions have slightly different semantics, so a bunch of
2097functions had to be renamed. Extensions that use the old API will
2098still compile but will \emph{not} participate in garbage collection,
2099so updating them for 2.3 should be considered fairly high priority.
2100
2101To upgrade an extension module to the new API, perform the following
2102steps:
2103
2104\begin{itemize}
2105
2106\item Rename \cfunction{Py_TPFLAGS_GC} to \cfunction{PyTPFLAGS_HAVE_GC}.
2107
2108\item Use \cfunction{PyObject_GC_New} or \cfunction{PyObject_GC_NewVar} to
2109allocate objects, and \cfunction{PyObject_GC_Del} to deallocate them.
2110
2111\item Rename \cfunction{PyObject_GC_Init} to \cfunction{PyObject_GC_Track} and
2112\cfunction{PyObject_GC_Fini} to \cfunction{PyObject_GC_UnTrack}.
2113
2114\item Remove \cfunction{PyGC_HEAD_SIZE} from object size calculations.
2115
2116\item Remove calls to \cfunction{PyObject_AS_GC} and \cfunction{PyObject_FROM_GC}.
2117
2118\end{itemize}
2119
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +00002120\item The cycle detection implementation used by the garbage collection
2121has proven to be stable, so it's now being made mandatory; you can no
2122longer compile Python without it, and the
2123\longprogramopt{with-cycle-gc} switch to \program{configure} has been removed.
Andrew M. Kuchling974ab9d2002-12-31 01:20:30 +00002124
Andrew M. Kuchling517109b2002-05-07 21:01:16 +00002125\item Python can now optionally be built as a shared library
2126(\file{libpython2.3.so}) by supplying \longprogramopt{enable-shared}
Fred Drake5c4cf152002-11-13 14:59:06 +00002127when running Python's \program{configure} script. (Contributed by Ondrej
Andrew M. Kuchlingfad2f592002-05-10 21:00:05 +00002128Palkovsky.)
Andrew M. Kuchlingf4dd65d2002-04-01 19:28:09 +00002129
Michael W. Hudsondd32a912002-08-15 14:59:02 +00002130\item The \csimplemacro{DL_EXPORT} and \csimplemacro{DL_IMPORT} macros
2131are now deprecated. Initialization functions for Python extension
2132modules should now be declared using the new macro
Andrew M. Kuchling3c305d92002-07-22 18:50:11 +00002133\csimplemacro{PyMODINIT_FUNC}, while the Python core will generally
2134use the \csimplemacro{PyAPI_FUNC} and \csimplemacro{PyAPI_DATA}
2135macros.
Neal Norwitzbba23a82002-07-22 13:18:59 +00002136
Fred Drake5c4cf152002-11-13 14:59:06 +00002137\item The interpreter can be compiled without any docstrings for
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +00002138the built-in functions and modules by supplying
Fred Drake5c4cf152002-11-13 14:59:06 +00002139\longprogramopt{without-doc-strings} to the \program{configure} script.
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +00002140This makes the Python executable about 10\% smaller, but will also
2141mean that you can't get help for Python's built-ins. (Contributed by
2142Gustavo Niemeyer.)
2143
Andrew M. Kuchling517109b2002-05-07 21:01:16 +00002144\item The \cfunction{PyArg_NoArgs()} macro is now deprecated, and code
Andrew M. Kuchling7845e7c2002-07-11 19:27:46 +00002145that uses it should be changed. For Python 2.2 and later, the method
2146definition table can specify the
Fred Drake5c4cf152002-11-13 14:59:06 +00002147\constant{METH_NOARGS} flag, signalling that there are no arguments, and
Andrew M. Kuchling7845e7c2002-07-11 19:27:46 +00002148the argument checking can then be removed. If compatibility with
2149pre-2.2 versions of Python is important, the code could use
Fred Drakeaac8c582003-01-17 22:50:10 +00002150\code{PyArg_ParseTuple(\var{args}, "")} instead, but this will be slower
Andrew M. Kuchling7845e7c2002-07-11 19:27:46 +00002151than using \constant{METH_NOARGS}.
Andrew M. Kuchling03594bb2002-03-27 02:29:48 +00002152
Andrew M. Kuchling517109b2002-05-07 21:01:16 +00002153\item A new function, \cfunction{PyObject_DelItemString(\var{mapping},
2154char *\var{key})} was added
Fred Drake5c4cf152002-11-13 14:59:06 +00002155as shorthand for
Raymond Hettingera685f522003-07-12 04:42:30 +00002156\code{PyObject_DelItem(\var{mapping}, PyString_New(\var{key}))}.
Andrew M. Kuchling03594bb2002-03-27 02:29:48 +00002157
Andrew M. Kuchling366c10c2002-11-14 23:07:57 +00002158\item The \method{xreadlines()} method of file objects, introduced in
2159Python 2.1, is no longer necessary because files now behave as their
2160own iterator. \method{xreadlines()} was originally introduced as a
2161faster way to loop over all the lines in a file, but now you can
2162simply write \code{for line in file_obj}.
2163
Andrew M. Kuchling517109b2002-05-07 21:01:16 +00002164\item File objects now manage their internal string buffer
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +00002165differently, increasing it exponentially when needed. This results in
2166the benchmark tests in \file{Lib/test/test_bufio.py} speeding up
2167considerably (from 57 seconds to 1.7 seconds, according to one
2168measurement).
Andrew M. Kuchling517109b2002-05-07 21:01:16 +00002169
Andrew M. Kuchling72b58e02002-05-29 17:30:34 +00002170\item It's now possible to define class and static methods for a C
2171extension type by setting either the \constant{METH_CLASS} or
2172\constant{METH_STATIC} flags in a method's \ctype{PyMethodDef}
2173structure.
Andrew M. Kuchling45afd542002-04-02 14:25:25 +00002174
Andrew M. Kuchling346386f2002-07-12 20:24:42 +00002175\item Python now includes a copy of the Expat XML parser's source code,
2176removing any dependence on a system version or local installation of
Fred Drake5c4cf152002-11-13 14:59:06 +00002177Expat.
Andrew M. Kuchling346386f2002-07-12 20:24:42 +00002178
Michael W. Hudson3e245d82003-02-11 14:19:56 +00002179\item If you dynamically allocate type objects in your extension, you
Neal Norwitzada859c2003-02-11 14:30:39 +00002180should be aware of a change in the rules relating to the
Michael W. Hudson3e245d82003-02-11 14:19:56 +00002181\member{__module__} and \member{__name__} attributes. In summary,
2182you will want to ensure the type's dictionary contains a
2183\code{'__module__'} key; making the module name the part of the type
2184name leading up to the final period will no longer have the desired
2185effect. For more detail, read the API reference documentation or the
2186source.
2187
Andrew M. Kuchling821013e2002-05-06 17:46:39 +00002188\end{itemize}
2189
Andrew M. Kuchling366c10c2002-11-14 23:07:57 +00002190
Andrew M. Kuchling974ab9d2002-12-31 01:20:30 +00002191%======================================================================
Andrew M. Kuchling821013e2002-05-06 17:46:39 +00002192\subsection{Port-Specific Changes}
2193
Andrew M. Kuchling187b1d82002-05-29 19:20:57 +00002194Support for a port to IBM's OS/2 using the EMX runtime environment was
2195merged into the main Python source tree. EMX is a POSIX emulation
2196layer over the OS/2 system APIs. The Python port for EMX tries to
2197support all the POSIX-like capability exposed by the EMX runtime, and
2198mostly succeeds; \function{fork()} and \function{fcntl()} are
2199restricted by the limitations of the underlying emulation layer. The
2200standard OS/2 port, which uses IBM's Visual Age compiler, also gained
2201support for case-sensitive import semantics as part of the integration
2202of the EMX port into CVS. (Contributed by Andrew MacIntyre.)
Andrew M. Kuchling03594bb2002-03-27 02:29:48 +00002203
Andrew M. Kuchling72b58e02002-05-29 17:30:34 +00002204On MacOS, most toolbox modules have been weaklinked to improve
2205backward compatibility. This means that modules will no longer fail
2206to load if a single routine is missing on the curent OS version.
Andrew M. Kuchling187b1d82002-05-29 19:20:57 +00002207Instead calling the missing routine will raise an exception.
2208(Contributed by Jack Jansen.)
Andrew M. Kuchling03594bb2002-03-27 02:29:48 +00002209
Andrew M. Kuchling187b1d82002-05-29 19:20:57 +00002210The RPM spec files, found in the \file{Misc/RPM/} directory in the
2211Python source distribution, were updated for 2.3. (Contributed by
2212Sean Reifschneider.)
Fred Drake03e10312002-03-26 19:17:43 +00002213
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +00002214Other new platforms now supported by Python include AtheOS
Fred Drake693aea22003-02-07 14:52:18 +00002215(\url{http://www.atheos.cx/}), GNU/Hurd, and OpenVMS.
Andrew M. Kuchling20e5abc2002-07-11 20:50:34 +00002216
Fred Drake03e10312002-03-26 19:17:43 +00002217
2218%======================================================================
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +00002219\section{Other Changes and Fixes \label{section-other}}
Andrew M. Kuchling517109b2002-05-07 21:01:16 +00002220
Andrew M. Kuchling7a82b8c2002-11-04 20:17:24 +00002221As usual, there were a bunch of other improvements and bugfixes
2222scattered throughout the source tree. A search through the CVS change
Andrew M. Kuchling2cd77312003-07-16 14:44:12 +00002223logs finds there were 523 patches applied and 514 bugs fixed between
Andrew M. Kuchling7a82b8c2002-11-04 20:17:24 +00002224Python 2.2 and 2.3. Both figures are likely to be underestimates.
2225
2226Some of the more notable changes are:
Andrew M. Kuchling517109b2002-05-07 21:01:16 +00002227
2228\begin{itemize}
2229
Fred Drake54fe3fd2002-11-26 22:07:35 +00002230\item The \file{regrtest.py} script now provides a way to allow ``all
2231resources except \var{foo}.'' A resource name passed to the
2232\programopt{-u} option can now be prefixed with a hyphen
2233(\character{-}) to mean ``remove this resource.'' For example, the
2234option `\code{\programopt{-u}all,-bsddb}' could be used to enable the
2235use of all resources except \code{bsddb}.
2236
Andrew M. Kuchling517109b2002-05-07 21:01:16 +00002237\item The tools used to build the documentation now work under Cygwin
2238as well as \UNIX.
2239
Michael W. Hudsondd32a912002-08-15 14:59:02 +00002240\item The \code{SET_LINENO} opcode has been removed. Back in the
2241mists of time, this opcode was needed to produce line numbers in
2242tracebacks and support trace functions (for, e.g., \module{pdb}).
2243Since Python 1.5, the line numbers in tracebacks have been computed
2244using a different mechanism that works with ``python -O''. For Python
22452.3 Michael Hudson implemented a similar scheme to determine when to
2246call the trace function, removing the need for \code{SET_LINENO}
2247entirely.
2248
Andrew M. Kuchling7a82b8c2002-11-04 20:17:24 +00002249It would be difficult to detect any resulting difference from Python
2250code, apart from a slight speed up when Python is run without
Michael W. Hudsondd32a912002-08-15 14:59:02 +00002251\programopt{-O}.
2252
2253C extensions that access the \member{f_lineno} field of frame objects
2254should instead call \code{PyCode_Addr2Line(f->f_code, f->f_lasti)}.
2255This will have the added effect of making the code work as desired
2256under ``python -O'' in earlier versions of Python.
2257
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +00002258A nifty new feature is that trace functions can now assign to the
2259\member{f_lineno} attribute of frame objects, changing the line that
2260will be executed next. A \samp{jump} command has been added to the
2261\module{pdb} debugger taking advantage of this new feature.
2262(Implemented by Richie Hindle.)
Andrew M. Kuchling974ab9d2002-12-31 01:20:30 +00002263
Andrew M. Kuchling517109b2002-05-07 21:01:16 +00002264\end{itemize}
2265
Andrew M. Kuchling187b1d82002-05-29 19:20:57 +00002266
Andrew M. Kuchling517109b2002-05-07 21:01:16 +00002267%======================================================================
Andrew M. Kuchling950725f2002-08-06 01:40:48 +00002268\section{Porting to Python 2.3}
2269
Andrew M. Kuchlingf15fb292002-12-31 18:34:54 +00002270This section lists previously described changes that may require
2271changes to your code:
Andrew M. Kuchling8a61f492002-11-13 13:24:41 +00002272
2273\begin{itemize}
2274
2275\item \keyword{yield} is now always a keyword; if it's used as a
2276variable name in your code, a different name must be chosen.
2277
Andrew M. Kuchling8a61f492002-11-13 13:24:41 +00002278\item For strings \var{X} and \var{Y}, \code{\var{X} in \var{Y}} now works
2279if \var{X} is more than one character long.
2280
Andrew M. Kuchling495172c2002-11-20 13:50:15 +00002281\item The \function{int()} type constructor will now return a long
2282integer instead of raising an \exception{OverflowError} when a string
2283or floating-point number is too large to fit into an integer.
2284
Andrew M. Kuchlingacddabc2003-02-18 00:43:24 +00002285\item If you have Unicode strings that contain 8-bit characters, you
2286must declare the file's encoding (UTF-8, Latin-1, or whatever) by
2287adding a comment to the top of the file. See
2288section~\ref{section-encodings} for more information.
2289
Andrew M. Kuchlingb492fa92002-11-27 19:11:10 +00002290\item Calling Tcl methods through \module{_tkinter} no longer
2291returns only strings. Instead, if Tcl returns other objects those
2292objects are converted to their Python equivalent, if one exists, or
2293wrapped with a \class{_tkinter.Tcl_Obj} object if no Python equivalent
2294exists.
2295
Andrew M. Kuchling80fd7852003-02-06 15:14:04 +00002296\item Large octal and hex literals such as
Andrew M. Kuchling72df65a2003-02-10 15:08:16 +00002297\code{0xffffffff} now trigger a \exception{FutureWarning}. Currently
Andrew M. Kuchling80fd7852003-02-06 15:14:04 +00002298they're stored as 32-bit numbers and result in a negative value, but
Andrew M. Kuchling72df65a2003-02-10 15:08:16 +00002299in Python 2.4 they'll become positive long integers.
2300
2301There are a few ways to fix this warning. If you really need a
2302positive number, just add an \samp{L} to the end of the literal. If
2303you're trying to get a 32-bit integer with low bits set and have
2304previously used an expression such as \code{~(1 << 31)}, it's probably
2305clearest to start with all bits set and clear the desired upper bits.
2306For example, to clear just the top bit (bit 31), you could write
2307\code{0xffffffffL {\&}{\textasciitilde}(1L<<31)}.
Andrew M. Kuchling80fd7852003-02-06 15:14:04 +00002308
Andrew M. Kuchling495172c2002-11-20 13:50:15 +00002309\item You can no longer disable assertions by assigning to \code{__debug__}.
2310
Andrew M. Kuchling8a61f492002-11-13 13:24:41 +00002311\item The Distutils \function{setup()} function has gained various new
Fred Drake5c4cf152002-11-13 14:59:06 +00002312keyword arguments such as \var{depends}. Old versions of the
Andrew M. Kuchling8a61f492002-11-13 13:24:41 +00002313Distutils will abort if passed unknown keywords. The fix is to check
2314for the presence of the new \function{get_distutil_options()} function
2315in your \file{setup.py} if you want to only support the new keywords
2316with a version of the Distutils that supports them:
2317
2318\begin{verbatim}
2319from distutils import core
2320
2321kw = {'sources': 'foo.c', ...}
2322if hasattr(core, 'get_distutil_options'):
2323 kw['depends'] = ['foo.h']
Fred Drake5c4cf152002-11-13 14:59:06 +00002324ext = Extension(**kw)
Andrew M. Kuchling8a61f492002-11-13 13:24:41 +00002325\end{verbatim}
2326
Andrew M. Kuchling495172c2002-11-20 13:50:15 +00002327\item Using \code{None} as a variable name will now result in a
2328\exception{SyntaxWarning} warning.
2329
2330\item Names of extension types defined by the modules included with
2331Python now contain the module and a \character{.} in front of the type
2332name.
2333
Andrew M. Kuchling8a61f492002-11-13 13:24:41 +00002334\end{itemize}
Andrew M. Kuchling950725f2002-08-06 01:40:48 +00002335
2336
2337%======================================================================
Fred Drake03e10312002-03-26 19:17:43 +00002338\section{Acknowledgements \label{acks}}
2339
Andrew M. Kuchling03594bb2002-03-27 02:29:48 +00002340The author would like to thank the following people for offering
2341suggestions, corrections and assistance with various drafts of this
Andrew M. Kuchlingd39078b2003-04-13 21:44:28 +00002342article: Jeff Bauer, Simon Brunning, Brett Cannon, Michael Chermside,
2343Andrew Dalke, Scott David Daniels, Fred~L. Drake, Jr., Kelly Gerber,
2344Raymond Hettinger, Michael Hudson, Chris Lambert, Detlef Lannert,
Andrew M. Kuchlingfcf6b3e2003-05-07 17:00:35 +00002345Martin von~L\"owis, Andrew MacIntyre, Lalo Martins, Chad Netzer,
2346Gustavo Niemeyer, Neal Norwitz, Hans Nowak, Chris Reedy, Francesco
2347Ricciardi, Vinay Sajip, Neil Schemenauer, Roman Suzi, Jason Tishler,
2348Just van~Rossum.
Fred Drake03e10312002-03-26 19:17:43 +00002349
2350\end{document}