blob: 63b9de063acf547c6c039c331bc2cd1e68601ef4 [file] [log] [blame]
Fred Drake03e10312002-03-26 19:17:43 +00001\documentclass{howto}
Andrew M. Kuchling03594bb2002-03-27 02:29:48 +00002% $Id$
3
4\title{What's New in Python 2.3}
Andrew M. Kuchling20e5abc2002-07-11 20:50:34 +00005\release{0.03}
Andrew M. Kuchling03594bb2002-03-27 02:29:48 +00006\author{A.M. Kuchling}
7\authoraddress{\email{akuchlin@mems-exchange.org}}
Fred Drake03e10312002-03-26 19:17:43 +00008
9\begin{document}
10\maketitle
11\tableofcontents
12
Andrew M. Kuchlingf70a0a82002-06-10 13:22:46 +000013% Optik (or whatever it gets called)
14%
Andrew M. Kuchlingc61ec522002-08-04 01:20:05 +000015% MacOS framework-related changes (section of its own, probably)
16%
Andrew M. Kuchling950725f2002-08-06 01:40:48 +000017% New sorting code
Andrew M. Kuchling90e9a792002-08-15 00:40:21 +000018%
Andrew M. Kuchling90e9a792002-08-15 00:40:21 +000019% xreadlines obsolete; files are their own iterator
Andrew M. Kuchlingf70a0a82002-06-10 13:22:46 +000020
Andrew M. Kuchling03594bb2002-03-27 02:29:48 +000021%\section{Introduction \label{intro}}
22
23{\large This article is a draft, and is currently up to date for some
Andrew M. Kuchling20e5abc2002-07-11 20:50:34 +000024random version of the CVS tree around mid-July 2002. Please send any
Andrew M. Kuchling03594bb2002-03-27 02:29:48 +000025additions, comments or errata to the author.}
26
27This article explains the new features in Python 2.3. The tentative
Andrew M. Kuchling20e5abc2002-07-11 20:50:34 +000028release date of Python 2.3 is currently scheduled for some undefined
29time before the end of 2002.
Andrew M. Kuchling03594bb2002-03-27 02:29:48 +000030
31This article doesn't attempt to provide a complete specification of
32the new features, but instead provides a convenient overview. For
33full details, you should refer to the documentation for Python 2.3,
34such as the
35\citetitle[http://www.python.org/doc/2.3/lib/lib.html]{Python Library
36Reference} and the
37\citetitle[http://www.python.org/doc/2.3/ref/ref.html]{Python
38Reference Manual}. If you want to understand the complete
39implementation and design rationale for a change, refer to the PEP for
40a particular new feature.
Fred Drake03e10312002-03-26 19:17:43 +000041
42
Andrew M. Kuchling03594bb2002-03-27 02:29:48 +000043%======================================================================
Andrew M. Kuchlingbc465102002-08-20 01:34:06 +000044\section{PEP 218: A Standard Set Datatype}
45
46The new \module{sets} module contains an implementation of a set
47datatype. The \class{Set} class is for mutable sets, sets that can
48have members added and removed. The \class{ImmutableSet} class is for
49sets that can't be modified, and can be used as dictionary keys. Sets
50are built on top of dictionaries, so the elements within a set must be
51hashable.
52
53As a simple example,
54
55\begin{verbatim}
56>>> import sets
57>>> S = sets.Set([1,2,3])
58>>> S
59Set([1, 2, 3])
60>>> 1 in S
61True
62>>> 0 in S
63False
64>>> S.add(5)
65>>> S.remove(3)
66>>> S
67Set([1, 2, 5])
68>>>
69\end{verbatim}
70
71The union and intersection of sets can be computed with the
72\method{union()} and \method{intersection()} methods, or,
73alternatively, using the bitwise operators \samp{\&} and \samp{|}.
74Mutable sets also have in-place versions of these methods,
75\method{union_update()} and \method{intersection_update()}.
76
77\begin{verbatim}
78>>> S1 = sets.Set([1,2,3])
79>>> S2 = sets.Set([4,5,6])
80>>> S1.union(S2)
81Set([1, 2, 3, 4, 5, 6])
82>>> S1 | S2 # Alternative notation
83Set([1, 2, 3, 4, 5, 6])
84>>> S1.intersection(S2)
85Set([])
86>>> S1 & S2 # Alternative notation
87Set([])
88>>> S1.union_update(S2)
89Set([1, 2, 3, 4, 5, 6])
90>>> S1
91Set([1, 2, 3, 4, 5, 6])
92>>>
93\end{verbatim}
94
95It's also possible to take the symmetric difference of two sets. This
96is the set of all elements in the union that aren't in the
97intersection. An alternative way of expressing the symmetric
98difference is that it contains all elements that are in exactly one
99set. Again, there's an in-place version, with the ungainly name
100\method{symmetric_difference_update()}.
101
102\begin{verbatim}
103>>> S1 = sets.Set([1,2,3,4])
104>>> S2 = sets.Set([3,4,5,6])
105>>> S1.symmetric_difference(S2)
106Set([1, 2, 5, 6])
107>>> S1 ^ S2
108Set([1, 2, 5, 6])
109>>>
110\end{verbatim}
111
112There are also methods, \method{issubset()} and \method{issuperset()},
113for checking whether one set is a strict subset or superset of
114another:
115
116\begin{verbatim}
117>>> S1 = sets.Set([1,2,3])
118>>> S2 = sets.Set([2,3])
119>>> S2.issubset(S1)
120True
121>>> S1.issubset(S2)
122False
123>>> S1.issuperset(S2)
124True
125>>>
126\end{verbatim}
127
128
129\begin{seealso}
130
131\seepep{218}{Adding a Built-In Set Object Type}{PEP written by Greg V. Wilson.
132Implemented by Greg V. Wilson, Alex Martelli, and GvR.}
133
134\end{seealso}
135
136
137
138%======================================================================
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000139\section{PEP 255: Simple Generators\label{section-generators}}
Andrew M. Kuchlingf4dd65d2002-04-01 19:28:09 +0000140
141In Python 2.2, generators were added as an optional feature, to be
142enabled by a \code{from __future__ import generators} directive. In
1432.3 generators no longer need to be specially enabled, and are now
144always present; this means that \keyword{yield} is now always a
145keyword. The rest of this section is a copy of the description of
146generators from the ``What's New in Python 2.2'' document; if you read
147it when 2.2 came out, you can skip the rest of this section.
148
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000149You're doubtless familiar with how function calls work in Python or C.
150When you call a function, it gets a private namespace where its local
Andrew M. Kuchlingf4dd65d2002-04-01 19:28:09 +0000151variables are created. When the function reaches a \keyword{return}
152statement, the local variables are destroyed and the resulting value
153is returned to the caller. A later call to the same function will get
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000154a fresh new set of local variables. But, what if the local variables
Andrew M. Kuchlingf4dd65d2002-04-01 19:28:09 +0000155weren't thrown away on exiting a function? What if you could later
156resume the function where it left off? This is what generators
157provide; they can be thought of as resumable functions.
158
159Here's the simplest example of a generator function:
160
161\begin{verbatim}
162def generate_ints(N):
163 for i in range(N):
164 yield i
165\end{verbatim}
166
167A new keyword, \keyword{yield}, was introduced for generators. Any
168function containing a \keyword{yield} statement is a generator
169function; this is detected by Python's bytecode compiler which
170compiles the function specially as a result.
171
172When you call a generator function, it doesn't return a single value;
173instead it returns a generator object that supports the iterator
174protocol. On executing the \keyword{yield} statement, the generator
175outputs the value of \code{i}, similar to a \keyword{return}
176statement. The big difference between \keyword{yield} and a
177\keyword{return} statement is that on reaching a \keyword{yield} the
178generator's state of execution is suspended and local variables are
179preserved. On the next call to the generator's \code{.next()} method,
180the function will resume executing immediately after the
181\keyword{yield} statement. (For complicated reasons, the
182\keyword{yield} statement isn't allowed inside the \keyword{try} block
183of a \code{try...finally} statement; read \pep{255} for a full
184explanation of the interaction between \keyword{yield} and
185exceptions.)
186
187Here's a sample usage of the \function{generate_ints} generator:
188
189\begin{verbatim}
190>>> gen = generate_ints(3)
191>>> gen
192<generator object at 0x8117f90>
193>>> gen.next()
1940
195>>> gen.next()
1961
197>>> gen.next()
1982
199>>> gen.next()
200Traceback (most recent call last):
Andrew M. Kuchling9f6e1042002-06-17 13:40:04 +0000201 File "stdin", line 1, in ?
202 File "stdin", line 2, in generate_ints
Andrew M. Kuchlingf4dd65d2002-04-01 19:28:09 +0000203StopIteration
204\end{verbatim}
205
206You could equally write \code{for i in generate_ints(5)}, or
207\code{a,b,c = generate_ints(3)}.
208
209Inside a generator function, the \keyword{return} statement can only
210be used without a value, and signals the end of the procession of
211values; afterwards the generator cannot return any further values.
212\keyword{return} with a value, such as \code{return 5}, is a syntax
213error inside a generator function. The end of the generator's results
214can also be indicated by raising \exception{StopIteration} manually,
215or by just letting the flow of execution fall off the bottom of the
216function.
217
218You could achieve the effect of generators manually by writing your
219own class and storing all the local variables of the generator as
220instance variables. For example, returning a list of integers could
221be done by setting \code{self.count} to 0, and having the
222\method{next()} method increment \code{self.count} and return it.
223However, for a moderately complicated generator, writing a
224corresponding class would be much messier.
225\file{Lib/test/test_generators.py} contains a number of more
226interesting examples. The simplest one implements an in-order
227traversal of a tree using generators recursively.
228
229\begin{verbatim}
230# A recursive generator that generates Tree leaves in in-order.
231def inorder(t):
232 if t:
233 for x in inorder(t.left):
234 yield x
235 yield t.label
236 for x in inorder(t.right):
237 yield x
238\end{verbatim}
239
240Two other examples in \file{Lib/test/test_generators.py} produce
241solutions for the N-Queens problem (placing $N$ queens on an $NxN$
242chess board so that no queen threatens another) and the Knight's Tour
243(a route that takes a knight to every square of an $NxN$ chessboard
244without visiting any square twice).
245
246The idea of generators comes from other programming languages,
247especially Icon (\url{http://www.cs.arizona.edu/icon/}), where the
248idea of generators is central. In Icon, every
249expression and function call behaves like a generator. One example
250from ``An Overview of the Icon Programming Language'' at
251\url{http://www.cs.arizona.edu/icon/docs/ipd266.htm} gives an idea of
252what this looks like:
253
254\begin{verbatim}
255sentence := "Store it in the neighboring harbor"
256if (i := find("or", sentence)) > 5 then write(i)
257\end{verbatim}
258
259In Icon the \function{find()} function returns the indexes at which the
260substring ``or'' is found: 3, 23, 33. In the \keyword{if} statement,
261\code{i} is first assigned a value of 3, but 3 is less than 5, so the
262comparison fails, and Icon retries it with the second value of 23. 23
263is greater than 5, so the comparison now succeeds, and the code prints
264the value 23 to the screen.
265
266Python doesn't go nearly as far as Icon in adopting generators as a
267central concept. Generators are considered a new part of the core
268Python language, but learning or using them isn't compulsory; if they
269don't solve any problems that you have, feel free to ignore them.
270One novel feature of Python's interface as compared to
271Icon's is that a generator's state is represented as a concrete object
272(the iterator) that can be passed around to other functions or stored
273in a data structure.
274
275\begin{seealso}
276
277\seepep{255}{Simple Generators}{Written by Neil Schemenauer, Tim
278Peters, Magnus Lie Hetland. Implemented mostly by Neil Schemenauer
279and Tim Peters, with other fixes from the Python Labs crew.}
280
281\end{seealso}
282
283
284%======================================================================
Fred Drake13090e12002-08-22 16:51:08 +0000285\section{PEP 263: Source Code Encodings \label{section-encodings}}
Andrew M. Kuchling950725f2002-08-06 01:40:48 +0000286
287Python source files can now be declared as being in different
288character set encodings. Encodings are declared by including a
289specially formatted comment in the first or second line of the source
290file. For example, a UTF-8 file can be declared with:
291
292\begin{verbatim}
293#!/usr/bin/env python
294# -*- coding: UTF-8 -*-
295\end{verbatim}
296
297Without such an encoding declaration, the default encoding used is
298ISO-8859-1, also known as Latin1.
299
300The encoding declaration only affects Unicode string literals; the
301text in the source code will be converted to Unicode using the
302specified encoding. Note that Python identifiers are still restricted
303to ASCII characters, so you can't have variable names that use
304characters outside of the usual alphanumerics.
305
306\begin{seealso}
307
308\seepep{263}{Defining Python Source Code Encodings}{Written by
Martin v. Löwisbd5e38d2002-10-07 18:52:29 +0000309Marc-Andr\'e Lemburg and Martin von L\"owis; implemented by SUZUKI
310Hisao and Martin von L\"owis.}
Andrew M. Kuchling950725f2002-08-06 01:40:48 +0000311
312\end{seealso}
313
314
315%======================================================================
Martin v. Löwisbd5e38d2002-10-07 18:52:29 +0000316\section{PEP 277: Unicode file name support for Windows NT}
Andrew M. Kuchling0f345562002-10-04 22:34:11 +0000317
Martin v. Löwisbd5e38d2002-10-07 18:52:29 +0000318On Windows NT, 2000, and XP, the system stores file names as Unicode
Andrew M. Kuchling0a6fa962002-10-09 12:11:10 +0000319strings. Traditionally, Python has represented file names as byte
320strings, which is inadequate because it renders some file names
Martin v. Löwisbd5e38d2002-10-07 18:52:29 +0000321inaccessible.
322
Andrew M. Kuchling0a6fa962002-10-09 12:11:10 +0000323Python now allows using arbitrary Unicode strings (within the
324limitations of the file system) for all functions that expect file
325names, in particular the \function{open()} built-in. If a Unicode
326string is passed to \function{os.listdir}, Python now returns a list
327of Unicode strings. A new function, \function{os.getcwdu()}, returns
328the current directory as a Unicode string.
Martin v. Löwisbd5e38d2002-10-07 18:52:29 +0000329
Andrew M. Kuchling0a6fa962002-10-09 12:11:10 +0000330Byte strings still work as file names, and Python will transparently
331convert them to Unicode using the \code{mbcs} encoding.
Martin v. Löwisbd5e38d2002-10-07 18:52:29 +0000332
Andrew M. Kuchling0a6fa962002-10-09 12:11:10 +0000333Other systems also allow Unicode strings as file names, but convert
334them to byte strings before passing them to the system which may cause
335a \exception{UnicodeError} to be raised. Applications can test whether
336arbitrary Unicode strings are supported as file names by checking
337\member{os.path.unicode_file_names}, a Boolean value.
Martin v. Löwisbd5e38d2002-10-07 18:52:29 +0000338
339\begin{seealso}
340
341\seepep{277}{Unicode file name support for Windows NT}{Written by Neil
342Hodgson; implemented by Neil Hodgson, Martin von L\"owis, and Mark
343Hammond.}
344
345\end{seealso}
Andrew M. Kuchling0f345562002-10-04 22:34:11 +0000346
347
348%======================================================================
Andrew M. Kuchlingf3676512002-04-15 02:27:55 +0000349\section{PEP 278: Universal Newline Support}
350
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000351The three major operating systems used today are Microsoft Windows,
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000352Apple's Macintosh OS, and the various \UNIX\ derivatives. A minor
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000353irritation is that these three platforms all use different characters
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000354to mark the ends of lines in text files. \UNIX\ uses character 10,
355the ASCII linefeed, while MacOS uses character 13, the ASCII carriage
356return, and Windows uses a two-character sequence of a carriage return
357plus a newline.
Andrew M. Kuchlingf3676512002-04-15 02:27:55 +0000358
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000359Python's file objects can now support end of line conventions other
360than the one followed by the platform on which Python is running.
361Opening a file with the mode \samp{U} or \samp{rU} will open a file
362for reading in universal newline mode. All three line ending
363conventions will be translated to a \samp{\e n} in the strings
364returned by the various file methods such as \method{read()} and
365\method{readline()}.
Andrew M. Kuchlingf3676512002-04-15 02:27:55 +0000366
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000367Universal newline support is also used when importing modules and when
368executing a file with the \function{execfile()} function. This means
369that Python modules can be shared between all three operating systems
370without needing to convert the line-endings.
371
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000372This feature can be disabled at compile-time by specifying
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000373\longprogramopt{without-universal-newlines} when running Python's
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000374\file{configure} script.
Andrew M. Kuchlingf3676512002-04-15 02:27:55 +0000375
376\begin{seealso}
377
378\seepep{278}{Universal Newline Support}{Written
379and implemented by Jack Jansen.}
380
381\end{seealso}
382
Andrew M. Kuchlingfad2f592002-05-10 21:00:05 +0000383
384%======================================================================
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +0000385\section{PEP 279: The \function{enumerate()} Built-in Function\label{section-enumerate}}
Andrew M. Kuchlingfad2f592002-05-10 21:00:05 +0000386
387A new built-in function, \function{enumerate()}, will make
388certain loops a bit clearer. \code{enumerate(thing)}, where
389\var{thing} is either an iterator or a sequence, returns a iterator
390that will return \code{(0, \var{thing[0]})}, \code{(1,
391\var{thing[1]})}, \code{(2, \var{thing[2]})}, and so forth. Fairly
392often you'll see code to change every element of a list that looks
393like this:
394
395\begin{verbatim}
396for i in range(len(L)):
397 item = L[i]
398 # ... compute some result based on item ...
399 L[i] = result
400\end{verbatim}
401
402This can be rewritten using \function{enumerate()} as:
403
404\begin{verbatim}
405for i, item in enumerate(L):
406 # ... compute some result based on item ...
407 L[i] = result
408\end{verbatim}
409
410
411\begin{seealso}
412
413\seepep{279}{The enumerate() built-in function}{Written
414by Raymond D. Hettinger.}
415
416\end{seealso}
417
418
Andrew M. Kuchlingf3676512002-04-15 02:27:55 +0000419%======================================================================
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000420\section{PEP 285: The \class{bool} Type\label{section-bool}}
421
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000422A Boolean type was added to Python 2.3. Two new constants were added
423to the \module{__builtin__} module, \constant{True} and
424\constant{False}. The type object for this new type is named
425\class{bool}; the constructor for it takes any Python value and
426converts it to \constant{True} or \constant{False}.
427
428\begin{verbatim}
429>>> bool(1)
430True
431>>> bool(0)
432False
433>>> bool([])
434False
435>>> bool( (1,) )
436True
437\end{verbatim}
438
439Most of the standard library modules and built-in functions have been
440changed to return Booleans.
441
442\begin{verbatim}
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000443>>> obj = []
444>>> hasattr(obj, 'append')
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000445True
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000446>>> isinstance(obj, list)
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000447True
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000448>>> isinstance(obj, tuple)
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000449False
450\end{verbatim}
451
452Python's Booleans were added with the primary goal of making code
453clearer. For example, if you're reading a function and encounter the
454statement \code{return 1}, you might wonder whether the \samp{1}
455represents a truth value, or whether it's an index, or whether it's a
456coefficient that multiplies some other quantity. If the statement is
457\code{return True}, however, the meaning of the return value is quite
458clearly a truth value.
459
460Python's Booleans were not added for the sake of strict type-checking.
Andrew M. Kuchlinga2a206b2002-05-24 21:08:58 +0000461A very strict language such as Pascal would also prevent you
462performing arithmetic with Booleans, and would require that the
463expression in an \keyword{if} statement always evaluate to a Boolean.
464Python is not this strict, and it never will be. (\pep{285}
465explicitly says so.) So you can still use any expression in an
466\keyword{if}, even ones that evaluate to a list or tuple or some
467random object, and the Boolean type is a subclass of the
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000468\class{int} class, so arithmetic using a Boolean still works.
469
470\begin{verbatim}
471>>> True + 1
4722
473>>> False + 1
4741
475>>> False * 75
4760
477>>> True * 75
47875
479\end{verbatim}
480
481To sum up \constant{True} and \constant{False} in a sentence: they're
482alternative ways to spell the integer values 1 and 0, with the single
483difference that \function{str()} and \function{repr()} return the
484strings \samp{True} and \samp{False} instead of \samp{1} and \samp{0}.
Andrew M. Kuchling3a52ff62002-04-03 22:44:47 +0000485
486\begin{seealso}
487
488\seepep{285}{Adding a bool type}{Written and implemented by GvR.}
489
490\end{seealso}
491
Michael W. Hudson5efaf7e2002-06-11 10:55:12 +0000492
Andrew M. Kuchling65b72822002-09-03 00:53:21 +0000493%======================================================================
494\section{PEP 293: Codec Error Handling Callbacks}
495
Martin v. Löwis20eae692002-10-07 19:01:07 +0000496When encoding a Unicode string into a byte string, unencodable
Andrew M. Kuchling0a6fa962002-10-09 12:11:10 +0000497characters may be encountered. So far, Python has allowed specifying
498the error processing as either ``strict'' (raising
499\exception{UnicodeError}), ``ignore'' (skip the character), or
500``replace'' (with question mark), defaulting to ``strict''. It may be
501desirable to specify an alternative processing of the error, e.g. by
502inserting an XML character reference or HTML entity reference into the
503converted string.
Martin v. Löwis20eae692002-10-07 19:01:07 +0000504
505Python now has a flexible framework to add additional processing
Andrew M. Kuchling0a6fa962002-10-09 12:11:10 +0000506strategies. New error handlers can be added with
Martin v. Löwis20eae692002-10-07 19:01:07 +0000507\function{codecs.register_error}. Codecs then can access the error
Andrew M. Kuchling0a6fa962002-10-09 12:11:10 +0000508handler with \function{codecs.lookup_error}. An equivalent C API has
509been added for codecs written in C. The error handler gets the
510necessary state information, such as the string being converted, the
511position in the string where the error was detected, and the target
512encoding. The handler can then either raise an exception, or return a
513replacement string.
Martin v. Löwis20eae692002-10-07 19:01:07 +0000514
515Two additional error handlers have been implemented using this
Andrew M. Kuchling0a6fa962002-10-09 12:11:10 +0000516framework: ``backslashreplace'' uses Python backslash quoting to
Martin v. Löwis20eae692002-10-07 19:01:07 +0000517represent the unencodable character, and ``xmlcharrefreplace'' emits
518XML character references.
Andrew M. Kuchling65b72822002-09-03 00:53:21 +0000519
520\begin{seealso}
521
522\seepep{293}{Codec Error Handling Callbacks}{Written and implemented by
Andrew M. Kuchling0a6fa962002-10-09 12:11:10 +0000523Walter D\"orwald.}
Andrew M. Kuchling65b72822002-09-03 00:53:21 +0000524
525\end{seealso}
526
527
528%======================================================================
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +0000529\section{Extended Slices\label{section-slices}}
Michael W. Hudson5efaf7e2002-06-11 10:55:12 +0000530
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +0000531Ever since Python 1.4, the slicing syntax has supported an optional
532third ``step'' or ``stride'' argument. For example, these are all
533legal Python syntax: \code{L[1:10:2]}, \code{L[:-1:1]},
534\code{L[::-1]}. This was added to Python included at the request of
535the developers of Numerical Python. However, the built-in sequence
536types of lists, tuples, and strings have never supported this feature,
537and you got a \exception{TypeError} if you tried it. Michael Hudson
538contributed a patch that was applied to Python 2.3 and fixed this
539shortcoming.
540
541For example, you can now easily extract the elements of a list that
542have even indexes:
Fred Drakedf872a22002-07-03 12:02:01 +0000543
544\begin{verbatim}
545>>> L = range(10)
546>>> L[::2]
547[0, 2, 4, 6, 8]
548\end{verbatim}
549
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +0000550Negative values also work, so you can make a copy of the same list in
551reverse order:
Fred Drakedf872a22002-07-03 12:02:01 +0000552
553\begin{verbatim}
554>>> L[::-1]
555[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
556\end{verbatim}
Andrew M. Kuchling3a52ff62002-04-03 22:44:47 +0000557
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +0000558This also works for strings:
559
560\begin{verbatim}
561>>> s='abcd'
562>>> s[::2]
563'ac'
564>>> s[::-1]
565'dcba'
566\end{verbatim}
567
Michael W. Hudson4da01ed2002-07-19 15:48:56 +0000568as well as tuples and arrays.
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +0000569
Michael W. Hudson4da01ed2002-07-19 15:48:56 +0000570If you have a mutable sequence (i.e. a list or an array) you can
571assign to or delete an extended slice, but there are some differences
572in assignment to extended and regular slices. Assignment to a regular
573slice can be used to change the length of the sequence:
574
575\begin{verbatim}
576>>> a = range(3)
577>>> a
578[0, 1, 2]
579>>> a[1:3] = [4, 5, 6]
580>>> a
581[0, 4, 5, 6]
582\end{verbatim}
583
584but when assigning to an extended slice the list on the right hand
585side of the statement must contain the same number of items as the
586slice it is replacing:
587
588\begin{verbatim}
589>>> a = range(4)
590>>> a
591[0, 1, 2, 3]
592>>> a[::2]
593[0, 2]
594>>> a[::2] = range(0, -2, -1)
595>>> a
596[0, 1, -1, 3]
597>>> a[::2] = range(3)
598Traceback (most recent call last):
599 File "<stdin>", line 1, in ?
600ValueError: attempt to assign list of size 3 to extended slice of size 2
601\end{verbatim}
602
603Deletion is more straightforward:
604
605\begin{verbatim}
606>>> a = range(4)
607>>> a[::2]
608[0, 2]
609>>> del a[::2]
610>>> a
611[1, 3]
612\end{verbatim}
613
614One can also now pass slice objects to builtin sequences
615\method{__getitem__} methods:
616
617\begin{verbatim}
618>>> range(10).__getitem__(slice(0, 5, 2))
619[0, 2, 4]
620\end{verbatim}
621
622or use them directly in subscripts:
623
624\begin{verbatim}
625>>> range(10)[slice(0, 5, 2)]
626[0, 2, 4]
627\end{verbatim}
628
629To make implementing sequences that support extended slicing in Python
630easier, slice ojects now have a method \method{indices} which given
631the length of a sequence returns \code{(start, stop, step)} handling
632omitted and out-of-bounds indices in a manner consistent with regular
633slices (and this innocuous phrase hides a welter of confusing
634details!). The method is intended to be used like this:
635
636\begin{verbatim}
637class FakeSeq:
638 ...
639 def calc_item(self, i):
640 ...
641 def __getitem__(self, item):
642 if isinstance(item, slice):
643 return FakeSeq([self.calc_item(i)
644 in range(*item.indices(len(self)))])
645 else:
646 return self.calc_item(i)
647\end{verbatim}
648
Andrew M. Kuchling90e9a792002-08-15 00:40:21 +0000649From this example you can also see that the builtin ``\class{slice}''
650object is now the type object for the slice type, and is no longer a
651function. This is consistent with Python 2.2, where \class{int},
652\class{str}, etc., underwent the same change.
653
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +0000654
Andrew M. Kuchling3a52ff62002-04-03 22:44:47 +0000655%======================================================================
Fred Drakedf872a22002-07-03 12:02:01 +0000656\section{Other Language Changes}
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000657
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +0000658Here are all of the changes that Python 2.3 makes to the core Python
659language.
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000660
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +0000661\begin{itemize}
662\item The \keyword{yield} statement is now always a keyword, as
663described in section~\ref{section-generators} of this document.
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000664
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +0000665\item A new built-in function \function{enumerate()}
666was added, as described in section~\ref{section-enumerate} of this
667document.
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000668
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +0000669\item Two new constants, \constant{True} and \constant{False} were
670added along with the built-in \class{bool} type, as described in
671section~\ref{section-bool} of this document.
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000672
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +0000673\item Built-in types now support the extended slicing syntax,
674as described in section~\ref{section-slices} of this document.
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000675
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +0000676\item Dictionaries have a new method, \method{pop(\var{key})}, that
677returns the value corresponding to \var{key} and removes that
678key/value pair from the dictionary. \method{pop()} will raise a
679\exception{KeyError} if the requested key isn't present in the
680dictionary:
681
682\begin{verbatim}
683>>> d = {1:2}
684>>> d
685{1: 2}
686>>> d.pop(4)
687Traceback (most recent call last):
688 File ``stdin'', line 1, in ?
689KeyError: 4
690>>> d.pop(1)
6912
692>>> d.pop(1)
693Traceback (most recent call last):
694 File ``stdin'', line 1, in ?
695KeyError: pop(): dictionary is empty
696>>> d
697{}
698>>>
699\end{verbatim}
700
701(Patch contributed by Raymond Hettinger.)
702
Andrew M. Kuchling6974aa92002-08-20 00:54:36 +0000703\item The \keyword{assert} statement no longer checks the \code{__debug__}
704flag, so you can no longer disable assertions by assigning to \code{__debug__}.
705Running Python with the \programopt{-O} switch will still generate
706code that doesn't execute any assertions.
707
708\item Most type objects are now callable, so you can use them
709to create new objects such as functions, classes, and modules. (This
710means that the \module{new} module can be deprecated in a future
711Python version, because you can now use the type objects available
712in the \module{types} module.)
713% XXX should new.py use PendingDeprecationWarning?
714For example, you can create a new module object with the following code:
715
716\begin{verbatim}
717>>> import types
718>>> m = types.ModuleType('abc','docstring')
719>>> m
720<module 'abc' (built-in)>
721>>> m.__doc__
722'docstring'
723\end{verbatim}
724
725\item
726A new warning, \exception{PendingDeprecationWarning} was added to
727indicate features which are in the process of being
728deprecated. The warning will \emph{not} be printed by default. To
729check for use of features that will be deprecated in the future,
730supply \programopt{-Walways::PendingDeprecationWarning::} on the
731command line or use \function{warnings.filterwarnings()}.
732
733\item Using \code{None} as a variable name will now result in a
734\exception{SyntaxWarning} warning. In a future version of Python,
735\code{None} may finally become a keyword.
736
Andrew M. Kuchlingdcfd8252002-09-13 22:21:42 +0000737\item Python runs multithreaded programs by switching between threads
738after executing N bytecodes. The default value for N has been
739increased from 10 to 100 bytecodes, speeding up single-threaded
740applications by reducing the switching overhead. Some multithreaded
741applications may suffer slower response time, but that's easily fixed
742by setting the limit back to a lower number by calling
743\function{sys.setcheckinterval(\var{N})}.
744
Andrew M. Kuchling6974aa92002-08-20 00:54:36 +0000745\item One minor but far-reaching change is that the names of extension
746types defined by the modules included with Python now contain the
747module and a \samp{.} in front of the type name. For example, in
748Python 2.2, if you created a socket and printed its
749\member{__class__}, you'd get this output:
750
751\begin{verbatim}
752>>> s = socket.socket()
753>>> s.__class__
754<type 'socket'>
755\end{verbatim}
756
757In 2.3, you get this:
758\begin{verbatim}
759>>> s.__class__
760<type '_socket.socket'>
761\end{verbatim}
762
763\end{itemize}
764
765
766\subsection{String Changes}
767
768\begin{itemize}
769
770\item The \code{in} operator now works differently for strings.
771Previously, when evaluating \code{\var{X} in \var{Y}} where \var{X}
772and \var{Y} are strings, \var{X} could only be a single character.
773That's now changed; \var{X} can be a string of any length, and
774\code{\var{X} in \var{Y}} will return \constant{True} if \var{X} is a
775substring of \var{Y}. If \var{X} is the empty string, the result is
776always \constant{True}.
777
778\begin{verbatim}
779>>> 'ab' in 'abcd'
780True
781>>> 'ad' in 'abcd'
782False
783>>> '' in 'abcd'
784True
785\end{verbatim}
786
787Note that this doesn't tell you where the substring starts; the
788\method{find()} method is still necessary to figure that out.
789
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +0000790\item The \method{strip()}, \method{lstrip()}, and \method{rstrip()}
791string methods now have an optional argument for specifying the
792characters to strip. The default is still to remove all whitespace
793characters:
794
795\begin{verbatim}
796>>> ' abc '.strip()
797'abc'
798>>> '><><abc<><><>'.strip('<>')
799'abc'
800>>> '><><abc<><><>\n'.strip('<>')
801'abc<><><>\n'
802>>> u'\u4000\u4001abc\u4000'.strip(u'\u4000')
803u'\u4001abc'
804>>>
805\end{verbatim}
806
Andrew M. Kuchling346386f2002-07-12 20:24:42 +0000807(Contributed by Simon Brunning.)
808
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +0000809\item The \method{startswith()} and \method{endswith()}
810string methods now accept negative numbers for the start and end
811parameters.
812
813\item Another new string method is \method{zfill()}, originally a
814function in the \module{string} module. \method{zfill()} pads a
815numeric string with zeros on the left until it's the specified width.
816Note that the \code{\%} operator is still more flexible and powerful
817than \method{zfill()}.
818
819\begin{verbatim}
820>>> '45'.zfill(4)
821'0045'
822>>> '12345'.zfill(4)
823'12345'
824>>> 'goofy'.zfill(6)
825'0goofy'
826\end{verbatim}
827
Andrew M. Kuchling346386f2002-07-12 20:24:42 +0000828(Contributed by Walter D\"orwald.)
829
Andrew M. Kuchling20e5abc2002-07-11 20:50:34 +0000830\item A new type object, \class{basestring}, has been added.
831 Both 8-bit strings and Unicode strings inherit from this type, so
832 \code{isinstance(obj, basestring)} will return \constant{True} for
833 either kind of string. It's a completely abstract type, so you
834 can't create \class{basestring} instances.
835
Andrew M. Kuchling6974aa92002-08-20 00:54:36 +0000836\item Interned strings are no longer immortal. Interned will now be
837garbage-collected in the usual way when the only reference to them is
838from the internal dictionary of interned strings. (Implemented by
839Oren Tirosh.)
840
841\end{itemize}
842
843
844\subsection{Optimizations}
845
846\begin{itemize}
847
Andrew M. Kuchling950725f2002-08-06 01:40:48 +0000848\item The \method{sort()} method of list objects has been extensively
849rewritten by Tim Peters, and the implementation is significantly
850faster.
851
Andrew M. Kuchling6974aa92002-08-20 00:54:36 +0000852\item Multiplication of large long integers is now much faster thanks
853to an implementation of Karatsuba multiplication, an algorithm that
854scales better than the O(n*n) required for the grade-school
855multiplication algorithm. (Original patch by Christopher A. Craig,
856and significantly reworked by Tim Peters.)
Andrew M. Kuchling20e5abc2002-07-11 20:50:34 +0000857
Andrew M. Kuchling6974aa92002-08-20 00:54:36 +0000858\item The \code{SET_LINENO} opcode is now gone. This may provide a
859small speed increase, subject to your compiler's idiosyncrasies.
860(Removed by Michael Hudson.)
Andrew M. Kuchling20e5abc2002-07-11 20:50:34 +0000861
Andrew M. Kuchling6974aa92002-08-20 00:54:36 +0000862\item A number of small rearrangements have been made in various
863hotspots to improve performance, inlining a function here, removing
864some code there. (Implemented mostly by GvR, but lots of people have
865contributed to one change or another.)
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +0000866
867\end{itemize}
Neal Norwitzd68f5172002-05-29 15:54:55 +0000868
Andrew M. Kuchling6974aa92002-08-20 00:54:36 +0000869
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000870%======================================================================
Andrew M. Kuchling03594bb2002-03-27 02:29:48 +0000871\section{New and Improved Modules}
872
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000873As usual, Python's standard modules had a number of enhancements and
Andrew M. Kuchlinga982eb12002-07-22 18:57:36 +0000874bug fixes. Here's a partial list of the most notable changes, sorted
875alphabetically by module name. Consult the
876\file{Misc/NEWS} file in the source tree for a more
877complete list of changes, or look through the CVS logs for all the
878details.
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000879
880\begin{itemize}
881
Andrew M. Kuchlinga982eb12002-07-22 18:57:36 +0000882\item The \module{array} module now supports arrays of Unicode
883characters using the \samp{u} format character. Arrays also now
884support using the \code{+=} assignment operator to add another array's
885contents, and the \code{*=} assignment operator to repeat an array.
886(Contributed by Jason Orendorff.)
887
888\item The Distutils \class{Extension} class now supports
889an extra constructor argument named \samp{depends} for listing
890additional source files that an extension depends on. This lets
891Distutils recompile the module if any of the dependency files are
892modified. For example, if \samp{sampmodule.c} includes the header
893file \file{sample.h}, you would create the \class{Extension} object like
894this:
895
896\begin{verbatim}
897ext = Extension("samp",
898 sources=["sampmodule.c"],
899 depends=["sample.h"])
900\end{verbatim}
901
902Modifying \file{sample.h} would then cause the module to be recompiled.
903(Contributed by Jeremy Hylton.)
904
Andrew M. Kuchlinga982eb12002-07-22 18:57:36 +0000905\item The \module{getopt} module gained a new function,
906\function{gnu_getopt()}, that supports the same arguments as the existing
907\function{getopt()} function but uses GNU-style scanning mode.
908The existing \function{getopt()} stops processing options as soon as a
909non-option argument is encountered, but in GNU-style mode processing
910continues, meaning that options and arguments can be mixed. For
911example:
912
913\begin{verbatim}
914>>> getopt.getopt(['-f', 'filename', 'output', '-v'], 'f:v')
915([('-f', 'filename')], ['output', '-v'])
916>>> getopt.gnu_getopt(['-f', 'filename', 'output', '-v'], 'f:v')
917([('-f', 'filename'), ('-v', '')], ['output'])
918\end{verbatim}
919
920(Contributed by Peter \AA{strand}.)
921
922\item The \module{grp}, \module{pwd}, and \module{resource} modules
923now return enhanced tuples:
924
925\begin{verbatim}
926>>> import grp
927>>> g = grp.getgrnam('amk')
928>>> g.gr_name, g.gr_gid
929('amk', 500)
930\end{verbatim}
931
Andrew M. Kuchling950725f2002-08-06 01:40:48 +0000932\item The new \module{heapq} module contains an implementation of a
933heap queue algorithm. A heap is an array-like data structure that
934keeps items in a sorted order such that, for every index k, heap[k] <=
935heap[2*k+1] and heap[k] <= heap[2*k+2]. This makes it quick to remove
936the smallest item, and inserting a new item while maintaining the heap
937property is O(lg~n). (See
938\url{http://www.nist.gov/dads/HTML/priorityque.html} for more
939information about the priority queue data structure.)
940
941The Python \module{heapq} module provides \function{heappush()} and
942\function{heappop()} functions for adding and removing items while
943maintaining the heap property on top of some other mutable Python
944sequence type. For example:
945
946\begin{verbatim}
947>>> import heapq
948>>> heap = []
949>>> for item in [3, 7, 5, 11, 1]:
950... heapq.heappush(heap, item)
951...
952>>> heap
953[1, 3, 5, 11, 7]
954>>> heapq.heappop(heap)
9551
956>>> heapq.heappop(heap)
9573
958>>> heap
959[5, 7, 11]
960>>>
961>>> heapq.heappush(heap, 5)
962>>> heap = []
963>>> for item in [3, 7, 5, 11, 1]:
964... heapq.heappush(heap, item)
965...
966>>> heap
967[1, 3, 5, 11, 7]
968>>> heapq.heappop(heap)
9691
970>>> heapq.heappop(heap)
9713
972>>> heap
973[5, 7, 11]
974>>>
975\end{verbatim}
976
977(Contributed by Kevin O'Connor.)
Andrew M. Kuchlinga982eb12002-07-22 18:57:36 +0000978
979\item Two new functions in the \module{math} module,
980\function{degrees(\var{rads})} and \function{radians(\var{degs})},
981convert between radians and degrees. Other functions in the
982\module{math} module such as
983\function{math.sin()} and \function{math.cos()} have always required
984input values measured in radians. (Contributed by Raymond Hettinger.)
985
Andrew M. Kuchling52f1b762002-07-28 20:29:03 +0000986\item Four new functions, \function{getpgid()}, \function{killpg()}, \function{lchown()}, and \function{mknod()}, were added to the \module{posix} module that
Andrew M. Kuchlinga982eb12002-07-22 18:57:36 +0000987underlies the \module{os} module. (Contributed by Gustavo Niemeyer
988and Geert Jansen.)
989
990\item The parser objects provided by the \module{pyexpat} module
991can now optionally buffer character data, resulting in fewer calls to
992your character data handler and therefore faster performance. Setting
993the parser object's \member{buffer_text} attribute to \constant{True}
994will enable buffering.
995
996\item The \module{readline} module also gained a number of new
997functions: \function{get_history_item()},
998\function{get_current_history_length()}, and \function{redisplay()}.
999
1000\item Support for more advanced POSIX signal handling was added
1001to the \module{signal} module by adding the \function{sigpending},
1002\function{sigprocmask} and \function{sigsuspend} functions, where supported
1003by the platform. These functions make it possible to avoid some previously
1004unavoidable race conditions.
1005
1006\item The \module{socket} module now supports timeouts. You
1007can call the \method{settimeout(\var{t})} method on a socket object to
1008set a timeout of \var{t} seconds. Subsequent socket operations that
1009take longer than \var{t} seconds to complete will abort and raise a
1010\exception{socket.error} exception.
1011
1012The original timeout implementation was by Tim O'Malley. Michael
1013Gilfix integrated it into the Python \module{socket} module, after the
1014patch had undergone a lengthy review. After it was checked in, Guido
1015van~Rossum rewrote parts of it. This is a good example of the free
1016software development process in action.
1017
Fred Drake583db0d2002-09-14 02:03:25 +00001018\item The value of the C \constant{PYTHON_API_VERSION} macro is now exposed
1019at the Python level as \code{sys.api_version}.
Andrew M. Kuchlingdcfd8252002-09-13 22:21:42 +00001020
Andrew M. Kuchling20e5abc2002-07-11 20:50:34 +00001021\item The new \module{textwrap} module contains functions for wrapping
Andrew M. Kuchlingd003a2a2002-06-26 13:23:55 +00001022strings containing paragraphs of text. The \function{wrap(\var{text},
1023\var{width})} function takes a string and returns a list containing
1024the text split into lines of no more than the chosen width. The
1025\function{fill(\var{text}, \var{width})} function returns a single
1026string, reformatted to fit into lines no longer than the chosen width.
1027(As you can guess, \function{fill()} is built on top of
1028\function{wrap()}. For example:
1029
1030\begin{verbatim}
1031>>> import textwrap
1032>>> paragraph = "Not a whit, we defy augury: ... more text ..."
1033>>> textwrap.wrap(paragraph, 60)
1034["Not a whit, we defy augury: there's a special providence in",
1035 "the fall of a sparrow. If it be now, 'tis not to come; if it",
1036 ...]
1037>>> print textwrap.fill(paragraph, 35)
1038Not a whit, we defy augury: there's
1039a special providence in the fall of
1040a sparrow. If it be now, 'tis not
1041to come; if it be not to come, it
1042will be now; if it be not now, yet
1043it will come: the readiness is all.
1044>>>
1045\end{verbatim}
1046
1047The module also contains a \class{TextWrapper} class that actually
1048implements the text wrapping strategy. Both the
1049\class{TextWrapper} class and the \function{wrap()} and
1050\function{fill()} functions support a number of additional keyword
1051arguments for fine-tuning the formatting; consult the module's
1052documentation for details.
1053% XXX add a link to the module docs?
1054(Contributed by Greg Ward.)
1055
Andrew M. Kuchlingef5d06b2002-07-22 19:21:06 +00001056\item The \module{time} module's \function{strptime()} function has
1057long been an annoyance because it uses the platform C library's
1058\function{strptime()} implementation, and different platforms
1059sometimes have odd bugs. Brett Cannon contributed a portable
1060implementation that's written in pure Python, which should behave
1061identically on all platforms.
1062
Andrew M. Kuchling20e5abc2002-07-11 20:50:34 +00001063\item The DOM implementation
1064in \module{xml.dom.minidom} can now generate XML output in a
1065particular encoding, by specifying an optional encoding argument to
1066the \method{toxml()} and \method{toprettyxml()} methods of DOM nodes.
1067
Andrew M. Kuchling821013e2002-05-06 17:46:39 +00001068\end{itemize}
1069
Andrew M. Kuchling03594bb2002-03-27 02:29:48 +00001070
Andrew M. Kuchlingef5d06b2002-07-22 19:21:06 +00001071%======================================================================
1072\section{Specialized Object Allocator (pymalloc)\label{section-pymalloc}}
1073
1074An experimental feature added to Python 2.1 was a specialized object
1075allocator called pymalloc, written by Vladimir Marangozov. Pymalloc
1076was intended to be faster than the system \cfunction{malloc()} and have
1077less memory overhead for typical allocation patterns of Python
1078programs. The allocator uses C's \cfunction{malloc()} function to get
1079large pools of memory, and then fulfills smaller memory requests from
1080these pools.
1081
1082In 2.1 and 2.2, pymalloc was an experimental feature and wasn't
1083enabled by default; you had to explicitly turn it on by providing the
1084\longprogramopt{with-pymalloc} option to the \program{configure}
1085script. In 2.3, pymalloc has had further enhancements and is now
1086enabled by default; you'll have to supply
1087\longprogramopt{without-pymalloc} to disable it.
1088
1089This change is transparent to code written in Python; however,
1090pymalloc may expose bugs in C extensions. Authors of C extension
1091modules should test their code with the object allocator enabled,
1092because some incorrect code may cause core dumps at runtime. There
1093are a bunch of memory allocation functions in Python's C API that have
1094previously been just aliases for the C library's \cfunction{malloc()}
1095and \cfunction{free()}, meaning that if you accidentally called
1096mismatched functions, the error wouldn't be noticeable. When the
1097object allocator is enabled, these functions aren't aliases of
1098\cfunction{malloc()} and \cfunction{free()} any more, and calling the
1099wrong function to free memory may get you a core dump. For example,
1100if memory was allocated using \cfunction{PyObject_Malloc()}, it has to
1101be freed using \cfunction{PyObject_Free()}, not \cfunction{free()}. A
1102few modules included with Python fell afoul of this and had to be
1103fixed; doubtless there are more third-party modules that will have the
1104same problem.
1105
1106As part of this change, the confusing multiple interfaces for
1107allocating memory have been consolidated down into two API families.
1108Memory allocated with one family must not be manipulated with
1109functions from the other family.
1110
1111There is another family of functions specifically for allocating
1112Python \emph{objects} (as opposed to memory).
1113
1114\begin{itemize}
1115 \item To allocate and free an undistinguished chunk of memory use
1116 the ``raw memory'' family: \cfunction{PyMem_Malloc()},
1117 \cfunction{PyMem_Realloc()}, and \cfunction{PyMem_Free()}.
1118
1119 \item The ``object memory'' family is the interface to the pymalloc
1120 facility described above and is biased towards a large number of
1121 ``small'' allocations: \cfunction{PyObject_Malloc},
1122 \cfunction{PyObject_Realloc}, and \cfunction{PyObject_Free}.
1123
1124 \item To allocate and free Python objects, use the ``object'' family
1125 \cfunction{PyObject_New()}, \cfunction{PyObject_NewVar()}, and
1126 \cfunction{PyObject_Del()}.
1127\end{itemize}
1128
1129Thanks to lots of work by Tim Peters, pymalloc in 2.3 also provides
1130debugging features to catch memory overwrites and doubled frees in
1131both extension modules and in the interpreter itself. To enable this
1132support, turn on the Python interpreter's debugging code by running
1133\program{configure} with \longprogramopt{with-pydebug}.
1134
1135To aid extension writers, a header file \file{Misc/pymemcompat.h} is
1136distributed with the source to Python 2.3 that allows Python
1137extensions to use the 2.3 interfaces to memory allocation and compile
1138against any version of Python since 1.5.2. You would copy the file
1139from Python's source distribution and bundle it with the source of
1140your extension.
1141
1142\begin{seealso}
1143
1144\seeurl{http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/Objects/obmalloc.c}
1145{For the full details of the pymalloc implementation, see
1146the comments at the top of the file \file{Objects/obmalloc.c} in the
1147Python source code. The above link points to the file within the
1148SourceForge CVS browser.}
1149
1150\end{seealso}
1151
1152
Andrew M. Kuchling821013e2002-05-06 17:46:39 +00001153% ======================================================================
1154\section{Build and C API Changes}
1155
Andrew M. Kuchling3c305d92002-07-22 18:50:11 +00001156Changes to Python's build process and to the C API include:
Andrew M. Kuchling821013e2002-05-06 17:46:39 +00001157
1158\begin{itemize}
1159
Andrew M. Kuchlingef5d06b2002-07-22 19:21:06 +00001160\item The C-level interface to the garbage collector has been changed,
1161to make it easier to write extension types that support garbage
1162collection, and to make it easier to debug misuses of the functions.
1163Various functions have slightly different semantics, so a bunch of
1164functions had to be renamed. Extensions that use the old API will
1165still compile but will \emph{not} participate in garbage collection,
1166so updating them for 2.3 should be considered fairly high priority.
1167
1168To upgrade an extension module to the new API, perform the following
1169steps:
1170
1171\begin{itemize}
1172
1173\item Rename \cfunction{Py_TPFLAGS_GC} to \cfunction{PyTPFLAGS_HAVE_GC}.
1174
1175\item Use \cfunction{PyObject_GC_New} or \cfunction{PyObject_GC_NewVar} to
1176allocate objects, and \cfunction{PyObject_GC_Del} to deallocate them.
1177
1178\item Rename \cfunction{PyObject_GC_Init} to \cfunction{PyObject_GC_Track} and
1179\cfunction{PyObject_GC_Fini} to \cfunction{PyObject_GC_UnTrack}.
1180
1181\item Remove \cfunction{PyGC_HEAD_SIZE} from object size calculations.
1182
1183\item Remove calls to \cfunction{PyObject_AS_GC} and \cfunction{PyObject_FROM_GC}.
1184
1185\end{itemize}
1186
Andrew M. Kuchling517109b2002-05-07 21:01:16 +00001187\item Python can now optionally be built as a shared library
1188(\file{libpython2.3.so}) by supplying \longprogramopt{enable-shared}
Andrew M. Kuchlingfad2f592002-05-10 21:00:05 +00001189when running Python's \file{configure} script. (Contributed by Ondrej
1190Palkovsky.)
Andrew M. Kuchlingf4dd65d2002-04-01 19:28:09 +00001191
Michael W. Hudsondd32a912002-08-15 14:59:02 +00001192\item The \csimplemacro{DL_EXPORT} and \csimplemacro{DL_IMPORT} macros
1193are now deprecated. Initialization functions for Python extension
1194modules should now be declared using the new macro
Andrew M. Kuchling3c305d92002-07-22 18:50:11 +00001195\csimplemacro{PyMODINIT_FUNC}, while the Python core will generally
1196use the \csimplemacro{PyAPI_FUNC} and \csimplemacro{PyAPI_DATA}
1197macros.
Neal Norwitzbba23a82002-07-22 13:18:59 +00001198
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +00001199\item The interpreter can be compiled without any docstrings for
1200the built-in functions and modules by supplying
Andrew M. Kuchling20e5abc2002-07-11 20:50:34 +00001201\longprogramopt{without-doc-strings} to the \file{configure} script.
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +00001202This makes the Python executable about 10\% smaller, but will also
1203mean that you can't get help for Python's built-ins. (Contributed by
1204Gustavo Niemeyer.)
1205
Andrew M. Kuchling20e5abc2002-07-11 20:50:34 +00001206\item The cycle detection implementation used by the garbage collection
1207has proven to be stable, so it's now being made mandatory; you can no
1208longer compile Python without it, and the
1209\longprogramopt{with-cycle-gc} switch to \file{configure} has been removed.
1210
Andrew M. Kuchling517109b2002-05-07 21:01:16 +00001211\item The \cfunction{PyArg_NoArgs()} macro is now deprecated, and code
Andrew M. Kuchling7845e7c2002-07-11 19:27:46 +00001212that uses it should be changed. For Python 2.2 and later, the method
1213definition table can specify the
1214\constant{METH_NOARGS} flag, signalling that there are no arguments, and
1215the argument checking can then be removed. If compatibility with
1216pre-2.2 versions of Python is important, the code could use
1217\code{PyArg_ParseTuple(args, "")} instead, but this will be slower
1218than using \constant{METH_NOARGS}.
Andrew M. Kuchling03594bb2002-03-27 02:29:48 +00001219
Andrew M. Kuchling517109b2002-05-07 21:01:16 +00001220\item A new function, \cfunction{PyObject_DelItemString(\var{mapping},
1221char *\var{key})} was added
1222as shorthand for
1223\code{PyObject_DelItem(\var{mapping}, PyString_New(\var{key})}.
Andrew M. Kuchling03594bb2002-03-27 02:29:48 +00001224
Andrew M. Kuchling517109b2002-05-07 21:01:16 +00001225\item File objects now manage their internal string buffer
1226differently by increasing it exponentially when needed.
1227This results in the benchmark tests in \file{Lib/test/test_bufio.py}
1228speeding up from 57 seconds to 1.7 seconds, according to one
1229measurement.
1230
Andrew M. Kuchling72b58e02002-05-29 17:30:34 +00001231\item It's now possible to define class and static methods for a C
1232extension type by setting either the \constant{METH_CLASS} or
1233\constant{METH_STATIC} flags in a method's \ctype{PyMethodDef}
1234structure.
Andrew M. Kuchling45afd542002-04-02 14:25:25 +00001235
Andrew M. Kuchling346386f2002-07-12 20:24:42 +00001236\item Python now includes a copy of the Expat XML parser's source code,
1237removing any dependence on a system version or local installation of
1238Expat.
1239
Andrew M. Kuchling821013e2002-05-06 17:46:39 +00001240\end{itemize}
1241
1242\subsection{Port-Specific Changes}
1243
Andrew M. Kuchling187b1d82002-05-29 19:20:57 +00001244Support for a port to IBM's OS/2 using the EMX runtime environment was
1245merged into the main Python source tree. EMX is a POSIX emulation
1246layer over the OS/2 system APIs. The Python port for EMX tries to
1247support all the POSIX-like capability exposed by the EMX runtime, and
1248mostly succeeds; \function{fork()} and \function{fcntl()} are
1249restricted by the limitations of the underlying emulation layer. The
1250standard OS/2 port, which uses IBM's Visual Age compiler, also gained
1251support for case-sensitive import semantics as part of the integration
1252of the EMX port into CVS. (Contributed by Andrew MacIntyre.)
Andrew M. Kuchling03594bb2002-03-27 02:29:48 +00001253
Andrew M. Kuchling72b58e02002-05-29 17:30:34 +00001254On MacOS, most toolbox modules have been weaklinked to improve
1255backward compatibility. This means that modules will no longer fail
1256to load if a single routine is missing on the curent OS version.
Andrew M. Kuchling187b1d82002-05-29 19:20:57 +00001257Instead calling the missing routine will raise an exception.
1258(Contributed by Jack Jansen.)
Andrew M. Kuchling03594bb2002-03-27 02:29:48 +00001259
Andrew M. Kuchling187b1d82002-05-29 19:20:57 +00001260The RPM spec files, found in the \file{Misc/RPM/} directory in the
1261Python source distribution, were updated for 2.3. (Contributed by
1262Sean Reifschneider.)
Fred Drake03e10312002-03-26 19:17:43 +00001263
Andrew M. Kuchling20e5abc2002-07-11 20:50:34 +00001264Python now supports AtheOS (\url{www.atheos.cx}) and GNU/Hurd.
1265
Fred Drake03e10312002-03-26 19:17:43 +00001266
1267%======================================================================
Andrew M. Kuchling517109b2002-05-07 21:01:16 +00001268\section{Other Changes and Fixes}
1269
1270Finally, there are various miscellaneous fixes:
1271
1272\begin{itemize}
1273
1274\item The tools used to build the documentation now work under Cygwin
1275as well as \UNIX.
1276
Michael W. Hudsondd32a912002-08-15 14:59:02 +00001277\item The \code{SET_LINENO} opcode has been removed. Back in the
1278mists of time, this opcode was needed to produce line numbers in
1279tracebacks and support trace functions (for, e.g., \module{pdb}).
1280Since Python 1.5, the line numbers in tracebacks have been computed
1281using a different mechanism that works with ``python -O''. For Python
12822.3 Michael Hudson implemented a similar scheme to determine when to
1283call the trace function, removing the need for \code{SET_LINENO}
1284entirely.
1285
1286Python code will be hard pushed to notice a difference from this
1287change, apart from a slight speed up when python is run without
1288\programopt{-O}.
1289
1290C extensions that access the \member{f_lineno} field of frame objects
1291should instead call \code{PyCode_Addr2Line(f->f_code, f->f_lasti)}.
1292This will have the added effect of making the code work as desired
1293under ``python -O'' in earlier versions of Python.
1294
Andrew M. Kuchling517109b2002-05-07 21:01:16 +00001295\end{itemize}
1296
Andrew M. Kuchling187b1d82002-05-29 19:20:57 +00001297
Andrew M. Kuchling517109b2002-05-07 21:01:16 +00001298%======================================================================
Andrew M. Kuchling950725f2002-08-06 01:40:48 +00001299\section{Porting to Python 2.3}
1300
1301XXX write this
1302
1303
1304%======================================================================
Fred Drake03e10312002-03-26 19:17:43 +00001305\section{Acknowledgements \label{acks}}
1306
Andrew M. Kuchling03594bb2002-03-27 02:29:48 +00001307The author would like to thank the following people for offering
1308suggestions, corrections and assistance with various drafts of this
Andrew M. Kuchling7f147a72002-06-10 18:58:19 +00001309article: Michael Chermside, Scott David Daniels, Fred~L. Drake, Jr.,
Andrew M. Kuchling7845e7c2002-07-11 19:27:46 +00001310Michael Hudson, Detlef Lannert, Martin von L\"owis, Andrew MacIntyre,
Andrew M. Kuchling83992482002-10-10 11:31:48 +00001311Lalo Martins, Gustavo Niemeyer, Neal Norwitz, Jason Tishler.
Fred Drake03e10312002-03-26 19:17:43 +00001312
1313\end{document}