blob: 283a66be8050ca8f65556154febe75bb6c7d56a5 [file] [log] [blame]
Fred Drake03e10312002-03-26 19:17:43 +00001\documentclass{howto}
Andrew M. Kuchling03594bb2002-03-27 02:29:48 +00002% $Id$
3
Andrew M. Kuchling517109b2002-05-07 21:01:16 +00004% TODO:
5% Go through and get the contributor's name for all the various changes
6
Andrew M. Kuchling03594bb2002-03-27 02:29:48 +00007\title{What's New in Python 2.3}
Andrew M. Kuchling20e5abc2002-07-11 20:50:34 +00008\release{0.03}
Andrew M. Kuchling03594bb2002-03-27 02:29:48 +00009\author{A.M. Kuchling}
10\authoraddress{\email{akuchlin@mems-exchange.org}}
Fred Drake03e10312002-03-26 19:17:43 +000011
12\begin{document}
13\maketitle
14\tableofcontents
15
Andrew M. Kuchlingf70a0a82002-06-10 13:22:46 +000016% Optik (or whatever it gets called)
17%
Andrew M. Kuchling346386f2002-07-12 20:24:42 +000018% Bug #580462: changes to GC API
Andrew M. Kuchling9f6e1042002-06-17 13:40:04 +000019%
Andrew M. Kuchlingc61ec522002-08-04 01:20:05 +000020% heapq module
21%
22% MacOS framework-related changes (section of its own, probably)
23%
Andrew M. Kuchlingf70a0a82002-06-10 13:22:46 +000024
Andrew M. Kuchling03594bb2002-03-27 02:29:48 +000025%\section{Introduction \label{intro}}
26
27{\large This article is a draft, and is currently up to date for some
Andrew M. Kuchling20e5abc2002-07-11 20:50:34 +000028random version of the CVS tree around mid-July 2002. Please send any
Andrew M. Kuchling03594bb2002-03-27 02:29:48 +000029additions, comments or errata to the author.}
30
31This article explains the new features in Python 2.3. The tentative
Andrew M. Kuchling20e5abc2002-07-11 20:50:34 +000032release date of Python 2.3 is currently scheduled for some undefined
33time before the end of 2002.
Andrew M. Kuchling03594bb2002-03-27 02:29:48 +000034
35This article doesn't attempt to provide a complete specification of
36the new features, but instead provides a convenient overview. For
37full details, you should refer to the documentation for Python 2.3,
38such as the
39\citetitle[http://www.python.org/doc/2.3/lib/lib.html]{Python Library
40Reference} and the
41\citetitle[http://www.python.org/doc/2.3/ref/ref.html]{Python
42Reference Manual}. If you want to understand the complete
43implementation and design rationale for a change, refer to the PEP for
44a particular new feature.
Fred Drake03e10312002-03-26 19:17:43 +000045
46
Andrew M. Kuchling03594bb2002-03-27 02:29:48 +000047%======================================================================
Andrew M. Kuchling517109b2002-05-07 21:01:16 +000048\section{PEP 255: Simple Generators\label{section-generators}}
Andrew M. Kuchlingf4dd65d2002-04-01 19:28:09 +000049
50In Python 2.2, generators were added as an optional feature, to be
51enabled by a \code{from __future__ import generators} directive. In
522.3 generators no longer need to be specially enabled, and are now
53always present; this means that \keyword{yield} is now always a
54keyword. The rest of this section is a copy of the description of
55generators from the ``What's New in Python 2.2'' document; if you read
56it when 2.2 came out, you can skip the rest of this section.
57
Andrew M. Kuchling517109b2002-05-07 21:01:16 +000058You're doubtless familiar with how function calls work in Python or C.
59When you call a function, it gets a private namespace where its local
Andrew M. Kuchlingf4dd65d2002-04-01 19:28:09 +000060variables are created. When the function reaches a \keyword{return}
61statement, the local variables are destroyed and the resulting value
62is returned to the caller. A later call to the same function will get
Andrew M. Kuchling517109b2002-05-07 21:01:16 +000063a fresh new set of local variables. But, what if the local variables
Andrew M. Kuchlingf4dd65d2002-04-01 19:28:09 +000064weren't thrown away on exiting a function? What if you could later
65resume the function where it left off? This is what generators
66provide; they can be thought of as resumable functions.
67
68Here's the simplest example of a generator function:
69
70\begin{verbatim}
71def generate_ints(N):
72 for i in range(N):
73 yield i
74\end{verbatim}
75
76A new keyword, \keyword{yield}, was introduced for generators. Any
77function containing a \keyword{yield} statement is a generator
78function; this is detected by Python's bytecode compiler which
79compiles the function specially as a result.
80
81When you call a generator function, it doesn't return a single value;
82instead it returns a generator object that supports the iterator
83protocol. On executing the \keyword{yield} statement, the generator
84outputs the value of \code{i}, similar to a \keyword{return}
85statement. The big difference between \keyword{yield} and a
86\keyword{return} statement is that on reaching a \keyword{yield} the
87generator's state of execution is suspended and local variables are
88preserved. On the next call to the generator's \code{.next()} method,
89the function will resume executing immediately after the
90\keyword{yield} statement. (For complicated reasons, the
91\keyword{yield} statement isn't allowed inside the \keyword{try} block
92of a \code{try...finally} statement; read \pep{255} for a full
93explanation of the interaction between \keyword{yield} and
94exceptions.)
95
96Here's a sample usage of the \function{generate_ints} generator:
97
98\begin{verbatim}
99>>> gen = generate_ints(3)
100>>> gen
101<generator object at 0x8117f90>
102>>> gen.next()
1030
104>>> gen.next()
1051
106>>> gen.next()
1072
108>>> gen.next()
109Traceback (most recent call last):
Andrew M. Kuchling9f6e1042002-06-17 13:40:04 +0000110 File "stdin", line 1, in ?
111 File "stdin", line 2, in generate_ints
Andrew M. Kuchlingf4dd65d2002-04-01 19:28:09 +0000112StopIteration
113\end{verbatim}
114
115You could equally write \code{for i in generate_ints(5)}, or
116\code{a,b,c = generate_ints(3)}.
117
118Inside a generator function, the \keyword{return} statement can only
119be used without a value, and signals the end of the procession of
120values; afterwards the generator cannot return any further values.
121\keyword{return} with a value, such as \code{return 5}, is a syntax
122error inside a generator function. The end of the generator's results
123can also be indicated by raising \exception{StopIteration} manually,
124or by just letting the flow of execution fall off the bottom of the
125function.
126
127You could achieve the effect of generators manually by writing your
128own class and storing all the local variables of the generator as
129instance variables. For example, returning a list of integers could
130be done by setting \code{self.count} to 0, and having the
131\method{next()} method increment \code{self.count} and return it.
132However, for a moderately complicated generator, writing a
133corresponding class would be much messier.
134\file{Lib/test/test_generators.py} contains a number of more
135interesting examples. The simplest one implements an in-order
136traversal of a tree using generators recursively.
137
138\begin{verbatim}
139# A recursive generator that generates Tree leaves in in-order.
140def inorder(t):
141 if t:
142 for x in inorder(t.left):
143 yield x
144 yield t.label
145 for x in inorder(t.right):
146 yield x
147\end{verbatim}
148
149Two other examples in \file{Lib/test/test_generators.py} produce
150solutions for the N-Queens problem (placing $N$ queens on an $NxN$
151chess board so that no queen threatens another) and the Knight's Tour
152(a route that takes a knight to every square of an $NxN$ chessboard
153without visiting any square twice).
154
155The idea of generators comes from other programming languages,
156especially Icon (\url{http://www.cs.arizona.edu/icon/}), where the
157idea of generators is central. In Icon, every
158expression and function call behaves like a generator. One example
159from ``An Overview of the Icon Programming Language'' at
160\url{http://www.cs.arizona.edu/icon/docs/ipd266.htm} gives an idea of
161what this looks like:
162
163\begin{verbatim}
164sentence := "Store it in the neighboring harbor"
165if (i := find("or", sentence)) > 5 then write(i)
166\end{verbatim}
167
168In Icon the \function{find()} function returns the indexes at which the
169substring ``or'' is found: 3, 23, 33. In the \keyword{if} statement,
170\code{i} is first assigned a value of 3, but 3 is less than 5, so the
171comparison fails, and Icon retries it with the second value of 23. 23
172is greater than 5, so the comparison now succeeds, and the code prints
173the value 23 to the screen.
174
175Python doesn't go nearly as far as Icon in adopting generators as a
176central concept. Generators are considered a new part of the core
177Python language, but learning or using them isn't compulsory; if they
178don't solve any problems that you have, feel free to ignore them.
179One novel feature of Python's interface as compared to
180Icon's is that a generator's state is represented as a concrete object
181(the iterator) that can be passed around to other functions or stored
182in a data structure.
183
184\begin{seealso}
185
186\seepep{255}{Simple Generators}{Written by Neil Schemenauer, Tim
187Peters, Magnus Lie Hetland. Implemented mostly by Neil Schemenauer
188and Tim Peters, with other fixes from the Python Labs crew.}
189
190\end{seealso}
191
192
193%======================================================================
Andrew M. Kuchlingf3676512002-04-15 02:27:55 +0000194\section{PEP 278: Universal Newline Support}
195
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000196The three major operating systems used today are Microsoft Windows,
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000197Apple's Macintosh OS, and the various \UNIX\ derivatives. A minor
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000198irritation is that these three platforms all use different characters
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000199to mark the ends of lines in text files. \UNIX\ uses character 10,
200the ASCII linefeed, while MacOS uses character 13, the ASCII carriage
201return, and Windows uses a two-character sequence of a carriage return
202plus a newline.
Andrew M. Kuchlingf3676512002-04-15 02:27:55 +0000203
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000204Python's file objects can now support end of line conventions other
205than the one followed by the platform on which Python is running.
206Opening a file with the mode \samp{U} or \samp{rU} will open a file
207for reading in universal newline mode. All three line ending
208conventions will be translated to a \samp{\e n} in the strings
209returned by the various file methods such as \method{read()} and
210\method{readline()}.
Andrew M. Kuchlingf3676512002-04-15 02:27:55 +0000211
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000212Universal newline support is also used when importing modules and when
213executing a file with the \function{execfile()} function. This means
214that Python modules can be shared between all three operating systems
215without needing to convert the line-endings.
216
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000217This feature can be disabled at compile-time by specifying
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000218\longprogramopt{without-universal-newlines} when running Python's
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000219\file{configure} script.
Andrew M. Kuchlingf3676512002-04-15 02:27:55 +0000220
221\begin{seealso}
222
223\seepep{278}{Universal Newline Support}{Written
224and implemented by Jack Jansen.}
225
226\end{seealso}
227
Andrew M. Kuchlingfad2f592002-05-10 21:00:05 +0000228
229%======================================================================
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +0000230\section{PEP 279: The \function{enumerate()} Built-in Function\label{section-enumerate}}
Andrew M. Kuchlingfad2f592002-05-10 21:00:05 +0000231
232A new built-in function, \function{enumerate()}, will make
233certain loops a bit clearer. \code{enumerate(thing)}, where
234\var{thing} is either an iterator or a sequence, returns a iterator
235that will return \code{(0, \var{thing[0]})}, \code{(1,
236\var{thing[1]})}, \code{(2, \var{thing[2]})}, and so forth. Fairly
237often you'll see code to change every element of a list that looks
238like this:
239
240\begin{verbatim}
241for i in range(len(L)):
242 item = L[i]
243 # ... compute some result based on item ...
244 L[i] = result
245\end{verbatim}
246
247This can be rewritten using \function{enumerate()} as:
248
249\begin{verbatim}
250for i, item in enumerate(L):
251 # ... compute some result based on item ...
252 L[i] = result
253\end{verbatim}
254
255
256\begin{seealso}
257
258\seepep{279}{The enumerate() built-in function}{Written
259by Raymond D. Hettinger.}
260
261\end{seealso}
262
263
Andrew M. Kuchlingf3676512002-04-15 02:27:55 +0000264%======================================================================
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000265\section{PEP 285: The \class{bool} Type\label{section-bool}}
266
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000267A Boolean type was added to Python 2.3. Two new constants were added
268to the \module{__builtin__} module, \constant{True} and
269\constant{False}. The type object for this new type is named
270\class{bool}; the constructor for it takes any Python value and
271converts it to \constant{True} or \constant{False}.
272
273\begin{verbatim}
274>>> bool(1)
275True
276>>> bool(0)
277False
278>>> bool([])
279False
280>>> bool( (1,) )
281True
282\end{verbatim}
283
284Most of the standard library modules and built-in functions have been
285changed to return Booleans.
286
287\begin{verbatim}
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000288>>> obj = []
289>>> hasattr(obj, 'append')
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000290True
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000291>>> isinstance(obj, list)
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000292True
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000293>>> isinstance(obj, tuple)
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000294False
295\end{verbatim}
296
297Python's Booleans were added with the primary goal of making code
298clearer. For example, if you're reading a function and encounter the
299statement \code{return 1}, you might wonder whether the \samp{1}
300represents a truth value, or whether it's an index, or whether it's a
301coefficient that multiplies some other quantity. If the statement is
302\code{return True}, however, the meaning of the return value is quite
303clearly a truth value.
304
305Python's Booleans were not added for the sake of strict type-checking.
Andrew M. Kuchlinga2a206b2002-05-24 21:08:58 +0000306A very strict language such as Pascal would also prevent you
307performing arithmetic with Booleans, and would require that the
308expression in an \keyword{if} statement always evaluate to a Boolean.
309Python is not this strict, and it never will be. (\pep{285}
310explicitly says so.) So you can still use any expression in an
311\keyword{if}, even ones that evaluate to a list or tuple or some
312random object, and the Boolean type is a subclass of the
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000313\class{int} class, so arithmetic using a Boolean still works.
314
315\begin{verbatim}
316>>> True + 1
3172
318>>> False + 1
3191
320>>> False * 75
3210
322>>> True * 75
32375
324\end{verbatim}
325
326To sum up \constant{True} and \constant{False} in a sentence: they're
327alternative ways to spell the integer values 1 and 0, with the single
328difference that \function{str()} and \function{repr()} return the
329strings \samp{True} and \samp{False} instead of \samp{1} and \samp{0}.
Andrew M. Kuchling3a52ff62002-04-03 22:44:47 +0000330
331\begin{seealso}
332
333\seepep{285}{Adding a bool type}{Written and implemented by GvR.}
334
335\end{seealso}
336
Michael W. Hudson5efaf7e2002-06-11 10:55:12 +0000337
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +0000338\section{Extended Slices\label{section-slices}}
Michael W. Hudson5efaf7e2002-06-11 10:55:12 +0000339
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +0000340Ever since Python 1.4, the slicing syntax has supported an optional
341third ``step'' or ``stride'' argument. For example, these are all
342legal Python syntax: \code{L[1:10:2]}, \code{L[:-1:1]},
343\code{L[::-1]}. This was added to Python included at the request of
344the developers of Numerical Python. However, the built-in sequence
345types of lists, tuples, and strings have never supported this feature,
346and you got a \exception{TypeError} if you tried it. Michael Hudson
347contributed a patch that was applied to Python 2.3 and fixed this
348shortcoming.
349
350For example, you can now easily extract the elements of a list that
351have even indexes:
Fred Drakedf872a22002-07-03 12:02:01 +0000352
353\begin{verbatim}
354>>> L = range(10)
355>>> L[::2]
356[0, 2, 4, 6, 8]
357\end{verbatim}
358
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +0000359Negative values also work, so you can make a copy of the same list in
360reverse order:
Fred Drakedf872a22002-07-03 12:02:01 +0000361
362\begin{verbatim}
363>>> L[::-1]
364[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
365\end{verbatim}
Andrew M. Kuchling3a52ff62002-04-03 22:44:47 +0000366
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +0000367This also works for strings:
368
369\begin{verbatim}
370>>> s='abcd'
371>>> s[::2]
372'ac'
373>>> s[::-1]
374'dcba'
375\end{verbatim}
376
Michael W. Hudson4da01ed2002-07-19 15:48:56 +0000377as well as tuples and arrays.
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +0000378
Michael W. Hudson4da01ed2002-07-19 15:48:56 +0000379If you have a mutable sequence (i.e. a list or an array) you can
380assign to or delete an extended slice, but there are some differences
381in assignment to extended and regular slices. Assignment to a regular
382slice can be used to change the length of the sequence:
383
384\begin{verbatim}
385>>> a = range(3)
386>>> a
387[0, 1, 2]
388>>> a[1:3] = [4, 5, 6]
389>>> a
390[0, 4, 5, 6]
391\end{verbatim}
392
393but when assigning to an extended slice the list on the right hand
394side of the statement must contain the same number of items as the
395slice it is replacing:
396
397\begin{verbatim}
398>>> a = range(4)
399>>> a
400[0, 1, 2, 3]
401>>> a[::2]
402[0, 2]
403>>> a[::2] = range(0, -2, -1)
404>>> a
405[0, 1, -1, 3]
406>>> a[::2] = range(3)
407Traceback (most recent call last):
408 File "<stdin>", line 1, in ?
409ValueError: attempt to assign list of size 3 to extended slice of size 2
410\end{verbatim}
411
412Deletion is more straightforward:
413
414\begin{verbatim}
415>>> a = range(4)
416>>> a[::2]
417[0, 2]
418>>> del a[::2]
419>>> a
420[1, 3]
421\end{verbatim}
422
423One can also now pass slice objects to builtin sequences
424\method{__getitem__} methods:
425
426\begin{verbatim}
427>>> range(10).__getitem__(slice(0, 5, 2))
428[0, 2, 4]
429\end{verbatim}
430
431or use them directly in subscripts:
432
433\begin{verbatim}
434>>> range(10)[slice(0, 5, 2)]
435[0, 2, 4]
436\end{verbatim}
437
438To make implementing sequences that support extended slicing in Python
439easier, slice ojects now have a method \method{indices} which given
440the length of a sequence returns \code{(start, stop, step)} handling
441omitted and out-of-bounds indices in a manner consistent with regular
442slices (and this innocuous phrase hides a welter of confusing
443details!). The method is intended to be used like this:
444
445\begin{verbatim}
446class FakeSeq:
447 ...
448 def calc_item(self, i):
449 ...
450 def __getitem__(self, item):
451 if isinstance(item, slice):
452 return FakeSeq([self.calc_item(i)
453 in range(*item.indices(len(self)))])
454 else:
455 return self.calc_item(i)
456\end{verbatim}
457
458From this example you can also see that the builtin ``\var{slice}''
459object is now the type of slice objects, not a function (so is now
460consistent with \var{int}, \var{str}, etc from 2.2).
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +0000461
Andrew M. Kuchling3a52ff62002-04-03 22:44:47 +0000462%======================================================================
Fred Drakedf872a22002-07-03 12:02:01 +0000463\section{Other Language Changes}
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000464
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +0000465Here are all of the changes that Python 2.3 makes to the core Python
466language.
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000467
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +0000468\begin{itemize}
469\item The \keyword{yield} statement is now always a keyword, as
470described in section~\ref{section-generators} of this document.
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000471
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +0000472\item A new built-in function \function{enumerate()}
473was added, as described in section~\ref{section-enumerate} of this
474document.
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000475
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +0000476\item Two new constants, \constant{True} and \constant{False} were
477added along with the built-in \class{bool} type, as described in
478section~\ref{section-bool} of this document.
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000479
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +0000480\item Built-in types now support the extended slicing syntax,
481as described in section~\ref{section-slices} of this document.
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000482
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +0000483\item Dictionaries have a new method, \method{pop(\var{key})}, that
484returns the value corresponding to \var{key} and removes that
485key/value pair from the dictionary. \method{pop()} will raise a
486\exception{KeyError} if the requested key isn't present in the
487dictionary:
488
489\begin{verbatim}
490>>> d = {1:2}
491>>> d
492{1: 2}
493>>> d.pop(4)
494Traceback (most recent call last):
495 File ``stdin'', line 1, in ?
496KeyError: 4
497>>> d.pop(1)
4982
499>>> d.pop(1)
500Traceback (most recent call last):
501 File ``stdin'', line 1, in ?
502KeyError: pop(): dictionary is empty
503>>> d
504{}
505>>>
506\end{verbatim}
507
508(Patch contributed by Raymond Hettinger.)
509
510\item The \method{strip()}, \method{lstrip()}, and \method{rstrip()}
511string methods now have an optional argument for specifying the
512characters to strip. The default is still to remove all whitespace
513characters:
514
515\begin{verbatim}
516>>> ' abc '.strip()
517'abc'
518>>> '><><abc<><><>'.strip('<>')
519'abc'
520>>> '><><abc<><><>\n'.strip('<>')
521'abc<><><>\n'
522>>> u'\u4000\u4001abc\u4000'.strip(u'\u4000')
523u'\u4001abc'
524>>>
525\end{verbatim}
526
Andrew M. Kuchling346386f2002-07-12 20:24:42 +0000527(Contributed by Simon Brunning.)
528
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +0000529\item The \method{startswith()} and \method{endswith()}
530string methods now accept negative numbers for the start and end
531parameters.
532
533\item Another new string method is \method{zfill()}, originally a
534function in the \module{string} module. \method{zfill()} pads a
535numeric string with zeros on the left until it's the specified width.
536Note that the \code{\%} operator is still more flexible and powerful
537than \method{zfill()}.
538
539\begin{verbatim}
540>>> '45'.zfill(4)
541'0045'
542>>> '12345'.zfill(4)
543'12345'
544>>> 'goofy'.zfill(6)
545'0goofy'
546\end{verbatim}
547
Andrew M. Kuchling346386f2002-07-12 20:24:42 +0000548(Contributed by Walter D\"orwald.)
549
550\item The \keyword{assert} statement no longer checks the \code{__debug__}
551flag, so you can no longer disable assertions by assigning to \code{__debug__}.
552Running Python with the \programopt{-O} switch will still generate
553code that doesn't execute any assertions.
554
Andrew M. Kuchling20e5abc2002-07-11 20:50:34 +0000555\item A new type object, \class{basestring}, has been added.
556 Both 8-bit strings and Unicode strings inherit from this type, so
557 \code{isinstance(obj, basestring)} will return \constant{True} for
558 either kind of string. It's a completely abstract type, so you
559 can't create \class{basestring} instances.
560
561\item Most type objects are now callable, so you can use them
562to create new objects such as functions, classes, and modules. (This
563means that the \module{new} module can be deprecated in a future
564Python version, because you can now use the type objects available
565in the \module{types} module.)
566% XXX should new.py use PendingDeprecationWarning?
567For example, you can create a new module object with the following code:
568
569\begin{verbatim}
570>>> import types
571>>> m = types.ModuleType('abc','docstring')
572>>> m
573<module 'abc' (built-in)>
574>>> m.__doc__
575'docstring'
576\end{verbatim}
577
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +0000578\item
Fred Drakedf872a22002-07-03 12:02:01 +0000579A new warning, \exception{PendingDeprecationWarning} was added to
Andrew M. Kuchling20e5abc2002-07-11 20:50:34 +0000580indicate features which are in the process of being
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +0000581deprecated. The warning will \emph{not} be printed by default. To
582check for use of features that will be deprecated in the future,
583supply \programopt{-Walways::PendingDeprecationWarning::} on the
584command line or use \function{warnings.filterwarnings()}.
585
586\item One minor but far-reaching change is that the names of extension
587types defined by the modules included with Python now contain the
588module and a \samp{.} in front of the type name. For example, in
589Python 2.2, if you created a socket and printed its
590\member{__class__}, you'd get this output:
591
592\begin{verbatim}
593>>> s = socket.socket()
594>>> s.__class__
595<type 'socket'>
596\end{verbatim}
597
598In 2.3, you get this:
599\begin{verbatim}
600>>> s.__class__
601<type '_socket.socket'>
602\end{verbatim}
603
604\end{itemize}
Neal Norwitzd68f5172002-05-29 15:54:55 +0000605
606
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000607%======================================================================
Andrew M. Kuchling03594bb2002-03-27 02:29:48 +0000608\section{New and Improved Modules}
609
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000610As usual, Python's standard modules had a number of enhancements and
Andrew M. Kuchlinga982eb12002-07-22 18:57:36 +0000611bug fixes. Here's a partial list of the most notable changes, sorted
612alphabetically by module name. Consult the
613\file{Misc/NEWS} file in the source tree for a more
614complete list of changes, or look through the CVS logs for all the
615details.
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000616
617\begin{itemize}
618
Andrew M. Kuchlinga982eb12002-07-22 18:57:36 +0000619\item The \module{array} module now supports arrays of Unicode
620characters using the \samp{u} format character. Arrays also now
621support using the \code{+=} assignment operator to add another array's
622contents, and the \code{*=} assignment operator to repeat an array.
623(Contributed by Jason Orendorff.)
624
625\item The Distutils \class{Extension} class now supports
626an extra constructor argument named \samp{depends} for listing
627additional source files that an extension depends on. This lets
628Distutils recompile the module if any of the dependency files are
629modified. For example, if \samp{sampmodule.c} includes the header
630file \file{sample.h}, you would create the \class{Extension} object like
631this:
632
633\begin{verbatim}
634ext = Extension("samp",
635 sources=["sampmodule.c"],
636 depends=["sample.h"])
637\end{verbatim}
638
639Modifying \file{sample.h} would then cause the module to be recompiled.
640(Contributed by Jeremy Hylton.)
641
642\item Two new binary packagers were added to the Distutils.
643\code{bdist_pkgtool} builds \file{.pkg} files to use with Solaris
644\program{pkgtool}, and \code{bdist_sdux} builds \program{swinstall}
645packages for use on HP-UX.
646An abstract binary packager class,
647\module{distutils.command.bdist_packager}, was added; this may make it
648easier to write binary packaging commands. (Contributed by Mark
649Alexander.)
650
651\item The \module{getopt} module gained a new function,
652\function{gnu_getopt()}, that supports the same arguments as the existing
653\function{getopt()} function but uses GNU-style scanning mode.
654The existing \function{getopt()} stops processing options as soon as a
655non-option argument is encountered, but in GNU-style mode processing
656continues, meaning that options and arguments can be mixed. For
657example:
658
659\begin{verbatim}
660>>> getopt.getopt(['-f', 'filename', 'output', '-v'], 'f:v')
661([('-f', 'filename')], ['output', '-v'])
662>>> getopt.gnu_getopt(['-f', 'filename', 'output', '-v'], 'f:v')
663([('-f', 'filename'), ('-v', '')], ['output'])
664\end{verbatim}
665
666(Contributed by Peter \AA{strand}.)
667
668\item The \module{grp}, \module{pwd}, and \module{resource} modules
669now return enhanced tuples:
670
671\begin{verbatim}
672>>> import grp
673>>> g = grp.getgrnam('amk')
674>>> g.gr_name, g.gr_gid
675('amk', 500)
676\end{verbatim}
677
678
679\item Two new functions in the \module{math} module,
680\function{degrees(\var{rads})} and \function{radians(\var{degs})},
681convert between radians and degrees. Other functions in the
682\module{math} module such as
683\function{math.sin()} and \function{math.cos()} have always required
684input values measured in radians. (Contributed by Raymond Hettinger.)
685
Andrew M. Kuchling52f1b762002-07-28 20:29:03 +0000686\item Four new functions, \function{getpgid()}, \function{killpg()}, \function{lchown()}, and \function{mknod()}, were added to the \module{posix} module that
Andrew M. Kuchlinga982eb12002-07-22 18:57:36 +0000687underlies the \module{os} module. (Contributed by Gustavo Niemeyer
688and Geert Jansen.)
689
690\item The parser objects provided by the \module{pyexpat} module
691can now optionally buffer character data, resulting in fewer calls to
692your character data handler and therefore faster performance. Setting
693the parser object's \member{buffer_text} attribute to \constant{True}
694will enable buffering.
695
696\item The \module{readline} module also gained a number of new
697functions: \function{get_history_item()},
698\function{get_current_history_length()}, and \function{redisplay()}.
699
700\item Support for more advanced POSIX signal handling was added
701to the \module{signal} module by adding the \function{sigpending},
702\function{sigprocmask} and \function{sigsuspend} functions, where supported
703by the platform. These functions make it possible to avoid some previously
704unavoidable race conditions.
705
706\item The \module{socket} module now supports timeouts. You
707can call the \method{settimeout(\var{t})} method on a socket object to
708set a timeout of \var{t} seconds. Subsequent socket operations that
709take longer than \var{t} seconds to complete will abort and raise a
710\exception{socket.error} exception.
711
712The original timeout implementation was by Tim O'Malley. Michael
713Gilfix integrated it into the Python \module{socket} module, after the
714patch had undergone a lengthy review. After it was checked in, Guido
715van~Rossum rewrote parts of it. This is a good example of the free
716software development process in action.
717
Andrew M. Kuchling20e5abc2002-07-11 20:50:34 +0000718\item The new \module{textwrap} module contains functions for wrapping
Andrew M. Kuchlingd003a2a2002-06-26 13:23:55 +0000719strings containing paragraphs of text. The \function{wrap(\var{text},
720\var{width})} function takes a string and returns a list containing
721the text split into lines of no more than the chosen width. The
722\function{fill(\var{text}, \var{width})} function returns a single
723string, reformatted to fit into lines no longer than the chosen width.
724(As you can guess, \function{fill()} is built on top of
725\function{wrap()}. For example:
726
727\begin{verbatim}
728>>> import textwrap
729>>> paragraph = "Not a whit, we defy augury: ... more text ..."
730>>> textwrap.wrap(paragraph, 60)
731["Not a whit, we defy augury: there's a special providence in",
732 "the fall of a sparrow. If it be now, 'tis not to come; if it",
733 ...]
734>>> print textwrap.fill(paragraph, 35)
735Not a whit, we defy augury: there's
736a special providence in the fall of
737a sparrow. If it be now, 'tis not
738to come; if it be not to come, it
739will be now; if it be not now, yet
740it will come: the readiness is all.
741>>>
742\end{verbatim}
743
744The module also contains a \class{TextWrapper} class that actually
745implements the text wrapping strategy. Both the
746\class{TextWrapper} class and the \function{wrap()} and
747\function{fill()} functions support a number of additional keyword
748arguments for fine-tuning the formatting; consult the module's
749documentation for details.
750% XXX add a link to the module docs?
751(Contributed by Greg Ward.)
752
Andrew M. Kuchlingef5d06b2002-07-22 19:21:06 +0000753\item The \module{time} module's \function{strptime()} function has
754long been an annoyance because it uses the platform C library's
755\function{strptime()} implementation, and different platforms
756sometimes have odd bugs. Brett Cannon contributed a portable
757implementation that's written in pure Python, which should behave
758identically on all platforms.
759
Andrew M. Kuchling20e5abc2002-07-11 20:50:34 +0000760\item The DOM implementation
761in \module{xml.dom.minidom} can now generate XML output in a
762particular encoding, by specifying an optional encoding argument to
763the \method{toxml()} and \method{toprettyxml()} methods of DOM nodes.
764
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000765\end{itemize}
766
Andrew M. Kuchling03594bb2002-03-27 02:29:48 +0000767
Andrew M. Kuchlingef5d06b2002-07-22 19:21:06 +0000768%======================================================================
769\section{Specialized Object Allocator (pymalloc)\label{section-pymalloc}}
770
771An experimental feature added to Python 2.1 was a specialized object
772allocator called pymalloc, written by Vladimir Marangozov. Pymalloc
773was intended to be faster than the system \cfunction{malloc()} and have
774less memory overhead for typical allocation patterns of Python
775programs. The allocator uses C's \cfunction{malloc()} function to get
776large pools of memory, and then fulfills smaller memory requests from
777these pools.
778
779In 2.1 and 2.2, pymalloc was an experimental feature and wasn't
780enabled by default; you had to explicitly turn it on by providing the
781\longprogramopt{with-pymalloc} option to the \program{configure}
782script. In 2.3, pymalloc has had further enhancements and is now
783enabled by default; you'll have to supply
784\longprogramopt{without-pymalloc} to disable it.
785
786This change is transparent to code written in Python; however,
787pymalloc may expose bugs in C extensions. Authors of C extension
788modules should test their code with the object allocator enabled,
789because some incorrect code may cause core dumps at runtime. There
790are a bunch of memory allocation functions in Python's C API that have
791previously been just aliases for the C library's \cfunction{malloc()}
792and \cfunction{free()}, meaning that if you accidentally called
793mismatched functions, the error wouldn't be noticeable. When the
794object allocator is enabled, these functions aren't aliases of
795\cfunction{malloc()} and \cfunction{free()} any more, and calling the
796wrong function to free memory may get you a core dump. For example,
797if memory was allocated using \cfunction{PyObject_Malloc()}, it has to
798be freed using \cfunction{PyObject_Free()}, not \cfunction{free()}. A
799few modules included with Python fell afoul of this and had to be
800fixed; doubtless there are more third-party modules that will have the
801same problem.
802
803As part of this change, the confusing multiple interfaces for
804allocating memory have been consolidated down into two API families.
805Memory allocated with one family must not be manipulated with
806functions from the other family.
807
808There is another family of functions specifically for allocating
809Python \emph{objects} (as opposed to memory).
810
811\begin{itemize}
812 \item To allocate and free an undistinguished chunk of memory use
813 the ``raw memory'' family: \cfunction{PyMem_Malloc()},
814 \cfunction{PyMem_Realloc()}, and \cfunction{PyMem_Free()}.
815
816 \item The ``object memory'' family is the interface to the pymalloc
817 facility described above and is biased towards a large number of
818 ``small'' allocations: \cfunction{PyObject_Malloc},
819 \cfunction{PyObject_Realloc}, and \cfunction{PyObject_Free}.
820
821 \item To allocate and free Python objects, use the ``object'' family
822 \cfunction{PyObject_New()}, \cfunction{PyObject_NewVar()}, and
823 \cfunction{PyObject_Del()}.
824\end{itemize}
825
826Thanks to lots of work by Tim Peters, pymalloc in 2.3 also provides
827debugging features to catch memory overwrites and doubled frees in
828both extension modules and in the interpreter itself. To enable this
829support, turn on the Python interpreter's debugging code by running
830\program{configure} with \longprogramopt{with-pydebug}.
831
832To aid extension writers, a header file \file{Misc/pymemcompat.h} is
833distributed with the source to Python 2.3 that allows Python
834extensions to use the 2.3 interfaces to memory allocation and compile
835against any version of Python since 1.5.2. You would copy the file
836from Python's source distribution and bundle it with the source of
837your extension.
838
839\begin{seealso}
840
841\seeurl{http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/Objects/obmalloc.c}
842{For the full details of the pymalloc implementation, see
843the comments at the top of the file \file{Objects/obmalloc.c} in the
844Python source code. The above link points to the file within the
845SourceForge CVS browser.}
846
847\end{seealso}
848
849
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000850% ======================================================================
851\section{Build and C API Changes}
852
Andrew M. Kuchling3c305d92002-07-22 18:50:11 +0000853Changes to Python's build process and to the C API include:
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000854
855\begin{itemize}
856
Andrew M. Kuchlingef5d06b2002-07-22 19:21:06 +0000857\item The C-level interface to the garbage collector has been changed,
858to make it easier to write extension types that support garbage
859collection, and to make it easier to debug misuses of the functions.
860Various functions have slightly different semantics, so a bunch of
861functions had to be renamed. Extensions that use the old API will
862still compile but will \emph{not} participate in garbage collection,
863so updating them for 2.3 should be considered fairly high priority.
864
865To upgrade an extension module to the new API, perform the following
866steps:
867
868\begin{itemize}
869
870\item Rename \cfunction{Py_TPFLAGS_GC} to \cfunction{PyTPFLAGS_HAVE_GC}.
871
872\item Use \cfunction{PyObject_GC_New} or \cfunction{PyObject_GC_NewVar} to
873allocate objects, and \cfunction{PyObject_GC_Del} to deallocate them.
874
875\item Rename \cfunction{PyObject_GC_Init} to \cfunction{PyObject_GC_Track} and
876\cfunction{PyObject_GC_Fini} to \cfunction{PyObject_GC_UnTrack}.
877
878\item Remove \cfunction{PyGC_HEAD_SIZE} from object size calculations.
879
880\item Remove calls to \cfunction{PyObject_AS_GC} and \cfunction{PyObject_FROM_GC}.
881
882\end{itemize}
883
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000884\item Python can now optionally be built as a shared library
885(\file{libpython2.3.so}) by supplying \longprogramopt{enable-shared}
Andrew M. Kuchlingfad2f592002-05-10 21:00:05 +0000886when running Python's \file{configure} script. (Contributed by Ondrej
887Palkovsky.)
Andrew M. Kuchlingf4dd65d2002-04-01 19:28:09 +0000888
Andrew M. Kuchling3c305d92002-07-22 18:50:11 +0000889\item The \csimplemacro{DL_EXPORT} and \csimplemacro{DL_IMPORT} macros are now
890deprecated. Initialization functions for Python extension modules
891should now be declared using the new macro
892\csimplemacro{PyMODINIT_FUNC}, while the Python core will generally
893use the \csimplemacro{PyAPI_FUNC} and \csimplemacro{PyAPI_DATA}
894macros.
Neal Norwitzbba23a82002-07-22 13:18:59 +0000895
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +0000896\item The interpreter can be compiled without any docstrings for
897the built-in functions and modules by supplying
Andrew M. Kuchling20e5abc2002-07-11 20:50:34 +0000898\longprogramopt{without-doc-strings} to the \file{configure} script.
Andrew M. Kuchlinge995d162002-07-11 20:09:50 +0000899This makes the Python executable about 10\% smaller, but will also
900mean that you can't get help for Python's built-ins. (Contributed by
901Gustavo Niemeyer.)
902
Andrew M. Kuchling20e5abc2002-07-11 20:50:34 +0000903\item The cycle detection implementation used by the garbage collection
904has proven to be stable, so it's now being made mandatory; you can no
905longer compile Python without it, and the
906\longprogramopt{with-cycle-gc} switch to \file{configure} has been removed.
907
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000908\item The \cfunction{PyArg_NoArgs()} macro is now deprecated, and code
Andrew M. Kuchling7845e7c2002-07-11 19:27:46 +0000909that uses it should be changed. For Python 2.2 and later, the method
910definition table can specify the
911\constant{METH_NOARGS} flag, signalling that there are no arguments, and
912the argument checking can then be removed. If compatibility with
913pre-2.2 versions of Python is important, the code could use
914\code{PyArg_ParseTuple(args, "")} instead, but this will be slower
915than using \constant{METH_NOARGS}.
Andrew M. Kuchling03594bb2002-03-27 02:29:48 +0000916
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000917\item A new function, \cfunction{PyObject_DelItemString(\var{mapping},
918char *\var{key})} was added
919as shorthand for
920\code{PyObject_DelItem(\var{mapping}, PyString_New(\var{key})}.
Andrew M. Kuchling03594bb2002-03-27 02:29:48 +0000921
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000922\item The source code for the Expat XML parser is now included with
923the Python source, so the \module{pyexpat} module is no longer
924dependent on having a system library containing Expat.
925
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000926\item File objects now manage their internal string buffer
927differently by increasing it exponentially when needed.
928This results in the benchmark tests in \file{Lib/test/test_bufio.py}
929speeding up from 57 seconds to 1.7 seconds, according to one
930measurement.
931
Andrew M. Kuchling72b58e02002-05-29 17:30:34 +0000932\item It's now possible to define class and static methods for a C
933extension type by setting either the \constant{METH_CLASS} or
934\constant{METH_STATIC} flags in a method's \ctype{PyMethodDef}
935structure.
Andrew M. Kuchling45afd542002-04-02 14:25:25 +0000936
Andrew M. Kuchling346386f2002-07-12 20:24:42 +0000937\item Python now includes a copy of the Expat XML parser's source code,
938removing any dependence on a system version or local installation of
939Expat.
940
Andrew M. Kuchling821013e2002-05-06 17:46:39 +0000941\end{itemize}
942
943\subsection{Port-Specific Changes}
944
Andrew M. Kuchling187b1d82002-05-29 19:20:57 +0000945Support for a port to IBM's OS/2 using the EMX runtime environment was
946merged into the main Python source tree. EMX is a POSIX emulation
947layer over the OS/2 system APIs. The Python port for EMX tries to
948support all the POSIX-like capability exposed by the EMX runtime, and
949mostly succeeds; \function{fork()} and \function{fcntl()} are
950restricted by the limitations of the underlying emulation layer. The
951standard OS/2 port, which uses IBM's Visual Age compiler, also gained
952support for case-sensitive import semantics as part of the integration
953of the EMX port into CVS. (Contributed by Andrew MacIntyre.)
Andrew M. Kuchling03594bb2002-03-27 02:29:48 +0000954
Andrew M. Kuchling72b58e02002-05-29 17:30:34 +0000955On MacOS, most toolbox modules have been weaklinked to improve
956backward compatibility. This means that modules will no longer fail
957to load if a single routine is missing on the curent OS version.
Andrew M. Kuchling187b1d82002-05-29 19:20:57 +0000958Instead calling the missing routine will raise an exception.
959(Contributed by Jack Jansen.)
Andrew M. Kuchling03594bb2002-03-27 02:29:48 +0000960
Andrew M. Kuchling187b1d82002-05-29 19:20:57 +0000961The RPM spec files, found in the \file{Misc/RPM/} directory in the
962Python source distribution, were updated for 2.3. (Contributed by
963Sean Reifschneider.)
Fred Drake03e10312002-03-26 19:17:43 +0000964
Andrew M. Kuchling20e5abc2002-07-11 20:50:34 +0000965Python now supports AtheOS (\url{www.atheos.cx}) and GNU/Hurd.
966
Fred Drake03e10312002-03-26 19:17:43 +0000967
968%======================================================================
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000969\section{Other Changes and Fixes}
970
971Finally, there are various miscellaneous fixes:
972
973\begin{itemize}
974
975\item The tools used to build the documentation now work under Cygwin
976as well as \UNIX.
977
978\end{itemize}
979
Andrew M. Kuchling187b1d82002-05-29 19:20:57 +0000980
Andrew M. Kuchling517109b2002-05-07 21:01:16 +0000981%======================================================================
Fred Drake03e10312002-03-26 19:17:43 +0000982\section{Acknowledgements \label{acks}}
983
Andrew M. Kuchling03594bb2002-03-27 02:29:48 +0000984The author would like to thank the following people for offering
985suggestions, corrections and assistance with various drafts of this
Andrew M. Kuchling7f147a72002-06-10 18:58:19 +0000986article: Michael Chermside, Scott David Daniels, Fred~L. Drake, Jr.,
Andrew M. Kuchling7845e7c2002-07-11 19:27:46 +0000987Michael Hudson, Detlef Lannert, Martin von L\"owis, Andrew MacIntyre,
988Gustavo Niemeyer, Neal Norwitz.
Fred Drake03e10312002-03-26 19:17:43 +0000989
990\end{document}