blob: 51697a1621cea93e8256643fc4bb55e6ef47ca24 [file] [log] [blame]
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +00001\documentclass{howto}
2
3% $Id$
4
5\title{What's New in Python 2.2}
Andrew M. Kuchling0ab31b82001-08-29 01:16:54 +00006\release{0.05}
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +00007\author{A.M. Kuchling}
Andrew M. Kuchling7bf82772001-07-11 18:54:26 +00008\authoraddress{\email{akuchlin@mems-exchange.org}}
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +00009\begin{document}
10\maketitle\tableofcontents
11
12\section{Introduction}
13
14{\large This document is a draft, and is subject to change until the
Andrew M. Kuchling9e9c1352001-08-11 03:06:50 +000015final version of Python 2.2 is released. Currently it's up to date
16for Python 2.2 alpha 1. Please send any comments, bug reports, or
17questions, no matter how minor, to \email{akuchlin@mems-exchange.org}.
18}
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +000019
Andrew M. Kuchling26c39bf2001-09-10 03:20:53 +000020This article explains the new features in Python 2.2.
21
22Python 2.2 can be thought of as the "cleanup release". There are some
23features such as generators and iterators that are completely new, but
24most of the changes, significant and far-reaching though they may be,
25are aimed at cleaning up irregularities and dark corners of the
26language design.
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +000027
28This article doesn't attempt to provide a complete specification for
Andrew M. Kuchling26c39bf2001-09-10 03:20:53 +000029the new features, but instead provides a convenient overview. For
30full details, you should refer to the documentation for Python 2.2,
Fred Drake0d002542001-07-17 13:55:33 +000031such as the
32\citetitle[http://python.sourceforge.net/devel-docs/lib/lib.html]{Python
33Library Reference} and the
34\citetitle[http://python.sourceforge.net/devel-docs/ref/ref.html]{Python
Andrew M. Kuchling26c39bf2001-09-10 03:20:53 +000035Reference Manual}.
36% XXX These \citetitle marks should get the python.org URLs for the final
Fred Drake0d002542001-07-17 13:55:33 +000037% release, just as soon as the docs are published there.
Andrew M. Kuchling26c39bf2001-09-10 03:20:53 +000038If you want to understand the complete implementation and design
39rationale for a change, refer to the PEP for a particular new feature.
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +000040
41The final release of Python 2.2 is planned for October 2001.
42
Andrew M. Kuchling8cfa9052001-07-19 01:19:59 +000043
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +000044%======================================================================
Andrew M. Kuchling26c39bf2001-09-10 03:20:53 +000045\section{PEP 252: Type and Class Changes}
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +000046
Andrew M. Kuchling26c39bf2001-09-10 03:20:53 +000047XXX
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +000048
Andrew M. Kuchling26c39bf2001-09-10 03:20:53 +000049I need to read and digest the relevant PEPs.
Andrew M. Kuchling8cfa9052001-07-19 01:19:59 +000050
Andrew M. Kuchling26c39bf2001-09-10 03:20:53 +000051GvR's description at http://www.python.org/2.2/descrintro.html
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +000052
Andrew M. Kuchling26c39bf2001-09-10 03:20:53 +000053\begin{seealso}
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +000054
Andrew M. Kuchling26c39bf2001-09-10 03:20:53 +000055\seepep{252}{Making Types Look More Like Classes}{Written and implemented
56by Guido van Rossum.}
57
58\end{seealso}
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +000059
Andrew M. Kuchling8cfa9052001-07-19 01:19:59 +000060
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +000061%======================================================================
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +000062\section{PEP 234: Iterators}
63
64A significant addition to 2.2 is an iteration interface at both the C
65and Python levels. Objects can define how they can be looped over by
66callers.
67
68In Python versions up to 2.1, the usual way to make \code{for item in
69obj} work is to define a \method{__getitem__()} method that looks
70something like this:
71
72\begin{verbatim}
73 def __getitem__(self, index):
74 return <next item>
75\end{verbatim}
76
77\method{__getitem__()} is more properly used to define an indexing
78operation on an object so that you can write \code{obj[5]} to retrieve
Andrew M. Kuchling8c69c912001-08-07 14:28:58 +000079the sixth element. It's a bit misleading when you're using this only
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +000080to support \keyword{for} loops. Consider some file-like object that
81wants to be looped over; the \var{index} parameter is essentially
82meaningless, as the class probably assumes that a series of
83\method{__getitem__()} calls will be made, with \var{index}
84incrementing by one each time. In other words, the presence of the
85\method{__getitem__()} method doesn't mean that \code{file[5]} will
86work, though it really should.
87
88In Python 2.2, iteration can be implemented separately, and
89\method{__getitem__()} methods can be limited to classes that really
90do support random access. The basic idea of iterators is quite
91simple. A new built-in function, \function{iter(obj)}, returns an
92iterator for the object \var{obj}. (It can also take two arguments:
Fred Drake0d002542001-07-17 13:55:33 +000093\code{iter(\var{C}, \var{sentinel})} will call the callable \var{C},
94until it returns \var{sentinel}, which will signal that the iterator
95is done. This form probably won't be used very often.)
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +000096
97Python classes can define an \method{__iter__()} method, which should
98create and return a new iterator for the object; if the object is its
99own iterator, this method can just return \code{self}. In particular,
100iterators will usually be their own iterators. Extension types
101implemented in C can implement a \code{tp_iter} function in order to
Andrew M. Kuchling4cf52a92001-07-17 12:48:48 +0000102return an iterator, and extension types that want to behave as
103iterators can define a \code{tp_iternext} function.
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000104
105So what do iterators do? They have one required method,
106\method{next()}, which takes no arguments and returns the next value.
107When there are no more values to be returned, calling \method{next()}
108should raise the \exception{StopIteration} exception.
109
110\begin{verbatim}
111>>> L = [1,2,3]
112>>> i = iter(L)
113>>> print i
114<iterator object at 0x8116870>
115>>> i.next()
1161
117>>> i.next()
1182
119>>> i.next()
1203
121>>> i.next()
122Traceback (most recent call last):
123 File "<stdin>", line 1, in ?
124StopIteration
125>>>
126\end{verbatim}
127
128In 2.2, Python's \keyword{for} statement no longer expects a sequence;
129it expects something for which \function{iter()} will return something.
130For backward compatibility, and convenience, an iterator is
131automatically constructed for sequences that don't implement
132\method{__iter__()} or a \code{tp_iter} slot, so \code{for i in
133[1,2,3]} will still work. Wherever the Python interpreter loops over
134a sequence, it's been changed to use the iterator protocol. This
135means you can do things like this:
136
137\begin{verbatim}
138>>> i = iter(L)
139>>> a,b,c = i
140>>> a,b,c
141(1, 2, 3)
142>>>
143\end{verbatim}
144
Andrew M. Kuchling9e9c1352001-08-11 03:06:50 +0000145Iterator support has been added to some of Python's basic types.
Fred Drake0d002542001-07-17 13:55:33 +0000146Calling \function{iter()} on a dictionary will return an iterator
Andrew M. Kuchling6ea9f0b2001-07-17 14:50:31 +0000147which loops over its keys:
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000148
149\begin{verbatim}
150>>> m = {'Jan': 1, 'Feb': 2, 'Mar': 3, 'Apr': 4, 'May': 5, 'Jun': 6,
151... 'Jul': 7, 'Aug': 8, 'Sep': 9, 'Oct': 10, 'Nov': 11, 'Dec': 12}
152>>> for key in m: print key, m[key]
153...
154Mar 3
155Feb 2
156Aug 8
157Sep 9
158May 5
159Jun 6
160Jul 7
161Jan 1
162Apr 4
163Nov 11
164Dec 12
165Oct 10
166>>>
167\end{verbatim}
168
169That's just the default behaviour. If you want to iterate over keys,
170values, or key/value pairs, you can explicitly call the
171\method{iterkeys()}, \method{itervalues()}, or \method{iteritems()}
Andrew M. Kuchling9e9c1352001-08-11 03:06:50 +0000172methods to get an appropriate iterator. In a minor related change,
173the \keyword{in} operator now works on dictionaries, so
174\code{\var{key} in dict} is now equivalent to
175\code{dict.has_key(\var{key})}.
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000176
Andrew M. Kuchling9e9c1352001-08-11 03:06:50 +0000177
178Files also provide an iterator, which calls the \method{readline()}
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000179method until there are no more lines in the file. This means you can
180now read each line of a file using code like this:
181
182\begin{verbatim}
183for line in file:
184 # do something for each line
185\end{verbatim}
186
187Note that you can only go forward in an iterator; there's no way to
188get the previous element, reset the iterator, or make a copy of it.
Fred Drake0d002542001-07-17 13:55:33 +0000189An iterator object could provide such additional capabilities, but the
190iterator protocol only requires a \method{next()} method.
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000191
192\begin{seealso}
193
194\seepep{234}{Iterators}{Written by Ka-Ping Yee and GvR; implemented
195by the Python Labs crew, mostly by GvR and Tim Peters.}
196
197\end{seealso}
198
Andrew M. Kuchling8cfa9052001-07-19 01:19:59 +0000199
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000200%======================================================================
201\section{PEP 255: Simple Generators}
202
203Generators are another new feature, one that interacts with the
204introduction of iterators.
205
206You're doubtless familiar with how function calls work in Python or
207C. When you call a function, it gets a private area where its local
208variables are created. When the function reaches a \keyword{return}
209statement, the local variables are destroyed and the resulting value
210is returned to the caller. A later call to the same function will get
211a fresh new set of local variables. But, what if the local variables
212weren't destroyed on exiting a function? What if you could later
213resume the function where it left off? This is what generators
214provide; they can be thought of as resumable functions.
215
216Here's the simplest example of a generator function:
217
218\begin{verbatim}
219def generate_ints(N):
220 for i in range(N):
221 yield i
222\end{verbatim}
223
224A new keyword, \keyword{yield}, was introduced for generators. Any
225function containing a \keyword{yield} statement is a generator
226function; this is detected by Python's bytecode compiler which
Andrew M. Kuchling4cf52a92001-07-17 12:48:48 +0000227compiles the function specially. Because a new keyword was
228introduced, generators must be explicitly enabled in a module by
229including a \code{from __future__ import generators} statement near
230the top of the module's source code. In Python 2.3 this statement
231will become unnecessary.
232
233When you call a generator function, it doesn't return a single value;
234instead it returns a generator object that supports the iterator
235interface. On executing the \keyword{yield} statement, the generator
236outputs the value of \code{i}, similar to a \keyword{return}
237statement. The big difference between \keyword{yield} and a
238\keyword{return} statement is that, on reaching a \keyword{yield} the
239generator's state of execution is suspended and local variables are
240preserved. On the next call to the generator's \code{.next()} method,
241the function will resume executing immediately after the
242\keyword{yield} statement. (For complicated reasons, the
243\keyword{yield} statement isn't allowed inside the \keyword{try} block
244of a \code{try...finally} statement; read PEP 255 for a full
245explanation of the interaction between \keyword{yield} and
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000246exceptions.)
247
248Here's a sample usage of the \function{generate_ints} generator:
249
250\begin{verbatim}
251>>> gen = generate_ints(3)
252>>> gen
253<generator object at 0x8117f90>
254>>> gen.next()
2550
256>>> gen.next()
2571
258>>> gen.next()
2592
260>>> gen.next()
261Traceback (most recent call last):
262 File "<stdin>", line 1, in ?
263 File "<stdin>", line 2, in generate_ints
264StopIteration
265>>>
266\end{verbatim}
267
268You could equally write \code{for i in generate_ints(5)}, or
269\code{a,b,c = generate_ints(3)}.
270
271Inside a generator function, the \keyword{return} statement can only
Andrew M. Kuchling4cf52a92001-07-17 12:48:48 +0000272be used without a value, and signals the end of the procession of
273values; afterwards the generator cannot return any further values.
274\keyword{return} with a value, such as \code{return 5}, is a syntax
275error inside a generator function. The end of the generator's results
276can also be indicated by raising \exception{StopIteration} manually,
277or by just letting the flow of execution fall off the bottom of the
278function.
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000279
280You could achieve the effect of generators manually by writing your
Andrew M. Kuchling4cf52a92001-07-17 12:48:48 +0000281own class and storing all the local variables of the generator as
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000282instance variables. For example, returning a list of integers could
283be done by setting \code{self.count} to 0, and having the
284\method{next()} method increment \code{self.count} and return it.
Andrew M. Kuchlingc32cc7c2001-07-17 18:25:01 +0000285However, for a moderately complicated generator, writing a
286corresponding class would be much messier.
287\file{Lib/test/test_generators.py} contains a number of more
288interesting examples. The simplest one implements an in-order
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000289traversal of a tree using generators recursively.
290
291\begin{verbatim}
292# A recursive generator that generates Tree leaves in in-order.
293def inorder(t):
294 if t:
295 for x in inorder(t.left):
296 yield x
297 yield t.label
298 for x in inorder(t.right):
299 yield x
300\end{verbatim}
301
302Two other examples in \file{Lib/test/test_generators.py} produce
303solutions for the N-Queens problem (placing $N$ queens on an $NxN$
304chess board so that no queen threatens another) and the Knight's Tour
305(a route that takes a knight to every square of an $NxN$ chessboard
306without visiting any square twice).
307
308The idea of generators comes from other programming languages,
309especially Icon (\url{http://www.cs.arizona.edu/icon/}), where the
310idea of generators is central to the language. In Icon, every
311expression and function call behaves like a generator. One example
312from ``An Overview of the Icon Programming Language'' at
313\url{http://www.cs.arizona.edu/icon/docs/ipd266.htm} gives an idea of
314what this looks like:
315
316\begin{verbatim}
317sentence := "Store it in the neighboring harbor"
318if (i := find("or", sentence)) > 5 then write(i)
319\end{verbatim}
320
321The \function{find()} function returns the indexes at which the
322substring ``or'' is found: 3, 23, 33. In the \keyword{if} statement,
323\code{i} is first assigned a value of 3, but 3 is less than 5, so the
324comparison fails, and Icon retries it with the second value of 23. 23
325is greater than 5, so the comparison now succeeds, and the code prints
326the value 23 to the screen.
327
328Python doesn't go nearly as far as Icon in adopting generators as a
329central concept. Generators are considered a new part of the core
330Python language, but learning or using them isn't compulsory; if they
331don't solve any problems that you have, feel free to ignore them.
332This is different from Icon where the idea of generators is a basic
333concept. One novel feature of Python's interface as compared to
334Icon's is that a generator's state is represented as a concrete object
335that can be passed around to other functions or stored in a data
336structure.
337
338\begin{seealso}
339
Andrew M. Kuchling4cf52a92001-07-17 12:48:48 +0000340\seepep{255}{Simple Generators}{Written by Neil Schemenauer, Tim
341Peters, Magnus Lie Hetland. Implemented mostly by Neil Schemenauer
342and Tim Peters, with other fixes from the Python Labs crew.}
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000343
344\end{seealso}
345
Andrew M. Kuchling8cfa9052001-07-19 01:19:59 +0000346
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000347%======================================================================
Andrew M. Kuchling2f0047a2001-09-05 14:53:31 +0000348\section{PEP 237: Unifying Long Integers and Integers}
349
Andrew M. Kuchling26c39bf2001-09-10 03:20:53 +0000350In recent versions, the distinction between regular integers, which
351are 32-bit values on most machines, and long integers, which can be of
352arbitrary size, was becoming an annoyance. For example, on platforms
353that support large files (files larger than \code{2**32} bytes), the
354\method{tell()} method of file objects has to return a long integer.
355However, there were various bits of Python that expected plain
356integers and would raise an error if a long integer was provided
357instead. For example, in version XXX of Python, only regular integers
358could be used as a slice index, and \code{'abc'[1L:]} would raise a
359\exception{TypeError} exception with the message 'slice index must be
360int'.
Andrew M. Kuchling2f0047a2001-09-05 14:53:31 +0000361
Andrew M. Kuchling26c39bf2001-09-10 03:20:53 +0000362Python 2.2 will shift values from short to long integers as required.
363The 'L' suffix is no longer needed to indicate a long integer literal,
364as now the compiler will choose the appropriate type. (Using the 'L'
365suffix will be discouraged in future 2.x versions of Python,
366triggering a warning in Python 2.4, and probably dropped in Python
3673.0.) Many operations that used to raise an \exception{OverflowError}
368will now return a long integer as their result. For example:
369
370\begin{verbatim}
371>>> 1234567890123
372XXX
373>>> 2 ** 32
374XXX put output here
375\end{verbatim}
376
377In most cases, integers and long integers will now be treated
378identically. You can still distinguish them with the
379\function{type()} built-in function, but that's rarely needed. The
380\function{int()} function will now return a long integer if the value
381is large enough.
382
383% XXX is there a warning-enabling command-line option for this?
384
385\begin{seealso}
386
387\seepep{237}{Unifying Long Integers and Integers}{Written by
388Moshe Zadka and Guido van Rossum. Implemented mostly by Guido van Rossum.}
389
390\end{seealso}
Andrew M. Kuchling2f0047a2001-09-05 14:53:31 +0000391
392%======================================================================
Andrew M. Kuchling9e9c1352001-08-11 03:06:50 +0000393\section{PEP 238: Changing the Division Operator}
394
395The most controversial change in Python 2.2 is the start of an effort
396to fix an old design flaw that's been in Python from the beginning.
397Currently Python's division operator, \code{/}, behaves like C's
398division operator when presented with two integer arguments. It
399returns an integer result that's truncated down when there would be
400fractional part. For example, \code{3/2} is 1, not 1.5, and
401\code{(-1)/2} is -1, not -0.5. This means that the results of divison
402can vary unexpectedly depending on the type of the two operands and
403because Python is dynamically typed, it can be difficult to determine
404the possible types of the operands.
405
406(The controversy is over whether this is \emph{really} a design flaw,
407and whether it's worth breaking existing code to fix this. It's
408caused endless discussions on python-dev and in July erupted into an
409storm of acidly sarcastic postings on \newsgroup{comp.lang.python}. I
410won't argue for either side here; read PEP 238 for a summary of
411arguments and counter-arguments.)
412
413Because this change might break code, it's being introduced very
414gradually. Python 2.2 begins the transition, but the switch won't be
415complete until Python 3.0.
416
417First, some terminology from PEP 238. ``True division'' is the
418division that most non-programmers are familiar with: 3/2 is 1.5, 1/4
419is 0.25, and so forth. ``Floor division'' is what Python's \code{/}
420operator currently does when given integer operands; the result is the
421floor of the value returned by true division. ``Classic division'' is
422the current mixed behaviour of \code{/}; it returns the result of
423floor division when the operands are integers, and returns the result
424of true division when one of the operands is a floating-point number.
425
426Here are the changes 2.2 introduces:
427
428\begin{itemize}
429
430\item A new operator, \code{//}, is the floor division operator.
431(Yes, we know it looks like \Cpp's comment symbol.) \code{//}
432\emph{always} returns the floor divison no matter what the types of
433its operands are, so \code{1 // 2} is 0 and \code{1.0 // 2.0} is also
4340.0.
435
436\code{//} is always available in Python 2.2; you don't need to enable
437it using a \code{__future__} statement.
438
439\item By including a \code{from __future__ import true_division} in a
440module, the \code{/} operator will be changed to return the result of
441true division, so \code{1/2} is 0.5. Without the \code{__future__}
442statement, \code{/} still means classic division. The default meaning
443of \code{/} will not change until Python 3.0.
444
445\item Classes can define methods called \method{__truediv__} and
446\method{__floordiv__} to overload the two division operators. At the
447C level, there are also slots in the \code{PyNumberMethods} structure
448so extension types can define the two operators.
449
450% XXX a warning someday?
451
452\end{itemize}
453
454\begin{seealso}
455
456\seepep{238}{Changing the Division Operator}{Written by Moshe Zadka and
457Guido van Rossum. Implemented by Guido van Rossum..}
458
459\end{seealso}
460
461
462%======================================================================
Andrew M. Kuchlinga43e7032001-06-27 20:32:12 +0000463\section{Unicode Changes}
464
Andrew M. Kuchling2cd712b2001-07-16 13:39:08 +0000465Python's Unicode support has been enhanced a bit in 2.2. Unicode
Andrew M. Kuchlinga6d2a042001-07-20 18:34:34 +0000466strings are usually stored as UCS-2, as 16-bit unsigned integers.
Andrew M. Kuchlingf5fec3c2001-07-19 01:48:08 +0000467Python 2.2 can also be compiled to use UCS-4, 32-bit unsigned
468integers, as its internal encoding by supplying
469\longprogramopt{enable-unicode=ucs4} to the configure script. When
Andrew M. Kuchlingab010872001-07-19 14:59:53 +0000470built to use UCS-4 (a ``wide Python''), the interpreter can natively
Andrew M. Kuchlinga6d2a042001-07-20 18:34:34 +0000471handle Unicode characters from U+000000 to U+110000, so the range of
472legal values for the \function{unichr()} function is expanded
473accordingly. Using an interpreter compiled to use UCS-2 (a ``narrow
474Python''), values greater than 65535 will still cause
475\function{unichr()} to raise a \exception{ValueError} exception.
Andrew M. Kuchlingab010872001-07-19 14:59:53 +0000476
477All this is the province of the still-unimplemented PEP 261, ``Support
478for `wide' Unicode characters''; consult it for further details, and
Andrew M. Kuchlinga6d2a042001-07-20 18:34:34 +0000479please offer comments on the PEP and on your experiences with the
4802.2 alpha releases.
481% XXX update previous line once 2.2 reaches beta.
Andrew M. Kuchlingab010872001-07-19 14:59:53 +0000482
483Another change is much simpler to explain. Since their introduction,
484Unicode strings have supported an \method{encode()} method to convert
485the string to a selected encoding such as UTF-8 or Latin-1. A
486symmetric \method{decode(\optional{\var{encoding}})} method has been
487added to 8-bit strings (though not to Unicode strings) in 2.2.
488\method{decode()} assumes that the string is in the specified encoding
489and decodes it, returning whatever is returned by the codec.
490
491Using this new feature, codecs have been added for tasks not directly
492related to Unicode. For example, codecs have been added for
493uu-encoding, MIME's base64 encoding, and compression with the
494\module{zlib} module:
Andrew M. Kuchling2cd712b2001-07-16 13:39:08 +0000495
496\begin{verbatim}
497>>> s = """Here is a lengthy piece of redundant, overly verbose,
498... and repetitive text.
499... """
500>>> data = s.encode('zlib')
501>>> data
502'x\x9c\r\xc9\xc1\r\x80 \x10\x04\xc0?Ul...'
503>>> data.decode('zlib')
504'Here is a lengthy piece of redundant, overly verbose,\nand repetitive text.\n'
505>>> print s.encode('uu')
506begin 666 <data>
507M2&5R92!I<R!A(&QE;F=T:'D@<&EE8V4@;V8@<F5D=6YD86YT+"!O=F5R;'D@
508>=F5R8F]S92P*86YD(')E<&5T:71I=F4@=&5X="X*
509
510end
511>>> "sheesh".encode('rot-13')
512'furrfu'
513\end{verbatim}
Andrew M. Kuchlinga43e7032001-06-27 20:32:12 +0000514
Andrew M. Kuchlingf5fec3c2001-07-19 01:48:08 +0000515\method{encode()} and \method{decode()} were implemented by
516Marc-Andr\'e Lemburg. The changes to support using UCS-4 internally
517were implemented by Fredrik Lundh and Martin von L\"owis.
Andrew M. Kuchlinga43e7032001-06-27 20:32:12 +0000518
Andrew M. Kuchlingf5fec3c2001-07-19 01:48:08 +0000519\begin{seealso}
520
521\seepep{261}{Support for `wide' Unicode characters}{PEP written by
522Paul Prescod. Not yet accepted or fully implemented.}
523
524\end{seealso}
Andrew M. Kuchling8cfa9052001-07-19 01:19:59 +0000525
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000526%======================================================================
527\section{PEP 227: Nested Scopes}
528
529In Python 2.1, statically nested scopes were added as an optional
530feature, to be enabled by a \code{from __future__ import
531nested_scopes} directive. In 2.2 nested scopes no longer need to be
532specially enabled, but are always enabled. The rest of this section
533is a copy of the description of nested scopes from my ``What's New in
534Python 2.1'' document; if you read it when 2.1 came out, you can skip
535the rest of this section.
536
537The largest change introduced in Python 2.1, and made complete in 2.2,
538is to Python's scoping rules. In Python 2.0, at any given time there
539are at most three namespaces used to look up variable names: local,
540module-level, and the built-in namespace. This often surprised people
541because it didn't match their intuitive expectations. For example, a
542nested recursive function definition doesn't work:
543
544\begin{verbatim}
545def f():
546 ...
547 def g(value):
548 ...
549 return g(value-1) + 1
550 ...
551\end{verbatim}
552
553The function \function{g()} will always raise a \exception{NameError}
554exception, because the binding of the name \samp{g} isn't in either
555its local namespace or in the module-level namespace. This isn't much
556of a problem in practice (how often do you recursively define interior
557functions like this?), but this also made using the \keyword{lambda}
558statement clumsier, and this was a problem in practice. In code which
559uses \keyword{lambda} you can often find local variables being copied
560by passing them as the default values of arguments.
561
562\begin{verbatim}
563def find(self, name):
564 "Return list of any entries equal to 'name'"
565 L = filter(lambda x, name=name: x == name,
566 self.list_attribute)
567 return L
568\end{verbatim}
569
570The readability of Python code written in a strongly functional style
571suffers greatly as a result.
572
573The most significant change to Python 2.2 is that static scoping has
574been added to the language to fix this problem. As a first effect,
575the \code{name=name} default argument is now unnecessary in the above
576example. Put simply, when a given variable name is not assigned a
577value within a function (by an assignment, or the \keyword{def},
578\keyword{class}, or \keyword{import} statements), references to the
579variable will be looked up in the local namespace of the enclosing
580scope. A more detailed explanation of the rules, and a dissection of
581the implementation, can be found in the PEP.
582
583This change may cause some compatibility problems for code where the
584same variable name is used both at the module level and as a local
585variable within a function that contains further function definitions.
586This seems rather unlikely though, since such code would have been
587pretty confusing to read in the first place.
588
589One side effect of the change is that the \code{from \var{module}
590import *} and \keyword{exec} statements have been made illegal inside
591a function scope under certain conditions. The Python reference
592manual has said all along that \code{from \var{module} import *} is
593only legal at the top level of a module, but the CPython interpreter
594has never enforced this before. As part of the implementation of
595nested scopes, the compiler which turns Python source into bytecodes
596has to generate different code to access variables in a containing
597scope. \code{from \var{module} import *} and \keyword{exec} make it
598impossible for the compiler to figure this out, because they add names
599to the local namespace that are unknowable at compile time.
600Therefore, if a function contains function definitions or
601\keyword{lambda} expressions with free variables, the compiler will
602flag this by raising a \exception{SyntaxError} exception.
603
604To make the preceding explanation a bit clearer, here's an example:
605
606\begin{verbatim}
607x = 1
608def f():
609 # The next line is a syntax error
610 exec 'x=2'
611 def g():
612 return x
613\end{verbatim}
614
615Line 4 containing the \keyword{exec} statement is a syntax error,
616since \keyword{exec} would define a new local variable named \samp{x}
617whose value should be accessed by \function{g()}.
618
619This shouldn't be much of a limitation, since \keyword{exec} is rarely
620used in most Python code (and when it is used, it's often a sign of a
621poor design anyway).
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000622
623\begin{seealso}
624
625\seepep{227}{Statically Nested Scopes}{Written and implemented by
626Jeremy Hylton.}
627
628\end{seealso}
629
Andrew M. Kuchlinga43e7032001-06-27 20:32:12 +0000630
631%======================================================================
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +0000632\section{New and Improved Modules}
633
634\begin{itemize}
635
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000636 \item The \module{xmlrpclib} module was contributed to the standard
Andrew M. Kuchling8c69c912001-08-07 14:28:58 +0000637 library by Fredrik Lundh. It provides support for writing XML-RPC
638 clients; XML-RPC is a simple remote procedure call protocol built on
639 top of HTTP and XML. For example, the following snippet retrieves a
640 list of RSS channels from the O'Reilly Network, and then retrieves a
641 list of the recent headlines for one channel:
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000642
643\begin{verbatim}
644import xmlrpclib
645s = xmlrpclib.Server(
646 'http://www.oreillynet.com/meerkat/xml-rpc/server.php')
647channels = s.meerkat.getChannels()
648# channels is a list of dictionaries, like this:
649# [{'id': 4, 'title': 'Freshmeat Daily News'}
650# {'id': 190, 'title': '32Bits Online'},
651# {'id': 4549, 'title': '3DGamers'}, ... ]
652
653# Get the items for one channel
654items = s.meerkat.getItems( {'channel': 4} )
655
656# 'items' is another list of dictionaries, like this:
657# [{'link': 'http://freshmeat.net/releases/52719/',
658# 'description': 'A utility which converts HTML to XSL FO.',
659# 'title': 'html2fo 0.3 (Default)'}, ... ]
660\end{verbatim}
661
Fred Drake0d002542001-07-17 13:55:33 +0000662See \url{http://www.xmlrpc.com/} for more information about XML-RPC.
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000663
664 \item The \module{socket} module can be compiled to support IPv6;
Andrew M. Kuchlingddeb1352001-07-16 14:35:52 +0000665 specify the \longprogramopt{enable-ipv6} option to Python's configure
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000666 script. (Contributed by Jun-ichiro ``itojun'' Hagino.)
667
668 \item Two new format characters were added to the \module{struct}
669 module for 64-bit integers on platforms that support the C
670 \ctype{long long} type. \samp{q} is for a signed 64-bit integer,
671 and \samp{Q} is for an unsigned one. The value is returned in
672 Python's long integer type. (Contributed by Tim Peters.)
673
674 \item In the interpreter's interactive mode, there's a new built-in
675 function \function{help()}, that uses the \module{pydoc} module
676 introduced in Python 2.1 to provide interactive.
677 \code{help(\var{object})} displays any available help text about
678 \var{object}. \code{help()} with no argument puts you in an online
679 help utility, where you can enter the names of functions, classes,
680 or modules to read their help text.
681 (Contributed by Guido van Rossum, using Ka-Ping Yee's \module{pydoc} module.)
682
683 \item Various bugfixes and performance improvements have been made
Andrew M. Kuchling4cf52a92001-07-17 12:48:48 +0000684 to the SRE engine underlying the \module{re} module. For example,
685 \function{re.sub()} will now use \function{string.replace()}
686 automatically when the pattern and its replacement are both just
687 literal strings without regex metacharacters. Another contributed
688 patch speeds up certain Unicode character ranges by a factor of
689 two. (SRE is maintained by Fredrik Lundh. The BIGCHARSET patch was
690 contributed by Martin von L\"owis.)
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000691
Andrew M. Kuchlinga6d2a042001-07-20 18:34:34 +0000692 \item The \module{imaplib} module, maintained by Piers Lauder, has
693 support for several new extensions: the NAMESPACE extension defined
694 in \rfc{2342}, SORT, GETACL and SETACL. (Contributed by Anthony
695 Baxter and Michel Pelletier.)
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000696
Fred Drake0d002542001-07-17 13:55:33 +0000697 \item The \module{rfc822} module's parsing of email addresses is
Andrew M. Kuchling4cf52a92001-07-17 12:48:48 +0000698 now compliant with \rfc{2822}, an update to \rfc{822}. The module's
699 name is \emph{not} going to be changed to \samp{rfc2822}.
700 (Contributed by Barry Warsaw.)
Andrew M. Kuchling77707672001-07-31 15:51:16 +0000701
702 \item New constants \constant{ascii_letters},
703 \constant{ascii_lowercase}, and \constant{ascii_uppercase} were
704 added to the \module{string} module. There were several modules in
705 the standard library that used \constant{string.letters} to mean the
706 ranges A-Za-z, but that assumption is incorrect when locales are in
707 use, because \constant{string.letters} varies depending on the set
708 of legal characters defined by the current locale. The buggy
709 modules have all been fixed to use \constant{ascii_letters} instead.
710 (Reported by an unknown person; fixed by Fred L. Drake, Jr.)
711
Andrew M. Kuchling8c69c912001-08-07 14:28:58 +0000712 \item The \module{mimetypes} module now makes it easier to use
713 alternative MIME-type databases by the addition of a
714 \class{MimeTypes} class, which takes a list of filenames to be
715 parsed. (Contributed by Fred L. Drake, Jr.)
716
Andrew M. Kuchling2f0047a2001-09-05 14:53:31 +0000717 \item XXX threading.Timer class
718
Andrew M. Kuchling77707672001-07-31 15:51:16 +0000719\end{itemize}
720
721
722%======================================================================
723\section{Interpreter Changes and Fixes}
724
725Some of the changes only affect people who deal with the Python
726interpreter at the C level, writing Python extension modules,
727embedding the interpreter, or just hacking on the interpreter itself.
728If you only write Python code, none of the changes described here will
729affect you very much.
730
731\begin{itemize}
732
733 \item Profiling and tracing functions can now be implemented in C,
734 which can operate at much higher speeds than Python-based functions
735 and should reduce the overhead of enabling profiling and tracing, so
736 it will be of interest to authors of development environments for
737 Python. Two new C functions were added to Python's API,
738 \cfunction{PyEval_SetProfile()} and \cfunction{PyEval_SetTrace()}.
739 The existing \function{sys.setprofile()} and
740 \function{sys.settrace()} functions still exist, and have simply
741 been changed to use the new C-level interface. (Contributed by Fred
742 L. Drake, Jr.)
743
744 \item Another low-level API, primarily of interest to implementors
745 of Python debuggers and development tools, was added.
746 \cfunction{PyInterpreterState_Head()} and
747 \cfunction{PyInterpreterState_Next()} let a caller walk through all
748 the existing interpreter objects;
749 \cfunction{PyInterpreterState_ThreadHead()} and
750 \cfunction{PyThreadState_Next()} allow looping over all the thread
751 states for a given interpreter. (Contributed by David Beazley.)
752
753 \item A new \samp{et} format sequence was added to
754 \cfunction{PyArg_ParseTuple}; \samp{et} takes both a parameter and
755 an encoding name, and converts the parameter to the given encoding
756 if the parameter turns out to be a Unicode string, or leaves it
757 alone if it's an 8-bit string, assuming it to already be in the
758 desired encoding. This differs from the \samp{es} format character,
759 which assumes that 8-bit strings are in Python's default ASCII
760 encoding and converts them to the specified new encoding.
761 (Contributed by M.-A. Lemburg, and used for the MBCS support on
762 Windows described in the previous section.)
Andrew M. Kuchling0ab31b82001-08-29 01:16:54 +0000763
764 \item Two new flags \constant{METH_NOARGS} and \constant{METH_O} are
765 available in method definition tables to simplify implementation of
766 methods with no arguments or a single untyped argument. Calling
767 such methods is more efficient than calling a corresponding method
768 that uses \constant{METH_VARARGS}.
769 Also, the old \constant{METH_OLDARGS} style of writing C methods is
770 now officially deprecated.
771
772\item
773 Two new wrapper functions, \cfunction{PyOS_snprintf()} and
774 \cfunction{PyOS_vsnprintf()} were added. which provide a
775 cross-platform implementations for the relatively new
776 \cfunction{snprintf()} and \cfunction{vsnprintf()} C lib APIs. In
777 contrast to the standard \cfunction{sprintf()} and
778 \cfunction{vsprintf()} functions, the Python versions check the
779 bounds of the buffer used to protect against buffer overruns.
780 (Contributed by M.-A. Lemburg.)
Andrew M. Kuchling77707672001-07-31 15:51:16 +0000781
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +0000782\end{itemize}
783
784
785%======================================================================
786\section{Other Changes and Fixes}
787
Andrew M. Kuchling8cfa9052001-07-19 01:19:59 +0000788% XXX update the patch and bug figures as we go
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000789As usual there were a bunch of other improvements and bugfixes
790scattered throughout the source tree. A search through the CVS change
Andrew M. Kuchling8cfa9052001-07-19 01:19:59 +0000791logs finds there were 43 patches applied, and 77 bugs fixed; both
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000792figures are likely to be underestimates. Some of the more notable
793changes are:
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +0000794
795\begin{itemize}
796
Andrew M. Kuchling0e03f582001-08-30 21:30:16 +0000797 \item The code for the MacOS port for Python, maintained by Jack
798 Jansen, is now kept in the main Python CVS tree, and many changes
799 have been made to support MacOS X.
800
801The most significant change is the ability to build Python as a
802framework, enabled by supplying the \longprogramopt{enable-framework}
803option to the configure script when compiling Python. According to
804Jack Jansen, ``This installs a self-contained Python installation plus
805the OSX framework "glue" into
806\file{/Library/Frameworks/Python.framework} (or another location of
807choice). For now there is little immediate added benefit to this
808(actually, there is the disadvantage that you have to change your PATH
809to be able to find Python), but it is the basis for creating a
810full-blown Python application, porting the MacPython IDE, possibly
811using Python as a standard OSA scripting language and much more.''
812
813Most of the MacPython toolbox modules, which interface to MacOS APIs
814such as windowing, QuickTime, scripting, etc. have been ported to OS
815X, but they've been left commented out in setup.py. People who want
816to experiment with these modules can uncomment them manually.
817
818% Jack's original comments:
819%The main change is the possibility to build Python as a
820%framework. This installs a self-contained Python installation plus the
821%OSX framework "glue" into /Library/Frameworks/Python.framework (or
822%another location of choice). For now there is little immedeate added
823%benefit to this (actually, there is the disadvantage that you have to
824%change your PATH to be able to find Python), but it is the basis for
825%creating a fullblown Python application, porting the MacPython IDE,
826%possibly using Python as a standard OSA scripting language and much
827%more. You enable this with "configure --enable-framework".
828
829%The other change is that most MacPython toolbox modules, which
830%interface to all the MacOS APIs such as windowing, quicktime,
831%scripting, etc. have been ported. Again, most of these are not of
832%immedeate use, as they need a full application to be really useful, so
833%they have been commented out in setup.py. People wanting to experiment
834%can uncomment them. Gestalt and Internet Config modules are enabled by
835%default.
836
837
Andrew M. Kuchling2cd712b2001-07-16 13:39:08 +0000838 \item Keyword arguments passed to builtin functions that don't take them
839 now cause a \exception{TypeError} exception to be raised, with the
840 message "\var{function} takes no keyword arguments".
841
Andrew M. Kuchling94a7eba2001-08-15 15:55:48 +0000842 \item A new script, \file{Tools/scripts/cleanfuture.py} by Tim
843 Peters, automatically removes obsolete \code{__future__} statements
844 from Python source code.
Andrew M. Kuchling2cd712b2001-07-16 13:39:08 +0000845
846 \item The new license introduced with Python 1.6 wasn't
847 GPL-compatible. This is fixed by some minor textual changes to the
848 2.2 license, so Python can now be embedded inside a GPLed program
849 again. The license changes were also applied to the Python 2.0.1
850 and 2.1.1 releases.
851
Andrew M. Kuchlingf4ccf582001-07-31 01:11:36 +0000852 \item When presented with a Unicode filename on Windows, Python will
853 now convert it to an MBCS encoded string, as used by the Microsoft
854 file APIs. As MBCS is explicitly used by the file APIs, Python's
855 choice of ASCII as the default encoding turns out to be an
856 annoyance.
857
Andrew M. Kuchling8cfa9052001-07-19 01:19:59 +0000858 (Contributed by Mark Hammond with assistance from Marc-Andr\'e
859 Lemburg.)
860
Andrew M. Kuchling2cd712b2001-07-16 13:39:08 +0000861 \item The \file{Tools/scripts/ftpmirror.py} script
862 now parses a \file{.netrc} file, if you have one.
Andrew M. Kuchling4cf52a92001-07-17 12:48:48 +0000863 (Contributed by Mike Romberg.)
Andrew M. Kuchling2cd712b2001-07-16 13:39:08 +0000864
Andrew M. Kuchling4cf52a92001-07-17 12:48:48 +0000865 \item Some features of the object returned by the
866 \function{xrange()} function are now deprecated, and trigger
867 warnings when they're accessed; they'll disappear in Python 2.3.
868 \class{xrange} objects tried to pretend they were full sequence
869 types by supporting slicing, sequence multiplication, and the
870 \keyword{in} operator, but these features were rarely used and
871 therefore buggy. The \method{tolist()} method and the
872 \member{start}, \member{stop}, and \member{step} attributes are also
873 being deprecated. At the C level, the fourth argument to the
874 \cfunction{PyRange_New()} function, \samp{repeat}, has also been
875 deprecated.
876
Andrew M. Kuchling8cfa9052001-07-19 01:19:59 +0000877 \item There were a bunch of patches to the dictionary
878 implementation, mostly to fix potential core dumps if a dictionary
879 contains objects that sneakily changed their hash value, or mutated
880 the dictionary they were contained in. For a while python-dev fell
881 into a gentle rhythm of Michael Hudson finding a case that dump
882 core, Tim Peters fixing it, Michael finding another case, and round
883 and round it went.
884
Andrew M. Kuchling33a3b632001-09-04 21:25:58 +0000885 \item On Windows, Python can now be compiled with Borland C thanks
886 to a number of patches contributed by Stephen Hansen, though the
887 result isn't fully functional yet. (But this \emph{is} progress...)
Andrew M. Kuchling8c69c912001-08-07 14:28:58 +0000888
Andrew M. Kuchlingf4ccf582001-07-31 01:11:36 +0000889 \item Another Windows enhancement: Wise Solutions generously offered
890 PythonLabs use of their InstallerMaster 8.1 system. Earlier
891 PythonLabs Windows installers used Wise 5.0a, which was beginning to
892 show its age. (Packaged up by Tim Peters.)
893
Andrew M. Kuchling8c69c912001-08-07 14:28:58 +0000894 \item Files ending in \samp{.pyw} can now be imported on Windows.
895 \samp{.pyw} is a Windows-only thing, used to indicate that a script
896 needs to be run using PYTHONW.EXE instead of PYTHON.EXE in order to
897 prevent a DOS console from popping up to display the output. This
898 patch makes it possible to import such scripts, in case they're also
899 usable as modules. (Implemented by David Bolen.)
900
Andrew M. Kuchling8cfa9052001-07-19 01:19:59 +0000901 \item On platforms where Python uses the C \cfunction{dlopen()} function
902 to load extension modules, it's now possible to set the flags used
903 by \cfunction{dlopen()} using the \function{sys.getdlopenflags()} and
904 \function{sys.setdlopenflags()} functions. (Contributed by Bram Stolk.)
Andrew M. Kuchling2f0047a2001-09-05 14:53:31 +0000905
Andrew M. Kuchling26c39bf2001-09-10 03:20:53 +0000906 \item The \function{pow()} built-in function no longer supports 3
907 arguments when floating-point numbers are supplied.
908 \code{pow(\var{x}, \var{y}, \var{z})} returns \code{(x**y) % z}, but
909 this is never useful for floating point numbers, and the final
910 result varies unpredictably depending on the platform. A call such
911 as \code{pow(2.0, 8.0, 7.0)} will now raise a \exception{XXX}
912 exception.
Andrew M. Kuchling77707672001-07-31 15:51:16 +0000913
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +0000914\end{itemize}
915
916
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +0000917%======================================================================
918\section{Acknowledgements}
919
920The author would like to thank the following people for offering
Andrew M. Kuchling6ea9f0b2001-07-17 14:50:31 +0000921suggestions and corrections to various drafts of this article: Fred
Andrew M. Kuchling9e9c1352001-08-11 03:06:50 +0000922Bremmer, Keith Briggs, Fred L. Drake, Jr., Carel Fellinger, Mark
Andrew M. Kuchling33a3b632001-09-04 21:25:58 +0000923Hammond, Stephen Hansen, Jack Jansen, Marc-Andr\'e Lemburg, Tim Peters, Neil
Andrew M. Kuchling0e03f582001-08-30 21:30:16 +0000924Schemenauer, Guido van Rossum.
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +0000925
926\end{document}