blob: 6645fbefc7121ed60b367fd50c97a29091c20cee [file] [log] [blame]
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +00001\documentclass{howto}
2
3% $Id$
4
5\title{What's New in Python 2.2}
Andrew M. Kuchling0ab31b82001-08-29 01:16:54 +00006\release{0.05}
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +00007\author{A.M. Kuchling}
Andrew M. Kuchling7bf82772001-07-11 18:54:26 +00008\authoraddress{\email{akuchlin@mems-exchange.org}}
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +00009\begin{document}
10\maketitle\tableofcontents
11
12\section{Introduction}
13
14{\large This document is a draft, and is subject to change until the
Andrew M. Kuchling9e9c1352001-08-11 03:06:50 +000015final version of Python 2.2 is released. Currently it's up to date
16for Python 2.2 alpha 1. Please send any comments, bug reports, or
17questions, no matter how minor, to \email{akuchlin@mems-exchange.org}.
18}
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +000019
Andrew M. Kuchling7bf82772001-07-11 18:54:26 +000020This article explains the new features in Python 2.2. Python 2.2
21includes some significant changes that go far toward cleaning up the
22language's darkest corners, and some exciting new features.
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +000023
24This article doesn't attempt to provide a complete specification for
25the new features, but instead provides a convenient overview of the
26new features. For full details, you should refer to 2.2 documentation
Fred Drake0d002542001-07-17 13:55:33 +000027such as the
28\citetitle[http://python.sourceforge.net/devel-docs/lib/lib.html]{Python
29Library Reference} and the
30\citetitle[http://python.sourceforge.net/devel-docs/ref/ref.html]{Python
31Reference Manual}, or to the PEP for a particular new feature.
32% These \citetitle marks should get the python.org URLs for the final
33% release, just as soon as the docs are published there.
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +000034
35The final release of Python 2.2 is planned for October 2001.
36
Andrew M. Kuchling8cfa9052001-07-19 01:19:59 +000037
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +000038%======================================================================
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +000039% It looks like this set of changes will likely get into 2.2,
40% so I need to read and digest the relevant PEPs.
Andrew M. Kuchling7bf82772001-07-11 18:54:26 +000041%\section{PEP 252: Type and Class Changes}
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +000042
Andrew M. Kuchling7bf82772001-07-11 18:54:26 +000043%XXX
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +000044
Andrew M. Kuchling8cfa9052001-07-19 01:19:59 +000045% GvR's description at http://www.python.org/2.2/descrintro.html
46
Andrew M. Kuchling7bf82772001-07-11 18:54:26 +000047%\begin{seealso}
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +000048
Andrew M. Kuchling7bf82772001-07-11 18:54:26 +000049%\seepep{252}{Making Types Look More Like Classes}{Written and implemented
50%by GvR.}
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +000051
Andrew M. Kuchling7bf82772001-07-11 18:54:26 +000052%\end{seealso}
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +000053
Andrew M. Kuchling8cfa9052001-07-19 01:19:59 +000054
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +000055%======================================================================
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +000056\section{PEP 234: Iterators}
57
58A significant addition to 2.2 is an iteration interface at both the C
59and Python levels. Objects can define how they can be looped over by
60callers.
61
62In Python versions up to 2.1, the usual way to make \code{for item in
63obj} work is to define a \method{__getitem__()} method that looks
64something like this:
65
66\begin{verbatim}
67 def __getitem__(self, index):
68 return <next item>
69\end{verbatim}
70
71\method{__getitem__()} is more properly used to define an indexing
72operation on an object so that you can write \code{obj[5]} to retrieve
Andrew M. Kuchling8c69c912001-08-07 14:28:58 +000073the sixth element. It's a bit misleading when you're using this only
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +000074to support \keyword{for} loops. Consider some file-like object that
75wants to be looped over; the \var{index} parameter is essentially
76meaningless, as the class probably assumes that a series of
77\method{__getitem__()} calls will be made, with \var{index}
78incrementing by one each time. In other words, the presence of the
79\method{__getitem__()} method doesn't mean that \code{file[5]} will
80work, though it really should.
81
82In Python 2.2, iteration can be implemented separately, and
83\method{__getitem__()} methods can be limited to classes that really
84do support random access. The basic idea of iterators is quite
85simple. A new built-in function, \function{iter(obj)}, returns an
86iterator for the object \var{obj}. (It can also take two arguments:
Fred Drake0d002542001-07-17 13:55:33 +000087\code{iter(\var{C}, \var{sentinel})} will call the callable \var{C},
88until it returns \var{sentinel}, which will signal that the iterator
89is done. This form probably won't be used very often.)
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +000090
91Python classes can define an \method{__iter__()} method, which should
92create and return a new iterator for the object; if the object is its
93own iterator, this method can just return \code{self}. In particular,
94iterators will usually be their own iterators. Extension types
95implemented in C can implement a \code{tp_iter} function in order to
Andrew M. Kuchling4cf52a92001-07-17 12:48:48 +000096return an iterator, and extension types that want to behave as
97iterators can define a \code{tp_iternext} function.
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +000098
99So what do iterators do? They have one required method,
100\method{next()}, which takes no arguments and returns the next value.
101When there are no more values to be returned, calling \method{next()}
102should raise the \exception{StopIteration} exception.
103
104\begin{verbatim}
105>>> L = [1,2,3]
106>>> i = iter(L)
107>>> print i
108<iterator object at 0x8116870>
109>>> i.next()
1101
111>>> i.next()
1122
113>>> i.next()
1143
115>>> i.next()
116Traceback (most recent call last):
117 File "<stdin>", line 1, in ?
118StopIteration
119>>>
120\end{verbatim}
121
122In 2.2, Python's \keyword{for} statement no longer expects a sequence;
123it expects something for which \function{iter()} will return something.
124For backward compatibility, and convenience, an iterator is
125automatically constructed for sequences that don't implement
126\method{__iter__()} or a \code{tp_iter} slot, so \code{for i in
127[1,2,3]} will still work. Wherever the Python interpreter loops over
128a sequence, it's been changed to use the iterator protocol. This
129means you can do things like this:
130
131\begin{verbatim}
132>>> i = iter(L)
133>>> a,b,c = i
134>>> a,b,c
135(1, 2, 3)
136>>>
137\end{verbatim}
138
Andrew M. Kuchling9e9c1352001-08-11 03:06:50 +0000139Iterator support has been added to some of Python's basic types.
Fred Drake0d002542001-07-17 13:55:33 +0000140Calling \function{iter()} on a dictionary will return an iterator
Andrew M. Kuchling6ea9f0b2001-07-17 14:50:31 +0000141which loops over its keys:
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000142
143\begin{verbatim}
144>>> m = {'Jan': 1, 'Feb': 2, 'Mar': 3, 'Apr': 4, 'May': 5, 'Jun': 6,
145... 'Jul': 7, 'Aug': 8, 'Sep': 9, 'Oct': 10, 'Nov': 11, 'Dec': 12}
146>>> for key in m: print key, m[key]
147...
148Mar 3
149Feb 2
150Aug 8
151Sep 9
152May 5
153Jun 6
154Jul 7
155Jan 1
156Apr 4
157Nov 11
158Dec 12
159Oct 10
160>>>
161\end{verbatim}
162
163That's just the default behaviour. If you want to iterate over keys,
164values, or key/value pairs, you can explicitly call the
165\method{iterkeys()}, \method{itervalues()}, or \method{iteritems()}
Andrew M. Kuchling9e9c1352001-08-11 03:06:50 +0000166methods to get an appropriate iterator. In a minor related change,
167the \keyword{in} operator now works on dictionaries, so
168\code{\var{key} in dict} is now equivalent to
169\code{dict.has_key(\var{key})}.
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000170
Andrew M. Kuchling9e9c1352001-08-11 03:06:50 +0000171
172Files also provide an iterator, which calls the \method{readline()}
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000173method until there are no more lines in the file. This means you can
174now read each line of a file using code like this:
175
176\begin{verbatim}
177for line in file:
178 # do something for each line
179\end{verbatim}
180
181Note that you can only go forward in an iterator; there's no way to
182get the previous element, reset the iterator, or make a copy of it.
Fred Drake0d002542001-07-17 13:55:33 +0000183An iterator object could provide such additional capabilities, but the
184iterator protocol only requires a \method{next()} method.
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000185
186\begin{seealso}
187
188\seepep{234}{Iterators}{Written by Ka-Ping Yee and GvR; implemented
189by the Python Labs crew, mostly by GvR and Tim Peters.}
190
191\end{seealso}
192
Andrew M. Kuchling8cfa9052001-07-19 01:19:59 +0000193
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000194%======================================================================
195\section{PEP 255: Simple Generators}
196
197Generators are another new feature, one that interacts with the
198introduction of iterators.
199
200You're doubtless familiar with how function calls work in Python or
201C. When you call a function, it gets a private area where its local
202variables are created. When the function reaches a \keyword{return}
203statement, the local variables are destroyed and the resulting value
204is returned to the caller. A later call to the same function will get
205a fresh new set of local variables. But, what if the local variables
206weren't destroyed on exiting a function? What if you could later
207resume the function where it left off? This is what generators
208provide; they can be thought of as resumable functions.
209
210Here's the simplest example of a generator function:
211
212\begin{verbatim}
213def generate_ints(N):
214 for i in range(N):
215 yield i
216\end{verbatim}
217
218A new keyword, \keyword{yield}, was introduced for generators. Any
219function containing a \keyword{yield} statement is a generator
220function; this is detected by Python's bytecode compiler which
Andrew M. Kuchling4cf52a92001-07-17 12:48:48 +0000221compiles the function specially. Because a new keyword was
222introduced, generators must be explicitly enabled in a module by
223including a \code{from __future__ import generators} statement near
224the top of the module's source code. In Python 2.3 this statement
225will become unnecessary.
226
227When you call a generator function, it doesn't return a single value;
228instead it returns a generator object that supports the iterator
229interface. On executing the \keyword{yield} statement, the generator
230outputs the value of \code{i}, similar to a \keyword{return}
231statement. The big difference between \keyword{yield} and a
232\keyword{return} statement is that, on reaching a \keyword{yield} the
233generator's state of execution is suspended and local variables are
234preserved. On the next call to the generator's \code{.next()} method,
235the function will resume executing immediately after the
236\keyword{yield} statement. (For complicated reasons, the
237\keyword{yield} statement isn't allowed inside the \keyword{try} block
238of a \code{try...finally} statement; read PEP 255 for a full
239explanation of the interaction between \keyword{yield} and
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000240exceptions.)
241
242Here's a sample usage of the \function{generate_ints} generator:
243
244\begin{verbatim}
245>>> gen = generate_ints(3)
246>>> gen
247<generator object at 0x8117f90>
248>>> gen.next()
2490
250>>> gen.next()
2511
252>>> gen.next()
2532
254>>> gen.next()
255Traceback (most recent call last):
256 File "<stdin>", line 1, in ?
257 File "<stdin>", line 2, in generate_ints
258StopIteration
259>>>
260\end{verbatim}
261
262You could equally write \code{for i in generate_ints(5)}, or
263\code{a,b,c = generate_ints(3)}.
264
265Inside a generator function, the \keyword{return} statement can only
Andrew M. Kuchling4cf52a92001-07-17 12:48:48 +0000266be used without a value, and signals the end of the procession of
267values; afterwards the generator cannot return any further values.
268\keyword{return} with a value, such as \code{return 5}, is a syntax
269error inside a generator function. The end of the generator's results
270can also be indicated by raising \exception{StopIteration} manually,
271or by just letting the flow of execution fall off the bottom of the
272function.
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000273
274You could achieve the effect of generators manually by writing your
Andrew M. Kuchling4cf52a92001-07-17 12:48:48 +0000275own class and storing all the local variables of the generator as
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000276instance variables. For example, returning a list of integers could
277be done by setting \code{self.count} to 0, and having the
278\method{next()} method increment \code{self.count} and return it.
Andrew M. Kuchlingc32cc7c2001-07-17 18:25:01 +0000279However, for a moderately complicated generator, writing a
280corresponding class would be much messier.
281\file{Lib/test/test_generators.py} contains a number of more
282interesting examples. The simplest one implements an in-order
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000283traversal of a tree using generators recursively.
284
285\begin{verbatim}
286# A recursive generator that generates Tree leaves in in-order.
287def inorder(t):
288 if t:
289 for x in inorder(t.left):
290 yield x
291 yield t.label
292 for x in inorder(t.right):
293 yield x
294\end{verbatim}
295
296Two other examples in \file{Lib/test/test_generators.py} produce
297solutions for the N-Queens problem (placing $N$ queens on an $NxN$
298chess board so that no queen threatens another) and the Knight's Tour
299(a route that takes a knight to every square of an $NxN$ chessboard
300without visiting any square twice).
301
302The idea of generators comes from other programming languages,
303especially Icon (\url{http://www.cs.arizona.edu/icon/}), where the
304idea of generators is central to the language. In Icon, every
305expression and function call behaves like a generator. One example
306from ``An Overview of the Icon Programming Language'' at
307\url{http://www.cs.arizona.edu/icon/docs/ipd266.htm} gives an idea of
308what this looks like:
309
310\begin{verbatim}
311sentence := "Store it in the neighboring harbor"
312if (i := find("or", sentence)) > 5 then write(i)
313\end{verbatim}
314
315The \function{find()} function returns the indexes at which the
316substring ``or'' is found: 3, 23, 33. In the \keyword{if} statement,
317\code{i} is first assigned a value of 3, but 3 is less than 5, so the
318comparison fails, and Icon retries it with the second value of 23. 23
319is greater than 5, so the comparison now succeeds, and the code prints
320the value 23 to the screen.
321
322Python doesn't go nearly as far as Icon in adopting generators as a
323central concept. Generators are considered a new part of the core
324Python language, but learning or using them isn't compulsory; if they
325don't solve any problems that you have, feel free to ignore them.
326This is different from Icon where the idea of generators is a basic
327concept. One novel feature of Python's interface as compared to
328Icon's is that a generator's state is represented as a concrete object
329that can be passed around to other functions or stored in a data
330structure.
331
332\begin{seealso}
333
Andrew M. Kuchling4cf52a92001-07-17 12:48:48 +0000334\seepep{255}{Simple Generators}{Written by Neil Schemenauer, Tim
335Peters, Magnus Lie Hetland. Implemented mostly by Neil Schemenauer
336and Tim Peters, with other fixes from the Python Labs crew.}
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000337
338\end{seealso}
339
Andrew M. Kuchling8cfa9052001-07-19 01:19:59 +0000340
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000341%======================================================================
Andrew M. Kuchling2f0047a2001-09-05 14:53:31 +0000342\section{PEP 237: Unifying Long Integers and Integers}
343
344XXX write this section
345
346
347%======================================================================
Andrew M. Kuchling9e9c1352001-08-11 03:06:50 +0000348\section{PEP 238: Changing the Division Operator}
349
350The most controversial change in Python 2.2 is the start of an effort
351to fix an old design flaw that's been in Python from the beginning.
352Currently Python's division operator, \code{/}, behaves like C's
353division operator when presented with two integer arguments. It
354returns an integer result that's truncated down when there would be
355fractional part. For example, \code{3/2} is 1, not 1.5, and
356\code{(-1)/2} is -1, not -0.5. This means that the results of divison
357can vary unexpectedly depending on the type of the two operands and
358because Python is dynamically typed, it can be difficult to determine
359the possible types of the operands.
360
361(The controversy is over whether this is \emph{really} a design flaw,
362and whether it's worth breaking existing code to fix this. It's
363caused endless discussions on python-dev and in July erupted into an
364storm of acidly sarcastic postings on \newsgroup{comp.lang.python}. I
365won't argue for either side here; read PEP 238 for a summary of
366arguments and counter-arguments.)
367
368Because this change might break code, it's being introduced very
369gradually. Python 2.2 begins the transition, but the switch won't be
370complete until Python 3.0.
371
372First, some terminology from PEP 238. ``True division'' is the
373division that most non-programmers are familiar with: 3/2 is 1.5, 1/4
374is 0.25, and so forth. ``Floor division'' is what Python's \code{/}
375operator currently does when given integer operands; the result is the
376floor of the value returned by true division. ``Classic division'' is
377the current mixed behaviour of \code{/}; it returns the result of
378floor division when the operands are integers, and returns the result
379of true division when one of the operands is a floating-point number.
380
381Here are the changes 2.2 introduces:
382
383\begin{itemize}
384
385\item A new operator, \code{//}, is the floor division operator.
386(Yes, we know it looks like \Cpp's comment symbol.) \code{//}
387\emph{always} returns the floor divison no matter what the types of
388its operands are, so \code{1 // 2} is 0 and \code{1.0 // 2.0} is also
3890.0.
390
391\code{//} is always available in Python 2.2; you don't need to enable
392it using a \code{__future__} statement.
393
394\item By including a \code{from __future__ import true_division} in a
395module, the \code{/} operator will be changed to return the result of
396true division, so \code{1/2} is 0.5. Without the \code{__future__}
397statement, \code{/} still means classic division. The default meaning
398of \code{/} will not change until Python 3.0.
399
400\item Classes can define methods called \method{__truediv__} and
401\method{__floordiv__} to overload the two division operators. At the
402C level, there are also slots in the \code{PyNumberMethods} structure
403so extension types can define the two operators.
404
405% XXX a warning someday?
406
407\end{itemize}
408
409\begin{seealso}
410
411\seepep{238}{Changing the Division Operator}{Written by Moshe Zadka and
412Guido van Rossum. Implemented by Guido van Rossum..}
413
414\end{seealso}
415
416
417%======================================================================
Andrew M. Kuchlinga43e7032001-06-27 20:32:12 +0000418\section{Unicode Changes}
419
Andrew M. Kuchling2cd712b2001-07-16 13:39:08 +0000420Python's Unicode support has been enhanced a bit in 2.2. Unicode
Andrew M. Kuchlinga6d2a042001-07-20 18:34:34 +0000421strings are usually stored as UCS-2, as 16-bit unsigned integers.
Andrew M. Kuchlingf5fec3c2001-07-19 01:48:08 +0000422Python 2.2 can also be compiled to use UCS-4, 32-bit unsigned
423integers, as its internal encoding by supplying
424\longprogramopt{enable-unicode=ucs4} to the configure script. When
Andrew M. Kuchlingab010872001-07-19 14:59:53 +0000425built to use UCS-4 (a ``wide Python''), the interpreter can natively
Andrew M. Kuchlinga6d2a042001-07-20 18:34:34 +0000426handle Unicode characters from U+000000 to U+110000, so the range of
427legal values for the \function{unichr()} function is expanded
428accordingly. Using an interpreter compiled to use UCS-2 (a ``narrow
429Python''), values greater than 65535 will still cause
430\function{unichr()} to raise a \exception{ValueError} exception.
Andrew M. Kuchlingab010872001-07-19 14:59:53 +0000431
432All this is the province of the still-unimplemented PEP 261, ``Support
433for `wide' Unicode characters''; consult it for further details, and
Andrew M. Kuchlinga6d2a042001-07-20 18:34:34 +0000434please offer comments on the PEP and on your experiences with the
4352.2 alpha releases.
436% XXX update previous line once 2.2 reaches beta.
Andrew M. Kuchlingab010872001-07-19 14:59:53 +0000437
438Another change is much simpler to explain. Since their introduction,
439Unicode strings have supported an \method{encode()} method to convert
440the string to a selected encoding such as UTF-8 or Latin-1. A
441symmetric \method{decode(\optional{\var{encoding}})} method has been
442added to 8-bit strings (though not to Unicode strings) in 2.2.
443\method{decode()} assumes that the string is in the specified encoding
444and decodes it, returning whatever is returned by the codec.
445
446Using this new feature, codecs have been added for tasks not directly
447related to Unicode. For example, codecs have been added for
448uu-encoding, MIME's base64 encoding, and compression with the
449\module{zlib} module:
Andrew M. Kuchling2cd712b2001-07-16 13:39:08 +0000450
451\begin{verbatim}
452>>> s = """Here is a lengthy piece of redundant, overly verbose,
453... and repetitive text.
454... """
455>>> data = s.encode('zlib')
456>>> data
457'x\x9c\r\xc9\xc1\r\x80 \x10\x04\xc0?Ul...'
458>>> data.decode('zlib')
459'Here is a lengthy piece of redundant, overly verbose,\nand repetitive text.\n'
460>>> print s.encode('uu')
461begin 666 <data>
462M2&5R92!I<R!A(&QE;F=T:'D@<&EE8V4@;V8@<F5D=6YD86YT+"!O=F5R;'D@
463>=F5R8F]S92P*86YD(')E<&5T:71I=F4@=&5X="X*
464
465end
466>>> "sheesh".encode('rot-13')
467'furrfu'
468\end{verbatim}
Andrew M. Kuchlinga43e7032001-06-27 20:32:12 +0000469
Andrew M. Kuchlingf5fec3c2001-07-19 01:48:08 +0000470\method{encode()} and \method{decode()} were implemented by
471Marc-Andr\'e Lemburg. The changes to support using UCS-4 internally
472were implemented by Fredrik Lundh and Martin von L\"owis.
Andrew M. Kuchlinga43e7032001-06-27 20:32:12 +0000473
Andrew M. Kuchlingf5fec3c2001-07-19 01:48:08 +0000474\begin{seealso}
475
476\seepep{261}{Support for `wide' Unicode characters}{PEP written by
477Paul Prescod. Not yet accepted or fully implemented.}
478
479\end{seealso}
Andrew M. Kuchling8cfa9052001-07-19 01:19:59 +0000480
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000481%======================================================================
482\section{PEP 227: Nested Scopes}
483
484In Python 2.1, statically nested scopes were added as an optional
485feature, to be enabled by a \code{from __future__ import
486nested_scopes} directive. In 2.2 nested scopes no longer need to be
487specially enabled, but are always enabled. The rest of this section
488is a copy of the description of nested scopes from my ``What's New in
489Python 2.1'' document; if you read it when 2.1 came out, you can skip
490the rest of this section.
491
492The largest change introduced in Python 2.1, and made complete in 2.2,
493is to Python's scoping rules. In Python 2.0, at any given time there
494are at most three namespaces used to look up variable names: local,
495module-level, and the built-in namespace. This often surprised people
496because it didn't match their intuitive expectations. For example, a
497nested recursive function definition doesn't work:
498
499\begin{verbatim}
500def f():
501 ...
502 def g(value):
503 ...
504 return g(value-1) + 1
505 ...
506\end{verbatim}
507
508The function \function{g()} will always raise a \exception{NameError}
509exception, because the binding of the name \samp{g} isn't in either
510its local namespace or in the module-level namespace. This isn't much
511of a problem in practice (how often do you recursively define interior
512functions like this?), but this also made using the \keyword{lambda}
513statement clumsier, and this was a problem in practice. In code which
514uses \keyword{lambda} you can often find local variables being copied
515by passing them as the default values of arguments.
516
517\begin{verbatim}
518def find(self, name):
519 "Return list of any entries equal to 'name'"
520 L = filter(lambda x, name=name: x == name,
521 self.list_attribute)
522 return L
523\end{verbatim}
524
525The readability of Python code written in a strongly functional style
526suffers greatly as a result.
527
528The most significant change to Python 2.2 is that static scoping has
529been added to the language to fix this problem. As a first effect,
530the \code{name=name} default argument is now unnecessary in the above
531example. Put simply, when a given variable name is not assigned a
532value within a function (by an assignment, or the \keyword{def},
533\keyword{class}, or \keyword{import} statements), references to the
534variable will be looked up in the local namespace of the enclosing
535scope. A more detailed explanation of the rules, and a dissection of
536the implementation, can be found in the PEP.
537
538This change may cause some compatibility problems for code where the
539same variable name is used both at the module level and as a local
540variable within a function that contains further function definitions.
541This seems rather unlikely though, since such code would have been
542pretty confusing to read in the first place.
543
544One side effect of the change is that the \code{from \var{module}
545import *} and \keyword{exec} statements have been made illegal inside
546a function scope under certain conditions. The Python reference
547manual has said all along that \code{from \var{module} import *} is
548only legal at the top level of a module, but the CPython interpreter
549has never enforced this before. As part of the implementation of
550nested scopes, the compiler which turns Python source into bytecodes
551has to generate different code to access variables in a containing
552scope. \code{from \var{module} import *} and \keyword{exec} make it
553impossible for the compiler to figure this out, because they add names
554to the local namespace that are unknowable at compile time.
555Therefore, if a function contains function definitions or
556\keyword{lambda} expressions with free variables, the compiler will
557flag this by raising a \exception{SyntaxError} exception.
558
559To make the preceding explanation a bit clearer, here's an example:
560
561\begin{verbatim}
562x = 1
563def f():
564 # The next line is a syntax error
565 exec 'x=2'
566 def g():
567 return x
568\end{verbatim}
569
570Line 4 containing the \keyword{exec} statement is a syntax error,
571since \keyword{exec} would define a new local variable named \samp{x}
572whose value should be accessed by \function{g()}.
573
574This shouldn't be much of a limitation, since \keyword{exec} is rarely
575used in most Python code (and when it is used, it's often a sign of a
576poor design anyway).
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000577
578\begin{seealso}
579
580\seepep{227}{Statically Nested Scopes}{Written and implemented by
581Jeremy Hylton.}
582
583\end{seealso}
584
Andrew M. Kuchlinga43e7032001-06-27 20:32:12 +0000585
586%======================================================================
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +0000587\section{New and Improved Modules}
588
589\begin{itemize}
590
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000591 \item The \module{xmlrpclib} module was contributed to the standard
Andrew M. Kuchling8c69c912001-08-07 14:28:58 +0000592 library by Fredrik Lundh. It provides support for writing XML-RPC
593 clients; XML-RPC is a simple remote procedure call protocol built on
594 top of HTTP and XML. For example, the following snippet retrieves a
595 list of RSS channels from the O'Reilly Network, and then retrieves a
596 list of the recent headlines for one channel:
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000597
598\begin{verbatim}
599import xmlrpclib
600s = xmlrpclib.Server(
601 'http://www.oreillynet.com/meerkat/xml-rpc/server.php')
602channels = s.meerkat.getChannels()
603# channels is a list of dictionaries, like this:
604# [{'id': 4, 'title': 'Freshmeat Daily News'}
605# {'id': 190, 'title': '32Bits Online'},
606# {'id': 4549, 'title': '3DGamers'}, ... ]
607
608# Get the items for one channel
609items = s.meerkat.getItems( {'channel': 4} )
610
611# 'items' is another list of dictionaries, like this:
612# [{'link': 'http://freshmeat.net/releases/52719/',
613# 'description': 'A utility which converts HTML to XSL FO.',
614# 'title': 'html2fo 0.3 (Default)'}, ... ]
615\end{verbatim}
616
Fred Drake0d002542001-07-17 13:55:33 +0000617See \url{http://www.xmlrpc.com/} for more information about XML-RPC.
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000618
619 \item The \module{socket} module can be compiled to support IPv6;
Andrew M. Kuchlingddeb1352001-07-16 14:35:52 +0000620 specify the \longprogramopt{enable-ipv6} option to Python's configure
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000621 script. (Contributed by Jun-ichiro ``itojun'' Hagino.)
622
623 \item Two new format characters were added to the \module{struct}
624 module for 64-bit integers on platforms that support the C
625 \ctype{long long} type. \samp{q} is for a signed 64-bit integer,
626 and \samp{Q} is for an unsigned one. The value is returned in
627 Python's long integer type. (Contributed by Tim Peters.)
628
629 \item In the interpreter's interactive mode, there's a new built-in
630 function \function{help()}, that uses the \module{pydoc} module
631 introduced in Python 2.1 to provide interactive.
632 \code{help(\var{object})} displays any available help text about
633 \var{object}. \code{help()} with no argument puts you in an online
634 help utility, where you can enter the names of functions, classes,
635 or modules to read their help text.
636 (Contributed by Guido van Rossum, using Ka-Ping Yee's \module{pydoc} module.)
637
638 \item Various bugfixes and performance improvements have been made
Andrew M. Kuchling4cf52a92001-07-17 12:48:48 +0000639 to the SRE engine underlying the \module{re} module. For example,
640 \function{re.sub()} will now use \function{string.replace()}
641 automatically when the pattern and its replacement are both just
642 literal strings without regex metacharacters. Another contributed
643 patch speeds up certain Unicode character ranges by a factor of
644 two. (SRE is maintained by Fredrik Lundh. The BIGCHARSET patch was
645 contributed by Martin von L\"owis.)
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000646
Andrew M. Kuchlinga6d2a042001-07-20 18:34:34 +0000647 \item The \module{imaplib} module, maintained by Piers Lauder, has
648 support for several new extensions: the NAMESPACE extension defined
649 in \rfc{2342}, SORT, GETACL and SETACL. (Contributed by Anthony
650 Baxter and Michel Pelletier.)
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000651
Fred Drake0d002542001-07-17 13:55:33 +0000652 \item The \module{rfc822} module's parsing of email addresses is
Andrew M. Kuchling4cf52a92001-07-17 12:48:48 +0000653 now compliant with \rfc{2822}, an update to \rfc{822}. The module's
654 name is \emph{not} going to be changed to \samp{rfc2822}.
655 (Contributed by Barry Warsaw.)
Andrew M. Kuchling77707672001-07-31 15:51:16 +0000656
657 \item New constants \constant{ascii_letters},
658 \constant{ascii_lowercase}, and \constant{ascii_uppercase} were
659 added to the \module{string} module. There were several modules in
660 the standard library that used \constant{string.letters} to mean the
661 ranges A-Za-z, but that assumption is incorrect when locales are in
662 use, because \constant{string.letters} varies depending on the set
663 of legal characters defined by the current locale. The buggy
664 modules have all been fixed to use \constant{ascii_letters} instead.
665 (Reported by an unknown person; fixed by Fred L. Drake, Jr.)
666
Andrew M. Kuchling8c69c912001-08-07 14:28:58 +0000667 \item The \module{mimetypes} module now makes it easier to use
668 alternative MIME-type databases by the addition of a
669 \class{MimeTypes} class, which takes a list of filenames to be
670 parsed. (Contributed by Fred L. Drake, Jr.)
671
Andrew M. Kuchling2f0047a2001-09-05 14:53:31 +0000672 \item XXX threading.Timer class
673
Andrew M. Kuchling77707672001-07-31 15:51:16 +0000674\end{itemize}
675
676
677%======================================================================
678\section{Interpreter Changes and Fixes}
679
680Some of the changes only affect people who deal with the Python
681interpreter at the C level, writing Python extension modules,
682embedding the interpreter, or just hacking on the interpreter itself.
683If you only write Python code, none of the changes described here will
684affect you very much.
685
686\begin{itemize}
687
688 \item Profiling and tracing functions can now be implemented in C,
689 which can operate at much higher speeds than Python-based functions
690 and should reduce the overhead of enabling profiling and tracing, so
691 it will be of interest to authors of development environments for
692 Python. Two new C functions were added to Python's API,
693 \cfunction{PyEval_SetProfile()} and \cfunction{PyEval_SetTrace()}.
694 The existing \function{sys.setprofile()} and
695 \function{sys.settrace()} functions still exist, and have simply
696 been changed to use the new C-level interface. (Contributed by Fred
697 L. Drake, Jr.)
698
699 \item Another low-level API, primarily of interest to implementors
700 of Python debuggers and development tools, was added.
701 \cfunction{PyInterpreterState_Head()} and
702 \cfunction{PyInterpreterState_Next()} let a caller walk through all
703 the existing interpreter objects;
704 \cfunction{PyInterpreterState_ThreadHead()} and
705 \cfunction{PyThreadState_Next()} allow looping over all the thread
706 states for a given interpreter. (Contributed by David Beazley.)
707
708 \item A new \samp{et} format sequence was added to
709 \cfunction{PyArg_ParseTuple}; \samp{et} takes both a parameter and
710 an encoding name, and converts the parameter to the given encoding
711 if the parameter turns out to be a Unicode string, or leaves it
712 alone if it's an 8-bit string, assuming it to already be in the
713 desired encoding. This differs from the \samp{es} format character,
714 which assumes that 8-bit strings are in Python's default ASCII
715 encoding and converts them to the specified new encoding.
716 (Contributed by M.-A. Lemburg, and used for the MBCS support on
717 Windows described in the previous section.)
Andrew M. Kuchling0ab31b82001-08-29 01:16:54 +0000718
719 \item Two new flags \constant{METH_NOARGS} and \constant{METH_O} are
720 available in method definition tables to simplify implementation of
721 methods with no arguments or a single untyped argument. Calling
722 such methods is more efficient than calling a corresponding method
723 that uses \constant{METH_VARARGS}.
724 Also, the old \constant{METH_OLDARGS} style of writing C methods is
725 now officially deprecated.
726
727\item
728 Two new wrapper functions, \cfunction{PyOS_snprintf()} and
729 \cfunction{PyOS_vsnprintf()} were added. which provide a
730 cross-platform implementations for the relatively new
731 \cfunction{snprintf()} and \cfunction{vsnprintf()} C lib APIs. In
732 contrast to the standard \cfunction{sprintf()} and
733 \cfunction{vsprintf()} functions, the Python versions check the
734 bounds of the buffer used to protect against buffer overruns.
735 (Contributed by M.-A. Lemburg.)
Andrew M. Kuchling77707672001-07-31 15:51:16 +0000736
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +0000737\end{itemize}
738
739
740%======================================================================
741\section{Other Changes and Fixes}
742
Andrew M. Kuchling8cfa9052001-07-19 01:19:59 +0000743% XXX update the patch and bug figures as we go
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000744As usual there were a bunch of other improvements and bugfixes
745scattered throughout the source tree. A search through the CVS change
Andrew M. Kuchling8cfa9052001-07-19 01:19:59 +0000746logs finds there were 43 patches applied, and 77 bugs fixed; both
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000747figures are likely to be underestimates. Some of the more notable
748changes are:
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +0000749
750\begin{itemize}
751
Andrew M. Kuchling0e03f582001-08-30 21:30:16 +0000752 \item The code for the MacOS port for Python, maintained by Jack
753 Jansen, is now kept in the main Python CVS tree, and many changes
754 have been made to support MacOS X.
755
756The most significant change is the ability to build Python as a
757framework, enabled by supplying the \longprogramopt{enable-framework}
758option to the configure script when compiling Python. According to
759Jack Jansen, ``This installs a self-contained Python installation plus
760the OSX framework "glue" into
761\file{/Library/Frameworks/Python.framework} (or another location of
762choice). For now there is little immediate added benefit to this
763(actually, there is the disadvantage that you have to change your PATH
764to be able to find Python), but it is the basis for creating a
765full-blown Python application, porting the MacPython IDE, possibly
766using Python as a standard OSA scripting language and much more.''
767
768Most of the MacPython toolbox modules, which interface to MacOS APIs
769such as windowing, QuickTime, scripting, etc. have been ported to OS
770X, but they've been left commented out in setup.py. People who want
771to experiment with these modules can uncomment them manually.
772
773% Jack's original comments:
774%The main change is the possibility to build Python as a
775%framework. This installs a self-contained Python installation plus the
776%OSX framework "glue" into /Library/Frameworks/Python.framework (or
777%another location of choice). For now there is little immedeate added
778%benefit to this (actually, there is the disadvantage that you have to
779%change your PATH to be able to find Python), but it is the basis for
780%creating a fullblown Python application, porting the MacPython IDE,
781%possibly using Python as a standard OSA scripting language and much
782%more. You enable this with "configure --enable-framework".
783
784%The other change is that most MacPython toolbox modules, which
785%interface to all the MacOS APIs such as windowing, quicktime,
786%scripting, etc. have been ported. Again, most of these are not of
787%immedeate use, as they need a full application to be really useful, so
788%they have been commented out in setup.py. People wanting to experiment
789%can uncomment them. Gestalt and Internet Config modules are enabled by
790%default.
791
792
Andrew M. Kuchling2cd712b2001-07-16 13:39:08 +0000793 \item Keyword arguments passed to builtin functions that don't take them
794 now cause a \exception{TypeError} exception to be raised, with the
795 message "\var{function} takes no keyword arguments".
796
Andrew M. Kuchling94a7eba2001-08-15 15:55:48 +0000797 \item A new script, \file{Tools/scripts/cleanfuture.py} by Tim
798 Peters, automatically removes obsolete \code{__future__} statements
799 from Python source code.
Andrew M. Kuchling2cd712b2001-07-16 13:39:08 +0000800
801 \item The new license introduced with Python 1.6 wasn't
802 GPL-compatible. This is fixed by some minor textual changes to the
803 2.2 license, so Python can now be embedded inside a GPLed program
804 again. The license changes were also applied to the Python 2.0.1
805 and 2.1.1 releases.
806
Andrew M. Kuchlingf4ccf582001-07-31 01:11:36 +0000807 \item When presented with a Unicode filename on Windows, Python will
808 now convert it to an MBCS encoded string, as used by the Microsoft
809 file APIs. As MBCS is explicitly used by the file APIs, Python's
810 choice of ASCII as the default encoding turns out to be an
811 annoyance.
812
Andrew M. Kuchling8cfa9052001-07-19 01:19:59 +0000813 (Contributed by Mark Hammond with assistance from Marc-Andr\'e
814 Lemburg.)
815
Andrew M. Kuchling2cd712b2001-07-16 13:39:08 +0000816 \item The \file{Tools/scripts/ftpmirror.py} script
817 now parses a \file{.netrc} file, if you have one.
Andrew M. Kuchling4cf52a92001-07-17 12:48:48 +0000818 (Contributed by Mike Romberg.)
Andrew M. Kuchling2cd712b2001-07-16 13:39:08 +0000819
Andrew M. Kuchling4cf52a92001-07-17 12:48:48 +0000820 \item Some features of the object returned by the
821 \function{xrange()} function are now deprecated, and trigger
822 warnings when they're accessed; they'll disappear in Python 2.3.
823 \class{xrange} objects tried to pretend they were full sequence
824 types by supporting slicing, sequence multiplication, and the
825 \keyword{in} operator, but these features were rarely used and
826 therefore buggy. The \method{tolist()} method and the
827 \member{start}, \member{stop}, and \member{step} attributes are also
828 being deprecated. At the C level, the fourth argument to the
829 \cfunction{PyRange_New()} function, \samp{repeat}, has also been
830 deprecated.
831
Andrew M. Kuchling8cfa9052001-07-19 01:19:59 +0000832 \item There were a bunch of patches to the dictionary
833 implementation, mostly to fix potential core dumps if a dictionary
834 contains objects that sneakily changed their hash value, or mutated
835 the dictionary they were contained in. For a while python-dev fell
836 into a gentle rhythm of Michael Hudson finding a case that dump
837 core, Tim Peters fixing it, Michael finding another case, and round
838 and round it went.
839
Andrew M. Kuchling33a3b632001-09-04 21:25:58 +0000840 \item On Windows, Python can now be compiled with Borland C thanks
841 to a number of patches contributed by Stephen Hansen, though the
842 result isn't fully functional yet. (But this \emph{is} progress...)
Andrew M. Kuchling8c69c912001-08-07 14:28:58 +0000843
Andrew M. Kuchlingf4ccf582001-07-31 01:11:36 +0000844 \item Another Windows enhancement: Wise Solutions generously offered
845 PythonLabs use of their InstallerMaster 8.1 system. Earlier
846 PythonLabs Windows installers used Wise 5.0a, which was beginning to
847 show its age. (Packaged up by Tim Peters.)
848
Andrew M. Kuchling8c69c912001-08-07 14:28:58 +0000849 \item Files ending in \samp{.pyw} can now be imported on Windows.
850 \samp{.pyw} is a Windows-only thing, used to indicate that a script
851 needs to be run using PYTHONW.EXE instead of PYTHON.EXE in order to
852 prevent a DOS console from popping up to display the output. This
853 patch makes it possible to import such scripts, in case they're also
854 usable as modules. (Implemented by David Bolen.)
855
Andrew M. Kuchling8cfa9052001-07-19 01:19:59 +0000856 \item On platforms where Python uses the C \cfunction{dlopen()} function
857 to load extension modules, it's now possible to set the flags used
858 by \cfunction{dlopen()} using the \function{sys.getdlopenflags()} and
859 \function{sys.setdlopenflags()} functions. (Contributed by Bram Stolk.)
Andrew M. Kuchling2f0047a2001-09-05 14:53:31 +0000860
861 \item XXX 3-argument float pow() is gone
Andrew M. Kuchling77707672001-07-31 15:51:16 +0000862
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +0000863\end{itemize}
864
865
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +0000866%======================================================================
867\section{Acknowledgements}
868
869The author would like to thank the following people for offering
Andrew M. Kuchling6ea9f0b2001-07-17 14:50:31 +0000870suggestions and corrections to various drafts of this article: Fred
Andrew M. Kuchling9e9c1352001-08-11 03:06:50 +0000871Bremmer, Keith Briggs, Fred L. Drake, Jr., Carel Fellinger, Mark
Andrew M. Kuchling33a3b632001-09-04 21:25:58 +0000872Hammond, Stephen Hansen, Jack Jansen, Marc-Andr\'e Lemburg, Tim Peters, Neil
Andrew M. Kuchling0e03f582001-08-30 21:30:16 +0000873Schemenauer, Guido van Rossum.
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +0000874
875\end{document}