blob: a297e9b9b6da7a7fd9886d1afa3d479f9b090fd7 [file] [log] [blame]
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +00001\documentclass{howto}
2
3% $Id$
4
5\title{What's New in Python 2.2}
Andrew M. Kuchling0ab31b82001-08-29 01:16:54 +00006\release{0.05}
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +00007\author{A.M. Kuchling}
Andrew M. Kuchling7bf82772001-07-11 18:54:26 +00008\authoraddress{\email{akuchlin@mems-exchange.org}}
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +00009\begin{document}
10\maketitle\tableofcontents
11
12\section{Introduction}
13
14{\large This document is a draft, and is subject to change until the
Andrew M. Kuchling9e9c1352001-08-11 03:06:50 +000015final version of Python 2.2 is released. Currently it's up to date
16for Python 2.2 alpha 1. Please send any comments, bug reports, or
17questions, no matter how minor, to \email{akuchlin@mems-exchange.org}.
18}
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +000019
Andrew M. Kuchling26c39bf2001-09-10 03:20:53 +000020This article explains the new features in Python 2.2.
21
22Python 2.2 can be thought of as the "cleanup release". There are some
23features such as generators and iterators that are completely new, but
24most of the changes, significant and far-reaching though they may be,
25are aimed at cleaning up irregularities and dark corners of the
26language design.
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +000027
28This article doesn't attempt to provide a complete specification for
Andrew M. Kuchling26c39bf2001-09-10 03:20:53 +000029the new features, but instead provides a convenient overview. For
30full details, you should refer to the documentation for Python 2.2,
Fred Drake0d002542001-07-17 13:55:33 +000031such as the
32\citetitle[http://python.sourceforge.net/devel-docs/lib/lib.html]{Python
33Library Reference} and the
34\citetitle[http://python.sourceforge.net/devel-docs/ref/ref.html]{Python
Andrew M. Kuchling26c39bf2001-09-10 03:20:53 +000035Reference Manual}.
36% XXX These \citetitle marks should get the python.org URLs for the final
Fred Drake0d002542001-07-17 13:55:33 +000037% release, just as soon as the docs are published there.
Andrew M. Kuchling26c39bf2001-09-10 03:20:53 +000038If you want to understand the complete implementation and design
39rationale for a change, refer to the PEP for a particular new feature.
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +000040
41The final release of Python 2.2 is planned for October 2001.
42
Andrew M. Kuchling8cfa9052001-07-19 01:19:59 +000043
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +000044%======================================================================
Andrew M. Kuchling26c39bf2001-09-10 03:20:53 +000045\section{PEP 252: Type and Class Changes}
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +000046
Andrew M. Kuchlingd6e40e22001-09-10 16:18:50 +000047XXX I need to read and digest the relevant PEPs.
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +000048
Andrew M. Kuchling26c39bf2001-09-10 03:20:53 +000049\begin{seealso}
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +000050
Andrew M. Kuchling26c39bf2001-09-10 03:20:53 +000051\seepep{252}{Making Types Look More Like Classes}{Written and implemented
52by Guido van Rossum.}
53
Andrew M. Kuchlingd6e40e22001-09-10 16:18:50 +000054\seeurl{http://www.python.org/2.2/descrintro.html}{A tutorial
55on the type/class changes in 2.2.}
56
Andrew M. Kuchling26c39bf2001-09-10 03:20:53 +000057\end{seealso}
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +000058
Andrew M. Kuchling8cfa9052001-07-19 01:19:59 +000059
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +000060%======================================================================
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +000061\section{PEP 234: Iterators}
62
63A significant addition to 2.2 is an iteration interface at both the C
64and Python levels. Objects can define how they can be looped over by
65callers.
66
67In Python versions up to 2.1, the usual way to make \code{for item in
68obj} work is to define a \method{__getitem__()} method that looks
69something like this:
70
71\begin{verbatim}
72 def __getitem__(self, index):
73 return <next item>
74\end{verbatim}
75
76\method{__getitem__()} is more properly used to define an indexing
77operation on an object so that you can write \code{obj[5]} to retrieve
Andrew M. Kuchling8c69c912001-08-07 14:28:58 +000078the sixth element. It's a bit misleading when you're using this only
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +000079to support \keyword{for} loops. Consider some file-like object that
80wants to be looped over; the \var{index} parameter is essentially
81meaningless, as the class probably assumes that a series of
82\method{__getitem__()} calls will be made, with \var{index}
83incrementing by one each time. In other words, the presence of the
84\method{__getitem__()} method doesn't mean that \code{file[5]} will
85work, though it really should.
86
87In Python 2.2, iteration can be implemented separately, and
88\method{__getitem__()} methods can be limited to classes that really
89do support random access. The basic idea of iterators is quite
90simple. A new built-in function, \function{iter(obj)}, returns an
91iterator for the object \var{obj}. (It can also take two arguments:
Fred Drake0d002542001-07-17 13:55:33 +000092\code{iter(\var{C}, \var{sentinel})} will call the callable \var{C},
93until it returns \var{sentinel}, which will signal that the iterator
94is done. This form probably won't be used very often.)
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +000095
96Python classes can define an \method{__iter__()} method, which should
97create and return a new iterator for the object; if the object is its
98own iterator, this method can just return \code{self}. In particular,
99iterators will usually be their own iterators. Extension types
100implemented in C can implement a \code{tp_iter} function in order to
Andrew M. Kuchling4cf52a92001-07-17 12:48:48 +0000101return an iterator, and extension types that want to behave as
102iterators can define a \code{tp_iternext} function.
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000103
104So what do iterators do? They have one required method,
105\method{next()}, which takes no arguments and returns the next value.
106When there are no more values to be returned, calling \method{next()}
107should raise the \exception{StopIteration} exception.
108
109\begin{verbatim}
110>>> L = [1,2,3]
111>>> i = iter(L)
112>>> print i
113<iterator object at 0x8116870>
114>>> i.next()
1151
116>>> i.next()
1172
118>>> i.next()
1193
120>>> i.next()
121Traceback (most recent call last):
122 File "<stdin>", line 1, in ?
123StopIteration
124>>>
125\end{verbatim}
126
127In 2.2, Python's \keyword{for} statement no longer expects a sequence;
128it expects something for which \function{iter()} will return something.
129For backward compatibility, and convenience, an iterator is
130automatically constructed for sequences that don't implement
131\method{__iter__()} or a \code{tp_iter} slot, so \code{for i in
132[1,2,3]} will still work. Wherever the Python interpreter loops over
133a sequence, it's been changed to use the iterator protocol. This
134means you can do things like this:
135
136\begin{verbatim}
137>>> i = iter(L)
138>>> a,b,c = i
139>>> a,b,c
140(1, 2, 3)
141>>>
142\end{verbatim}
143
Andrew M. Kuchling9e9c1352001-08-11 03:06:50 +0000144Iterator support has been added to some of Python's basic types.
Fred Drake0d002542001-07-17 13:55:33 +0000145Calling \function{iter()} on a dictionary will return an iterator
Andrew M. Kuchling6ea9f0b2001-07-17 14:50:31 +0000146which loops over its keys:
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000147
148\begin{verbatim}
149>>> m = {'Jan': 1, 'Feb': 2, 'Mar': 3, 'Apr': 4, 'May': 5, 'Jun': 6,
150... 'Jul': 7, 'Aug': 8, 'Sep': 9, 'Oct': 10, 'Nov': 11, 'Dec': 12}
151>>> for key in m: print key, m[key]
152...
153Mar 3
154Feb 2
155Aug 8
156Sep 9
157May 5
158Jun 6
159Jul 7
160Jan 1
161Apr 4
162Nov 11
163Dec 12
164Oct 10
165>>>
166\end{verbatim}
167
168That's just the default behaviour. If you want to iterate over keys,
169values, or key/value pairs, you can explicitly call the
170\method{iterkeys()}, \method{itervalues()}, or \method{iteritems()}
Andrew M. Kuchling9e9c1352001-08-11 03:06:50 +0000171methods to get an appropriate iterator. In a minor related change,
172the \keyword{in} operator now works on dictionaries, so
173\code{\var{key} in dict} is now equivalent to
174\code{dict.has_key(\var{key})}.
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000175
Andrew M. Kuchling9e9c1352001-08-11 03:06:50 +0000176
177Files also provide an iterator, which calls the \method{readline()}
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000178method until there are no more lines in the file. This means you can
179now read each line of a file using code like this:
180
181\begin{verbatim}
182for line in file:
183 # do something for each line
184\end{verbatim}
185
186Note that you can only go forward in an iterator; there's no way to
187get the previous element, reset the iterator, or make a copy of it.
Fred Drake0d002542001-07-17 13:55:33 +0000188An iterator object could provide such additional capabilities, but the
189iterator protocol only requires a \method{next()} method.
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000190
191\begin{seealso}
192
193\seepep{234}{Iterators}{Written by Ka-Ping Yee and GvR; implemented
194by the Python Labs crew, mostly by GvR and Tim Peters.}
195
196\end{seealso}
197
Andrew M. Kuchling8cfa9052001-07-19 01:19:59 +0000198
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000199%======================================================================
200\section{PEP 255: Simple Generators}
201
202Generators are another new feature, one that interacts with the
203introduction of iterators.
204
205You're doubtless familiar with how function calls work in Python or
206C. When you call a function, it gets a private area where its local
207variables are created. When the function reaches a \keyword{return}
208statement, the local variables are destroyed and the resulting value
209is returned to the caller. A later call to the same function will get
210a fresh new set of local variables. But, what if the local variables
211weren't destroyed on exiting a function? What if you could later
212resume the function where it left off? This is what generators
213provide; they can be thought of as resumable functions.
214
215Here's the simplest example of a generator function:
216
217\begin{verbatim}
218def generate_ints(N):
219 for i in range(N):
220 yield i
221\end{verbatim}
222
223A new keyword, \keyword{yield}, was introduced for generators. Any
224function containing a \keyword{yield} statement is a generator
225function; this is detected by Python's bytecode compiler which
Andrew M. Kuchling4cf52a92001-07-17 12:48:48 +0000226compiles the function specially. Because a new keyword was
227introduced, generators must be explicitly enabled in a module by
228including a \code{from __future__ import generators} statement near
229the top of the module's source code. In Python 2.3 this statement
230will become unnecessary.
231
232When you call a generator function, it doesn't return a single value;
233instead it returns a generator object that supports the iterator
234interface. On executing the \keyword{yield} statement, the generator
235outputs the value of \code{i}, similar to a \keyword{return}
236statement. The big difference between \keyword{yield} and a
237\keyword{return} statement is that, on reaching a \keyword{yield} the
238generator's state of execution is suspended and local variables are
239preserved. On the next call to the generator's \code{.next()} method,
240the function will resume executing immediately after the
241\keyword{yield} statement. (For complicated reasons, the
242\keyword{yield} statement isn't allowed inside the \keyword{try} block
243of a \code{try...finally} statement; read PEP 255 for a full
244explanation of the interaction between \keyword{yield} and
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000245exceptions.)
246
247Here's a sample usage of the \function{generate_ints} generator:
248
249\begin{verbatim}
250>>> gen = generate_ints(3)
251>>> gen
252<generator object at 0x8117f90>
253>>> gen.next()
2540
255>>> gen.next()
2561
257>>> gen.next()
2582
259>>> gen.next()
260Traceback (most recent call last):
261 File "<stdin>", line 1, in ?
262 File "<stdin>", line 2, in generate_ints
263StopIteration
264>>>
265\end{verbatim}
266
267You could equally write \code{for i in generate_ints(5)}, or
268\code{a,b,c = generate_ints(3)}.
269
270Inside a generator function, the \keyword{return} statement can only
Andrew M. Kuchling4cf52a92001-07-17 12:48:48 +0000271be used without a value, and signals the end of the procession of
272values; afterwards the generator cannot return any further values.
273\keyword{return} with a value, such as \code{return 5}, is a syntax
274error inside a generator function. The end of the generator's results
275can also be indicated by raising \exception{StopIteration} manually,
276or by just letting the flow of execution fall off the bottom of the
277function.
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000278
279You could achieve the effect of generators manually by writing your
Andrew M. Kuchling4cf52a92001-07-17 12:48:48 +0000280own class and storing all the local variables of the generator as
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000281instance variables. For example, returning a list of integers could
282be done by setting \code{self.count} to 0, and having the
283\method{next()} method increment \code{self.count} and return it.
Andrew M. Kuchlingc32cc7c2001-07-17 18:25:01 +0000284However, for a moderately complicated generator, writing a
285corresponding class would be much messier.
286\file{Lib/test/test_generators.py} contains a number of more
287interesting examples. The simplest one implements an in-order
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000288traversal of a tree using generators recursively.
289
290\begin{verbatim}
291# A recursive generator that generates Tree leaves in in-order.
292def inorder(t):
293 if t:
294 for x in inorder(t.left):
295 yield x
296 yield t.label
297 for x in inorder(t.right):
298 yield x
299\end{verbatim}
300
301Two other examples in \file{Lib/test/test_generators.py} produce
302solutions for the N-Queens problem (placing $N$ queens on an $NxN$
303chess board so that no queen threatens another) and the Knight's Tour
304(a route that takes a knight to every square of an $NxN$ chessboard
305without visiting any square twice).
306
307The idea of generators comes from other programming languages,
308especially Icon (\url{http://www.cs.arizona.edu/icon/}), where the
309idea of generators is central to the language. In Icon, every
310expression and function call behaves like a generator. One example
311from ``An Overview of the Icon Programming Language'' at
312\url{http://www.cs.arizona.edu/icon/docs/ipd266.htm} gives an idea of
313what this looks like:
314
315\begin{verbatim}
316sentence := "Store it in the neighboring harbor"
317if (i := find("or", sentence)) > 5 then write(i)
318\end{verbatim}
319
320The \function{find()} function returns the indexes at which the
321substring ``or'' is found: 3, 23, 33. In the \keyword{if} statement,
322\code{i} is first assigned a value of 3, but 3 is less than 5, so the
323comparison fails, and Icon retries it with the second value of 23. 23
324is greater than 5, so the comparison now succeeds, and the code prints
325the value 23 to the screen.
326
327Python doesn't go nearly as far as Icon in adopting generators as a
328central concept. Generators are considered a new part of the core
329Python language, but learning or using them isn't compulsory; if they
330don't solve any problems that you have, feel free to ignore them.
331This is different from Icon where the idea of generators is a basic
332concept. One novel feature of Python's interface as compared to
333Icon's is that a generator's state is represented as a concrete object
334that can be passed around to other functions or stored in a data
335structure.
336
337\begin{seealso}
338
Andrew M. Kuchling4cf52a92001-07-17 12:48:48 +0000339\seepep{255}{Simple Generators}{Written by Neil Schemenauer, Tim
340Peters, Magnus Lie Hetland. Implemented mostly by Neil Schemenauer
341and Tim Peters, with other fixes from the Python Labs crew.}
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000342
343\end{seealso}
344
Andrew M. Kuchling8cfa9052001-07-19 01:19:59 +0000345
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000346%======================================================================
Andrew M. Kuchling2f0047a2001-09-05 14:53:31 +0000347\section{PEP 237: Unifying Long Integers and Integers}
348
Andrew M. Kuchling26c39bf2001-09-10 03:20:53 +0000349In recent versions, the distinction between regular integers, which
350are 32-bit values on most machines, and long integers, which can be of
351arbitrary size, was becoming an annoyance. For example, on platforms
352that support large files (files larger than \code{2**32} bytes), the
353\method{tell()} method of file objects has to return a long integer.
354However, there were various bits of Python that expected plain
355integers and would raise an error if a long integer was provided
Andrew M. Kuchlingd6e40e22001-09-10 16:18:50 +0000356instead. For example, in Python 1.5, only regular integers
Andrew M. Kuchling26c39bf2001-09-10 03:20:53 +0000357could be used as a slice index, and \code{'abc'[1L:]} would raise a
358\exception{TypeError} exception with the message 'slice index must be
359int'.
Andrew M. Kuchling2f0047a2001-09-05 14:53:31 +0000360
Andrew M. Kuchling26c39bf2001-09-10 03:20:53 +0000361Python 2.2 will shift values from short to long integers as required.
362The 'L' suffix is no longer needed to indicate a long integer literal,
363as now the compiler will choose the appropriate type. (Using the 'L'
364suffix will be discouraged in future 2.x versions of Python,
365triggering a warning in Python 2.4, and probably dropped in Python
3663.0.) Many operations that used to raise an \exception{OverflowError}
367will now return a long integer as their result. For example:
368
369\begin{verbatim}
370>>> 1234567890123
Andrew M. Kuchlingd6e40e22001-09-10 16:18:50 +00003711234567890123L
372>>> 2 ** 64
37318446744073709551616L
Andrew M. Kuchling26c39bf2001-09-10 03:20:53 +0000374\end{verbatim}
375
376In most cases, integers and long integers will now be treated
377identically. You can still distinguish them with the
378\function{type()} built-in function, but that's rarely needed. The
379\function{int()} function will now return a long integer if the value
380is large enough.
381
382% XXX is there a warning-enabling command-line option for this?
383
384\begin{seealso}
385
386\seepep{237}{Unifying Long Integers and Integers}{Written by
387Moshe Zadka and Guido van Rossum. Implemented mostly by Guido van Rossum.}
388
389\end{seealso}
Andrew M. Kuchling2f0047a2001-09-05 14:53:31 +0000390
391%======================================================================
Andrew M. Kuchling9e9c1352001-08-11 03:06:50 +0000392\section{PEP 238: Changing the Division Operator}
393
394The most controversial change in Python 2.2 is the start of an effort
395to fix an old design flaw that's been in Python from the beginning.
396Currently Python's division operator, \code{/}, behaves like C's
397division operator when presented with two integer arguments. It
398returns an integer result that's truncated down when there would be
399fractional part. For example, \code{3/2} is 1, not 1.5, and
400\code{(-1)/2} is -1, not -0.5. This means that the results of divison
401can vary unexpectedly depending on the type of the two operands and
402because Python is dynamically typed, it can be difficult to determine
403the possible types of the operands.
404
405(The controversy is over whether this is \emph{really} a design flaw,
406and whether it's worth breaking existing code to fix this. It's
407caused endless discussions on python-dev and in July erupted into an
408storm of acidly sarcastic postings on \newsgroup{comp.lang.python}. I
409won't argue for either side here; read PEP 238 for a summary of
410arguments and counter-arguments.)
411
412Because this change might break code, it's being introduced very
413gradually. Python 2.2 begins the transition, but the switch won't be
414complete until Python 3.0.
415
416First, some terminology from PEP 238. ``True division'' is the
417division that most non-programmers are familiar with: 3/2 is 1.5, 1/4
418is 0.25, and so forth. ``Floor division'' is what Python's \code{/}
419operator currently does when given integer operands; the result is the
420floor of the value returned by true division. ``Classic division'' is
421the current mixed behaviour of \code{/}; it returns the result of
422floor division when the operands are integers, and returns the result
423of true division when one of the operands is a floating-point number.
424
425Here are the changes 2.2 introduces:
426
427\begin{itemize}
428
429\item A new operator, \code{//}, is the floor division operator.
430(Yes, we know it looks like \Cpp's comment symbol.) \code{//}
431\emph{always} returns the floor divison no matter what the types of
432its operands are, so \code{1 // 2} is 0 and \code{1.0 // 2.0} is also
4330.0.
434
435\code{//} is always available in Python 2.2; you don't need to enable
436it using a \code{__future__} statement.
437
438\item By including a \code{from __future__ import true_division} in a
439module, the \code{/} operator will be changed to return the result of
440true division, so \code{1/2} is 0.5. Without the \code{__future__}
441statement, \code{/} still means classic division. The default meaning
442of \code{/} will not change until Python 3.0.
443
444\item Classes can define methods called \method{__truediv__} and
445\method{__floordiv__} to overload the two division operators. At the
446C level, there are also slots in the \code{PyNumberMethods} structure
447so extension types can define the two operators.
448
449% XXX a warning someday?
450
451\end{itemize}
452
453\begin{seealso}
454
455\seepep{238}{Changing the Division Operator}{Written by Moshe Zadka and
456Guido van Rossum. Implemented by Guido van Rossum..}
457
458\end{seealso}
459
460
461%======================================================================
Andrew M. Kuchlinga43e7032001-06-27 20:32:12 +0000462\section{Unicode Changes}
463
Andrew M. Kuchling2cd712b2001-07-16 13:39:08 +0000464Python's Unicode support has been enhanced a bit in 2.2. Unicode
Andrew M. Kuchlinga6d2a042001-07-20 18:34:34 +0000465strings are usually stored as UCS-2, as 16-bit unsigned integers.
Andrew M. Kuchlingf5fec3c2001-07-19 01:48:08 +0000466Python 2.2 can also be compiled to use UCS-4, 32-bit unsigned
467integers, as its internal encoding by supplying
468\longprogramopt{enable-unicode=ucs4} to the configure script. When
Andrew M. Kuchlingab010872001-07-19 14:59:53 +0000469built to use UCS-4 (a ``wide Python''), the interpreter can natively
Andrew M. Kuchlinga6d2a042001-07-20 18:34:34 +0000470handle Unicode characters from U+000000 to U+110000, so the range of
471legal values for the \function{unichr()} function is expanded
472accordingly. Using an interpreter compiled to use UCS-2 (a ``narrow
473Python''), values greater than 65535 will still cause
474\function{unichr()} to raise a \exception{ValueError} exception.
Andrew M. Kuchlingab010872001-07-19 14:59:53 +0000475
476All this is the province of the still-unimplemented PEP 261, ``Support
477for `wide' Unicode characters''; consult it for further details, and
Andrew M. Kuchlinga6d2a042001-07-20 18:34:34 +0000478please offer comments on the PEP and on your experiences with the
4792.2 alpha releases.
480% XXX update previous line once 2.2 reaches beta.
Andrew M. Kuchlingab010872001-07-19 14:59:53 +0000481
482Another change is much simpler to explain. Since their introduction,
483Unicode strings have supported an \method{encode()} method to convert
484the string to a selected encoding such as UTF-8 or Latin-1. A
485symmetric \method{decode(\optional{\var{encoding}})} method has been
486added to 8-bit strings (though not to Unicode strings) in 2.2.
487\method{decode()} assumes that the string is in the specified encoding
488and decodes it, returning whatever is returned by the codec.
489
490Using this new feature, codecs have been added for tasks not directly
491related to Unicode. For example, codecs have been added for
492uu-encoding, MIME's base64 encoding, and compression with the
493\module{zlib} module:
Andrew M. Kuchling2cd712b2001-07-16 13:39:08 +0000494
495\begin{verbatim}
496>>> s = """Here is a lengthy piece of redundant, overly verbose,
497... and repetitive text.
498... """
499>>> data = s.encode('zlib')
500>>> data
501'x\x9c\r\xc9\xc1\r\x80 \x10\x04\xc0?Ul...'
502>>> data.decode('zlib')
503'Here is a lengthy piece of redundant, overly verbose,\nand repetitive text.\n'
504>>> print s.encode('uu')
505begin 666 <data>
506M2&5R92!I<R!A(&QE;F=T:'D@<&EE8V4@;V8@<F5D=6YD86YT+"!O=F5R;'D@
507>=F5R8F]S92P*86YD(')E<&5T:71I=F4@=&5X="X*
508
509end
510>>> "sheesh".encode('rot-13')
511'furrfu'
512\end{verbatim}
Andrew M. Kuchlinga43e7032001-06-27 20:32:12 +0000513
Andrew M. Kuchlingf5fec3c2001-07-19 01:48:08 +0000514\method{encode()} and \method{decode()} were implemented by
515Marc-Andr\'e Lemburg. The changes to support using UCS-4 internally
516were implemented by Fredrik Lundh and Martin von L\"owis.
Andrew M. Kuchlinga43e7032001-06-27 20:32:12 +0000517
Andrew M. Kuchlingf5fec3c2001-07-19 01:48:08 +0000518\begin{seealso}
519
520\seepep{261}{Support for `wide' Unicode characters}{PEP written by
521Paul Prescod. Not yet accepted or fully implemented.}
522
523\end{seealso}
Andrew M. Kuchling8cfa9052001-07-19 01:19:59 +0000524
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000525%======================================================================
526\section{PEP 227: Nested Scopes}
527
528In Python 2.1, statically nested scopes were added as an optional
529feature, to be enabled by a \code{from __future__ import
530nested_scopes} directive. In 2.2 nested scopes no longer need to be
531specially enabled, but are always enabled. The rest of this section
532is a copy of the description of nested scopes from my ``What's New in
533Python 2.1'' document; if you read it when 2.1 came out, you can skip
534the rest of this section.
535
536The largest change introduced in Python 2.1, and made complete in 2.2,
537is to Python's scoping rules. In Python 2.0, at any given time there
538are at most three namespaces used to look up variable names: local,
539module-level, and the built-in namespace. This often surprised people
540because it didn't match their intuitive expectations. For example, a
541nested recursive function definition doesn't work:
542
543\begin{verbatim}
544def f():
545 ...
546 def g(value):
547 ...
548 return g(value-1) + 1
549 ...
550\end{verbatim}
551
552The function \function{g()} will always raise a \exception{NameError}
553exception, because the binding of the name \samp{g} isn't in either
554its local namespace or in the module-level namespace. This isn't much
555of a problem in practice (how often do you recursively define interior
556functions like this?), but this also made using the \keyword{lambda}
557statement clumsier, and this was a problem in practice. In code which
558uses \keyword{lambda} you can often find local variables being copied
559by passing them as the default values of arguments.
560
561\begin{verbatim}
562def find(self, name):
563 "Return list of any entries equal to 'name'"
564 L = filter(lambda x, name=name: x == name,
565 self.list_attribute)
566 return L
567\end{verbatim}
568
569The readability of Python code written in a strongly functional style
570suffers greatly as a result.
571
572The most significant change to Python 2.2 is that static scoping has
573been added to the language to fix this problem. As a first effect,
574the \code{name=name} default argument is now unnecessary in the above
575example. Put simply, when a given variable name is not assigned a
576value within a function (by an assignment, or the \keyword{def},
577\keyword{class}, or \keyword{import} statements), references to the
578variable will be looked up in the local namespace of the enclosing
579scope. A more detailed explanation of the rules, and a dissection of
580the implementation, can be found in the PEP.
581
582This change may cause some compatibility problems for code where the
583same variable name is used both at the module level and as a local
584variable within a function that contains further function definitions.
585This seems rather unlikely though, since such code would have been
586pretty confusing to read in the first place.
587
588One side effect of the change is that the \code{from \var{module}
589import *} and \keyword{exec} statements have been made illegal inside
590a function scope under certain conditions. The Python reference
591manual has said all along that \code{from \var{module} import *} is
592only legal at the top level of a module, but the CPython interpreter
593has never enforced this before. As part of the implementation of
594nested scopes, the compiler which turns Python source into bytecodes
595has to generate different code to access variables in a containing
596scope. \code{from \var{module} import *} and \keyword{exec} make it
597impossible for the compiler to figure this out, because they add names
598to the local namespace that are unknowable at compile time.
599Therefore, if a function contains function definitions or
600\keyword{lambda} expressions with free variables, the compiler will
601flag this by raising a \exception{SyntaxError} exception.
602
603To make the preceding explanation a bit clearer, here's an example:
604
605\begin{verbatim}
606x = 1
607def f():
608 # The next line is a syntax error
609 exec 'x=2'
610 def g():
611 return x
612\end{verbatim}
613
614Line 4 containing the \keyword{exec} statement is a syntax error,
615since \keyword{exec} would define a new local variable named \samp{x}
616whose value should be accessed by \function{g()}.
617
618This shouldn't be much of a limitation, since \keyword{exec} is rarely
619used in most Python code (and when it is used, it's often a sign of a
620poor design anyway).
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000621
622\begin{seealso}
623
624\seepep{227}{Statically Nested Scopes}{Written and implemented by
625Jeremy Hylton.}
626
627\end{seealso}
628
Andrew M. Kuchlinga43e7032001-06-27 20:32:12 +0000629
630%======================================================================
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +0000631\section{New and Improved Modules}
632
633\begin{itemize}
634
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000635 \item The \module{xmlrpclib} module was contributed to the standard
Andrew M. Kuchling8c69c912001-08-07 14:28:58 +0000636 library by Fredrik Lundh. It provides support for writing XML-RPC
637 clients; XML-RPC is a simple remote procedure call protocol built on
638 top of HTTP and XML. For example, the following snippet retrieves a
639 list of RSS channels from the O'Reilly Network, and then retrieves a
640 list of the recent headlines for one channel:
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000641
642\begin{verbatim}
643import xmlrpclib
644s = xmlrpclib.Server(
645 'http://www.oreillynet.com/meerkat/xml-rpc/server.php')
646channels = s.meerkat.getChannels()
647# channels is a list of dictionaries, like this:
648# [{'id': 4, 'title': 'Freshmeat Daily News'}
649# {'id': 190, 'title': '32Bits Online'},
650# {'id': 4549, 'title': '3DGamers'}, ... ]
651
652# Get the items for one channel
653items = s.meerkat.getItems( {'channel': 4} )
654
655# 'items' is another list of dictionaries, like this:
656# [{'link': 'http://freshmeat.net/releases/52719/',
657# 'description': 'A utility which converts HTML to XSL FO.',
658# 'title': 'html2fo 0.3 (Default)'}, ... ]
659\end{verbatim}
660
Fred Drake0d002542001-07-17 13:55:33 +0000661See \url{http://www.xmlrpc.com/} for more information about XML-RPC.
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000662
663 \item The \module{socket} module can be compiled to support IPv6;
Andrew M. Kuchlingddeb1352001-07-16 14:35:52 +0000664 specify the \longprogramopt{enable-ipv6} option to Python's configure
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000665 script. (Contributed by Jun-ichiro ``itojun'' Hagino.)
666
667 \item Two new format characters were added to the \module{struct}
668 module for 64-bit integers on platforms that support the C
669 \ctype{long long} type. \samp{q} is for a signed 64-bit integer,
670 and \samp{Q} is for an unsigned one. The value is returned in
671 Python's long integer type. (Contributed by Tim Peters.)
672
673 \item In the interpreter's interactive mode, there's a new built-in
674 function \function{help()}, that uses the \module{pydoc} module
675 introduced in Python 2.1 to provide interactive.
676 \code{help(\var{object})} displays any available help text about
677 \var{object}. \code{help()} with no argument puts you in an online
678 help utility, where you can enter the names of functions, classes,
679 or modules to read their help text.
680 (Contributed by Guido van Rossum, using Ka-Ping Yee's \module{pydoc} module.)
681
682 \item Various bugfixes and performance improvements have been made
Andrew M. Kuchling4cf52a92001-07-17 12:48:48 +0000683 to the SRE engine underlying the \module{re} module. For example,
684 \function{re.sub()} will now use \function{string.replace()}
685 automatically when the pattern and its replacement are both just
686 literal strings without regex metacharacters. Another contributed
687 patch speeds up certain Unicode character ranges by a factor of
688 two. (SRE is maintained by Fredrik Lundh. The BIGCHARSET patch was
689 contributed by Martin von L\"owis.)
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000690
Andrew M. Kuchling1efd7ad2001-09-14 16:19:27 +0000691 \item The \module{smtplib} module now supports \rfc{2487}, ``Secure
692 SMTP over TLS'', so it's now possible to encrypt the SMTP traffic
693 between a Python program and the mail transport agent being handed a
694 message. (Contributed by Gerhard H\"aring.)
695
Andrew M. Kuchlinga6d2a042001-07-20 18:34:34 +0000696 \item The \module{imaplib} module, maintained by Piers Lauder, has
697 support for several new extensions: the NAMESPACE extension defined
698 in \rfc{2342}, SORT, GETACL and SETACL. (Contributed by Anthony
699 Baxter and Michel Pelletier.)
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000700
Fred Drake0d002542001-07-17 13:55:33 +0000701 \item The \module{rfc822} module's parsing of email addresses is
Andrew M. Kuchling4cf52a92001-07-17 12:48:48 +0000702 now compliant with \rfc{2822}, an update to \rfc{822}. The module's
703 name is \emph{not} going to be changed to \samp{rfc2822}.
704 (Contributed by Barry Warsaw.)
Andrew M. Kuchling77707672001-07-31 15:51:16 +0000705
706 \item New constants \constant{ascii_letters},
707 \constant{ascii_lowercase}, and \constant{ascii_uppercase} were
708 added to the \module{string} module. There were several modules in
709 the standard library that used \constant{string.letters} to mean the
710 ranges A-Za-z, but that assumption is incorrect when locales are in
711 use, because \constant{string.letters} varies depending on the set
712 of legal characters defined by the current locale. The buggy
713 modules have all been fixed to use \constant{ascii_letters} instead.
714 (Reported by an unknown person; fixed by Fred L. Drake, Jr.)
715
Andrew M. Kuchling8c69c912001-08-07 14:28:58 +0000716 \item The \module{mimetypes} module now makes it easier to use
717 alternative MIME-type databases by the addition of a
718 \class{MimeTypes} class, which takes a list of filenames to be
719 parsed. (Contributed by Fred L. Drake, Jr.)
720
Andrew M. Kuchlingd6e40e22001-09-10 16:18:50 +0000721 \item A \class{Timer} class was added to the \module{threading}
722 module that allows scheduling an activity to happen at some future
723 time. (Contributed by Itamar Shtull-Trauring.)
Andrew M. Kuchling2f0047a2001-09-05 14:53:31 +0000724
Andrew M. Kuchling77707672001-07-31 15:51:16 +0000725\end{itemize}
726
727
728%======================================================================
729\section{Interpreter Changes and Fixes}
730
731Some of the changes only affect people who deal with the Python
732interpreter at the C level, writing Python extension modules,
733embedding the interpreter, or just hacking on the interpreter itself.
734If you only write Python code, none of the changes described here will
735affect you very much.
736
737\begin{itemize}
738
739 \item Profiling and tracing functions can now be implemented in C,
740 which can operate at much higher speeds than Python-based functions
741 and should reduce the overhead of enabling profiling and tracing, so
742 it will be of interest to authors of development environments for
743 Python. Two new C functions were added to Python's API,
744 \cfunction{PyEval_SetProfile()} and \cfunction{PyEval_SetTrace()}.
745 The existing \function{sys.setprofile()} and
746 \function{sys.settrace()} functions still exist, and have simply
747 been changed to use the new C-level interface. (Contributed by Fred
748 L. Drake, Jr.)
749
750 \item Another low-level API, primarily of interest to implementors
751 of Python debuggers and development tools, was added.
752 \cfunction{PyInterpreterState_Head()} and
753 \cfunction{PyInterpreterState_Next()} let a caller walk through all
754 the existing interpreter objects;
755 \cfunction{PyInterpreterState_ThreadHead()} and
756 \cfunction{PyThreadState_Next()} allow looping over all the thread
757 states for a given interpreter. (Contributed by David Beazley.)
758
759 \item A new \samp{et} format sequence was added to
760 \cfunction{PyArg_ParseTuple}; \samp{et} takes both a parameter and
761 an encoding name, and converts the parameter to the given encoding
762 if the parameter turns out to be a Unicode string, or leaves it
763 alone if it's an 8-bit string, assuming it to already be in the
764 desired encoding. This differs from the \samp{es} format character,
765 which assumes that 8-bit strings are in Python's default ASCII
766 encoding and converts them to the specified new encoding.
Andrew M. Kuchlingd6e40e22001-09-10 16:18:50 +0000767 (Contributed by M.-A. Lemburg.)
Andrew M. Kuchling0ab31b82001-08-29 01:16:54 +0000768
769 \item Two new flags \constant{METH_NOARGS} and \constant{METH_O} are
770 available in method definition tables to simplify implementation of
771 methods with no arguments or a single untyped argument. Calling
772 such methods is more efficient than calling a corresponding method
773 that uses \constant{METH_VARARGS}.
774 Also, the old \constant{METH_OLDARGS} style of writing C methods is
775 now officially deprecated.
776
777\item
778 Two new wrapper functions, \cfunction{PyOS_snprintf()} and
779 \cfunction{PyOS_vsnprintf()} were added. which provide a
780 cross-platform implementations for the relatively new
781 \cfunction{snprintf()} and \cfunction{vsnprintf()} C lib APIs. In
782 contrast to the standard \cfunction{sprintf()} and
783 \cfunction{vsprintf()} functions, the Python versions check the
784 bounds of the buffer used to protect against buffer overruns.
785 (Contributed by M.-A. Lemburg.)
Andrew M. Kuchling77707672001-07-31 15:51:16 +0000786
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +0000787\end{itemize}
788
789
790%======================================================================
791\section{Other Changes and Fixes}
792
Andrew M. Kuchling8cfa9052001-07-19 01:19:59 +0000793% XXX update the patch and bug figures as we go
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000794As usual there were a bunch of other improvements and bugfixes
795scattered throughout the source tree. A search through the CVS change
Andrew M. Kuchlingd6e40e22001-09-10 16:18:50 +0000796logs finds there were 119 patches applied, and 179 bugs fixed; both
Andrew M. Kuchling4dbf8712001-07-16 02:17:14 +0000797figures are likely to be underestimates. Some of the more notable
798changes are:
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +0000799
800\begin{itemize}
801
Andrew M. Kuchling0e03f582001-08-30 21:30:16 +0000802 \item The code for the MacOS port for Python, maintained by Jack
803 Jansen, is now kept in the main Python CVS tree, and many changes
804 have been made to support MacOS X.
805
806The most significant change is the ability to build Python as a
807framework, enabled by supplying the \longprogramopt{enable-framework}
808option to the configure script when compiling Python. According to
809Jack Jansen, ``This installs a self-contained Python installation plus
810the OSX framework "glue" into
811\file{/Library/Frameworks/Python.framework} (or another location of
812choice). For now there is little immediate added benefit to this
813(actually, there is the disadvantage that you have to change your PATH
814to be able to find Python), but it is the basis for creating a
815full-blown Python application, porting the MacPython IDE, possibly
816using Python as a standard OSA scripting language and much more.''
817
818Most of the MacPython toolbox modules, which interface to MacOS APIs
819such as windowing, QuickTime, scripting, etc. have been ported to OS
820X, but they've been left commented out in setup.py. People who want
821to experiment with these modules can uncomment them manually.
822
823% Jack's original comments:
824%The main change is the possibility to build Python as a
825%framework. This installs a self-contained Python installation plus the
826%OSX framework "glue" into /Library/Frameworks/Python.framework (or
827%another location of choice). For now there is little immedeate added
828%benefit to this (actually, there is the disadvantage that you have to
829%change your PATH to be able to find Python), but it is the basis for
830%creating a fullblown Python application, porting the MacPython IDE,
831%possibly using Python as a standard OSA scripting language and much
832%more. You enable this with "configure --enable-framework".
833
834%The other change is that most MacPython toolbox modules, which
835%interface to all the MacOS APIs such as windowing, quicktime,
836%scripting, etc. have been ported. Again, most of these are not of
837%immedeate use, as they need a full application to be really useful, so
838%they have been commented out in setup.py. People wanting to experiment
839%can uncomment them. Gestalt and Internet Config modules are enabled by
840%default.
841
842
Andrew M. Kuchling2cd712b2001-07-16 13:39:08 +0000843 \item Keyword arguments passed to builtin functions that don't take them
844 now cause a \exception{TypeError} exception to be raised, with the
845 message "\var{function} takes no keyword arguments".
846
Andrew M. Kuchling94a7eba2001-08-15 15:55:48 +0000847 \item A new script, \file{Tools/scripts/cleanfuture.py} by Tim
848 Peters, automatically removes obsolete \code{__future__} statements
849 from Python source code.
Andrew M. Kuchling2cd712b2001-07-16 13:39:08 +0000850
851 \item The new license introduced with Python 1.6 wasn't
852 GPL-compatible. This is fixed by some minor textual changes to the
853 2.2 license, so Python can now be embedded inside a GPLed program
854 again. The license changes were also applied to the Python 2.0.1
855 and 2.1.1 releases.
856
Andrew M. Kuchlingf4ccf582001-07-31 01:11:36 +0000857 \item When presented with a Unicode filename on Windows, Python will
858 now convert it to an MBCS encoded string, as used by the Microsoft
859 file APIs. As MBCS is explicitly used by the file APIs, Python's
860 choice of ASCII as the default encoding turns out to be an
861 annoyance.
Andrew M. Kuchling8cfa9052001-07-19 01:19:59 +0000862 (Contributed by Mark Hammond with assistance from Marc-Andr\'e
863 Lemburg.)
864
Andrew M. Kuchlingd6e40e22001-09-10 16:18:50 +0000865 \item Large file support is now enabled on Windows. (Contributed by
866 Tim Peters.)
867
Andrew M. Kuchling2cd712b2001-07-16 13:39:08 +0000868 \item The \file{Tools/scripts/ftpmirror.py} script
869 now parses a \file{.netrc} file, if you have one.
Andrew M. Kuchling4cf52a92001-07-17 12:48:48 +0000870 (Contributed by Mike Romberg.)
Andrew M. Kuchling2cd712b2001-07-16 13:39:08 +0000871
Andrew M. Kuchling4cf52a92001-07-17 12:48:48 +0000872 \item Some features of the object returned by the
873 \function{xrange()} function are now deprecated, and trigger
874 warnings when they're accessed; they'll disappear in Python 2.3.
875 \class{xrange} objects tried to pretend they were full sequence
876 types by supporting slicing, sequence multiplication, and the
877 \keyword{in} operator, but these features were rarely used and
878 therefore buggy. The \method{tolist()} method and the
879 \member{start}, \member{stop}, and \member{step} attributes are also
880 being deprecated. At the C level, the fourth argument to the
881 \cfunction{PyRange_New()} function, \samp{repeat}, has also been
882 deprecated.
883
Andrew M. Kuchling8cfa9052001-07-19 01:19:59 +0000884 \item There were a bunch of patches to the dictionary
885 implementation, mostly to fix potential core dumps if a dictionary
886 contains objects that sneakily changed their hash value, or mutated
887 the dictionary they were contained in. For a while python-dev fell
888 into a gentle rhythm of Michael Hudson finding a case that dump
889 core, Tim Peters fixing it, Michael finding another case, and round
890 and round it went.
891
Andrew M. Kuchling33a3b632001-09-04 21:25:58 +0000892 \item On Windows, Python can now be compiled with Borland C thanks
893 to a number of patches contributed by Stephen Hansen, though the
894 result isn't fully functional yet. (But this \emph{is} progress...)
Andrew M. Kuchling8c69c912001-08-07 14:28:58 +0000895
Andrew M. Kuchlingf4ccf582001-07-31 01:11:36 +0000896 \item Another Windows enhancement: Wise Solutions generously offered
897 PythonLabs use of their InstallerMaster 8.1 system. Earlier
898 PythonLabs Windows installers used Wise 5.0a, which was beginning to
899 show its age. (Packaged up by Tim Peters.)
900
Andrew M. Kuchling8c69c912001-08-07 14:28:58 +0000901 \item Files ending in \samp{.pyw} can now be imported on Windows.
902 \samp{.pyw} is a Windows-only thing, used to indicate that a script
903 needs to be run using PYTHONW.EXE instead of PYTHON.EXE in order to
904 prevent a DOS console from popping up to display the output. This
905 patch makes it possible to import such scripts, in case they're also
906 usable as modules. (Implemented by David Bolen.)
907
Andrew M. Kuchling8cfa9052001-07-19 01:19:59 +0000908 \item On platforms where Python uses the C \cfunction{dlopen()} function
909 to load extension modules, it's now possible to set the flags used
910 by \cfunction{dlopen()} using the \function{sys.getdlopenflags()} and
911 \function{sys.setdlopenflags()} functions. (Contributed by Bram Stolk.)
Andrew M. Kuchling2f0047a2001-09-05 14:53:31 +0000912
Andrew M. Kuchling26c39bf2001-09-10 03:20:53 +0000913 \item The \function{pow()} built-in function no longer supports 3
914 arguments when floating-point numbers are supplied.
915 \code{pow(\var{x}, \var{y}, \var{z})} returns \code{(x**y) % z}, but
916 this is never useful for floating point numbers, and the final
917 result varies unpredictably depending on the platform. A call such
Andrew M. Kuchlingd6e40e22001-09-10 16:18:50 +0000918 as \code{pow(2.0, 8.0, 7.0)} will now raise a \exception{TypeError}
Andrew M. Kuchling26c39bf2001-09-10 03:20:53 +0000919 exception.
Andrew M. Kuchling77707672001-07-31 15:51:16 +0000920
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +0000921\end{itemize}
922
923
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +0000924%======================================================================
925\section{Acknowledgements}
926
927The author would like to thank the following people for offering
Andrew M. Kuchling6ea9f0b2001-07-17 14:50:31 +0000928suggestions and corrections to various drafts of this article: Fred
Andrew M. Kuchling9e9c1352001-08-11 03:06:50 +0000929Bremmer, Keith Briggs, Fred L. Drake, Jr., Carel Fellinger, Mark
Andrew M. Kuchling33a3b632001-09-04 21:25:58 +0000930Hammond, Stephen Hansen, Jack Jansen, Marc-Andr\'e Lemburg, Tim Peters, Neil
Andrew M. Kuchling0e03f582001-08-30 21:30:16 +0000931Schemenauer, Guido van Rossum.
Andrew M. Kuchlinga8defaa2001-05-05 16:37:29 +0000932
933\end{document}