blob: bc827d8be0816dc9897a676abcf7d047a5fc33ca [file] [log] [blame]
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +00001\documentclass{howto}
2
Andrew M. Kuchling730067e2000-06-30 01:44:05 +00003\title{What's New in Python 2.0}
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +00004\release{0.05}
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +00005\author{A.M. Kuchling and Moshe Zadka}
6\authoraddress{\email{amk1@bigfoot.com}, \email{moshez@math.huji.ac.il} }
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +00007\begin{document}
8\maketitle\tableofcontents
9
10\section{Introduction}
11
Andrew M. Kuchling69db0e42000-06-28 02:16:00 +000012{\large This is a draft document; please report inaccuracies and
13omissions to the authors. This document should not be treated as
Andrew M. Kuchling730067e2000-06-30 01:44:05 +000014definitive; features described here might be removed or changed during
15the beta cycle before the final release of Python 2.0.
Andrew M. Kuchlinga5bbb002000-06-10 02:41:46 +000016}
17
Andrew M. Kuchling730067e2000-06-30 01:44:05 +000018A new release of Python, version 2.0, will be released some time this
Andrew M. Kuchling70ba3822000-07-01 00:13:30 +000019summer. Beta versions are already available from
Andrew M. Kuchling6d4addd2000-09-25 14:40:15 +000020\url{http://www.pythonlabs.com/products/python2.0/}. This article
Andrew M. Kuchling70ba3822000-07-01 00:13:30 +000021covers the exciting new features in 2.0, highlights some other useful
22changes, and points out a few incompatible changes that may require
23rewriting code.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +000024
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +000025Python's development never completely stops between releases, and a
26steady flow of bug fixes and improvements are always being submitted.
27A host of minor fixes, a few optimizations, additional docstrings, and
Andrew M. Kuchling730067e2000-06-30 01:44:05 +000028better error messages went into 2.0; to list them all would be
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +000029impossible, but they're certainly significant. Consult the
30publicly-available CVS logs if you want to see the full list.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +000031
32% ======================================================================
Andrew M. Kuchling4d46d382000-09-06 17:58:49 +000033\section{What About Python 1.6?}
34
35Python 1.6 can be thought of as the Contractual Obligations Python
36release. After the core development team left CNRI in May 2000, CNRI
37requested that a 1.6 release be created, containing all the work on
38Python that had been performed at CNRI. Python 1.6 therefore
39represents the state of the CVS tree as of May 2000, with the most
40significant new feature being Unicode support. Development continued
41after May, of course, so the 1.6 tree received a few fixes to ensure
42that it's forward-compatible with Python 2.0. 1.6 is therefore part
43of Python's evolution, and not a side branch.
44
45So, should you take much interest in Python 1.6? Probably not. The
461.6final and 2.0beta1 releases were made on the same day (September 5,
472000), the plan being to finalize Python 2.0 within a month or so. If
48you have applications to maintain, there seems little point in
49breaking things by moving to 1.6, fixing them, and then having another
50round of breakage within a month by moving to 2.0; you're better off
51just going straight to 2.0. Most of the really interesting features
52described in this document are only in 2.0, because a lot of work was
53done between May and September.
54
55% ======================================================================
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +000056\section{Unicode}
57
Andrew M. Kuchling730067e2000-06-30 01:44:05 +000058The largest new feature in Python 2.0 is a new fundamental data type:
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +000059Unicode strings. Unicode uses 16-bit numbers to represent characters
60instead of the 8-bit number used by ASCII, meaning that 65,536
61distinct characters can be supported.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +000062
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +000063The final interface for Unicode support was arrived at through
Andrew M. Kuchlingb853ea02000-06-03 03:06:58 +000064countless often-stormy discussions on the python-dev mailing list, and
Andrew M. Kuchling62cdd962000-06-30 12:46:41 +000065mostly implemented by Marc-Andr\'e Lemburg, based on a Unicode string
66type implementation by Fredrik Lundh. A detailed explanation of the
67interface is in the file \file{Misc/unicode.txt} in the Python source
68distribution; it's also available on the Web at
Andrew M. Kuchlingb853ea02000-06-03 03:06:58 +000069\url{http://starship.python.net/crew/lemburg/unicode-proposal.txt}.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +000070This article will simply cover the most significant points from the
71full interface.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +000072
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +000073In Python source code, Unicode strings are written as
74\code{u"string"}. Arbitrary Unicode characters can be written using a
Andrew M. Kuchlingb853ea02000-06-03 03:06:58 +000075new escape sequence, \code{\e u\var{HHHH}}, where \var{HHHH} is a
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000764-digit hexadecimal number from 0000 to FFFF. The existing
Andrew M. Kuchlingb853ea02000-06-03 03:06:58 +000077\code{\e x\var{HHHH}} escape sequence can also be used, and octal
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +000078escapes can be used for characters up to U+01FF, which is represented
Andrew M. Kuchlingb853ea02000-06-03 03:06:58 +000079by \code{\e 777}.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +000080
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +000081Unicode strings, just like regular strings, are an immutable sequence
Andrew M. Kuchling662d76e2000-06-25 14:32:48 +000082type. They can be indexed and sliced, but not modified in place.
Andrew M. Kuchling62cdd962000-06-30 12:46:41 +000083Unicode strings have an \method{encode( \optional{encoding} )} method
Andrew M. Kuchling662d76e2000-06-25 14:32:48 +000084that returns an 8-bit string in the desired encoding. Encodings are
85named by strings, such as \code{'ascii'}, \code{'utf-8'},
86\code{'iso-8859-1'}, or whatever. A codec API is defined for
87implementing and registering new encodings that are then available
88throughout a Python program. If an encoding isn't specified, the
89default encoding is usually 7-bit ASCII, though it can be changed for
90your Python installation by calling the
Andrew M. Kuchlingc0328f02000-06-10 15:11:20 +000091\function{sys.setdefaultencoding(\var{encoding})} function in a
Andrew M. Kuchling69db0e42000-06-28 02:16:00 +000092customised version of \file{site.py}.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +000093
94Combining 8-bit and Unicode strings always coerces to Unicode, using
95the default ASCII encoding; the result of \code{'a' + u'bc'} is
Andrew M. Kuchling7f6270d2000-06-09 02:48:18 +000096\code{u'abc'}.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +000097
98New built-in functions have been added, and existing built-ins
99modified to support Unicode:
100
101\begin{itemize}
102\item \code{unichr(\var{ch})} returns a Unicode string 1 character
103long, containing the character \var{ch}.
104
105\item \code{ord(\var{u})}, where \var{u} is a 1-character regular or Unicode string, returns the number of the character as an integer.
106
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000107\item \code{unicode(\var{string} \optional{, \var{encoding}}
108\optional{, \var{errors}} ) } creates a Unicode string from an 8-bit
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000109string. \code{encoding} is a string naming the encoding to use.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000110The \code{errors} parameter specifies the treatment of characters that
111are invalid for the current encoding; passing \code{'strict'} as the
112value causes an exception to be raised on any encoding error, while
113\code{'ignore'} causes errors to be silently ignored and
114\code{'replace'} uses U+FFFD, the official replacement character, in
115case of any problems.
116
117\end{itemize}
118
119A new module, \module{unicodedata}, provides an interface to Unicode
120character properties. For example, \code{unicodedata.category(u'A')}
121returns the 2-character string 'Lu', the 'L' denoting it's a letter,
122and 'u' meaning that it's uppercase.
Andrew M. Kuchlingb853ea02000-06-03 03:06:58 +0000123\code{u.bidirectional(u'\e x0660')} returns 'AN', meaning that U+0660 is
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000124an Arabic number.
125
Andrew M. Kuchlingb853ea02000-06-03 03:06:58 +0000126The \module{codecs} module contains functions to look up existing encodings
127and register new ones. Unless you want to implement a
128new encoding, you'll most often use the
129\function{codecs.lookup(\var{encoding})} function, which returns a
1304-element tuple: \code{(\var{encode_func},
131\var{decode_func}, \var{stream_reader}, \var{stream_writer})}.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000132
133\begin{itemize}
134\item \var{encode_func} is a function that takes a Unicode string, and
135returns a 2-tuple \code{(\var{string}, \var{length})}. \var{string}
136is an 8-bit string containing a portion (perhaps all) of the Unicode
Andrew M. Kuchling2d2dc9f2000-08-17 00:27:06 +0000137string converted into the given encoding, and \var{length} tells you
138how much of the Unicode string was converted.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000139
Andrew M. Kuchling118ee962000-09-27 01:01:18 +0000140\item \var{decode_func} is the opposite of \var{encode_func}, taking
141an 8-bit string and returning a 2-tuple \code{(\var{ustring},
142\var{length})}, consisting of the resulting Unicode string
143\var{ustring} and the integer \var{length} telling how much of the
144string was consumed.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000145
146\item \var{stream_reader} is a class that supports decoding input from
147a stream. \var{stream_reader(\var{file_obj})} returns an object that
148supports the \method{read()}, \method{readline()}, and
149\method{readlines()} methods. These methods will all translate from
150the given encoding and return Unicode strings.
151
152\item \var{stream_writer}, similarly, is a class that supports
153encoding output to a stream. \var{stream_writer(\var{file_obj})}
Andrew M. Kuchling69db0e42000-06-28 02:16:00 +0000154returns an object that supports the \method{write()} and
155\method{writelines()} methods. These methods expect Unicode strings,
156translating them to the given encoding on output.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000157\end{itemize}
158
159For example, the following code writes a Unicode string into a file,
160encoding it as UTF-8:
161
162\begin{verbatim}
163import codecs
164
165unistr = u'\u0660\u2000ab ...'
166
167(UTF8_encode, UTF8_decode,
168 UTF8_streamreader, UTF8_streamwriter) = codecs.lookup('UTF-8')
169
170output = UTF8_streamwriter( open( '/tmp/output', 'wb') )
171output.write( unistr )
172output.close()
173\end{verbatim}
174
175The following code would then read UTF-8 input from the file:
176
177\begin{verbatim}
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000178input = UTF8_streamreader( open( '/tmp/output', 'rb') )
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000179print repr(input.read())
180input.close()
181\end{verbatim}
182
183Unicode-aware regular expressions are available through the
184\module{re} module, which has a new underlying implementation called
185SRE written by Fredrik Lundh of Secret Labs AB.
186
Andrew M. Kuchlingc0328f02000-06-10 15:11:20 +0000187A \code{-U} command line option was added which causes the Python
188compiler to interpret all string literals as Unicode string literals.
189This is intended to be used in testing and future-proofing your Python
190code, since some future version of Python may drop support for 8-bit
191strings and provide only Unicode strings.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000192
193% ======================================================================
Andrew M. Kuchling2d2dc9f2000-08-17 00:27:06 +0000194\section{List Comprehensions}
195
196Lists are a workhorse data type in Python, and many programs
197manipulate a list at some point. Two common operations on lists are
198to loop over them, and either pick out the elements that meet a
199certain criterion, or apply some function to each element. For
200example, given a list of strings, you might want to pull out all the
201strings containing a given substring, or strip off trailing whitespace
202from each line.
203
204The existing \function{map()} and \function{filter()} functions can be
205used for this purpose, but they require a function as one of their
206arguments. This is fine if there's an existing built-in function that
207can be passed directly, but if there isn't, you have to create a
208little function to do the required work, and Python's scoping rules
209make the result ugly if the little function needs additional
210information. Take the first example in the previous paragraph,
211finding all the strings in the list containing a given substring. You
212could write the following to do it:
213
214\begin{verbatim}
215# Given the list L, make a list of all strings
216# containing the substring S.
217sublist = filter( lambda s, substring=S:
218 string.find(s, substring) != -1,
219 L)
220\end{verbatim}
221
222Because of Python's scoping rules, a default argument is used so that
223the anonymous function created by the \keyword{lambda} statement knows
224what substring is being searched for. List comprehensions make this
225cleaner:
226
227\begin{verbatim}
228sublist = [ s for s in L if string.find(s, S) != -1 ]
229\end{verbatim}
230
231List comprehensions have the form:
232
233\begin{verbatim}
234[ expression for expr in sequence1
235 for expr2 in sequence2 ...
236 for exprN in sequenceN
237 if condition
238\end{verbatim}
239
240The \keyword{for}...\keyword{in} clauses contain the sequences to be
241iterated over. The sequences do not have to be the same length,
242because they are \emph{not} iterated over in parallel, but
243from left to right; this is explained more clearly in the following
244paragraphs. The elements of the generated list will be the successive
245values of \var{expression}. The final \keyword{if} clause is
246optional; if present, \var{expression} is only evaluated and added to
247the result if \var{condition} is true.
248
249To make the semantics very clear, a list comprehension is equivalent
250to the following Python code:
251
252\begin{verbatim}
253for expr1 in sequence1:
254 for expr2 in sequence2:
255 ...
256 for exprN in sequenceN:
257 if (condition):
258 # Append the value of
259 # the expression to the
260 # resulting list.
261\end{verbatim}
262
263This means that when there are \keyword{for}...\keyword{in} clauses,
264the resulting list will be equal to the product of the lengths of all
265the sequences. If you have two lists of length 3, the output list is
2669 elements long:
267
268\begin{verbatim}
269seq1 = 'abc'
270seq2 = (1,2,3)
271>>> [ (x,y) for x in seq1 for y in seq2]
272[('a', 1), ('a', 2), ('a', 3), ('b', 1), ('b', 2), ('b', 3), ('c', 1),
273('c', 2), ('c', 3)]
274\end{verbatim}
275
276To avoid introducing an ambiguity into Python's grammar, if
277\var{expression} is creating a tuple, it must be surrounded with
278parentheses. The first list comprehension below is a syntax error,
279while the second one is correct:
280
281\begin{verbatim}
282# Syntax error
283[ x,y for x in seq1 for y in seq2]
284# Correct
285[ (x,y) for x in seq1 for y in seq2]
286\end{verbatim}
287
Andrew M. Kuchling2d2dc9f2000-08-17 00:27:06 +0000288The idea of list comprehensions originally comes from the functional
289programming language Haskell (\url{http://www.haskell.org}). Greg
290Ewing argued most effectively for adding them to Python and wrote the
291initial list comprehension patch, which was then discussed for a
292seemingly endless time on the python-dev mailing list and kept
293up-to-date by Skip Montanaro.
294
Andrew M. Kuchling2d2dc9f2000-08-17 00:27:06 +0000295% ======================================================================
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000296\section{Augmented Assignment}
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000297
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000298Augmented assignment operators, another long-requested feature, have
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000299been added to Python 2.0. Augmented assignment operators include
300\code{+=}, \code{-=}, \code{*=}, and so forth. For example, the
301statement \code{a += 2} increments the value of the variable
302\code{a} by 2, equivalent to the slightly lengthier \code{a = a + 2}.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000303
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000304The full list of supported assignment operators is \code{+=},
305\code{-=}, \code{*=}, \code{/=}, \code{\%=}, \code{**=}, \code{\&=},
Andrew M. Kuchling3cdb5762000-08-30 12:55:42 +0000306\code{|=}, \verb|^=|, \code{>>=}, and \code{<<=}. Python classes can
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000307override the augmented assignment operators by defining methods named
308\method{__iadd__}, \method{__isub__}, etc. For example, the following
309\class{Number} class stores a number and supports using += to create a
310new instance with an incremented value.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000311
312\begin{verbatim}
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000313class Number:
314 def __init__(self, value):
315 self.value = value
316 def __iadd__(self, increment):
317 return Number( self.value + increment)
318
319n = Number(5)
320n += 3
321print n.value
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000322\end{verbatim}
323
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000324The \method{__iadd__} special method is called with the value of the
325increment, and should return a new instance with an appropriately
326modified value; this return value is bound as the new value of the
327variable on the left-hand side.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000328
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000329Augmented assignment operators were first introduced in the C
330programming language, and most C-derived languages, such as
331\program{awk}, C++, Java, Perl, and PHP also support them. The augmented
332assignment patch was implemented by Thomas Wouters.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000333
334% ======================================================================
335\section{String Methods}
336
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000337Until now string-manipulation functionality was in the \module{string}
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000338module, which was usually a front-end for the \module{strop}
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000339module written in C. The addition of Unicode posed a difficulty for
340the \module{strop} module, because the functions would all need to be
341rewritten in order to accept either 8-bit or Unicode strings. For
342functions such as \function{string.replace()}, which takes 3 string
343arguments, that means eight possible permutations, and correspondingly
344complicated code.
345
Andrew M. Kuchling730067e2000-06-30 01:44:05 +0000346Instead, Python 2.0 pushes the problem onto the string type, making
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000347string manipulation functionality available through methods on both
3488-bit strings and Unicode strings.
349
350\begin{verbatim}
351>>> 'andrew'.capitalize()
352'Andrew'
353>>> 'hostname'.replace('os', 'linux')
354'hlinuxtname'
355>>> 'moshe'.find('sh')
3562
357\end{verbatim}
358
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000359One thing that hasn't changed, a noteworthy April Fools' joke
360notwithstanding, is that Python strings are immutable. Thus, the
361string methods return new strings, and do not modify the string on
362which they operate.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000363
364The old \module{string} module is still around for backwards
365compatibility, but it mostly acts as a front-end to the new string
366methods.
367
Andrew M. Kuchling730067e2000-06-30 01:44:05 +0000368Two methods which have no parallel in pre-2.0 versions, although they
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000369did exist in JPython for quite some time, are \method{startswith()}
370and \method{endswith}. \code{s.startswith(t)} is equivalent to \code{s[:len(t)]
371== t}, while \code{s.endswith(t)} is equivalent to \code{s[-len(t):] == t}.
372
Andrew M. Kuchlingfed4f1e2000-07-01 12:33:43 +0000373One other method which deserves special mention is \method{join}. The
374\method{join} method of a string receives one parameter, a sequence of
375strings, and is equivalent to the \function{string.join} function from
376the old \module{string} module, with the arguments reversed. In other
377words, \code{s.join(seq)} is equivalent to the old
378\code{string.join(seq, s)}.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000379
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000380% ======================================================================
Andrew M. Kuchling35e8afb2000-07-08 12:06:31 +0000381\section{Optional Collection of Cycles}
382
383The C implementation of Python uses reference counting to implement
384garbage collection. Every Python object maintains a count of the
385number of references pointing to itself, and adjusts the count as
386references are created or destroyed. Once the reference count reaches
387zero, the object is no longer accessible, since you need to have a
388reference to an object to access it, and if the count is zero, no
389references exist any longer.
390
391Reference counting has some pleasant properties: it's easy to
392understand and implement, and the resulting implementation is
393portable, fairly fast, and reacts well with other libraries that
394implement their own memory handling schemes. The major problem with
395reference counting is that it sometimes doesn't realise that objects
396are no longer accessible, resulting in a memory leak. This happens
397when there are cycles of references.
398
399Consider the simplest possible cycle,
400a class instance which has a reference to itself:
401
402\begin{verbatim}
403instance = SomeClass()
404instance.myself = instance
405\end{verbatim}
406
407After the above two lines of code have been executed, the reference
408count of \code{instance} is 2; one reference is from the variable
409named \samp{'instance'}, and the other is from the \samp{myself}
410attribute of the instance.
411
412If the next line of code is \code{del instance}, what happens? The
413reference count of \code{instance} is decreased by 1, so it has a
414reference count of 1; the reference in the \samp{myself} attribute
415still exists. Yet the instance is no longer accessible through Python
416code, and it could be deleted. Several objects can participate in a
417cycle if they have references to each other, causing all of the
418objects to be leaked.
419
420An experimental step has been made toward fixing this problem. When
421compiling Python, the \verb|--with-cycle-gc| option can be specified.
422This causes a cycle detection algorithm to be periodically executed,
423which looks for inaccessible cycles and deletes the objects involved.
424A new \module{gc} module provides functions to perform a garbage
425collection, obtain debugging statistics, and tuning the collector's parameters.
426
427Why isn't cycle detection enabled by default? Running the cycle detection
428algorithm takes some time, and some tuning will be required to
429minimize the overhead cost. It's not yet obvious how much performance
430is lost, because benchmarking this is tricky and depends crucially
431on how often the program creates and destroys objects.
432
433Several people tackled this problem and contributed to a solution. An
434early implementation of the cycle detection approach was written by
435Toby Kelsey. The current algorithm was suggested by Eric Tiedemann
436during a visit to CNRI, and Guido van Rossum and Neil Schemenauer
437wrote two different implementations, which were later integrated by
438Neil. Lots of other people offered suggestions along the way; the
439March 2000 archives of the python-dev mailing list contain most of the
440relevant discussion, especially in the threads titled ``Reference
441cycle collection for Python'' and ``Finalization again''.
442
Andrew M. Kuchling35e8afb2000-07-08 12:06:31 +0000443% ======================================================================
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000444\section{Other Core Changes}
Andrew M. Kuchling35e8afb2000-07-08 12:06:31 +0000445
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000446Various minor changes have been made to Python's syntax and built-in
447functions. None of the changes are very far-reaching, but they're
448handy conveniences.
449
450\subsection{Minor Language Changes}
451
452A new syntax makes it more convenient to call a given function
453with a tuple of arguments and/or a dictionary of keyword arguments.
454In Python 1.5 and earlier, you'd use the \function{apply()}
455built-in function: \code{apply(f, \var{args}, \var{kw})} calls the
456function \function{f()} with the argument tuple \var{args} and the
457keyword arguments in the dictionary \var{kw}. \function{apply()}
458is the same in 2.0, but thanks to a patch from
459Greg Ewing, \code{f(*\var{args}, **\var{kw})} as a shorter
460and clearer way to achieve the same effect. This syntax is
461symmetrical with the syntax for defining functions:
462
463\begin{verbatim}
464def f(*args, **kw):
465 # args is a tuple of positional args,
466 # kw is a dictionary of keyword args
467 ...
468\end{verbatim}
469
470The \keyword{print} statement can now have its output directed to a
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000471file-like object by following the \keyword{print} with
472\verb|>> file|, similar to the redirection operator in Unix shells.
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000473Previously you'd either have to use the \method{write()} method of the
474file-like object, which lacks the convenience and simplicity of
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000475\keyword{print}, or you could assign a new value to
476\code{sys.stdout} and then restore the old value. For sending output to standard error,
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000477it's much easier to write this:
478
479\begin{verbatim}
480print >> sys.stderr, "Warning: action field not supplied"
481\end{verbatim}
482
483Modules can now be renamed on importing them, using the syntax
484\code{import \var{module} as \var{name}} or \code{from \var{module}
485import \var{name} as \var{othername}}. The patch was submitted by
486Thomas Wouters.
487
488A new format style is available when using the \code{\%} operator;
489'\%r' will insert the \function{repr()} of its argument. This was
490also added from symmetry considerations, this time for symmetry with
491the existing '\%s' format style, which inserts the \function{str()} of
492its argument. For example, \code{'\%r \%s' \% ('abc', 'abc')} returns a
493string containing \verb|'abc' abc|.
494
495Previously there was no way to implement a class that overrode
496Python's built-in \keyword{in} operator and implemented a custom
497version. \code{\var{obj} in \var{seq}} returns true if \var{obj} is
498present in the sequence \var{seq}; Python computes this by simply
499trying every index of the sequence until either \var{obj} is found or
500an \exception{IndexError} is encountered. Moshe Zadka contributed a
501patch which adds a \method{__contains__} magic method for providing a
502custom implementation for \keyword{in}. Additionally, new built-in
503objects written in C can define what \keyword{in} means for them via a
504new slot in the sequence protocol.
505
506Earlier versions of Python used a recursive algorithm for deleting
507objects. Deeply nested data structures could cause the interpreter to
508fill up the C stack and crash; Christian Tismer rewrote the deletion
509logic to fix this problem. On a related note, comparing recursive
510objects recursed infinitely and crashed; Jeremy Hylton rewrote the
511code to no longer crash, producing a useful result instead. For
512example, after this code:
513
514\begin{verbatim}
515a = []
516b = []
517a.append(a)
518b.append(b)
519\end{verbatim}
520
521The comparison \code{a==b} returns true, because the two recursive
522data structures are isomorphic. \footnote{See the thread ``trashcan
523and PR\#7'' in the April 2000 archives of the python-dev mailing list
524for the discussion leading up to this implementation, and some useful
525relevant links.
526%http://www.python.org/pipermail/python-dev/2000-April/004834.html
527}
528
529Work has been done on porting Python to 64-bit Windows on the Itanium
530processor, mostly by Trent Mick of ActiveState. (Confusingly,
531\code{sys.platform} is still \code{'win32'} on Win64 because it seems
532that for ease of porting, MS Visual C++ treats code as 32 bit on Itanium.)
533PythonWin also supports Windows CE; see the Python CE page at
534\url{http://starship.python.net/crew/mhammond/ce/} for more
535information.
536
537An attempt has been made to alleviate one of Python's warts, the
538often-confusing \exception{NameError} exception when code refers to a
539local variable before the variable has been assigned a value. For
540example, the following code raises an exception on the \keyword{print}
541statement in both 1.5.2 and 2.0; in 1.5.2 a \exception{NameError}
542exception is raised, while 2.0 raises a new
543\exception{UnboundLocalError} exception.
544\exception{UnboundLocalError} is a subclass of \exception{NameError},
545so any existing code that expects \exception{NameError} to be raised
546should still work.
547
548\begin{verbatim}
549def f():
550 print "i=",i
551 i = i + 1
552f()
553\end{verbatim}
554
Andrew M. Kuchling4d46d382000-09-06 17:58:49 +0000555Two new exceptions, \exception{TabError} and
556\exception{IndentationError}, have been introduced. They're both
557subclasses of \exception{SyntaxError}, and are raised when Python code
558is found to be improperly indented.
559
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000560\subsection{Changes to Built-in Functions}
561
562A new built-in, \function{zip(\var{seq1}, \var{seq2}, ...)}, has been
563added. \function{zip()} returns a list of tuples where each tuple
564contains the i-th element from each of the argument sequences. The
565difference between \function{zip()} and \code{map(None, \var{seq1},
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000566\var{seq2})} is that \function{map()} pads the sequences with
567\code{None} if the sequences aren't all of the same length, while
568\function{zip()} truncates the returned list to the length of the
569shortest argument sequence.
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000570
571The \function{int()} and \function{long()} functions now accept an
572optional ``base'' parameter when the first argument is a string.
573\code{int('123', 10)} returns 123, while \code{int('123', 16)} returns
574291. \code{int(123, 16)} raises a \exception{TypeError} exception
575with the message ``can't convert non-string with explicit base''.
576
577A new variable holding more detailed version information has been
578added to the \module{sys} module. \code{sys.version_info} is a tuple
579\code{(\var{major}, \var{minor}, \var{micro}, \var{level},
580\var{serial})} For example, in a hypothetical 2.0.1beta1,
581\code{sys.version_info} would be \code{(2, 0, 1, 'beta', 1)}.
582\var{level} is a string such as \code{"alpha"}, \code{"beta"}, or
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000583\code{"final"} for a final release.
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000584
585Dictionaries have an odd new method, \method{setdefault(\var{key},
586\var{default})}, which behaves similarly to the existing
587\method{get()} method. However, if the key is missing,
588\method{setdefault()} both returns the value of \var{default} as
589\method{get()} would do, and also inserts it into the dictionary as
590the value for \var{key}. Thus, the following lines of code:
591
592\begin{verbatim}
593if dict.has_key( key ): return dict[key]
594else:
595 dict[key] = []
596 return dict[key]
597\end{verbatim}
598
599can be reduced to a single \code{return dict.setdefault(key, [])} statement.
600
Andrew M. Kuchling4d46d382000-09-06 17:58:49 +0000601The interpreter sets a maximum recursion depth in order to catch
602runaway recursion before filling the C stack and causing a core dump
603or GPF.. Previously this limit was fixed when you compiled Python,
604but in 2.0 the maximum recursion depth can be read and modified using
605\function{sys.getrecursionlimit} and \function{sys.setrecursionlimit}.
606The default value is 1000, and a rough maximum value for a given
607platform can be found by running a new script,
608\file{Misc/find_recursionlimit.py}.
Andrew M. Kuchling35e8afb2000-07-08 12:06:31 +0000609
610% ======================================================================
Andrew M. Kuchling730067e2000-06-30 01:44:05 +0000611\section{Porting to 2.0}
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000612
613New Python releases try hard to be compatible with previous releases,
614and the record has been pretty good. However, some changes are
Andrew M. Kuchling4d46d382000-09-06 17:58:49 +0000615considered useful enough, usually because they fix initial design decisions that
616turned out to be actively mistaken, that breaking backward compatibility
Andrew M. Kuchling730067e2000-06-30 01:44:05 +0000617can't always be avoided. This section lists the changes in Python 2.0
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000618that may cause old Python code to break.
619
620The change which will probably break the most code is tightening up
621the arguments accepted by some methods. Some methods would take
622multiple arguments and treat them as a tuple, particularly various
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000623list methods such as \method{.append()} and \method{.insert()}.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000624In earlier versions of Python, if \code{L} is a list, \code{L.append(
Andrew M. Kuchling730067e2000-06-30 01:44:05 +00006251,2 )} appends the tuple \code{(1,2)} to the list. In Python 2.0 this
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000626causes a \exception{TypeError} exception to be raised, with the
627message: 'append requires exactly 1 argument; 2 given'. The fix is to
628simply add an extra set of parentheses to pass both values as a tuple:
629\code{L.append( (1,2) )}.
630
631The earlier versions of these methods were more forgiving because they
632used an old function in Python's C interface to parse their arguments;
Andrew M. Kuchling730067e2000-06-30 01:44:05 +00006332.0 modernizes them to use \function{PyArg_ParseTuple}, the current
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000634argument parsing function, which provides more helpful error messages
635and treats multi-argument calls as errors. If you absolutely must use
Andrew M. Kuchling730067e2000-06-30 01:44:05 +00006362.0 but can't fix your code, you can edit \file{Objects/listobject.c}
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000637and define the preprocessor symbol \code{NO_STRICT_LIST_APPEND} to
638preserve the old behaviour; this isn't recommended.
639
640Some of the functions in the \module{socket} module are still
641forgiving in this way. For example, \function{socket.connect(
642('hostname', 25) )} is the correct form, passing a tuple representing
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000643an IP address, but \function{socket.connect( 'hostname', 25 )} also
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000644works. \function{socket.connect_ex()} and \function{socket.bind()} are
Andrew M. Kuchling730067e2000-06-30 01:44:05 +0000645similarly easy-going. 2.0alpha1 tightened these functions up, but
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000646because the documentation actually used the erroneous multiple
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000647argument form, many people wrote code which would break with the
648stricter checking. GvR backed out the changes in the face of public
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000649reaction, so for the \module{socket} module, the documentation was
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000650fixed and the multiple argument form is simply marked as deprecated;
651it \emph{will} be tightened up again in a future Python version.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000652
Andrew M. Kuchling4d46d382000-09-06 17:58:49 +0000653The \code{\e x} escape in string literals now takes exactly 2 hex
654digits. Previously it would consume all the hex digits following the
655'x' and take the lowest 8 bits of the result, so \code{\e x123456} was
656equivalent to \code{\e x56}.
657
658The \exception{AttributeError} exception has a more friendly error message,
659whose text will be something like \code{'Spam' instance has no attribute 'eggs'}.
660Previously the error message was just the missing attribute name \code{eggs}, and
661code written to take advantage of this fact will break in 2.0.
662
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000663Some work has been done to make integers and long integers a bit more
664interchangeable. In 1.5.2, large-file support was added for Solaris,
665to allow reading files larger than 2Gb; this made the \method{tell()}
666method of file objects return a long integer instead of a regular
667integer. Some code would subtract two file offsets and attempt to use
668the result to multiply a sequence or slice a string, but this raised a
Andrew M. Kuchling730067e2000-06-30 01:44:05 +0000669\exception{TypeError}. In 2.0, long integers can be used to multiply
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000670or slice a sequence, and it'll behave as you'd intuitively expect it
671to; \code{3L * 'abc'} produces 'abcabcabc', and \code{
672(0,1,2,3)[2L:4L]} produces (2,3). Long integers can also be used in
673various new places where previously only integers were accepted, such
674as in the \method{seek()} method of file objects.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000675
676The subtlest long integer change of all is that the \function{str()}
677of a long integer no longer has a trailing 'L' character, though
678\function{repr()} still includes it. The 'L' annoyed many people who
679wanted to print long integers that looked just like regular integers,
680since they had to go out of their way to chop off the character. This
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000681is no longer a problem in 2.0, but code which does \code{str(longval)[:-1]} and assumes the 'L' is there, will now lose
682the final digit.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000683
684Taking the \function{repr()} of a float now uses a different
685formatting precision than \function{str()}. \function{repr()} uses
Andrew M. Kuchling662d76e2000-06-25 14:32:48 +0000686\code{\%.17g} format string for C's \function{sprintf()}, while
687\function{str()} uses \code{\%.12g} as before. The effect is that
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000688\function{repr()} may occasionally show more decimal places than
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000689\function{str()}, for certain numbers.
Andrew M. Kuchlinga5bbb002000-06-10 02:41:46 +0000690For example, the number 8.1 can't be represented exactly in binary, so
691\code{repr(8.1)} is \code{'8.0999999999999996'}, while str(8.1) is
692\code{'8.1'}.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000693
Andrew M. Kuchling730067e2000-06-30 01:44:05 +0000694The \code{-X} command-line option, which turned all standard
Andrew M. Kuchling62cdd962000-06-30 12:46:41 +0000695exceptions into strings instead of classes, has been removed; the
696standard exceptions will now always be classes. The
697\module{exceptions} module containing the standard exceptions was
698translated from Python to a built-in C module, written by Barry Warsaw
699and Fredrik Lundh.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000700
Andrew M. Kuchling791b3662000-07-01 15:04:18 +0000701% Commented out for now -- I don't think anyone will care.
702%The pattern and match objects provided by SRE are C types, not Python
703%class instances as in 1.5. This means you can no longer inherit from
704%\class{RegexObject} or \class{MatchObject}, but that shouldn't be much
705%of a problem since no one should have been doing that in the first
706%place.
707
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000708% ======================================================================
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000709\section{Extending/Embedding Changes}
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000710
711Some of the changes are under the covers, and will only be apparent to
Andrew M. Kuchling8357c4c2000-07-01 00:14:43 +0000712people writing C extension modules or embedding a Python interpreter
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000713in a larger application. If you aren't dealing with Python's C API,
Andrew M. Kuchling5b8311e2000-05-31 03:28:42 +0000714you can safely skip this section.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000715
Andrew M. Kuchlinga5bbb002000-06-10 02:41:46 +0000716The version number of the Python C API was incremented, so C
717extensions compiled for 1.5.2 must be recompiled in order to work with
Andrew M. Kuchling730067e2000-06-30 01:44:05 +00007182.0. On Windows, attempting to import a third party extension built
Andrew M. Kuchlinga5bbb002000-06-10 02:41:46 +0000719for Python 1.5.x usually results in an immediate crash; there's not
Andrew M. Kuchling62cdd962000-06-30 12:46:41 +0000720much we can do about this. (Here's Mark Hammond's explanation of the
721reasons for the crash. The 1.5 module is linked against
722\file{Python15.dll}. When \file{Python.exe} , linked against
723\file{Python16.dll}, starts up, it initializes the Python data
724structures in \file{Python16.dll}. When Python then imports the
725module \file{foo.pyd} linked against \file{Python15.dll}, it
726immediately tries to call the functions in that DLL. As Python has
727not been initialized in that DLL, the program immediately crashes.)
Andrew M. Kuchlinga5bbb002000-06-10 02:41:46 +0000728
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000729Users of Jim Fulton's ExtensionClass module will be pleased to find
730out that hooks have been added so that ExtensionClasses are now
731supported by \function{isinstance()} and \function{issubclass()}.
732This means you no longer have to remember to write code such as
733\code{if type(obj) == myExtensionClass}, but can use the more natural
734\code{if isinstance(obj, myExtensionClass)}.
735
Andrew M. Kuchlingb853ea02000-06-03 03:06:58 +0000736The \file{Python/importdl.c} file, which was a mass of \#ifdefs to
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000737support dynamic loading on many different platforms, was cleaned up
Andrew M. Kuchling69db0e42000-06-28 02:16:00 +0000738and reorganised by Greg Stein. \file{importdl.c} is now quite small,
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000739and platform-specific code has been moved into a bunch of
Andrew M. Kuchlingb9fb1f22000-08-04 12:40:35 +0000740\file{Python/dynload_*.c} files. Another cleanup: there were also a
741number of \file{my*.h} files in the Include/ directory that held
742various portability hacks; they've been merged into a single file,
743\file{Include/pyport.h}.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000744
745Vladimir Marangozov's long-awaited malloc restructuring was completed,
746to make it easy to have the Python interpreter use a custom allocator
747instead of C's standard \function{malloc()}. For documentation, read
Andrew M. Kuchling2d2dc9f2000-08-17 00:27:06 +0000748the comments in \file{Include/pymem.h} and
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000749\file{Include/objimpl.h}. For the lengthy discussions during which
750the interface was hammered out, see the Web archives of the 'patches'
751and 'python-dev' lists at python.org.
752
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000753Recent versions of the GUSI development environment for MacOS support
754POSIX threads. Therefore, Python's POSIX threading support now works
755on the Macintosh. Threading support using the user-space GNU \texttt{pth}
756library was also contributed.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000757
758Threading support on Windows was enhanced, too. Windows supports
759thread locks that use kernel objects only in case of contention; in
760the common case when there's no contention, they use simpler functions
761which are an order of magnitude faster. A threaded version of Python
Andrew M. Kuchling730067e2000-06-30 01:44:05 +00007621.5.2 on NT is twice as slow as an unthreaded version; with the 2.0
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000763changes, the difference is only 10\%. These improvements were
764contributed by Yakov Markovitch.
765
Andrew M. Kuchling08d87c62000-07-09 15:05:15 +0000766Python 2.0's source now uses only ANSI C prototypes, so compiling Python now
767requires an ANSI C compiler, and can no longer be done using a compiler that
768only supports K\&R C.
769
Andrew M. Kuchling4d46d382000-09-06 17:58:49 +0000770Previously the Python virtual machine used 16-bit numbers in its
771bytecode, limiting the size of source files. In particular, this
772affected the maximum size of literal lists and dictionaries in Python
773source; occasionally people who are generating Python code would run into this limit.
774A patch by Charles G. Waldman raises the limit from \verb|2^16| to \verb|2^{32}|.
775
776
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000777% ======================================================================
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000778\section{Distutils: Making Modules Easy to Install}
779
780Before Python 2.0, installing modules was a tedious affair -- there
781was no way to figure out automatically where Python is installed, or
782what compiler options to use for extension modules. Software authors
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000783had to go through an arduous ritual of editing Makefiles and
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000784configuration files, which only really work on Unix and leave Windows
785and MacOS unsupported. Software users faced wildly differing
786installation instructions
787
788The SIG for distribution utilities, shepherded by Greg Ward, has
789created the Distutils, a system to make package installation much
790easier. They form the \module{distutils} package, a new part of
791Python's standard library. In the best case, installing a Python
792module from source will require the same steps: first you simply mean
793unpack the tarball or zip archive, and the run ``\code{python setup.py
794install}''. The platform will be automatically detected, the compiler
795will be recognized, C extension modules will be compiled, and the
796distribution installed into the proper directory. Optional
797command-line arguments provide more control over the installation
798process, the distutils package offers many places to override defaults
799-- separating the build from the install, building or installing in
800non-default directories, and more.
801
802In order to use the Distutils, you need to write a \file{setup.py}
803script. For the simple case, when the software contains only .py
804files, a minimal \file{setup.py} can be just a few lines long:
805
806\begin{verbatim}
807from distutils.core import setup
808setup (name = "foo", version = "1.0",
809 py_modules = ["module1", "module2"])
810\end{verbatim}
811
812The \file{setup.py} file isn't much more complicated if the software
813consists of a few packages:
814
815\begin{verbatim}
816from distutils.core import setup
817setup (name = "foo", version = "1.0",
818 packages = ["package", "package.subpackage"])
819\end{verbatim}
820
821A C extension can be the most complicated case; here's an example taken from
822the PyXML package:
823
824
825\begin{verbatim}
826from distutils.core import setup, Extension
827
828expat_extension = Extension('xml.parsers.pyexpat',
829 define_macros = [('XML_NS', None)],
830 include_dirs = [ 'extensions/expat/xmltok',
831 'extensions/expat/xmlparse' ],
832 sources = [ 'extensions/pyexpat.c',
833 'extensions/expat/xmltok/xmltok.c',
834 'extensions/expat/xmltok/xmlrole.c',
835 ]
836 )
837setup (name = "PyXML", version = "0.5.4",
838 ext_modules =[ expat_extension ] )
839
840\end{verbatim}
841
842The Distutils can also take care of creating source and binary
843distributions. The ``sdist'' command, run by ``\code{python setup.py
844sdist}', builds a source distribution such as \file{foo-1.0.tar.gz}.
845Adding new commands isn't difficult, ``bdist_rpm'' and
846``bdist_wininst'' commands have already been contributed to create an
847RPM distribution and a Windows installer for the software,
848respectively. Commands to create other distribution formats such as
849Debian packages and Solaris \file{.pkg} files are in various stages of
850development.
851
852All this is documented in a new manual, \textit{Distributing Python
853Modules}, that joins the basic set of Python documentation.
854
855% ======================================================================
856%\section{New XML Code}
857
858%XXX write this section...
859
860% ======================================================================
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000861\section{Module changes}
862
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000863Lots of improvements and bugfixes were made to Python's extensive
864standard library; some of the affected modules include
865\module{readline}, \module{ConfigParser}, \module{cgi},
866\module{calendar}, \module{posix}, \module{readline}, \module{xmllib},
867\module{aifc}, \module{chunk, wave}, \module{random}, \module{shelve},
868and \module{nntplib}. Consult the CVS logs for the exact
869patch-by-patch details.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000870
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000871Brian Gallew contributed OpenSSL support for the \module{socket}
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000872module. OpenSSL is an implementation of the Secure Socket Layer,
873which encrypts the data being sent over a socket. When compiling
874Python, you can edit \file{Modules/Setup} to include SSL support,
875which adds an additional function to the \module{socket} module:
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000876\function{socket.ssl(\var{socket}, \var{keyfile}, \var{certfile})},
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000877which takes a socket object and returns an SSL socket. The
878\module{httplib} and \module{urllib} modules were also changed to
879support ``https://'' URLs, though no one has implemented FTP or SMTP
880over SSL.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000881
Andrew M. Kuchling69db0e42000-06-28 02:16:00 +0000882The \module{httplib} module has been rewritten by Greg Stein to
883support HTTP/1.1. Backward compatibility with the 1.5 version of
884\module{httplib} is provided, though using HTTP/1.1 features such as
885pipelining will require rewriting code to use a different set of
886interfaces.
887
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000888The \module{Tkinter} module now supports Tcl/Tk version 8.1, 8.2, or
8898.3, and support for the older 7.x versions has been dropped. The
Andrew M. Kuchling791b3662000-07-01 15:04:18 +0000890Tkinter module now supports displaying Unicode strings in Tk widgets.
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000891Also, Fredrik Lundh contributed an optimization which makes operations
892like \code{create_line} and \code{create_polygon} much faster,
Andrew M. Kuchling791b3662000-07-01 15:04:18 +0000893especially when using lots of coordinates.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000894
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000895The \module{curses} module has been greatly extended, starting from
896Oliver Andrich's enhanced version, to provide many additional
897functions from ncurses and SYSV curses, such as colour, alternative
Andrew M. Kuchling69db0e42000-06-28 02:16:00 +0000898character set support, pads, and mouse support. This means the module
899is no longer compatible with operating systems that only have BSD
900curses, but there don't seem to be any currently maintained OSes that
901fall into this category.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000902
Andrew M. Kuchling730067e2000-06-30 01:44:05 +0000903As mentioned in the earlier discussion of 2.0's Unicode support, the
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000904underlying implementation of the regular expressions provided by the
905\module{re} module has been changed. SRE, a new regular expression
906engine written by Fredrik Lundh and partially funded by Hewlett
907Packard, supports matching against both 8-bit strings and Unicode
908strings.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000909
910% ======================================================================
911\section{New modules}
912
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000913A number of new modules were added. We'll simply list them with brief
Andrew M. Kuchling730067e2000-06-30 01:44:05 +0000914descriptions; consult the 2.0 documentation for the details of a
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000915particular module.
916
917\begin{itemize}
918
Andrew M. Kuchling62cdd962000-06-30 12:46:41 +0000919\item{\module{atexit}}:
920For registering functions to be called before the Python interpreter exits.
921Code that currently sets
922\code{sys.exitfunc} directly should be changed to
923use the \module{atexit} module instead, importing \module{atexit}
924and calling \function{atexit.register()} with
925the function to be called on exit.
926(Contributed by Skip Montanaro.)
927
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000928\item{\module{codecs}, \module{encodings}, \module{unicodedata}:} Added as part of the new Unicode support.
929
Andrew M. Kuchlingfed4f1e2000-07-01 12:33:43 +0000930\item{\module{filecmp}:} Supersedes the old \module{cmp}, \module{cmpcache} and
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000931\module{dircmp} modules, which have now become deprecated.
Andrew M. Kuchlingc0328f02000-06-10 15:11:20 +0000932(Contributed by Gordon MacMillan and Moshe Zadka.)
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000933
Andrew M. Kuchling35e8afb2000-07-08 12:06:31 +0000934\item{\module{linuxaudiodev}:} Support for the \file{/dev/audio}
935device on Linux, a twin to the existing \module{sunaudiodev} module.
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000936(Contributed by Peter Bosch.)
937
938\item{\module{mmap}:} An interface to memory-mapped files on both
939Windows and Unix. A file's contents can be mapped directly into
940memory, at which point it behaves like a mutable string, so its
941contents can be read and modified. They can even be passed to
942functions that expect ordinary strings, such as the \module{re}
943module. (Contributed by Sam Rushing, with some extensions by
944A.M. Kuchling.)
945
Andrew M. Kuchling35e8afb2000-07-08 12:06:31 +0000946\item{\module{pyexpat}:} An interface to the Expat XML parser.
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000947(Contributed by Paul Prescod.)
948
949\item{\module{robotparser}:} Parse a \file{robots.txt} file, which is
950used for writing Web spiders that politely avoid certain areas of a
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000951Web site. The parser accepts the contents of a \file{robots.txt} file,
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000952builds a set of rules from it, and can then answer questions about
953the fetchability of a given URL. (Contributed by Skip Montanaro.)
954
955\item{\module{tabnanny}:} A module/script to
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000956check Python source code for ambiguous indentation.
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000957(Contributed by Tim Peters.)
958
Andrew M. Kuchlinga5bbb002000-06-10 02:41:46 +0000959\item{\module{UserString}:} A base class useful for deriving objects that behave like strings.
960
Andrew M. Kuchling08d87c62000-07-09 15:05:15 +0000961\item{\module{webbrowser}:} A module that provides a platform independent
962way to launch a web browser on a specific URL. For each platform, various
963browsers are tried in a specific order. The user can alter which browser
964is launched by setting the \var{BROWSER} environment variable.
965(Originally inspired by Eric S. Raymond's patch to \module{urllib}
966which added similar functionality, but
967the final module comes from code originally
968implemented by Fred Drake as \file{Tools/idle/BrowserControl.py},
969and adapted for the standard library by Fred.)
970
Andrew M. Kuchlingd500e442000-09-06 12:30:25 +0000971\item{\module{_winreg}:} An interface to the
Andrew M. Kuchlingfed4f1e2000-07-01 12:33:43 +0000972Windows registry. \module{_winreg} is an adaptation of functions that
973have been part of PythonWin since 1995, but has now been added to the core
Andrew M. Kuchlingd500e442000-09-06 12:30:25 +0000974distribution, and enhanced to support Unicode.
975\module{_winreg} was written by Bill Tutt and Mark Hammond.
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000976
977\item{\module{zipfile}:} A module for reading and writing ZIP-format
978archives. These are archives produced by \program{PKZIP} on
979DOS/Windows or \program{zip} on Unix, not to be confused with
980\program{gzip}-format files (which are supported by the \module{gzip}
981module)
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000982(Contributed by James C. Ahlstrom.)
983
Andrew M. Kuchling69db0e42000-06-28 02:16:00 +0000984\item{\module{imputil}:} A module that provides a simpler way for
985writing customised import hooks, in comparison to the existing
986\module{ihooks} module. (Implemented by Greg Stein, with much
987discussion on python-dev along the way.)
988
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000989\end{itemize}
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000990
991% ======================================================================
992\section{IDLE Improvements}
993
Andrew M. Kuchlingc0328f02000-06-10 15:11:20 +0000994IDLE is the official Python cross-platform IDE, written using Tkinter.
Andrew M. Kuchling730067e2000-06-30 01:44:05 +0000995Python 2.0 includes IDLE 0.6, which adds a number of new features and
Andrew M. Kuchlingc0328f02000-06-10 15:11:20 +0000996improvements. A partial list:
997
998\begin{itemize}
999\item UI improvements and optimizations,
1000especially in the area of syntax highlighting and auto-indentation.
1001
1002\item The class browser now shows more information, such as the top
Andrew M. Kuchling730067e2000-06-30 01:44:05 +00001003level functions in a module.
Andrew M. Kuchlingc0328f02000-06-10 15:11:20 +00001004
1005\item Tab width is now a user settable option. When opening an existing Python
1006file, IDLE automatically detects the indentation conventions, and adapts.
1007
1008\item There is now support for calling browsers on various platforms,
1009used to open the Python documentation in a browser.
1010
1011\item IDLE now has a command line, which is largely similar to
1012the vanilla Python interpreter.
1013
1014\item Call tips were added in many places.
1015
1016\item IDLE can now be installed as a package.
1017
1018\item In the editor window, there is now a line/column bar at the bottom.
1019
1020\item Three new keystroke commands: Check module (Alt-F5), Import
Andrew M. Kuchling730067e2000-06-30 01:44:05 +00001021module (F5) and Run script (Ctrl-F5).
Andrew M. Kuchlingc0328f02000-06-10 15:11:20 +00001022
1023\end{itemize}
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +00001024
1025% ======================================================================
1026\section{Deleted and Deprecated Modules}
1027
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +00001028A few modules have been dropped because they're obsolete, or because
1029there are now better ways to do the same thing. The \module{stdwin}
1030module is gone; it was for a platform-independent windowing toolkit
1031that's no longer developed.
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +00001032
Andrew M. Kuchlinga5bbb002000-06-10 02:41:46 +00001033A number of modules have been moved to the
1034\file{lib-old} subdirectory:
1035\module{cmp}, \module{cmpcache}, \module{dircmp}, \module{dump},
1036\module{find}, \module{grep}, \module{packmail},
1037\module{poly}, \module{util}, \module{whatsound}, \module{zmod}.
1038If you have code which relies on a module that's been moved to
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +00001039\file{lib-old}, you can simply add that directory to \code{sys.path}
Andrew M. Kuchlinga5bbb002000-06-10 02:41:46 +00001040to get them back, but you're encouraged to update any code that uses
1041these modules.
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +00001042
Andrew M. Kuchling730067e2000-06-30 01:44:05 +00001043\section{Acknowledgements}
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +00001044
Andrew M. Kuchlinga6161ed2000-07-01 00:23:02 +00001045The authors would like to thank the following people for offering
Andrew M. Kuchling118ee962000-09-27 01:01:18 +00001046suggestions on drafts of this article: Mark Hammond, Gregg Hauser,
1047Fredrik Lundh, Detlef Lannert, Skip Montanaro, Vladimir Marangozov,
1048Guido van Rossum, and Neil Schemenauer.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +00001049
1050\end{document}