blob: fe41ea95f45b583cb6cc586f4d33a0c6e6195fb6 [file] [log] [blame]
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +00001\documentclass{howto}
2
Andrew M. Kuchling730067e2000-06-30 01:44:05 +00003\title{What's New in Python 2.0}
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +00004\release{0.05}
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +00005\author{A.M. Kuchling and Moshe Zadka}
6\authoraddress{\email{amk1@bigfoot.com}, \email{moshez@math.huji.ac.il} }
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +00007\begin{document}
8\maketitle\tableofcontents
9
10\section{Introduction}
11
Andrew M. Kuchling69db0e42000-06-28 02:16:00 +000012{\large This is a draft document; please report inaccuracies and
13omissions to the authors. This document should not be treated as
Andrew M. Kuchling730067e2000-06-30 01:44:05 +000014definitive; features described here might be removed or changed during
15the beta cycle before the final release of Python 2.0.
Andrew M. Kuchlinga5bbb002000-06-10 02:41:46 +000016}
17
Andrew M. Kuchling730067e2000-06-30 01:44:05 +000018A new release of Python, version 2.0, will be released some time this
Andrew M. Kuchling70ba3822000-07-01 00:13:30 +000019summer. Beta versions are already available from
Andrew M. Kuchling6d4addd2000-09-25 14:40:15 +000020\url{http://www.pythonlabs.com/products/python2.0/}. This article
Andrew M. Kuchling70ba3822000-07-01 00:13:30 +000021covers the exciting new features in 2.0, highlights some other useful
22changes, and points out a few incompatible changes that may require
23rewriting code.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +000024
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +000025Python's development never completely stops between releases, and a
26steady flow of bug fixes and improvements are always being submitted.
27A host of minor fixes, a few optimizations, additional docstrings, and
Andrew M. Kuchling730067e2000-06-30 01:44:05 +000028better error messages went into 2.0; to list them all would be
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +000029impossible, but they're certainly significant. Consult the
30publicly-available CVS logs if you want to see the full list.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +000031
32% ======================================================================
Andrew M. Kuchling4d46d382000-09-06 17:58:49 +000033\section{What About Python 1.6?}
34
35Python 1.6 can be thought of as the Contractual Obligations Python
36release. After the core development team left CNRI in May 2000, CNRI
37requested that a 1.6 release be created, containing all the work on
38Python that had been performed at CNRI. Python 1.6 therefore
39represents the state of the CVS tree as of May 2000, with the most
40significant new feature being Unicode support. Development continued
41after May, of course, so the 1.6 tree received a few fixes to ensure
42that it's forward-compatible with Python 2.0. 1.6 is therefore part
43of Python's evolution, and not a side branch.
44
45So, should you take much interest in Python 1.6? Probably not. The
461.6final and 2.0beta1 releases were made on the same day (September 5,
472000), the plan being to finalize Python 2.0 within a month or so. If
48you have applications to maintain, there seems little point in
49breaking things by moving to 1.6, fixing them, and then having another
50round of breakage within a month by moving to 2.0; you're better off
51just going straight to 2.0. Most of the really interesting features
52described in this document are only in 2.0, because a lot of work was
53done between May and September.
54
55% ======================================================================
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +000056\section{Unicode}
57
Andrew M. Kuchling730067e2000-06-30 01:44:05 +000058The largest new feature in Python 2.0 is a new fundamental data type:
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +000059Unicode strings. Unicode uses 16-bit numbers to represent characters
60instead of the 8-bit number used by ASCII, meaning that 65,536
61distinct characters can be supported.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +000062
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +000063The final interface for Unicode support was arrived at through
Andrew M. Kuchlingb853ea02000-06-03 03:06:58 +000064countless often-stormy discussions on the python-dev mailing list, and
Andrew M. Kuchling62cdd962000-06-30 12:46:41 +000065mostly implemented by Marc-Andr\'e Lemburg, based on a Unicode string
66type implementation by Fredrik Lundh. A detailed explanation of the
67interface is in the file \file{Misc/unicode.txt} in the Python source
68distribution; it's also available on the Web at
Andrew M. Kuchlingb853ea02000-06-03 03:06:58 +000069\url{http://starship.python.net/crew/lemburg/unicode-proposal.txt}.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +000070This article will simply cover the most significant points from the
71full interface.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +000072
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +000073In Python source code, Unicode strings are written as
74\code{u"string"}. Arbitrary Unicode characters can be written using a
Andrew M. Kuchlingb853ea02000-06-03 03:06:58 +000075new escape sequence, \code{\e u\var{HHHH}}, where \var{HHHH} is a
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000764-digit hexadecimal number from 0000 to FFFF. The existing
Andrew M. Kuchlingb853ea02000-06-03 03:06:58 +000077\code{\e x\var{HHHH}} escape sequence can also be used, and octal
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +000078escapes can be used for characters up to U+01FF, which is represented
Andrew M. Kuchlingb853ea02000-06-03 03:06:58 +000079by \code{\e 777}.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +000080
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +000081Unicode strings, just like regular strings, are an immutable sequence
Andrew M. Kuchling662d76e2000-06-25 14:32:48 +000082type. They can be indexed and sliced, but not modified in place.
Andrew M. Kuchling62cdd962000-06-30 12:46:41 +000083Unicode strings have an \method{encode( \optional{encoding} )} method
Andrew M. Kuchling662d76e2000-06-25 14:32:48 +000084that returns an 8-bit string in the desired encoding. Encodings are
85named by strings, such as \code{'ascii'}, \code{'utf-8'},
86\code{'iso-8859-1'}, or whatever. A codec API is defined for
87implementing and registering new encodings that are then available
88throughout a Python program. If an encoding isn't specified, the
89default encoding is usually 7-bit ASCII, though it can be changed for
90your Python installation by calling the
Andrew M. Kuchlingc0328f02000-06-10 15:11:20 +000091\function{sys.setdefaultencoding(\var{encoding})} function in a
Andrew M. Kuchling69db0e42000-06-28 02:16:00 +000092customised version of \file{site.py}.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +000093
94Combining 8-bit and Unicode strings always coerces to Unicode, using
95the default ASCII encoding; the result of \code{'a' + u'bc'} is
Andrew M. Kuchling7f6270d2000-06-09 02:48:18 +000096\code{u'abc'}.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +000097
98New built-in functions have been added, and existing built-ins
99modified to support Unicode:
100
101\begin{itemize}
102\item \code{unichr(\var{ch})} returns a Unicode string 1 character
103long, containing the character \var{ch}.
104
105\item \code{ord(\var{u})}, where \var{u} is a 1-character regular or Unicode string, returns the number of the character as an integer.
106
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000107\item \code{unicode(\var{string} \optional{, \var{encoding}}
108\optional{, \var{errors}} ) } creates a Unicode string from an 8-bit
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000109string. \code{encoding} is a string naming the encoding to use.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000110The \code{errors} parameter specifies the treatment of characters that
111are invalid for the current encoding; passing \code{'strict'} as the
112value causes an exception to be raised on any encoding error, while
113\code{'ignore'} causes errors to be silently ignored and
114\code{'replace'} uses U+FFFD, the official replacement character, in
115case of any problems.
116
117\end{itemize}
118
119A new module, \module{unicodedata}, provides an interface to Unicode
120character properties. For example, \code{unicodedata.category(u'A')}
121returns the 2-character string 'Lu', the 'L' denoting it's a letter,
122and 'u' meaning that it's uppercase.
Andrew M. Kuchlingb853ea02000-06-03 03:06:58 +0000123\code{u.bidirectional(u'\e x0660')} returns 'AN', meaning that U+0660 is
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000124an Arabic number.
125
Andrew M. Kuchlingb853ea02000-06-03 03:06:58 +0000126The \module{codecs} module contains functions to look up existing encodings
127and register new ones. Unless you want to implement a
128new encoding, you'll most often use the
129\function{codecs.lookup(\var{encoding})} function, which returns a
1304-element tuple: \code{(\var{encode_func},
131\var{decode_func}, \var{stream_reader}, \var{stream_writer})}.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000132
133\begin{itemize}
134\item \var{encode_func} is a function that takes a Unicode string, and
135returns a 2-tuple \code{(\var{string}, \var{length})}. \var{string}
136is an 8-bit string containing a portion (perhaps all) of the Unicode
Andrew M. Kuchling2d2dc9f2000-08-17 00:27:06 +0000137string converted into the given encoding, and \var{length} tells you
138how much of the Unicode string was converted.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000139
140\item \var{decode_func} is the mirror of \var{encode_func},
141taking a Unicode string and
142returns a 2-tuple \code{(\var{ustring}, \var{length})} containing a Unicode string
143and \var{length} telling you how much of the string was consumed.
144
145\item \var{stream_reader} is a class that supports decoding input from
146a stream. \var{stream_reader(\var{file_obj})} returns an object that
147supports the \method{read()}, \method{readline()}, and
148\method{readlines()} methods. These methods will all translate from
149the given encoding and return Unicode strings.
150
151\item \var{stream_writer}, similarly, is a class that supports
152encoding output to a stream. \var{stream_writer(\var{file_obj})}
Andrew M. Kuchling69db0e42000-06-28 02:16:00 +0000153returns an object that supports the \method{write()} and
154\method{writelines()} methods. These methods expect Unicode strings,
155translating them to the given encoding on output.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000156\end{itemize}
157
158For example, the following code writes a Unicode string into a file,
159encoding it as UTF-8:
160
161\begin{verbatim}
162import codecs
163
164unistr = u'\u0660\u2000ab ...'
165
166(UTF8_encode, UTF8_decode,
167 UTF8_streamreader, UTF8_streamwriter) = codecs.lookup('UTF-8')
168
169output = UTF8_streamwriter( open( '/tmp/output', 'wb') )
170output.write( unistr )
171output.close()
172\end{verbatim}
173
174The following code would then read UTF-8 input from the file:
175
176\begin{verbatim}
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000177input = UTF8_streamreader( open( '/tmp/output', 'rb') )
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000178print repr(input.read())
179input.close()
180\end{verbatim}
181
182Unicode-aware regular expressions are available through the
183\module{re} module, which has a new underlying implementation called
184SRE written by Fredrik Lundh of Secret Labs AB.
185
Andrew M. Kuchlingc0328f02000-06-10 15:11:20 +0000186A \code{-U} command line option was added which causes the Python
187compiler to interpret all string literals as Unicode string literals.
188This is intended to be used in testing and future-proofing your Python
189code, since some future version of Python may drop support for 8-bit
190strings and provide only Unicode strings.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000191
192% ======================================================================
Andrew M. Kuchling2d2dc9f2000-08-17 00:27:06 +0000193\section{List Comprehensions}
194
195Lists are a workhorse data type in Python, and many programs
196manipulate a list at some point. Two common operations on lists are
197to loop over them, and either pick out the elements that meet a
198certain criterion, or apply some function to each element. For
199example, given a list of strings, you might want to pull out all the
200strings containing a given substring, or strip off trailing whitespace
201from each line.
202
203The existing \function{map()} and \function{filter()} functions can be
204used for this purpose, but they require a function as one of their
205arguments. This is fine if there's an existing built-in function that
206can be passed directly, but if there isn't, you have to create a
207little function to do the required work, and Python's scoping rules
208make the result ugly if the little function needs additional
209information. Take the first example in the previous paragraph,
210finding all the strings in the list containing a given substring. You
211could write the following to do it:
212
213\begin{verbatim}
214# Given the list L, make a list of all strings
215# containing the substring S.
216sublist = filter( lambda s, substring=S:
217 string.find(s, substring) != -1,
218 L)
219\end{verbatim}
220
221Because of Python's scoping rules, a default argument is used so that
222the anonymous function created by the \keyword{lambda} statement knows
223what substring is being searched for. List comprehensions make this
224cleaner:
225
226\begin{verbatim}
227sublist = [ s for s in L if string.find(s, S) != -1 ]
228\end{verbatim}
229
230List comprehensions have the form:
231
232\begin{verbatim}
233[ expression for expr in sequence1
234 for expr2 in sequence2 ...
235 for exprN in sequenceN
236 if condition
237\end{verbatim}
238
239The \keyword{for}...\keyword{in} clauses contain the sequences to be
240iterated over. The sequences do not have to be the same length,
241because they are \emph{not} iterated over in parallel, but
242from left to right; this is explained more clearly in the following
243paragraphs. The elements of the generated list will be the successive
244values of \var{expression}. The final \keyword{if} clause is
245optional; if present, \var{expression} is only evaluated and added to
246the result if \var{condition} is true.
247
248To make the semantics very clear, a list comprehension is equivalent
249to the following Python code:
250
251\begin{verbatim}
252for expr1 in sequence1:
253 for expr2 in sequence2:
254 ...
255 for exprN in sequenceN:
256 if (condition):
257 # Append the value of
258 # the expression to the
259 # resulting list.
260\end{verbatim}
261
262This means that when there are \keyword{for}...\keyword{in} clauses,
263the resulting list will be equal to the product of the lengths of all
264the sequences. If you have two lists of length 3, the output list is
2659 elements long:
266
267\begin{verbatim}
268seq1 = 'abc'
269seq2 = (1,2,3)
270>>> [ (x,y) for x in seq1 for y in seq2]
271[('a', 1), ('a', 2), ('a', 3), ('b', 1), ('b', 2), ('b', 3), ('c', 1),
272('c', 2), ('c', 3)]
273\end{verbatim}
274
275To avoid introducing an ambiguity into Python's grammar, if
276\var{expression} is creating a tuple, it must be surrounded with
277parentheses. The first list comprehension below is a syntax error,
278while the second one is correct:
279
280\begin{verbatim}
281# Syntax error
282[ x,y for x in seq1 for y in seq2]
283# Correct
284[ (x,y) for x in seq1 for y in seq2]
285\end{verbatim}
286
Andrew M. Kuchling2d2dc9f2000-08-17 00:27:06 +0000287The idea of list comprehensions originally comes from the functional
288programming language Haskell (\url{http://www.haskell.org}). Greg
289Ewing argued most effectively for adding them to Python and wrote the
290initial list comprehension patch, which was then discussed for a
291seemingly endless time on the python-dev mailing list and kept
292up-to-date by Skip Montanaro.
293
Andrew M. Kuchling2d2dc9f2000-08-17 00:27:06 +0000294% ======================================================================
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000295\section{Augmented Assignment}
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000296
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000297Augmented assignment operators, another long-requested feature, have
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000298been added to Python 2.0. Augmented assignment operators include
299\code{+=}, \code{-=}, \code{*=}, and so forth. For example, the
300statement \code{a += 2} increments the value of the variable
301\code{a} by 2, equivalent to the slightly lengthier \code{a = a + 2}.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000302
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000303The full list of supported assignment operators is \code{+=},
304\code{-=}, \code{*=}, \code{/=}, \code{\%=}, \code{**=}, \code{\&=},
Andrew M. Kuchling3cdb5762000-08-30 12:55:42 +0000305\code{|=}, \verb|^=|, \code{>>=}, and \code{<<=}. Python classes can
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000306override the augmented assignment operators by defining methods named
307\method{__iadd__}, \method{__isub__}, etc. For example, the following
308\class{Number} class stores a number and supports using += to create a
309new instance with an incremented value.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000310
311\begin{verbatim}
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000312class Number:
313 def __init__(self, value):
314 self.value = value
315 def __iadd__(self, increment):
316 return Number( self.value + increment)
317
318n = Number(5)
319n += 3
320print n.value
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000321\end{verbatim}
322
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000323The \method{__iadd__} special method is called with the value of the
324increment, and should return a new instance with an appropriately
325modified value; this return value is bound as the new value of the
326variable on the left-hand side.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000327
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000328Augmented assignment operators were first introduced in the C
329programming language, and most C-derived languages, such as
330\program{awk}, C++, Java, Perl, and PHP also support them. The augmented
331assignment patch was implemented by Thomas Wouters.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000332
333% ======================================================================
334\section{String Methods}
335
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000336Until now string-manipulation functionality was in the \module{string}
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000337module, which was usually a front-end for the \module{strop}
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000338module written in C. The addition of Unicode posed a difficulty for
339the \module{strop} module, because the functions would all need to be
340rewritten in order to accept either 8-bit or Unicode strings. For
341functions such as \function{string.replace()}, which takes 3 string
342arguments, that means eight possible permutations, and correspondingly
343complicated code.
344
Andrew M. Kuchling730067e2000-06-30 01:44:05 +0000345Instead, Python 2.0 pushes the problem onto the string type, making
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000346string manipulation functionality available through methods on both
3478-bit strings and Unicode strings.
348
349\begin{verbatim}
350>>> 'andrew'.capitalize()
351'Andrew'
352>>> 'hostname'.replace('os', 'linux')
353'hlinuxtname'
354>>> 'moshe'.find('sh')
3552
356\end{verbatim}
357
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000358One thing that hasn't changed, a noteworthy April Fools' joke
359notwithstanding, is that Python strings are immutable. Thus, the
360string methods return new strings, and do not modify the string on
361which they operate.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000362
363The old \module{string} module is still around for backwards
364compatibility, but it mostly acts as a front-end to the new string
365methods.
366
Andrew M. Kuchling730067e2000-06-30 01:44:05 +0000367Two methods which have no parallel in pre-2.0 versions, although they
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000368did exist in JPython for quite some time, are \method{startswith()}
369and \method{endswith}. \code{s.startswith(t)} is equivalent to \code{s[:len(t)]
370== t}, while \code{s.endswith(t)} is equivalent to \code{s[-len(t):] == t}.
371
Andrew M. Kuchlingfed4f1e2000-07-01 12:33:43 +0000372One other method which deserves special mention is \method{join}. The
373\method{join} method of a string receives one parameter, a sequence of
374strings, and is equivalent to the \function{string.join} function from
375the old \module{string} module, with the arguments reversed. In other
376words, \code{s.join(seq)} is equivalent to the old
377\code{string.join(seq, s)}.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000378
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000379% ======================================================================
Andrew M. Kuchling35e8afb2000-07-08 12:06:31 +0000380\section{Optional Collection of Cycles}
381
382The C implementation of Python uses reference counting to implement
383garbage collection. Every Python object maintains a count of the
384number of references pointing to itself, and adjusts the count as
385references are created or destroyed. Once the reference count reaches
386zero, the object is no longer accessible, since you need to have a
387reference to an object to access it, and if the count is zero, no
388references exist any longer.
389
390Reference counting has some pleasant properties: it's easy to
391understand and implement, and the resulting implementation is
392portable, fairly fast, and reacts well with other libraries that
393implement their own memory handling schemes. The major problem with
394reference counting is that it sometimes doesn't realise that objects
395are no longer accessible, resulting in a memory leak. This happens
396when there are cycles of references.
397
398Consider the simplest possible cycle,
399a class instance which has a reference to itself:
400
401\begin{verbatim}
402instance = SomeClass()
403instance.myself = instance
404\end{verbatim}
405
406After the above two lines of code have been executed, the reference
407count of \code{instance} is 2; one reference is from the variable
408named \samp{'instance'}, and the other is from the \samp{myself}
409attribute of the instance.
410
411If the next line of code is \code{del instance}, what happens? The
412reference count of \code{instance} is decreased by 1, so it has a
413reference count of 1; the reference in the \samp{myself} attribute
414still exists. Yet the instance is no longer accessible through Python
415code, and it could be deleted. Several objects can participate in a
416cycle if they have references to each other, causing all of the
417objects to be leaked.
418
419An experimental step has been made toward fixing this problem. When
420compiling Python, the \verb|--with-cycle-gc| option can be specified.
421This causes a cycle detection algorithm to be periodically executed,
422which looks for inaccessible cycles and deletes the objects involved.
423A new \module{gc} module provides functions to perform a garbage
424collection, obtain debugging statistics, and tuning the collector's parameters.
425
426Why isn't cycle detection enabled by default? Running the cycle detection
427algorithm takes some time, and some tuning will be required to
428minimize the overhead cost. It's not yet obvious how much performance
429is lost, because benchmarking this is tricky and depends crucially
430on how often the program creates and destroys objects.
431
432Several people tackled this problem and contributed to a solution. An
433early implementation of the cycle detection approach was written by
434Toby Kelsey. The current algorithm was suggested by Eric Tiedemann
435during a visit to CNRI, and Guido van Rossum and Neil Schemenauer
436wrote two different implementations, which were later integrated by
437Neil. Lots of other people offered suggestions along the way; the
438March 2000 archives of the python-dev mailing list contain most of the
439relevant discussion, especially in the threads titled ``Reference
440cycle collection for Python'' and ``Finalization again''.
441
Andrew M. Kuchling35e8afb2000-07-08 12:06:31 +0000442% ======================================================================
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000443\section{Other Core Changes}
Andrew M. Kuchling35e8afb2000-07-08 12:06:31 +0000444
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000445Various minor changes have been made to Python's syntax and built-in
446functions. None of the changes are very far-reaching, but they're
447handy conveniences.
448
449\subsection{Minor Language Changes}
450
451A new syntax makes it more convenient to call a given function
452with a tuple of arguments and/or a dictionary of keyword arguments.
453In Python 1.5 and earlier, you'd use the \function{apply()}
454built-in function: \code{apply(f, \var{args}, \var{kw})} calls the
455function \function{f()} with the argument tuple \var{args} and the
456keyword arguments in the dictionary \var{kw}. \function{apply()}
457is the same in 2.0, but thanks to a patch from
458Greg Ewing, \code{f(*\var{args}, **\var{kw})} as a shorter
459and clearer way to achieve the same effect. This syntax is
460symmetrical with the syntax for defining functions:
461
462\begin{verbatim}
463def f(*args, **kw):
464 # args is a tuple of positional args,
465 # kw is a dictionary of keyword args
466 ...
467\end{verbatim}
468
469The \keyword{print} statement can now have its output directed to a
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000470file-like object by following the \keyword{print} with
471\verb|>> file|, similar to the redirection operator in Unix shells.
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000472Previously you'd either have to use the \method{write()} method of the
473file-like object, which lacks the convenience and simplicity of
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000474\keyword{print}, or you could assign a new value to
475\code{sys.stdout} and then restore the old value. For sending output to standard error,
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000476it's much easier to write this:
477
478\begin{verbatim}
479print >> sys.stderr, "Warning: action field not supplied"
480\end{verbatim}
481
482Modules can now be renamed on importing them, using the syntax
483\code{import \var{module} as \var{name}} or \code{from \var{module}
484import \var{name} as \var{othername}}. The patch was submitted by
485Thomas Wouters.
486
487A new format style is available when using the \code{\%} operator;
488'\%r' will insert the \function{repr()} of its argument. This was
489also added from symmetry considerations, this time for symmetry with
490the existing '\%s' format style, which inserts the \function{str()} of
491its argument. For example, \code{'\%r \%s' \% ('abc', 'abc')} returns a
492string containing \verb|'abc' abc|.
493
494Previously there was no way to implement a class that overrode
495Python's built-in \keyword{in} operator and implemented a custom
496version. \code{\var{obj} in \var{seq}} returns true if \var{obj} is
497present in the sequence \var{seq}; Python computes this by simply
498trying every index of the sequence until either \var{obj} is found or
499an \exception{IndexError} is encountered. Moshe Zadka contributed a
500patch which adds a \method{__contains__} magic method for providing a
501custom implementation for \keyword{in}. Additionally, new built-in
502objects written in C can define what \keyword{in} means for them via a
503new slot in the sequence protocol.
504
505Earlier versions of Python used a recursive algorithm for deleting
506objects. Deeply nested data structures could cause the interpreter to
507fill up the C stack and crash; Christian Tismer rewrote the deletion
508logic to fix this problem. On a related note, comparing recursive
509objects recursed infinitely and crashed; Jeremy Hylton rewrote the
510code to no longer crash, producing a useful result instead. For
511example, after this code:
512
513\begin{verbatim}
514a = []
515b = []
516a.append(a)
517b.append(b)
518\end{verbatim}
519
520The comparison \code{a==b} returns true, because the two recursive
521data structures are isomorphic. \footnote{See the thread ``trashcan
522and PR\#7'' in the April 2000 archives of the python-dev mailing list
523for the discussion leading up to this implementation, and some useful
524relevant links.
525%http://www.python.org/pipermail/python-dev/2000-April/004834.html
526}
527
528Work has been done on porting Python to 64-bit Windows on the Itanium
529processor, mostly by Trent Mick of ActiveState. (Confusingly,
530\code{sys.platform} is still \code{'win32'} on Win64 because it seems
531that for ease of porting, MS Visual C++ treats code as 32 bit on Itanium.)
532PythonWin also supports Windows CE; see the Python CE page at
533\url{http://starship.python.net/crew/mhammond/ce/} for more
534information.
535
536An attempt has been made to alleviate one of Python's warts, the
537often-confusing \exception{NameError} exception when code refers to a
538local variable before the variable has been assigned a value. For
539example, the following code raises an exception on the \keyword{print}
540statement in both 1.5.2 and 2.0; in 1.5.2 a \exception{NameError}
541exception is raised, while 2.0 raises a new
542\exception{UnboundLocalError} exception.
543\exception{UnboundLocalError} is a subclass of \exception{NameError},
544so any existing code that expects \exception{NameError} to be raised
545should still work.
546
547\begin{verbatim}
548def f():
549 print "i=",i
550 i = i + 1
551f()
552\end{verbatim}
553
Andrew M. Kuchling4d46d382000-09-06 17:58:49 +0000554Two new exceptions, \exception{TabError} and
555\exception{IndentationError}, have been introduced. They're both
556subclasses of \exception{SyntaxError}, and are raised when Python code
557is found to be improperly indented.
558
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000559\subsection{Changes to Built-in Functions}
560
561A new built-in, \function{zip(\var{seq1}, \var{seq2}, ...)}, has been
562added. \function{zip()} returns a list of tuples where each tuple
563contains the i-th element from each of the argument sequences. The
564difference between \function{zip()} and \code{map(None, \var{seq1},
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000565\var{seq2})} is that \function{map()} pads the sequences with
566\code{None} if the sequences aren't all of the same length, while
567\function{zip()} truncates the returned list to the length of the
568shortest argument sequence.
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000569
570The \function{int()} and \function{long()} functions now accept an
571optional ``base'' parameter when the first argument is a string.
572\code{int('123', 10)} returns 123, while \code{int('123', 16)} returns
573291. \code{int(123, 16)} raises a \exception{TypeError} exception
574with the message ``can't convert non-string with explicit base''.
575
576A new variable holding more detailed version information has been
577added to the \module{sys} module. \code{sys.version_info} is a tuple
578\code{(\var{major}, \var{minor}, \var{micro}, \var{level},
579\var{serial})} For example, in a hypothetical 2.0.1beta1,
580\code{sys.version_info} would be \code{(2, 0, 1, 'beta', 1)}.
581\var{level} is a string such as \code{"alpha"}, \code{"beta"}, or
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000582\code{"final"} for a final release.
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000583
584Dictionaries have an odd new method, \method{setdefault(\var{key},
585\var{default})}, which behaves similarly to the existing
586\method{get()} method. However, if the key is missing,
587\method{setdefault()} both returns the value of \var{default} as
588\method{get()} would do, and also inserts it into the dictionary as
589the value for \var{key}. Thus, the following lines of code:
590
591\begin{verbatim}
592if dict.has_key( key ): return dict[key]
593else:
594 dict[key] = []
595 return dict[key]
596\end{verbatim}
597
598can be reduced to a single \code{return dict.setdefault(key, [])} statement.
599
Andrew M. Kuchling4d46d382000-09-06 17:58:49 +0000600The interpreter sets a maximum recursion depth in order to catch
601runaway recursion before filling the C stack and causing a core dump
602or GPF.. Previously this limit was fixed when you compiled Python,
603but in 2.0 the maximum recursion depth can be read and modified using
604\function{sys.getrecursionlimit} and \function{sys.setrecursionlimit}.
605The default value is 1000, and a rough maximum value for a given
606platform can be found by running a new script,
607\file{Misc/find_recursionlimit.py}.
Andrew M. Kuchling35e8afb2000-07-08 12:06:31 +0000608
609% ======================================================================
Andrew M. Kuchling730067e2000-06-30 01:44:05 +0000610\section{Porting to 2.0}
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000611
612New Python releases try hard to be compatible with previous releases,
613and the record has been pretty good. However, some changes are
Andrew M. Kuchling4d46d382000-09-06 17:58:49 +0000614considered useful enough, usually because they fix initial design decisions that
615turned out to be actively mistaken, that breaking backward compatibility
Andrew M. Kuchling730067e2000-06-30 01:44:05 +0000616can't always be avoided. This section lists the changes in Python 2.0
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000617that may cause old Python code to break.
618
619The change which will probably break the most code is tightening up
620the arguments accepted by some methods. Some methods would take
621multiple arguments and treat them as a tuple, particularly various
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000622list methods such as \method{.append()} and \method{.insert()}.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000623In earlier versions of Python, if \code{L} is a list, \code{L.append(
Andrew M. Kuchling730067e2000-06-30 01:44:05 +00006241,2 )} appends the tuple \code{(1,2)} to the list. In Python 2.0 this
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000625causes a \exception{TypeError} exception to be raised, with the
626message: 'append requires exactly 1 argument; 2 given'. The fix is to
627simply add an extra set of parentheses to pass both values as a tuple:
628\code{L.append( (1,2) )}.
629
630The earlier versions of these methods were more forgiving because they
631used an old function in Python's C interface to parse their arguments;
Andrew M. Kuchling730067e2000-06-30 01:44:05 +00006322.0 modernizes them to use \function{PyArg_ParseTuple}, the current
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000633argument parsing function, which provides more helpful error messages
634and treats multi-argument calls as errors. If you absolutely must use
Andrew M. Kuchling730067e2000-06-30 01:44:05 +00006352.0 but can't fix your code, you can edit \file{Objects/listobject.c}
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000636and define the preprocessor symbol \code{NO_STRICT_LIST_APPEND} to
637preserve the old behaviour; this isn't recommended.
638
639Some of the functions in the \module{socket} module are still
640forgiving in this way. For example, \function{socket.connect(
641('hostname', 25) )} is the correct form, passing a tuple representing
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000642an IP address, but \function{socket.connect( 'hostname', 25 )} also
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000643works. \function{socket.connect_ex()} and \function{socket.bind()} are
Andrew M. Kuchling730067e2000-06-30 01:44:05 +0000644similarly easy-going. 2.0alpha1 tightened these functions up, but
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000645because the documentation actually used the erroneous multiple
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000646argument form, many people wrote code which would break with the
647stricter checking. GvR backed out the changes in the face of public
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000648reaction, so for the \module{socket} module, the documentation was
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000649fixed and the multiple argument form is simply marked as deprecated;
650it \emph{will} be tightened up again in a future Python version.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000651
Andrew M. Kuchling4d46d382000-09-06 17:58:49 +0000652The \code{\e x} escape in string literals now takes exactly 2 hex
653digits. Previously it would consume all the hex digits following the
654'x' and take the lowest 8 bits of the result, so \code{\e x123456} was
655equivalent to \code{\e x56}.
656
657The \exception{AttributeError} exception has a more friendly error message,
658whose text will be something like \code{'Spam' instance has no attribute 'eggs'}.
659Previously the error message was just the missing attribute name \code{eggs}, and
660code written to take advantage of this fact will break in 2.0.
661
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000662Some work has been done to make integers and long integers a bit more
663interchangeable. In 1.5.2, large-file support was added for Solaris,
664to allow reading files larger than 2Gb; this made the \method{tell()}
665method of file objects return a long integer instead of a regular
666integer. Some code would subtract two file offsets and attempt to use
667the result to multiply a sequence or slice a string, but this raised a
Andrew M. Kuchling730067e2000-06-30 01:44:05 +0000668\exception{TypeError}. In 2.0, long integers can be used to multiply
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000669or slice a sequence, and it'll behave as you'd intuitively expect it
670to; \code{3L * 'abc'} produces 'abcabcabc', and \code{
671(0,1,2,3)[2L:4L]} produces (2,3). Long integers can also be used in
672various new places where previously only integers were accepted, such
673as in the \method{seek()} method of file objects.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000674
675The subtlest long integer change of all is that the \function{str()}
676of a long integer no longer has a trailing 'L' character, though
677\function{repr()} still includes it. The 'L' annoyed many people who
678wanted to print long integers that looked just like regular integers,
679since they had to go out of their way to chop off the character. This
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000680is no longer a problem in 2.0, but code which does \code{str(longval)[:-1]} and assumes the 'L' is there, will now lose
681the final digit.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000682
683Taking the \function{repr()} of a float now uses a different
684formatting precision than \function{str()}. \function{repr()} uses
Andrew M. Kuchling662d76e2000-06-25 14:32:48 +0000685\code{\%.17g} format string for C's \function{sprintf()}, while
686\function{str()} uses \code{\%.12g} as before. The effect is that
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000687\function{repr()} may occasionally show more decimal places than
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000688\function{str()}, for certain numbers.
Andrew M. Kuchlinga5bbb002000-06-10 02:41:46 +0000689For example, the number 8.1 can't be represented exactly in binary, so
690\code{repr(8.1)} is \code{'8.0999999999999996'}, while str(8.1) is
691\code{'8.1'}.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000692
Andrew M. Kuchling730067e2000-06-30 01:44:05 +0000693The \code{-X} command-line option, which turned all standard
Andrew M. Kuchling62cdd962000-06-30 12:46:41 +0000694exceptions into strings instead of classes, has been removed; the
695standard exceptions will now always be classes. The
696\module{exceptions} module containing the standard exceptions was
697translated from Python to a built-in C module, written by Barry Warsaw
698and Fredrik Lundh.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000699
Andrew M. Kuchling791b3662000-07-01 15:04:18 +0000700% Commented out for now -- I don't think anyone will care.
701%The pattern and match objects provided by SRE are C types, not Python
702%class instances as in 1.5. This means you can no longer inherit from
703%\class{RegexObject} or \class{MatchObject}, but that shouldn't be much
704%of a problem since no one should have been doing that in the first
705%place.
706
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000707% ======================================================================
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000708\section{Extending/Embedding Changes}
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000709
710Some of the changes are under the covers, and will only be apparent to
Andrew M. Kuchling8357c4c2000-07-01 00:14:43 +0000711people writing C extension modules or embedding a Python interpreter
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000712in a larger application. If you aren't dealing with Python's C API,
Andrew M. Kuchling5b8311e2000-05-31 03:28:42 +0000713you can safely skip this section.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000714
Andrew M. Kuchlinga5bbb002000-06-10 02:41:46 +0000715The version number of the Python C API was incremented, so C
716extensions compiled for 1.5.2 must be recompiled in order to work with
Andrew M. Kuchling730067e2000-06-30 01:44:05 +00007172.0. On Windows, attempting to import a third party extension built
Andrew M. Kuchlinga5bbb002000-06-10 02:41:46 +0000718for Python 1.5.x usually results in an immediate crash; there's not
Andrew M. Kuchling62cdd962000-06-30 12:46:41 +0000719much we can do about this. (Here's Mark Hammond's explanation of the
720reasons for the crash. The 1.5 module is linked against
721\file{Python15.dll}. When \file{Python.exe} , linked against
722\file{Python16.dll}, starts up, it initializes the Python data
723structures in \file{Python16.dll}. When Python then imports the
724module \file{foo.pyd} linked against \file{Python15.dll}, it
725immediately tries to call the functions in that DLL. As Python has
726not been initialized in that DLL, the program immediately crashes.)
Andrew M. Kuchlinga5bbb002000-06-10 02:41:46 +0000727
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000728Users of Jim Fulton's ExtensionClass module will be pleased to find
729out that hooks have been added so that ExtensionClasses are now
730supported by \function{isinstance()} and \function{issubclass()}.
731This means you no longer have to remember to write code such as
732\code{if type(obj) == myExtensionClass}, but can use the more natural
733\code{if isinstance(obj, myExtensionClass)}.
734
Andrew M. Kuchlingb853ea02000-06-03 03:06:58 +0000735The \file{Python/importdl.c} file, which was a mass of \#ifdefs to
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000736support dynamic loading on many different platforms, was cleaned up
Andrew M. Kuchling69db0e42000-06-28 02:16:00 +0000737and reorganised by Greg Stein. \file{importdl.c} is now quite small,
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000738and platform-specific code has been moved into a bunch of
Andrew M. Kuchlingb9fb1f22000-08-04 12:40:35 +0000739\file{Python/dynload_*.c} files. Another cleanup: there were also a
740number of \file{my*.h} files in the Include/ directory that held
741various portability hacks; they've been merged into a single file,
742\file{Include/pyport.h}.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000743
744Vladimir Marangozov's long-awaited malloc restructuring was completed,
745to make it easy to have the Python interpreter use a custom allocator
746instead of C's standard \function{malloc()}. For documentation, read
Andrew M. Kuchling2d2dc9f2000-08-17 00:27:06 +0000747the comments in \file{Include/pymem.h} and
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000748\file{Include/objimpl.h}. For the lengthy discussions during which
749the interface was hammered out, see the Web archives of the 'patches'
750and 'python-dev' lists at python.org.
751
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000752Recent versions of the GUSI development environment for MacOS support
753POSIX threads. Therefore, Python's POSIX threading support now works
754on the Macintosh. Threading support using the user-space GNU \texttt{pth}
755library was also contributed.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000756
757Threading support on Windows was enhanced, too. Windows supports
758thread locks that use kernel objects only in case of contention; in
759the common case when there's no contention, they use simpler functions
760which are an order of magnitude faster. A threaded version of Python
Andrew M. Kuchling730067e2000-06-30 01:44:05 +00007611.5.2 on NT is twice as slow as an unthreaded version; with the 2.0
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000762changes, the difference is only 10\%. These improvements were
763contributed by Yakov Markovitch.
764
Andrew M. Kuchling08d87c62000-07-09 15:05:15 +0000765Python 2.0's source now uses only ANSI C prototypes, so compiling Python now
766requires an ANSI C compiler, and can no longer be done using a compiler that
767only supports K\&R C.
768
Andrew M. Kuchling4d46d382000-09-06 17:58:49 +0000769Previously the Python virtual machine used 16-bit numbers in its
770bytecode, limiting the size of source files. In particular, this
771affected the maximum size of literal lists and dictionaries in Python
772source; occasionally people who are generating Python code would run into this limit.
773A patch by Charles G. Waldman raises the limit from \verb|2^16| to \verb|2^{32}|.
774
775
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000776% ======================================================================
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000777\section{Distutils: Making Modules Easy to Install}
778
779Before Python 2.0, installing modules was a tedious affair -- there
780was no way to figure out automatically where Python is installed, or
781what compiler options to use for extension modules. Software authors
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000782had to go through an arduous ritual of editing Makefiles and
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000783configuration files, which only really work on Unix and leave Windows
784and MacOS unsupported. Software users faced wildly differing
785installation instructions
786
787The SIG for distribution utilities, shepherded by Greg Ward, has
788created the Distutils, a system to make package installation much
789easier. They form the \module{distutils} package, a new part of
790Python's standard library. In the best case, installing a Python
791module from source will require the same steps: first you simply mean
792unpack the tarball or zip archive, and the run ``\code{python setup.py
793install}''. The platform will be automatically detected, the compiler
794will be recognized, C extension modules will be compiled, and the
795distribution installed into the proper directory. Optional
796command-line arguments provide more control over the installation
797process, the distutils package offers many places to override defaults
798-- separating the build from the install, building or installing in
799non-default directories, and more.
800
801In order to use the Distutils, you need to write a \file{setup.py}
802script. For the simple case, when the software contains only .py
803files, a minimal \file{setup.py} can be just a few lines long:
804
805\begin{verbatim}
806from distutils.core import setup
807setup (name = "foo", version = "1.0",
808 py_modules = ["module1", "module2"])
809\end{verbatim}
810
811The \file{setup.py} file isn't much more complicated if the software
812consists of a few packages:
813
814\begin{verbatim}
815from distutils.core import setup
816setup (name = "foo", version = "1.0",
817 packages = ["package", "package.subpackage"])
818\end{verbatim}
819
820A C extension can be the most complicated case; here's an example taken from
821the PyXML package:
822
823
824\begin{verbatim}
825from distutils.core import setup, Extension
826
827expat_extension = Extension('xml.parsers.pyexpat',
828 define_macros = [('XML_NS', None)],
829 include_dirs = [ 'extensions/expat/xmltok',
830 'extensions/expat/xmlparse' ],
831 sources = [ 'extensions/pyexpat.c',
832 'extensions/expat/xmltok/xmltok.c',
833 'extensions/expat/xmltok/xmlrole.c',
834 ]
835 )
836setup (name = "PyXML", version = "0.5.4",
837 ext_modules =[ expat_extension ] )
838
839\end{verbatim}
840
841The Distutils can also take care of creating source and binary
842distributions. The ``sdist'' command, run by ``\code{python setup.py
843sdist}', builds a source distribution such as \file{foo-1.0.tar.gz}.
844Adding new commands isn't difficult, ``bdist_rpm'' and
845``bdist_wininst'' commands have already been contributed to create an
846RPM distribution and a Windows installer for the software,
847respectively. Commands to create other distribution formats such as
848Debian packages and Solaris \file{.pkg} files are in various stages of
849development.
850
851All this is documented in a new manual, \textit{Distributing Python
852Modules}, that joins the basic set of Python documentation.
853
854% ======================================================================
855%\section{New XML Code}
856
857%XXX write this section...
858
859% ======================================================================
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000860\section{Module changes}
861
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000862Lots of improvements and bugfixes were made to Python's extensive
863standard library; some of the affected modules include
864\module{readline}, \module{ConfigParser}, \module{cgi},
865\module{calendar}, \module{posix}, \module{readline}, \module{xmllib},
866\module{aifc}, \module{chunk, wave}, \module{random}, \module{shelve},
867and \module{nntplib}. Consult the CVS logs for the exact
868patch-by-patch details.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000869
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000870Brian Gallew contributed OpenSSL support for the \module{socket}
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000871module. OpenSSL is an implementation of the Secure Socket Layer,
872which encrypts the data being sent over a socket. When compiling
873Python, you can edit \file{Modules/Setup} to include SSL support,
874which adds an additional function to the \module{socket} module:
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000875\function{socket.ssl(\var{socket}, \var{keyfile}, \var{certfile})},
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000876which takes a socket object and returns an SSL socket. The
877\module{httplib} and \module{urllib} modules were also changed to
878support ``https://'' URLs, though no one has implemented FTP or SMTP
879over SSL.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000880
Andrew M. Kuchling69db0e42000-06-28 02:16:00 +0000881The \module{httplib} module has been rewritten by Greg Stein to
882support HTTP/1.1. Backward compatibility with the 1.5 version of
883\module{httplib} is provided, though using HTTP/1.1 features such as
884pipelining will require rewriting code to use a different set of
885interfaces.
886
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000887The \module{Tkinter} module now supports Tcl/Tk version 8.1, 8.2, or
8888.3, and support for the older 7.x versions has been dropped. The
Andrew M. Kuchling791b3662000-07-01 15:04:18 +0000889Tkinter module now supports displaying Unicode strings in Tk widgets.
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000890Also, Fredrik Lundh contributed an optimization which makes operations
891like \code{create_line} and \code{create_polygon} much faster,
Andrew M. Kuchling791b3662000-07-01 15:04:18 +0000892especially when using lots of coordinates.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000893
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000894The \module{curses} module has been greatly extended, starting from
895Oliver Andrich's enhanced version, to provide many additional
896functions from ncurses and SYSV curses, such as colour, alternative
Andrew M. Kuchling69db0e42000-06-28 02:16:00 +0000897character set support, pads, and mouse support. This means the module
898is no longer compatible with operating systems that only have BSD
899curses, but there don't seem to be any currently maintained OSes that
900fall into this category.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000901
Andrew M. Kuchling730067e2000-06-30 01:44:05 +0000902As mentioned in the earlier discussion of 2.0's Unicode support, the
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000903underlying implementation of the regular expressions provided by the
904\module{re} module has been changed. SRE, a new regular expression
905engine written by Fredrik Lundh and partially funded by Hewlett
906Packard, supports matching against both 8-bit strings and Unicode
907strings.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000908
909% ======================================================================
910\section{New modules}
911
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000912A number of new modules were added. We'll simply list them with brief
Andrew M. Kuchling730067e2000-06-30 01:44:05 +0000913descriptions; consult the 2.0 documentation for the details of a
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000914particular module.
915
916\begin{itemize}
917
Andrew M. Kuchling62cdd962000-06-30 12:46:41 +0000918\item{\module{atexit}}:
919For registering functions to be called before the Python interpreter exits.
920Code that currently sets
921\code{sys.exitfunc} directly should be changed to
922use the \module{atexit} module instead, importing \module{atexit}
923and calling \function{atexit.register()} with
924the function to be called on exit.
925(Contributed by Skip Montanaro.)
926
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000927\item{\module{codecs}, \module{encodings}, \module{unicodedata}:} Added as part of the new Unicode support.
928
Andrew M. Kuchlingfed4f1e2000-07-01 12:33:43 +0000929\item{\module{filecmp}:} Supersedes the old \module{cmp}, \module{cmpcache} and
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000930\module{dircmp} modules, which have now become deprecated.
Andrew M. Kuchlingc0328f02000-06-10 15:11:20 +0000931(Contributed by Gordon MacMillan and Moshe Zadka.)
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000932
Andrew M. Kuchling35e8afb2000-07-08 12:06:31 +0000933\item{\module{linuxaudiodev}:} Support for the \file{/dev/audio}
934device on Linux, a twin to the existing \module{sunaudiodev} module.
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000935(Contributed by Peter Bosch.)
936
937\item{\module{mmap}:} An interface to memory-mapped files on both
938Windows and Unix. A file's contents can be mapped directly into
939memory, at which point it behaves like a mutable string, so its
940contents can be read and modified. They can even be passed to
941functions that expect ordinary strings, such as the \module{re}
942module. (Contributed by Sam Rushing, with some extensions by
943A.M. Kuchling.)
944
Andrew M. Kuchling35e8afb2000-07-08 12:06:31 +0000945\item{\module{pyexpat}:} An interface to the Expat XML parser.
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000946(Contributed by Paul Prescod.)
947
948\item{\module{robotparser}:} Parse a \file{robots.txt} file, which is
949used for writing Web spiders that politely avoid certain areas of a
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000950Web site. The parser accepts the contents of a \file{robots.txt} file,
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000951builds a set of rules from it, and can then answer questions about
952the fetchability of a given URL. (Contributed by Skip Montanaro.)
953
954\item{\module{tabnanny}:} A module/script to
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000955check Python source code for ambiguous indentation.
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000956(Contributed by Tim Peters.)
957
Andrew M. Kuchlinga5bbb002000-06-10 02:41:46 +0000958\item{\module{UserString}:} A base class useful for deriving objects that behave like strings.
959
Andrew M. Kuchling08d87c62000-07-09 15:05:15 +0000960\item{\module{webbrowser}:} A module that provides a platform independent
961way to launch a web browser on a specific URL. For each platform, various
962browsers are tried in a specific order. The user can alter which browser
963is launched by setting the \var{BROWSER} environment variable.
964(Originally inspired by Eric S. Raymond's patch to \module{urllib}
965which added similar functionality, but
966the final module comes from code originally
967implemented by Fred Drake as \file{Tools/idle/BrowserControl.py},
968and adapted for the standard library by Fred.)
969
Andrew M. Kuchlingd500e442000-09-06 12:30:25 +0000970\item{\module{_winreg}:} An interface to the
Andrew M. Kuchlingfed4f1e2000-07-01 12:33:43 +0000971Windows registry. \module{_winreg} is an adaptation of functions that
972have been part of PythonWin since 1995, but has now been added to the core
Andrew M. Kuchlingd500e442000-09-06 12:30:25 +0000973distribution, and enhanced to support Unicode.
974\module{_winreg} was written by Bill Tutt and Mark Hammond.
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000975
976\item{\module{zipfile}:} A module for reading and writing ZIP-format
977archives. These are archives produced by \program{PKZIP} on
978DOS/Windows or \program{zip} on Unix, not to be confused with
979\program{gzip}-format files (which are supported by the \module{gzip}
980module)
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000981(Contributed by James C. Ahlstrom.)
982
Andrew M. Kuchling69db0e42000-06-28 02:16:00 +0000983\item{\module{imputil}:} A module that provides a simpler way for
984writing customised import hooks, in comparison to the existing
985\module{ihooks} module. (Implemented by Greg Stein, with much
986discussion on python-dev along the way.)
987
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000988\end{itemize}
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000989
990% ======================================================================
991\section{IDLE Improvements}
992
Andrew M. Kuchlingc0328f02000-06-10 15:11:20 +0000993IDLE is the official Python cross-platform IDE, written using Tkinter.
Andrew M. Kuchling730067e2000-06-30 01:44:05 +0000994Python 2.0 includes IDLE 0.6, which adds a number of new features and
Andrew M. Kuchlingc0328f02000-06-10 15:11:20 +0000995improvements. A partial list:
996
997\begin{itemize}
998\item UI improvements and optimizations,
999especially in the area of syntax highlighting and auto-indentation.
1000
1001\item The class browser now shows more information, such as the top
Andrew M. Kuchling730067e2000-06-30 01:44:05 +00001002level functions in a module.
Andrew M. Kuchlingc0328f02000-06-10 15:11:20 +00001003
1004\item Tab width is now a user settable option. When opening an existing Python
1005file, IDLE automatically detects the indentation conventions, and adapts.
1006
1007\item There is now support for calling browsers on various platforms,
1008used to open the Python documentation in a browser.
1009
1010\item IDLE now has a command line, which is largely similar to
1011the vanilla Python interpreter.
1012
1013\item Call tips were added in many places.
1014
1015\item IDLE can now be installed as a package.
1016
1017\item In the editor window, there is now a line/column bar at the bottom.
1018
1019\item Three new keystroke commands: Check module (Alt-F5), Import
Andrew M. Kuchling730067e2000-06-30 01:44:05 +00001020module (F5) and Run script (Ctrl-F5).
Andrew M. Kuchlingc0328f02000-06-10 15:11:20 +00001021
1022\end{itemize}
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +00001023
1024% ======================================================================
1025\section{Deleted and Deprecated Modules}
1026
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +00001027A few modules have been dropped because they're obsolete, or because
1028there are now better ways to do the same thing. The \module{stdwin}
1029module is gone; it was for a platform-independent windowing toolkit
1030that's no longer developed.
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +00001031
Andrew M. Kuchlinga5bbb002000-06-10 02:41:46 +00001032A number of modules have been moved to the
1033\file{lib-old} subdirectory:
1034\module{cmp}, \module{cmpcache}, \module{dircmp}, \module{dump},
1035\module{find}, \module{grep}, \module{packmail},
1036\module{poly}, \module{util}, \module{whatsound}, \module{zmod}.
1037If you have code which relies on a module that's been moved to
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +00001038\file{lib-old}, you can simply add that directory to \code{sys.path}
Andrew M. Kuchlinga5bbb002000-06-10 02:41:46 +00001039to get them back, but you're encouraged to update any code that uses
1040these modules.
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +00001041
Andrew M. Kuchling730067e2000-06-30 01:44:05 +00001042\section{Acknowledgements}
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +00001043
Andrew M. Kuchlinga6161ed2000-07-01 00:23:02 +00001044The authors would like to thank the following people for offering
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +00001045suggestions on drafts of this article: Mark Hammond, Fredrik Lundh,
1046Detlef Lannert, Skip Montanaro, Vladimir Marangozov, Guido van Rossum,
1047and Neil Schemenauer.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +00001048
1049\end{document}