blob: 42bbcc6ea624e550d4d9f554220f7a3de82dd1a8 [file] [log] [blame]
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +00001\documentclass{howto}
2
Andrew M. Kuchling3ad4e742000-09-27 01:33:41 +00003% $Id$
4
Andrew M. Kuchling730067e2000-06-30 01:44:05 +00005\title{What's New in Python 2.0}
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +00006\release{0.05}
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +00007\author{A.M. Kuchling and Moshe Zadka}
8\authoraddress{\email{amk1@bigfoot.com}, \email{moshez@math.huji.ac.il} }
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +00009\begin{document}
10\maketitle\tableofcontents
11
12\section{Introduction}
13
Andrew M. Kuchling69db0e42000-06-28 02:16:00 +000014{\large This is a draft document; please report inaccuracies and
15omissions to the authors. This document should not be treated as
Andrew M. Kuchling730067e2000-06-30 01:44:05 +000016definitive; features described here might be removed or changed during
17the beta cycle before the final release of Python 2.0.
Andrew M. Kuchlinga5bbb002000-06-10 02:41:46 +000018}
19
Andrew M. Kuchling730067e2000-06-30 01:44:05 +000020A new release of Python, version 2.0, will be released some time this
Andrew M. Kuchlingbe870dd2000-09-27 02:36:10 +000021autumn. Beta versions are already available from
Andrew M. Kuchling6d4addd2000-09-25 14:40:15 +000022\url{http://www.pythonlabs.com/products/python2.0/}. This article
Andrew M. Kuchling70ba3822000-07-01 00:13:30 +000023covers the exciting new features in 2.0, highlights some other useful
24changes, and points out a few incompatible changes that may require
25rewriting code.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +000026
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +000027Python's development never completely stops between releases, and a
28steady flow of bug fixes and improvements are always being submitted.
29A host of minor fixes, a few optimizations, additional docstrings, and
Andrew M. Kuchling730067e2000-06-30 01:44:05 +000030better error messages went into 2.0; to list them all would be
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +000031impossible, but they're certainly significant. Consult the
Andrew M. Kuchlingec1722e2000-10-12 03:04:22 +000032publicly-available CVS logs if you want to see the full list. This
33progress is due to the five developers working for
34PythonLabs are now getting paid to spend their days fixing bugs,
35and also due to the improved communication resulting
36from moving to SourceForge.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +000037
38% ======================================================================
Andrew M. Kuchling4d46d382000-09-06 17:58:49 +000039\section{What About Python 1.6?}
40
41Python 1.6 can be thought of as the Contractual Obligations Python
42release. After the core development team left CNRI in May 2000, CNRI
43requested that a 1.6 release be created, containing all the work on
44Python that had been performed at CNRI. Python 1.6 therefore
45represents the state of the CVS tree as of May 2000, with the most
46significant new feature being Unicode support. Development continued
47after May, of course, so the 1.6 tree received a few fixes to ensure
48that it's forward-compatible with Python 2.0. 1.6 is therefore part
49of Python's evolution, and not a side branch.
50
51So, should you take much interest in Python 1.6? Probably not. The
521.6final and 2.0beta1 releases were made on the same day (September 5,
532000), the plan being to finalize Python 2.0 within a month or so. If
54you have applications to maintain, there seems little point in
55breaking things by moving to 1.6, fixing them, and then having another
56round of breakage within a month by moving to 2.0; you're better off
57just going straight to 2.0. Most of the really interesting features
58described in this document are only in 2.0, because a lot of work was
59done between May and September.
60
61% ======================================================================
Andrew M. Kuchlingbe870dd2000-09-27 02:36:10 +000062\section{New Development Process}
63
64The most important change in Python 2.0 may not be to the code at all,
Andrew M. Kuchlingd44dc3c2000-10-04 12:40:44 +000065but to how Python is developed: in May 2000 the Python developers
66began using the tools made available by SourceForge for storing
67source code, tracking bug reports, and managing the queue of patch
68submissions. To report bugs or submit patches for Python 2.0, use the
69bug tracking and patch manager tools available from Python's project
70page, located at \url{http://sourceforge.net/projects/python/}.
Andrew M. Kuchlingbe870dd2000-09-27 02:36:10 +000071
Andrew M. Kuchlingd44dc3c2000-10-04 12:40:44 +000072The most important of the services now hosted at SourceForge is the
73Python CVS tree, the version-controlled repository containing the
74source code for Python. Previously, there were roughly 7 or so people
75who had write access to the CVS tree, and all patches had to be
76inspected and checked in by one of the people on this short list.
77Obviously, this wasn't very scalable. By moving the CVS tree to
78SourceForge, it became possible to grant write access to more people;
79as of September 2000 there were 27 people able to check in changes, a
80fourfold increase. This makes possible large-scale changes that
81wouldn't be attempted if they'd have to be filtered through the small
82group of core developers. For example, one day Peter Schneider-Kamp
83took it into his head to drop K\&R C compatibility and convert the C
84source for Python to ANSI C. After getting approval on the python-dev
85mailing list, he launched into a flurry of checkins that lasted about
86a week, other developers joined in to help, and the job was done. If
87there were only 5 people with write access, probably that task would
88have been viewed as ``nice, but not worth the time and effort needed''
89and it would never have gotten done.
Andrew M. Kuchlingbe870dd2000-09-27 02:36:10 +000090
Andrew M. Kuchlingd44dc3c2000-10-04 12:40:44 +000091The shift to using SourceForge's services has resulted in a remarkable
92increase in the speed of development. Patches now get submitted,
93commented on, revised by people other than the original submitter, and
94bounced back and forth between people until the patch is deemed worth
Andrew M. Kuchlingec1722e2000-10-12 03:04:22 +000095checking in. Bugs are tracked in one central location and can be
96assigned to a specific person for fixing, and we can count the number
97of open bugs to measure progress. This didn't come without a cost:
98developers now have more e-mail to deal with, more mailing lists to
99follow, and special tools had to be written for the new environment.
100For example, SourceForge sends default patch and bug notification
101e-mail messages that are completely unhelpful, so Ka-Ping Yee wrote an
102HTML screen-scraper that sends more useful messages.
Andrew M. Kuchlingbe870dd2000-09-27 02:36:10 +0000103
104The ease of adding code caused a few initial growing pains, such as
105code was checked in before it was ready or without getting clear
106agreement from the developer group. The approval process that has
107emerged is somewhat similar to that used by the Apache group.
108Developers can vote +1, +0, -0, or -1 on a patch; +1 and -1 denote
109acceptance or rejection, while +0 and -0 mean the developer is mostly
110indifferent to the change, though with a slight positive or negative
Andrew M. Kuchlingec1722e2000-10-12 03:04:22 +0000111slant. The most significant change from the Apache model is that the
112voting is essentially advisory, letting Guido van Rossum, who has
113Benevolent Dictator For Life status, know what the general opinion is.
114He can still ignore the result of a vote, and approve or
115reject a change even if the community disagrees with him.
Andrew M. Kuchlingbe870dd2000-09-27 02:36:10 +0000116
117Producing an actual patch is the last step in adding a new feature,
118and is usually easy compared to the earlier task of coming up with a
119good design. Discussions of new features can often explode into
120lengthy mailing list threads, making the discussion hard to follow,
121and no one can read every posting to python-dev. Therefore, a
122relatively formal process has been set up to write Python Enhancement
123Proposals (PEPs), modelled on the Internet RFC process. PEPs are
124draft documents that describe a proposed new feature, and are
125continually revised until the community reaches a consensus, either
126accepting or rejecting the proposal. Quoting from the introduction to
127PEP 1, ``PEP Purpose and Guidelines'':
128
129\begin{quotation}
130 PEP stands for Python Enhancement Proposal. A PEP is a design
131 document providing information to the Python community, or
132 describing a new feature for Python. The PEP should provide a
133 concise technical specification of the feature and a rationale for
134 the feature.
135
136 We intend PEPs to be the primary mechanisms for proposing new
137 features, for collecting community input on an issue, and for
138 documenting the design decisions that have gone into Python. The
139 PEP author is responsible for building consensus within the
140 community and documenting dissenting opinions.
141\end{quotation}
142
143Read the rest of PEP 1 for the details of the PEP editorial process,
144style, and format. PEPs are kept in the Python CVS tree on
145SourceForge, though they're not part of the Python 2.0 distribution,
146and are also available in HTML form from
147\url{http://python.sourceforge.net/peps/}. As of September 2000,
148there are 25 PEPS, ranging from PEP 201, ``Lockstep Iteration'', to
149PEP 225, ``Elementwise/Objectwise Operators''.
150
Andrew M. Kuchlingbe870dd2000-09-27 02:36:10 +0000151% ======================================================================
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000152\section{Unicode}
153
Andrew M. Kuchling730067e2000-06-30 01:44:05 +0000154The largest new feature in Python 2.0 is a new fundamental data type:
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000155Unicode strings. Unicode uses 16-bit numbers to represent characters
156instead of the 8-bit number used by ASCII, meaning that 65,536
157distinct characters can be supported.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000158
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000159The final interface for Unicode support was arrived at through
Andrew M. Kuchlingb853ea02000-06-03 03:06:58 +0000160countless often-stormy discussions on the python-dev mailing list, and
Andrew M. Kuchling62cdd962000-06-30 12:46:41 +0000161mostly implemented by Marc-Andr\'e Lemburg, based on a Unicode string
162type implementation by Fredrik Lundh. A detailed explanation of the
163interface is in the file \file{Misc/unicode.txt} in the Python source
164distribution; it's also available on the Web at
Andrew M. Kuchlingb853ea02000-06-03 03:06:58 +0000165\url{http://starship.python.net/crew/lemburg/unicode-proposal.txt}.
Andrew M. Kuchling6032c482000-10-12 02:37:14 +0000166This article will simply cover the most significant points about the Unicode
167interfaces.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000168
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000169In Python source code, Unicode strings are written as
170\code{u"string"}. Arbitrary Unicode characters can be written using a
Andrew M. Kuchlingb853ea02000-06-03 03:06:58 +0000171new escape sequence, \code{\e u\var{HHHH}}, where \var{HHHH} is a
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +00001724-digit hexadecimal number from 0000 to FFFF. The existing
Andrew M. Kuchlingb853ea02000-06-03 03:06:58 +0000173\code{\e x\var{HHHH}} escape sequence can also be used, and octal
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000174escapes can be used for characters up to U+01FF, which is represented
Andrew M. Kuchlingb853ea02000-06-03 03:06:58 +0000175by \code{\e 777}.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000176
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000177Unicode strings, just like regular strings, are an immutable sequence
Andrew M. Kuchling662d76e2000-06-25 14:32:48 +0000178type. They can be indexed and sliced, but not modified in place.
Andrew M. Kuchling62cdd962000-06-30 12:46:41 +0000179Unicode strings have an \method{encode( \optional{encoding} )} method
Andrew M. Kuchling662d76e2000-06-25 14:32:48 +0000180that returns an 8-bit string in the desired encoding. Encodings are
181named by strings, such as \code{'ascii'}, \code{'utf-8'},
182\code{'iso-8859-1'}, or whatever. A codec API is defined for
183implementing and registering new encodings that are then available
184throughout a Python program. If an encoding isn't specified, the
185default encoding is usually 7-bit ASCII, though it can be changed for
186your Python installation by calling the
Andrew M. Kuchlingc0328f02000-06-10 15:11:20 +0000187\function{sys.setdefaultencoding(\var{encoding})} function in a
Andrew M. Kuchling69db0e42000-06-28 02:16:00 +0000188customised version of \file{site.py}.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000189
190Combining 8-bit and Unicode strings always coerces to Unicode, using
191the default ASCII encoding; the result of \code{'a' + u'bc'} is
Andrew M. Kuchling7f6270d2000-06-09 02:48:18 +0000192\code{u'abc'}.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000193
194New built-in functions have been added, and existing built-ins
195modified to support Unicode:
196
197\begin{itemize}
198\item \code{unichr(\var{ch})} returns a Unicode string 1 character
199long, containing the character \var{ch}.
200
201\item \code{ord(\var{u})}, where \var{u} is a 1-character regular or Unicode string, returns the number of the character as an integer.
202
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000203\item \code{unicode(\var{string} \optional{, \var{encoding}}
204\optional{, \var{errors}} ) } creates a Unicode string from an 8-bit
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000205string. \code{encoding} is a string naming the encoding to use.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000206The \code{errors} parameter specifies the treatment of characters that
207are invalid for the current encoding; passing \code{'strict'} as the
208value causes an exception to be raised on any encoding error, while
209\code{'ignore'} causes errors to be silently ignored and
210\code{'replace'} uses U+FFFD, the official replacement character, in
211case of any problems.
212
Andrew M. Kuchling3ad4e742000-09-27 01:33:41 +0000213\item The \keyword{exec} statement, and various built-ins such as
214\code{eval()}, \code{getattr()}, and \code{setattr()} will also
215accept Unicode strings as well as regular strings. (It's possible
216that the process of fixing this missed some built-ins; if you find a
217built-in function that accepts strings but doesn't accept Unicode
218strings at all, please report it as a bug.)
219
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000220\end{itemize}
221
222A new module, \module{unicodedata}, provides an interface to Unicode
223character properties. For example, \code{unicodedata.category(u'A')}
224returns the 2-character string 'Lu', the 'L' denoting it's a letter,
225and 'u' meaning that it's uppercase.
Andrew M. Kuchlingb853ea02000-06-03 03:06:58 +0000226\code{u.bidirectional(u'\e x0660')} returns 'AN', meaning that U+0660 is
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000227an Arabic number.
228
Andrew M. Kuchlingb853ea02000-06-03 03:06:58 +0000229The \module{codecs} module contains functions to look up existing encodings
230and register new ones. Unless you want to implement a
231new encoding, you'll most often use the
232\function{codecs.lookup(\var{encoding})} function, which returns a
2334-element tuple: \code{(\var{encode_func},
234\var{decode_func}, \var{stream_reader}, \var{stream_writer})}.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000235
236\begin{itemize}
237\item \var{encode_func} is a function that takes a Unicode string, and
238returns a 2-tuple \code{(\var{string}, \var{length})}. \var{string}
239is an 8-bit string containing a portion (perhaps all) of the Unicode
Andrew M. Kuchling2d2dc9f2000-08-17 00:27:06 +0000240string converted into the given encoding, and \var{length} tells you
241how much of the Unicode string was converted.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000242
Andrew M. Kuchling118ee962000-09-27 01:01:18 +0000243\item \var{decode_func} is the opposite of \var{encode_func}, taking
244an 8-bit string and returning a 2-tuple \code{(\var{ustring},
245\var{length})}, consisting of the resulting Unicode string
246\var{ustring} and the integer \var{length} telling how much of the
Andrew M. Kuchling3ad4e742000-09-27 01:33:41 +00002478-bit string was consumed.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000248
249\item \var{stream_reader} is a class that supports decoding input from
250a stream. \var{stream_reader(\var{file_obj})} returns an object that
251supports the \method{read()}, \method{readline()}, and
252\method{readlines()} methods. These methods will all translate from
253the given encoding and return Unicode strings.
254
255\item \var{stream_writer}, similarly, is a class that supports
256encoding output to a stream. \var{stream_writer(\var{file_obj})}
Andrew M. Kuchling69db0e42000-06-28 02:16:00 +0000257returns an object that supports the \method{write()} and
258\method{writelines()} methods. These methods expect Unicode strings,
259translating them to the given encoding on output.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000260\end{itemize}
261
262For example, the following code writes a Unicode string into a file,
263encoding it as UTF-8:
264
265\begin{verbatim}
266import codecs
267
268unistr = u'\u0660\u2000ab ...'
269
270(UTF8_encode, UTF8_decode,
271 UTF8_streamreader, UTF8_streamwriter) = codecs.lookup('UTF-8')
272
273output = UTF8_streamwriter( open( '/tmp/output', 'wb') )
274output.write( unistr )
275output.close()
276\end{verbatim}
277
278The following code would then read UTF-8 input from the file:
279
280\begin{verbatim}
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000281input = UTF8_streamreader( open( '/tmp/output', 'rb') )
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000282print repr(input.read())
283input.close()
284\end{verbatim}
285
286Unicode-aware regular expressions are available through the
287\module{re} module, which has a new underlying implementation called
288SRE written by Fredrik Lundh of Secret Labs AB.
289
Andrew M. Kuchlingc0328f02000-06-10 15:11:20 +0000290A \code{-U} command line option was added which causes the Python
291compiler to interpret all string literals as Unicode string literals.
292This is intended to be used in testing and future-proofing your Python
293code, since some future version of Python may drop support for 8-bit
294strings and provide only Unicode strings.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000295
296% ======================================================================
Andrew M. Kuchling2d2dc9f2000-08-17 00:27:06 +0000297\section{List Comprehensions}
298
299Lists are a workhorse data type in Python, and many programs
300manipulate a list at some point. Two common operations on lists are
301to loop over them, and either pick out the elements that meet a
302certain criterion, or apply some function to each element. For
303example, given a list of strings, you might want to pull out all the
304strings containing a given substring, or strip off trailing whitespace
305from each line.
306
307The existing \function{map()} and \function{filter()} functions can be
308used for this purpose, but they require a function as one of their
309arguments. This is fine if there's an existing built-in function that
310can be passed directly, but if there isn't, you have to create a
311little function to do the required work, and Python's scoping rules
312make the result ugly if the little function needs additional
313information. Take the first example in the previous paragraph,
314finding all the strings in the list containing a given substring. You
315could write the following to do it:
316
317\begin{verbatim}
318# Given the list L, make a list of all strings
319# containing the substring S.
320sublist = filter( lambda s, substring=S:
321 string.find(s, substring) != -1,
322 L)
323\end{verbatim}
324
325Because of Python's scoping rules, a default argument is used so that
326the anonymous function created by the \keyword{lambda} statement knows
327what substring is being searched for. List comprehensions make this
328cleaner:
329
330\begin{verbatim}
331sublist = [ s for s in L if string.find(s, S) != -1 ]
332\end{verbatim}
333
334List comprehensions have the form:
335
336\begin{verbatim}
337[ expression for expr in sequence1
338 for expr2 in sequence2 ...
339 for exprN in sequenceN
340 if condition
341\end{verbatim}
342
343The \keyword{for}...\keyword{in} clauses contain the sequences to be
344iterated over. The sequences do not have to be the same length,
345because they are \emph{not} iterated over in parallel, but
346from left to right; this is explained more clearly in the following
347paragraphs. The elements of the generated list will be the successive
348values of \var{expression}. The final \keyword{if} clause is
349optional; if present, \var{expression} is only evaluated and added to
350the result if \var{condition} is true.
351
352To make the semantics very clear, a list comprehension is equivalent
353to the following Python code:
354
355\begin{verbatim}
356for expr1 in sequence1:
357 for expr2 in sequence2:
358 ...
359 for exprN in sequenceN:
360 if (condition):
361 # Append the value of
362 # the expression to the
363 # resulting list.
364\end{verbatim}
365
366This means that when there are \keyword{for}...\keyword{in} clauses,
367the resulting list will be equal to the product of the lengths of all
368the sequences. If you have two lists of length 3, the output list is
3699 elements long:
370
371\begin{verbatim}
372seq1 = 'abc'
373seq2 = (1,2,3)
374>>> [ (x,y) for x in seq1 for y in seq2]
375[('a', 1), ('a', 2), ('a', 3), ('b', 1), ('b', 2), ('b', 3), ('c', 1),
376('c', 2), ('c', 3)]
377\end{verbatim}
378
379To avoid introducing an ambiguity into Python's grammar, if
380\var{expression} is creating a tuple, it must be surrounded with
381parentheses. The first list comprehension below is a syntax error,
382while the second one is correct:
383
384\begin{verbatim}
385# Syntax error
386[ x,y for x in seq1 for y in seq2]
387# Correct
388[ (x,y) for x in seq1 for y in seq2]
389\end{verbatim}
390
Andrew M. Kuchling2d2dc9f2000-08-17 00:27:06 +0000391The idea of list comprehensions originally comes from the functional
392programming language Haskell (\url{http://www.haskell.org}). Greg
393Ewing argued most effectively for adding them to Python and wrote the
394initial list comprehension patch, which was then discussed for a
395seemingly endless time on the python-dev mailing list and kept
396up-to-date by Skip Montanaro.
397
Andrew M. Kuchling2d2dc9f2000-08-17 00:27:06 +0000398% ======================================================================
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000399\section{Augmented Assignment}
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000400
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000401Augmented assignment operators, another long-requested feature, have
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000402been added to Python 2.0. Augmented assignment operators include
403\code{+=}, \code{-=}, \code{*=}, and so forth. For example, the
404statement \code{a += 2} increments the value of the variable
405\code{a} by 2, equivalent to the slightly lengthier \code{a = a + 2}.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000406
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000407The full list of supported assignment operators is \code{+=},
408\code{-=}, \code{*=}, \code{/=}, \code{\%=}, \code{**=}, \code{\&=},
Andrew M. Kuchling3cdb5762000-08-30 12:55:42 +0000409\code{|=}, \verb|^=|, \code{>>=}, and \code{<<=}. Python classes can
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000410override the augmented assignment operators by defining methods named
411\method{__iadd__}, \method{__isub__}, etc. For example, the following
412\class{Number} class stores a number and supports using += to create a
413new instance with an incremented value.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000414
415\begin{verbatim}
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000416class Number:
417 def __init__(self, value):
418 self.value = value
419 def __iadd__(self, increment):
420 return Number( self.value + increment)
421
422n = Number(5)
423n += 3
424print n.value
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000425\end{verbatim}
426
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000427The \method{__iadd__} special method is called with the value of the
428increment, and should return a new instance with an appropriately
429modified value; this return value is bound as the new value of the
430variable on the left-hand side.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000431
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000432Augmented assignment operators were first introduced in the C
433programming language, and most C-derived languages, such as
434\program{awk}, C++, Java, Perl, and PHP also support them. The augmented
435assignment patch was implemented by Thomas Wouters.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000436
437% ======================================================================
438\section{String Methods}
439
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000440Until now string-manipulation functionality was in the \module{string}
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000441module, which was usually a front-end for the \module{strop}
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000442module written in C. The addition of Unicode posed a difficulty for
443the \module{strop} module, because the functions would all need to be
444rewritten in order to accept either 8-bit or Unicode strings. For
445functions such as \function{string.replace()}, which takes 3 string
446arguments, that means eight possible permutations, and correspondingly
447complicated code.
448
Andrew M. Kuchling730067e2000-06-30 01:44:05 +0000449Instead, Python 2.0 pushes the problem onto the string type, making
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000450string manipulation functionality available through methods on both
4518-bit strings and Unicode strings.
452
453\begin{verbatim}
454>>> 'andrew'.capitalize()
455'Andrew'
456>>> 'hostname'.replace('os', 'linux')
457'hlinuxtname'
458>>> 'moshe'.find('sh')
4592
460\end{verbatim}
461
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000462One thing that hasn't changed, a noteworthy April Fools' joke
463notwithstanding, is that Python strings are immutable. Thus, the
464string methods return new strings, and do not modify the string on
465which they operate.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000466
467The old \module{string} module is still around for backwards
468compatibility, but it mostly acts as a front-end to the new string
469methods.
470
Andrew M. Kuchling730067e2000-06-30 01:44:05 +0000471Two methods which have no parallel in pre-2.0 versions, although they
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000472did exist in JPython for quite some time, are \method{startswith()}
473and \method{endswith}. \code{s.startswith(t)} is equivalent to \code{s[:len(t)]
474== t}, while \code{s.endswith(t)} is equivalent to \code{s[-len(t):] == t}.
475
Andrew M. Kuchlingfed4f1e2000-07-01 12:33:43 +0000476One other method which deserves special mention is \method{join}. The
477\method{join} method of a string receives one parameter, a sequence of
478strings, and is equivalent to the \function{string.join} function from
479the old \module{string} module, with the arguments reversed. In other
480words, \code{s.join(seq)} is equivalent to the old
481\code{string.join(seq, s)}.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000482
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000483% ======================================================================
Andrew M. Kuchlingec1722e2000-10-12 03:04:22 +0000484\section{Garbage Collection of Cycles}
Andrew M. Kuchling35e8afb2000-07-08 12:06:31 +0000485
486The C implementation of Python uses reference counting to implement
487garbage collection. Every Python object maintains a count of the
488number of references pointing to itself, and adjusts the count as
489references are created or destroyed. Once the reference count reaches
490zero, the object is no longer accessible, since you need to have a
491reference to an object to access it, and if the count is zero, no
492references exist any longer.
493
494Reference counting has some pleasant properties: it's easy to
495understand and implement, and the resulting implementation is
496portable, fairly fast, and reacts well with other libraries that
497implement their own memory handling schemes. The major problem with
498reference counting is that it sometimes doesn't realise that objects
499are no longer accessible, resulting in a memory leak. This happens
500when there are cycles of references.
501
502Consider the simplest possible cycle,
503a class instance which has a reference to itself:
504
505\begin{verbatim}
506instance = SomeClass()
507instance.myself = instance
508\end{verbatim}
509
510After the above two lines of code have been executed, the reference
511count of \code{instance} is 2; one reference is from the variable
512named \samp{'instance'}, and the other is from the \samp{myself}
513attribute of the instance.
514
515If the next line of code is \code{del instance}, what happens? The
516reference count of \code{instance} is decreased by 1, so it has a
517reference count of 1; the reference in the \samp{myself} attribute
518still exists. Yet the instance is no longer accessible through Python
519code, and it could be deleted. Several objects can participate in a
520cycle if they have references to each other, causing all of the
521objects to be leaked.
522
Andrew M. Kuchlingec1722e2000-10-12 03:04:22 +0000523Python 2.0 fixes this problem by periodically executing a cycle
524detection algorithm which looks for inaccessible cycles and deletes
525the objects involved. A new \module{gc} module provides functions to
526perform a garbage collection, obtain debugging statistics, and tuning
527the collector's parameters.
Andrew M. Kuchling35e8afb2000-07-08 12:06:31 +0000528
Andrew M. Kuchlingec1722e2000-10-12 03:04:22 +0000529Running the cycle detection algorithm takes some time, and therefore
530will result in some additional overhead. It is hoped that after we've
531gotten experience with the cycle collection from using 2.0, Python 2.1
532will be able to minimize the overhead with careful tuning. It's not
533yet obvious how much performance is lost, because benchmarking this is
534tricky and depends crucially on how often the program creates and
535destroys objects. The detection of cycles can be disabled when Python
536is compiled, if you can't afford even a tiny speed penalty or suspect
537that the cycle collection is buggy, by specifying the
538\samp{--without-cycle-gc} switch when running the \file{configure}
539script.
Andrew M. Kuchling35e8afb2000-07-08 12:06:31 +0000540
541Several people tackled this problem and contributed to a solution. An
542early implementation of the cycle detection approach was written by
543Toby Kelsey. The current algorithm was suggested by Eric Tiedemann
544during a visit to CNRI, and Guido van Rossum and Neil Schemenauer
545wrote two different implementations, which were later integrated by
546Neil. Lots of other people offered suggestions along the way; the
547March 2000 archives of the python-dev mailing list contain most of the
548relevant discussion, especially in the threads titled ``Reference
549cycle collection for Python'' and ``Finalization again''.
550
Andrew M. Kuchling35e8afb2000-07-08 12:06:31 +0000551% ======================================================================
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000552\section{Other Core Changes}
Andrew M. Kuchling35e8afb2000-07-08 12:06:31 +0000553
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000554Various minor changes have been made to Python's syntax and built-in
555functions. None of the changes are very far-reaching, but they're
556handy conveniences.
557
558\subsection{Minor Language Changes}
559
560A new syntax makes it more convenient to call a given function
561with a tuple of arguments and/or a dictionary of keyword arguments.
562In Python 1.5 and earlier, you'd use the \function{apply()}
563built-in function: \code{apply(f, \var{args}, \var{kw})} calls the
564function \function{f()} with the argument tuple \var{args} and the
565keyword arguments in the dictionary \var{kw}. \function{apply()}
566is the same in 2.0, but thanks to a patch from
567Greg Ewing, \code{f(*\var{args}, **\var{kw})} as a shorter
568and clearer way to achieve the same effect. This syntax is
569symmetrical with the syntax for defining functions:
570
571\begin{verbatim}
572def f(*args, **kw):
573 # args is a tuple of positional args,
574 # kw is a dictionary of keyword args
575 ...
576\end{verbatim}
577
578The \keyword{print} statement can now have its output directed to a
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000579file-like object by following the \keyword{print} with
580\verb|>> file|, similar to the redirection operator in Unix shells.
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000581Previously you'd either have to use the \method{write()} method of the
582file-like object, which lacks the convenience and simplicity of
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000583\keyword{print}, or you could assign a new value to
584\code{sys.stdout} and then restore the old value. For sending output to standard error,
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000585it's much easier to write this:
586
587\begin{verbatim}
588print >> sys.stderr, "Warning: action field not supplied"
589\end{verbatim}
590
591Modules can now be renamed on importing them, using the syntax
592\code{import \var{module} as \var{name}} or \code{from \var{module}
593import \var{name} as \var{othername}}. The patch was submitted by
594Thomas Wouters.
595
596A new format style is available when using the \code{\%} operator;
597'\%r' will insert the \function{repr()} of its argument. This was
598also added from symmetry considerations, this time for symmetry with
599the existing '\%s' format style, which inserts the \function{str()} of
600its argument. For example, \code{'\%r \%s' \% ('abc', 'abc')} returns a
601string containing \verb|'abc' abc|.
602
603Previously there was no way to implement a class that overrode
604Python's built-in \keyword{in} operator and implemented a custom
605version. \code{\var{obj} in \var{seq}} returns true if \var{obj} is
606present in the sequence \var{seq}; Python computes this by simply
607trying every index of the sequence until either \var{obj} is found or
608an \exception{IndexError} is encountered. Moshe Zadka contributed a
609patch which adds a \method{__contains__} magic method for providing a
610custom implementation for \keyword{in}. Additionally, new built-in
611objects written in C can define what \keyword{in} means for them via a
612new slot in the sequence protocol.
613
614Earlier versions of Python used a recursive algorithm for deleting
615objects. Deeply nested data structures could cause the interpreter to
616fill up the C stack and crash; Christian Tismer rewrote the deletion
617logic to fix this problem. On a related note, comparing recursive
618objects recursed infinitely and crashed; Jeremy Hylton rewrote the
619code to no longer crash, producing a useful result instead. For
620example, after this code:
621
622\begin{verbatim}
623a = []
624b = []
625a.append(a)
626b.append(b)
627\end{verbatim}
628
629The comparison \code{a==b} returns true, because the two recursive
Andrew M. Kuchling6032c482000-10-12 02:37:14 +0000630data structures are isomorphic. See the thread ``trashcan
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000631and PR\#7'' in the April 2000 archives of the python-dev mailing list
632for the discussion leading up to this implementation, and some useful
Andrew M. Kuchlingec1722e2000-10-12 03:04:22 +0000633relevant links.
Andrew M. Kuchling6032c482000-10-12 02:37:14 +0000634% Starting URL:
635% http://www.python.org/pipermail/python-dev/2000-April/004834.html
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000636
Andrew M. Kuchlingec1722e2000-10-12 03:04:22 +0000637Note that comparisons can now also raise exceptions. In earlier
638versions of Python, a comparison operation such as \code{cmp(a,b)}
639would always produce an answer, even if a user-defined
640\method{__cmp__} method encountered an error, since the resulting
641exception would simply be silently swallowed.
642
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000643Work has been done on porting Python to 64-bit Windows on the Itanium
644processor, mostly by Trent Mick of ActiveState. (Confusingly,
645\code{sys.platform} is still \code{'win32'} on Win64 because it seems
646that for ease of porting, MS Visual C++ treats code as 32 bit on Itanium.)
647PythonWin also supports Windows CE; see the Python CE page at
648\url{http://starship.python.net/crew/mhammond/ce/} for more
649information.
650
Andrew M. Kuchlingec1722e2000-10-12 03:04:22 +0000651Another new platform is Darwin/MacOS X; inital support for it is in
652Python 2.0. Dynamic loading works, if you specify ``configure
653--with-dyld --with-suffix=.x''. Consult the README in the Python
654source distribution for more instructions.
655
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000656An attempt has been made to alleviate one of Python's warts, the
657often-confusing \exception{NameError} exception when code refers to a
658local variable before the variable has been assigned a value. For
659example, the following code raises an exception on the \keyword{print}
660statement in both 1.5.2 and 2.0; in 1.5.2 a \exception{NameError}
661exception is raised, while 2.0 raises a new
662\exception{UnboundLocalError} exception.
663\exception{UnboundLocalError} is a subclass of \exception{NameError},
664so any existing code that expects \exception{NameError} to be raised
665should still work.
666
667\begin{verbatim}
668def f():
669 print "i=",i
670 i = i + 1
671f()
672\end{verbatim}
673
Andrew M. Kuchling4d46d382000-09-06 17:58:49 +0000674Two new exceptions, \exception{TabError} and
675\exception{IndentationError}, have been introduced. They're both
676subclasses of \exception{SyntaxError}, and are raised when Python code
677is found to be improperly indented.
678
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000679\subsection{Changes to Built-in Functions}
680
681A new built-in, \function{zip(\var{seq1}, \var{seq2}, ...)}, has been
682added. \function{zip()} returns a list of tuples where each tuple
683contains the i-th element from each of the argument sequences. The
684difference between \function{zip()} and \code{map(None, \var{seq1},
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000685\var{seq2})} is that \function{map()} pads the sequences with
686\code{None} if the sequences aren't all of the same length, while
687\function{zip()} truncates the returned list to the length of the
688shortest argument sequence.
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000689
690The \function{int()} and \function{long()} functions now accept an
691optional ``base'' parameter when the first argument is a string.
692\code{int('123', 10)} returns 123, while \code{int('123', 16)} returns
693291. \code{int(123, 16)} raises a \exception{TypeError} exception
694with the message ``can't convert non-string with explicit base''.
695
696A new variable holding more detailed version information has been
697added to the \module{sys} module. \code{sys.version_info} is a tuple
698\code{(\var{major}, \var{minor}, \var{micro}, \var{level},
699\var{serial})} For example, in a hypothetical 2.0.1beta1,
700\code{sys.version_info} would be \code{(2, 0, 1, 'beta', 1)}.
701\var{level} is a string such as \code{"alpha"}, \code{"beta"}, or
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000702\code{"final"} for a final release.
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000703
704Dictionaries have an odd new method, \method{setdefault(\var{key},
705\var{default})}, which behaves similarly to the existing
706\method{get()} method. However, if the key is missing,
707\method{setdefault()} both returns the value of \var{default} as
708\method{get()} would do, and also inserts it into the dictionary as
709the value for \var{key}. Thus, the following lines of code:
710
711\begin{verbatim}
712if dict.has_key( key ): return dict[key]
713else:
714 dict[key] = []
715 return dict[key]
716\end{verbatim}
717
718can be reduced to a single \code{return dict.setdefault(key, [])} statement.
719
Andrew M. Kuchling4d46d382000-09-06 17:58:49 +0000720The interpreter sets a maximum recursion depth in order to catch
721runaway recursion before filling the C stack and causing a core dump
722or GPF.. Previously this limit was fixed when you compiled Python,
723but in 2.0 the maximum recursion depth can be read and modified using
724\function{sys.getrecursionlimit} and \function{sys.setrecursionlimit}.
725The default value is 1000, and a rough maximum value for a given
726platform can be found by running a new script,
727\file{Misc/find_recursionlimit.py}.
Andrew M. Kuchling35e8afb2000-07-08 12:06:31 +0000728
729% ======================================================================
Andrew M. Kuchling730067e2000-06-30 01:44:05 +0000730\section{Porting to 2.0}
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000731
732New Python releases try hard to be compatible with previous releases,
733and the record has been pretty good. However, some changes are
Andrew M. Kuchling4d46d382000-09-06 17:58:49 +0000734considered useful enough, usually because they fix initial design decisions that
735turned out to be actively mistaken, that breaking backward compatibility
Andrew M. Kuchling730067e2000-06-30 01:44:05 +0000736can't always be avoided. This section lists the changes in Python 2.0
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000737that may cause old Python code to break.
738
739The change which will probably break the most code is tightening up
740the arguments accepted by some methods. Some methods would take
741multiple arguments and treat them as a tuple, particularly various
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000742list methods such as \method{.append()} and \method{.insert()}.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000743In earlier versions of Python, if \code{L} is a list, \code{L.append(
Andrew M. Kuchling730067e2000-06-30 01:44:05 +00007441,2 )} appends the tuple \code{(1,2)} to the list. In Python 2.0 this
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000745causes a \exception{TypeError} exception to be raised, with the
746message: 'append requires exactly 1 argument; 2 given'. The fix is to
747simply add an extra set of parentheses to pass both values as a tuple:
748\code{L.append( (1,2) )}.
749
750The earlier versions of these methods were more forgiving because they
751used an old function in Python's C interface to parse their arguments;
Andrew M. Kuchling730067e2000-06-30 01:44:05 +00007522.0 modernizes them to use \function{PyArg_ParseTuple}, the current
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000753argument parsing function, which provides more helpful error messages
754and treats multi-argument calls as errors. If you absolutely must use
Andrew M. Kuchling730067e2000-06-30 01:44:05 +00007552.0 but can't fix your code, you can edit \file{Objects/listobject.c}
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000756and define the preprocessor symbol \code{NO_STRICT_LIST_APPEND} to
757preserve the old behaviour; this isn't recommended.
758
759Some of the functions in the \module{socket} module are still
760forgiving in this way. For example, \function{socket.connect(
761('hostname', 25) )} is the correct form, passing a tuple representing
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000762an IP address, but \function{socket.connect( 'hostname', 25 )} also
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000763works. \function{socket.connect_ex()} and \function{socket.bind()} are
Andrew M. Kuchling730067e2000-06-30 01:44:05 +0000764similarly easy-going. 2.0alpha1 tightened these functions up, but
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000765because the documentation actually used the erroneous multiple
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000766argument form, many people wrote code which would break with the
767stricter checking. GvR backed out the changes in the face of public
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000768reaction, so for the \module{socket} module, the documentation was
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000769fixed and the multiple argument form is simply marked as deprecated;
770it \emph{will} be tightened up again in a future Python version.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000771
Andrew M. Kuchling4d46d382000-09-06 17:58:49 +0000772The \code{\e x} escape in string literals now takes exactly 2 hex
773digits. Previously it would consume all the hex digits following the
774'x' and take the lowest 8 bits of the result, so \code{\e x123456} was
775equivalent to \code{\e x56}.
776
777The \exception{AttributeError} exception has a more friendly error message,
778whose text will be something like \code{'Spam' instance has no attribute 'eggs'}.
779Previously the error message was just the missing attribute name \code{eggs}, and
780code written to take advantage of this fact will break in 2.0.
781
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000782Some work has been done to make integers and long integers a bit more
783interchangeable. In 1.5.2, large-file support was added for Solaris,
784to allow reading files larger than 2Gb; this made the \method{tell()}
785method of file objects return a long integer instead of a regular
786integer. Some code would subtract two file offsets and attempt to use
787the result to multiply a sequence or slice a string, but this raised a
Andrew M. Kuchling730067e2000-06-30 01:44:05 +0000788\exception{TypeError}. In 2.0, long integers can be used to multiply
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000789or slice a sequence, and it'll behave as you'd intuitively expect it
790to; \code{3L * 'abc'} produces 'abcabcabc', and \code{
791(0,1,2,3)[2L:4L]} produces (2,3). Long integers can also be used in
Andrew M. Kuchling3ad4e742000-09-27 01:33:41 +0000792various contexts where previously only integers were accepted, such
793as in the \method{seek()} method of file objects, and in the formats
794supported by the \verb|%| operator (\verb|%d|, \verb|%i|, \verb|%x|,
795etc.). For example, \code{"\%d" \% 2L**64} will produce the string
796\samp{18446744073709551616}.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000797
798The subtlest long integer change of all is that the \function{str()}
799of a long integer no longer has a trailing 'L' character, though
800\function{repr()} still includes it. The 'L' annoyed many people who
801wanted to print long integers that looked just like regular integers,
802since they had to go out of their way to chop off the character. This
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000803is no longer a problem in 2.0, but code which does \code{str(longval)[:-1]} and assumes the 'L' is there, will now lose
804the final digit.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000805
806Taking the \function{repr()} of a float now uses a different
807formatting precision than \function{str()}. \function{repr()} uses
Andrew M. Kuchling662d76e2000-06-25 14:32:48 +0000808\code{\%.17g} format string for C's \function{sprintf()}, while
809\function{str()} uses \code{\%.12g} as before. The effect is that
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000810\function{repr()} may occasionally show more decimal places than
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000811\function{str()}, for certain numbers.
Andrew M. Kuchlinga5bbb002000-06-10 02:41:46 +0000812For example, the number 8.1 can't be represented exactly in binary, so
813\code{repr(8.1)} is \code{'8.0999999999999996'}, while str(8.1) is
814\code{'8.1'}.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000815
Andrew M. Kuchling730067e2000-06-30 01:44:05 +0000816The \code{-X} command-line option, which turned all standard
Andrew M. Kuchling62cdd962000-06-30 12:46:41 +0000817exceptions into strings instead of classes, has been removed; the
818standard exceptions will now always be classes. The
819\module{exceptions} module containing the standard exceptions was
820translated from Python to a built-in C module, written by Barry Warsaw
821and Fredrik Lundh.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000822
Andrew M. Kuchling791b3662000-07-01 15:04:18 +0000823% Commented out for now -- I don't think anyone will care.
824%The pattern and match objects provided by SRE are C types, not Python
825%class instances as in 1.5. This means you can no longer inherit from
826%\class{RegexObject} or \class{MatchObject}, but that shouldn't be much
827%of a problem since no one should have been doing that in the first
828%place.
829
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000830% ======================================================================
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000831\section{Extending/Embedding Changes}
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000832
833Some of the changes are under the covers, and will only be apparent to
Andrew M. Kuchling8357c4c2000-07-01 00:14:43 +0000834people writing C extension modules or embedding a Python interpreter
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000835in a larger application. If you aren't dealing with Python's C API,
Andrew M. Kuchling5b8311e2000-05-31 03:28:42 +0000836you can safely skip this section.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000837
Andrew M. Kuchlinga5bbb002000-06-10 02:41:46 +0000838The version number of the Python C API was incremented, so C
839extensions compiled for 1.5.2 must be recompiled in order to work with
Andrew M. Kuchling730067e2000-06-30 01:44:05 +00008402.0. On Windows, attempting to import a third party extension built
Andrew M. Kuchlinga5bbb002000-06-10 02:41:46 +0000841for Python 1.5.x usually results in an immediate crash; there's not
Andrew M. Kuchling62cdd962000-06-30 12:46:41 +0000842much we can do about this. (Here's Mark Hammond's explanation of the
843reasons for the crash. The 1.5 module is linked against
844\file{Python15.dll}. When \file{Python.exe} , linked against
845\file{Python16.dll}, starts up, it initializes the Python data
846structures in \file{Python16.dll}. When Python then imports the
847module \file{foo.pyd} linked against \file{Python15.dll}, it
848immediately tries to call the functions in that DLL. As Python has
849not been initialized in that DLL, the program immediately crashes.)
Andrew M. Kuchlinga5bbb002000-06-10 02:41:46 +0000850
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000851Users of Jim Fulton's ExtensionClass module will be pleased to find
852out that hooks have been added so that ExtensionClasses are now
853supported by \function{isinstance()} and \function{issubclass()}.
854This means you no longer have to remember to write code such as
855\code{if type(obj) == myExtensionClass}, but can use the more natural
856\code{if isinstance(obj, myExtensionClass)}.
857
Andrew M. Kuchlingb853ea02000-06-03 03:06:58 +0000858The \file{Python/importdl.c} file, which was a mass of \#ifdefs to
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000859support dynamic loading on many different platforms, was cleaned up
Andrew M. Kuchling69db0e42000-06-28 02:16:00 +0000860and reorganised by Greg Stein. \file{importdl.c} is now quite small,
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000861and platform-specific code has been moved into a bunch of
Andrew M. Kuchlingb9fb1f22000-08-04 12:40:35 +0000862\file{Python/dynload_*.c} files. Another cleanup: there were also a
863number of \file{my*.h} files in the Include/ directory that held
864various portability hacks; they've been merged into a single file,
865\file{Include/pyport.h}.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000866
867Vladimir Marangozov's long-awaited malloc restructuring was completed,
868to make it easy to have the Python interpreter use a custom allocator
869instead of C's standard \function{malloc()}. For documentation, read
Andrew M. Kuchling2d2dc9f2000-08-17 00:27:06 +0000870the comments in \file{Include/pymem.h} and
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000871\file{Include/objimpl.h}. For the lengthy discussions during which
872the interface was hammered out, see the Web archives of the 'patches'
873and 'python-dev' lists at python.org.
874
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000875Recent versions of the GUSI development environment for MacOS support
876POSIX threads. Therefore, Python's POSIX threading support now works
877on the Macintosh. Threading support using the user-space GNU \texttt{pth}
878library was also contributed.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000879
880Threading support on Windows was enhanced, too. Windows supports
881thread locks that use kernel objects only in case of contention; in
882the common case when there's no contention, they use simpler functions
883which are an order of magnitude faster. A threaded version of Python
Andrew M. Kuchling730067e2000-06-30 01:44:05 +00008841.5.2 on NT is twice as slow as an unthreaded version; with the 2.0
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000885changes, the difference is only 10\%. These improvements were
886contributed by Yakov Markovitch.
887
Andrew M. Kuchling08d87c62000-07-09 15:05:15 +0000888Python 2.0's source now uses only ANSI C prototypes, so compiling Python now
889requires an ANSI C compiler, and can no longer be done using a compiler that
890only supports K\&R C.
891
Andrew M. Kuchling4d46d382000-09-06 17:58:49 +0000892Previously the Python virtual machine used 16-bit numbers in its
893bytecode, limiting the size of source files. In particular, this
894affected the maximum size of literal lists and dictionaries in Python
Andrew M. Kuchling3ad4e742000-09-27 01:33:41 +0000895source; occasionally people who are generating Python code would run
896into this limit. A patch by Charles G. Waldman raises the limit from
897\verb|2^16| to \verb|2^{32}|.
Andrew M. Kuchling4d46d382000-09-06 17:58:49 +0000898
Andrew M. Kuchling3ad4e742000-09-27 01:33:41 +0000899Three new convenience functions intended for adding constants to a
900module's dictionary at module initialization time were added:
901\function{PyModule_AddObject()}, \function{PyModule_AddIntConstant()},
902and \function{PyModule_AddStringConstant()}. Each of these functions
903takes a module object, a null-terminated C string containing the name
904to be added, and a third argument for the value to be assigned to the
905name. This third argument is, respectively, a Python object, a C
906long, or a C string.
907
908A wrapper API was added for Unix-style signal handlers.
909\function{PyOS_getsig()} gets a signal handler and
910\function{PyOS_setsig()} will set a new handler.
Andrew M. Kuchling4d46d382000-09-06 17:58:49 +0000911
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000912% ======================================================================
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000913\section{Distutils: Making Modules Easy to Install}
914
915Before Python 2.0, installing modules was a tedious affair -- there
916was no way to figure out automatically where Python is installed, or
917what compiler options to use for extension modules. Software authors
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +0000918had to go through an arduous ritual of editing Makefiles and
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000919configuration files, which only really work on Unix and leave Windows
Andrew M. Kuchling3ad4e742000-09-27 01:33:41 +0000920and MacOS unsupported. Python users faced wildly differing
921installation instructions which varied between different extension
922packages, which made adminstering a Python installation something of a
923chore.
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000924
925The SIG for distribution utilities, shepherded by Greg Ward, has
926created the Distutils, a system to make package installation much
927easier. They form the \module{distutils} package, a new part of
928Python's standard library. In the best case, installing a Python
929module from source will require the same steps: first you simply mean
930unpack the tarball or zip archive, and the run ``\code{python setup.py
931install}''. The platform will be automatically detected, the compiler
932will be recognized, C extension modules will be compiled, and the
933distribution installed into the proper directory. Optional
934command-line arguments provide more control over the installation
935process, the distutils package offers many places to override defaults
936-- separating the build from the install, building or installing in
937non-default directories, and more.
938
939In order to use the Distutils, you need to write a \file{setup.py}
940script. For the simple case, when the software contains only .py
941files, a minimal \file{setup.py} can be just a few lines long:
942
943\begin{verbatim}
944from distutils.core import setup
945setup (name = "foo", version = "1.0",
946 py_modules = ["module1", "module2"])
947\end{verbatim}
948
949The \file{setup.py} file isn't much more complicated if the software
950consists of a few packages:
951
952\begin{verbatim}
953from distutils.core import setup
954setup (name = "foo", version = "1.0",
955 packages = ["package", "package.subpackage"])
956\end{verbatim}
957
958A C extension can be the most complicated case; here's an example taken from
959the PyXML package:
960
961
962\begin{verbatim}
963from distutils.core import setup, Extension
964
965expat_extension = Extension('xml.parsers.pyexpat',
966 define_macros = [('XML_NS', None)],
967 include_dirs = [ 'extensions/expat/xmltok',
968 'extensions/expat/xmlparse' ],
969 sources = [ 'extensions/pyexpat.c',
970 'extensions/expat/xmltok/xmltok.c',
971 'extensions/expat/xmltok/xmlrole.c',
972 ]
973 )
974setup (name = "PyXML", version = "0.5.4",
975 ext_modules =[ expat_extension ] )
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000976\end{verbatim}
977
978The Distutils can also take care of creating source and binary
979distributions. The ``sdist'' command, run by ``\code{python setup.py
980sdist}', builds a source distribution such as \file{foo-1.0.tar.gz}.
981Adding new commands isn't difficult, ``bdist_rpm'' and
982``bdist_wininst'' commands have already been contributed to create an
983RPM distribution and a Windows installer for the software,
984respectively. Commands to create other distribution formats such as
985Debian packages and Solaris \file{.pkg} files are in various stages of
986development.
987
988All this is documented in a new manual, \textit{Distributing Python
989Modules}, that joins the basic set of Python documentation.
990
Fred Drake7486c6b2000-10-12 02:49:12 +0000991% ======================================================================
Andrew M. Kuchling6032c482000-10-12 02:37:14 +0000992\section{XML Modules}
Andrew M. Kuchling43737642000-08-30 00:51:02 +0000993
Andrew M. Kuchling6032c482000-10-12 02:37:14 +0000994Python 1.5.2 included a simple XML parser in the form of the
995\module{xmllib} module, contributed by Sjoerd Mullender. Since
9961.5.2's release, two different interfaces for processing XML have
997become common: SAX2 (version 2 of the Simple API for XML) provides an
998event-driven interface with some similarities to \module{xmllib}, and
999the DOM (Document Object Model) provides a tree-based interface,
1000transforming an XML document into a tree of nodes that can be
1001traversed and modified. Python 2.0 includes a SAX2 interface and a
1002stripped-down DOM interface as part of the \module{xml} package.
1003Here we will give a brief overview of these new interfaces; consult
1004the Python documentation or the source code for complete details.
1005The Python XML SIG is also working on improved documentation.
1006
1007\subsection{SAX2 Support}
1008
1009SAX defines an event-driven interface for parsing XML. To use SAX,
1010you must write a SAX handler class. Handler classes inherit from
1011various classes provided by SAX, and override various methods that
1012will then be called by the XML parser. For example, the
1013\method{startElement} and \method{endElement} methods are called for
1014every starting and end tag encountered by the parser, the
1015\method{characters()} method is called for every chunk of character
1016data, and so forth.
1017
1018The advantage of the event-driven approach is that that the whole
1019document doesn't have to be resident in memory at any one time, which
1020matters if you are processing really huge documents. However, writing
1021the SAX handler class can get very complicated if you're trying to
1022modify the document structure in some elaborate way.
1023
1024For example, this little example program defines a handler that prints
1025a message for every starting and ending tag, and then parses the file
1026\file{hamlet.xml} using it:
1027
1028\begin{verbatim}
1029from xml import sax
1030
1031class SimpleHandler(sax.ContentHandler):
1032 def startElement(self, name, attrs):
1033 print 'Start of element:', name, attrs.keys()
1034
1035 def endElement(self, name):
1036 print 'End of element:', name
1037
1038# Create a parser object
1039parser = sax.make_parser()
1040
1041# Tell it what handler to use
1042handler = SimpleHandler()
1043parser.setContentHandler( handler )
1044
1045# Parse a file!
1046parser.parse( 'hamlet.xml' )
1047\end{verbatim}
1048
1049For more information, consult the Python documentation, or the XML
1050HOWTO at \url{http://www.python.org/doc/howto/xml/}.
1051
1052\subsection{DOM Support}
1053
1054The Document Object Model is a tree-based representation for an XML
1055document. A top-level \class{Document} instance is the root of the
1056tree, and has a single child which is the top-level \class{Element}
1057instance. This \class{Element} has children nodes representing
1058character data and any sub-elements, which may have further children
1059of their own, and so forth. Using the DOM you can traverse the
1060resulting tree any way you like, access element and attribute values,
1061insert and delete nodes, and convert the tree back into XML.
1062
1063The DOM is useful for modifying XML documents, because you can create
1064a DOM tree, modify it by adding new nodes or rearranging subtrees, and
1065then produce a new XML document as output. You can also construct a
1066DOM tree manually and convert it to XML, which can be a more flexible
1067way of producing XML output than simply writing
1068\code{<tag1>}...\code{</tag1>} to a file.
1069
1070The DOM implementation included with Python lives in the
1071\module{xml.dom.minidom} module. It's a lightweight implementation of
1072the Level 1 DOM with support for XML namespaces. The
1073\function{parse()} and \function{parseString()} convenience
1074functions are provided for generating a DOM tree:
1075
1076\begin{verbatim}
1077from xml.dom import minidom
1078doc = minidom.parse('hamlet.xml')
1079\end{verbatim}
1080
1081\code{doc} is a \class{Document} instance. \class{Document}, like all
1082the other DOM classes such as \class{Element} and \class{Text}, is a
1083subclass of the \class{Node} base class. All the nodes in a DOM tree
1084therefore support certain common methods, such as \method{toxml()}
1085which returns a string containing the XML representation of the node
1086and its children. Each class also has special methods of its own; for
1087example, \class{Element} and \class{Document} instances have a method
1088to find all child elements with a given tag name. Continuing from the
1089previous 2-line example:
1090
1091\begin{verbatim}
1092perslist = doc.getElementsByTagName( 'PERSONA' )
1093print perslist[0].toxml()
1094print perslist[1].toxml()
1095\end{verbatim}
1096
1097For the \textit{Hamlet} XML file, the above few lines output:
1098
1099\begin{verbatim}
1100<PERSONA>CLAUDIUS, king of Denmark. </PERSONA>
1101<PERSONA>HAMLET, son to the late, and nephew to the present king.</PERSONA>
1102\end{verbatim}
1103
1104The root element of the document is available as
1105\code{doc.documentElement}, and its children can be easily modified
1106by deleting, adding, or removing nodes:
1107
1108\begin{verbatim}
1109root = doc.documentElement
1110
1111# Remove the first child
1112root.removeChild( root.childNodes[0] )
1113
1114# Move the new first child to the end
1115root.appendChild( root.childNodes[0] )
1116
1117# Insert the new first child (originally,
1118# the third child) before the 20th child.
1119root.insertBefore( root.childNodes[0], root.childNodes[20] )
1120\end{verbatim}
1121
1122Again, I will refer you to the Python documentation for a complete
1123listing of the different \class{Node} classes and their various methods.
1124
1125\subsection{Relationship to PyXML}
1126
1127The XML Special Interest Group has been working on XML-related Python
1128code for a while. Its code distribution, called PyXML, is available
1129from the SIG's Web pages at \url{http://www.python.org/sigs/xml-sig/}.
1130The PyXML distribution also used the package name \samp{xml}. If
1131you've written programs that used PyXML, you're probably wondering
1132about its compatibility with the 2.0 \module{xml} package.
1133
1134The answer is that Python 2.0's \module{xml} package isn't compatible
1135with PyXML, but can be made compatible by installing a recent version
1136PyXML. Many applications can get by with the XML support that is
1137included with Python 2.0, but more complicated applications will
1138require that the full PyXML package will be installed. When
1139installed, PyXML versions 0.6.0 or greater will replace the
1140\module{xml} package shipped with Python, and will be a strict
1141superset of the standard package, adding a bunch of additional
1142features. Some of the additional features in PyXML include:
1143
1144\begin{itemize}
1145\item 4DOM, a full DOM implementation
Andrew M. Kuchlingf1551702000-10-16 14:19:21 +00001146from FourThought, Inc.
Andrew M. Kuchling6032c482000-10-12 02:37:14 +00001147\item The xmlproc validating parser, written by Lars Marius Garshol.
1148\item The \module{sgmlop} parser accelerator module, written by Fredrik Lundh.
1149\end{itemize}
Andrew M. Kuchling43737642000-08-30 00:51:02 +00001150
1151% ======================================================================
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +00001152\section{Module changes}
1153
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +00001154Lots of improvements and bugfixes were made to Python's extensive
1155standard library; some of the affected modules include
1156\module{readline}, \module{ConfigParser}, \module{cgi},
1157\module{calendar}, \module{posix}, \module{readline}, \module{xmllib},
1158\module{aifc}, \module{chunk, wave}, \module{random}, \module{shelve},
1159and \module{nntplib}. Consult the CVS logs for the exact
1160patch-by-patch details.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +00001161
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +00001162Brian Gallew contributed OpenSSL support for the \module{socket}
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +00001163module. OpenSSL is an implementation of the Secure Socket Layer,
1164which encrypts the data being sent over a socket. When compiling
1165Python, you can edit \file{Modules/Setup} to include SSL support,
1166which adds an additional function to the \module{socket} module:
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +00001167\function{socket.ssl(\var{socket}, \var{keyfile}, \var{certfile})},
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +00001168which takes a socket object and returns an SSL socket. The
1169\module{httplib} and \module{urllib} modules were also changed to
1170support ``https://'' URLs, though no one has implemented FTP or SMTP
1171over SSL.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +00001172
Andrew M. Kuchling69db0e42000-06-28 02:16:00 +00001173The \module{httplib} module has been rewritten by Greg Stein to
1174support HTTP/1.1. Backward compatibility with the 1.5 version of
1175\module{httplib} is provided, though using HTTP/1.1 features such as
1176pipelining will require rewriting code to use a different set of
1177interfaces.
1178
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +00001179The \module{Tkinter} module now supports Tcl/Tk version 8.1, 8.2, or
11808.3, and support for the older 7.x versions has been dropped. The
Andrew M. Kuchling791b3662000-07-01 15:04:18 +00001181Tkinter module now supports displaying Unicode strings in Tk widgets.
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +00001182Also, Fredrik Lundh contributed an optimization which makes operations
1183like \code{create_line} and \code{create_polygon} much faster,
Andrew M. Kuchling791b3662000-07-01 15:04:18 +00001184especially when using lots of coordinates.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +00001185
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +00001186The \module{curses} module has been greatly extended, starting from
1187Oliver Andrich's enhanced version, to provide many additional
1188functions from ncurses and SYSV curses, such as colour, alternative
Andrew M. Kuchling69db0e42000-06-28 02:16:00 +00001189character set support, pads, and mouse support. This means the module
1190is no longer compatible with operating systems that only have BSD
1191curses, but there don't seem to be any currently maintained OSes that
1192fall into this category.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +00001193
Andrew M. Kuchling730067e2000-06-30 01:44:05 +00001194As mentioned in the earlier discussion of 2.0's Unicode support, the
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +00001195underlying implementation of the regular expressions provided by the
1196\module{re} module has been changed. SRE, a new regular expression
1197engine written by Fredrik Lundh and partially funded by Hewlett
1198Packard, supports matching against both 8-bit strings and Unicode
1199strings.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +00001200
1201% ======================================================================
1202\section{New modules}
1203
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +00001204A number of new modules were added. We'll simply list them with brief
Andrew M. Kuchling730067e2000-06-30 01:44:05 +00001205descriptions; consult the 2.0 documentation for the details of a
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +00001206particular module.
1207
1208\begin{itemize}
1209
Andrew M. Kuchling62cdd962000-06-30 12:46:41 +00001210\item{\module{atexit}}:
1211For registering functions to be called before the Python interpreter exits.
1212Code that currently sets
1213\code{sys.exitfunc} directly should be changed to
1214use the \module{atexit} module instead, importing \module{atexit}
1215and calling \function{atexit.register()} with
1216the function to be called on exit.
1217(Contributed by Skip Montanaro.)
1218
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +00001219\item{\module{codecs}, \module{encodings}, \module{unicodedata}:} Added as part of the new Unicode support.
1220
Andrew M. Kuchlingfed4f1e2000-07-01 12:33:43 +00001221\item{\module{filecmp}:} Supersedes the old \module{cmp}, \module{cmpcache} and
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +00001222\module{dircmp} modules, which have now become deprecated.
Andrew M. Kuchlingc0328f02000-06-10 15:11:20 +00001223(Contributed by Gordon MacMillan and Moshe Zadka.)
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +00001224
Andrew M. Kuchlingec1722e2000-10-12 03:04:22 +00001225\item{\module{gettext}:} This module provides internationalization
1226(I18N) and localization (L10N) support for Python programs by
1227providing an interface to the GNU gettext message catalog library.
1228(Integrated by Barry Warsaw, from separate contributions by Martin von
1229Loewis, Peter Funk, and James Henstridge.)
1230
Andrew M. Kuchling35e8afb2000-07-08 12:06:31 +00001231\item{\module{linuxaudiodev}:} Support for the \file{/dev/audio}
1232device on Linux, a twin to the existing \module{sunaudiodev} module.
Andrew M. Kuchlingec1722e2000-10-12 03:04:22 +00001233(Contributed by Peter Bosch, with fixes by Jeremy Hylton.)
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +00001234
1235\item{\module{mmap}:} An interface to memory-mapped files on both
1236Windows and Unix. A file's contents can be mapped directly into
1237memory, at which point it behaves like a mutable string, so its
1238contents can be read and modified. They can even be passed to
1239functions that expect ordinary strings, such as the \module{re}
1240module. (Contributed by Sam Rushing, with some extensions by
1241A.M. Kuchling.)
1242
Andrew M. Kuchling35e8afb2000-07-08 12:06:31 +00001243\item{\module{pyexpat}:} An interface to the Expat XML parser.
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +00001244(Contributed by Paul Prescod.)
1245
1246\item{\module{robotparser}:} Parse a \file{robots.txt} file, which is
1247used for writing Web spiders that politely avoid certain areas of a
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +00001248Web site. The parser accepts the contents of a \file{robots.txt} file,
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +00001249builds a set of rules from it, and can then answer questions about
1250the fetchability of a given URL. (Contributed by Skip Montanaro.)
1251
1252\item{\module{tabnanny}:} A module/script to
Andrew M. Kuchling5e08a012000-09-04 17:59:27 +00001253check Python source code for ambiguous indentation.
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +00001254(Contributed by Tim Peters.)
1255
Andrew M. Kuchlinga5bbb002000-06-10 02:41:46 +00001256\item{\module{UserString}:} A base class useful for deriving objects that behave like strings.
1257
Andrew M. Kuchling08d87c62000-07-09 15:05:15 +00001258\item{\module{webbrowser}:} A module that provides a platform independent
1259way to launch a web browser on a specific URL. For each platform, various
1260browsers are tried in a specific order. The user can alter which browser
1261is launched by setting the \var{BROWSER} environment variable.
1262(Originally inspired by Eric S. Raymond's patch to \module{urllib}
1263which added similar functionality, but
1264the final module comes from code originally
1265implemented by Fred Drake as \file{Tools/idle/BrowserControl.py},
1266and adapted for the standard library by Fred.)
1267
Andrew M. Kuchlingd500e442000-09-06 12:30:25 +00001268\item{\module{_winreg}:} An interface to the
Andrew M. Kuchlingfed4f1e2000-07-01 12:33:43 +00001269Windows registry. \module{_winreg} is an adaptation of functions that
1270have been part of PythonWin since 1995, but has now been added to the core
Andrew M. Kuchlingd500e442000-09-06 12:30:25 +00001271distribution, and enhanced to support Unicode.
1272\module{_winreg} was written by Bill Tutt and Mark Hammond.
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +00001273
1274\item{\module{zipfile}:} A module for reading and writing ZIP-format
1275archives. These are archives produced by \program{PKZIP} on
1276DOS/Windows or \program{zip} on Unix, not to be confused with
1277\program{gzip}-format files (which are supported by the \module{gzip}
1278module)
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +00001279(Contributed by James C. Ahlstrom.)
1280
Andrew M. Kuchling69db0e42000-06-28 02:16:00 +00001281\item{\module{imputil}:} A module that provides a simpler way for
1282writing customised import hooks, in comparison to the existing
1283\module{ihooks} module. (Implemented by Greg Stein, with much
1284discussion on python-dev along the way.)
1285
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +00001286\end{itemize}
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +00001287
1288% ======================================================================
1289\section{IDLE Improvements}
1290
Andrew M. Kuchlingc0328f02000-06-10 15:11:20 +00001291IDLE is the official Python cross-platform IDE, written using Tkinter.
Andrew M. Kuchling730067e2000-06-30 01:44:05 +00001292Python 2.0 includes IDLE 0.6, which adds a number of new features and
Andrew M. Kuchlingc0328f02000-06-10 15:11:20 +00001293improvements. A partial list:
1294
1295\begin{itemize}
1296\item UI improvements and optimizations,
1297especially in the area of syntax highlighting and auto-indentation.
1298
1299\item The class browser now shows more information, such as the top
Andrew M. Kuchling730067e2000-06-30 01:44:05 +00001300level functions in a module.
Andrew M. Kuchlingc0328f02000-06-10 15:11:20 +00001301
1302\item Tab width is now a user settable option. When opening an existing Python
1303file, IDLE automatically detects the indentation conventions, and adapts.
1304
1305\item There is now support for calling browsers on various platforms,
1306used to open the Python documentation in a browser.
1307
1308\item IDLE now has a command line, which is largely similar to
1309the vanilla Python interpreter.
1310
1311\item Call tips were added in many places.
1312
1313\item IDLE can now be installed as a package.
1314
1315\item In the editor window, there is now a line/column bar at the bottom.
1316
1317\item Three new keystroke commands: Check module (Alt-F5), Import
Andrew M. Kuchling730067e2000-06-30 01:44:05 +00001318module (F5) and Run script (Ctrl-F5).
Andrew M. Kuchlingc0328f02000-06-10 15:11:20 +00001319
1320\end{itemize}
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +00001321
1322% ======================================================================
1323\section{Deleted and Deprecated Modules}
1324
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +00001325A few modules have been dropped because they're obsolete, or because
1326there are now better ways to do the same thing. The \module{stdwin}
1327module is gone; it was for a platform-independent windowing toolkit
1328that's no longer developed.
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +00001329
Andrew M. Kuchlinga5bbb002000-06-10 02:41:46 +00001330A number of modules have been moved to the
1331\file{lib-old} subdirectory:
1332\module{cmp}, \module{cmpcache}, \module{dircmp}, \module{dump},
1333\module{find}, \module{grep}, \module{packmail},
1334\module{poly}, \module{util}, \module{whatsound}, \module{zmod}.
1335If you have code which relies on a module that's been moved to
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +00001336\file{lib-old}, you can simply add that directory to \code{sys.path}
Andrew M. Kuchlinga5bbb002000-06-10 02:41:46 +00001337to get them back, but you're encouraged to update any code that uses
1338these modules.
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +00001339
Andrew M. Kuchling730067e2000-06-30 01:44:05 +00001340\section{Acknowledgements}
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +00001341
Andrew M. Kuchlinga6161ed2000-07-01 00:23:02 +00001342The authors would like to thank the following people for offering
Andrew M. Kuchling118ee962000-09-27 01:01:18 +00001343suggestions on drafts of this article: Mark Hammond, Gregg Hauser,
Andrew M. Kuchlingec1722e2000-10-12 03:04:22 +00001344Jeremy Hylton, Fredrik Lundh, Detlef Lannert, Aahz Maruch, Skip
1345Montanaro, Vladimir Marangozov, Guido van Rossum, Neil Schemenauer,
1346and Russ Schmidt.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +00001347
1348\end{document}