blob: 6e098ecfbefdab4549d64f523e66508152e377d3 [file] [log] [blame]
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +00001\documentclass{howto}
2
3\title{What's New in Python 1.6}
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +00004\release{0.02}
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +00005\author{A.M. Kuchling and Moshe Zadka}
6\authoraddress{\email{amk1@bigfoot.com}, \email{moshez@math.huji.ac.il} }
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +00007\begin{document}
8\maketitle\tableofcontents
9
10\section{Introduction}
11
Andrew M. Kuchlinga5bbb002000-06-10 02:41:46 +000012{\large This is a draft document; please report inaccuracies and
13omissions to the authors. \\
14XXX marks locations where fact-checking or rewriting is still needed.
15}
16
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +000017A new release of Python, version 1.6, will be released some time this
18summer. Alpha versions are already available from
19\url{http://www.python.org/1.6/}. This article talks about the
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +000020exciting new features in 1.6, highlights some other useful changes,
21and points out a few incompatible changes that may require rewriting
22code.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +000023
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +000024Python's development never completely stops between releases, and a
25steady flow of bug fixes and improvements are always being submitted.
26A host of minor fixes, a few optimizations, additional docstrings, and
27better error messages went into 1.6; to list them all would be
28impossible, but they're certainly significant. Consult the
29publicly-available CVS logs if you want to see the full list.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +000030
31% ======================================================================
32\section{Unicode}
33
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +000034The largest new feature in Python 1.6 is a new fundamental data type:
35Unicode strings. Unicode uses 16-bit numbers to represent characters
36instead of the 8-bit number used by ASCII, meaning that 65,536
37distinct characters can be supported.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +000038
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +000039The final interface for Unicode support was arrived at through
Andrew M. Kuchlingb853ea02000-06-03 03:06:58 +000040countless often-stormy discussions on the python-dev mailing list, and
41mostly implemented by Marc-Andr\'e Lemburg. A detailed explanation of
42the interface is in the file
43\file{Misc/unicode.txt} in the Python source distribution; it's also
44available on the Web at
45\url{http://starship.python.net/crew/lemburg/unicode-proposal.txt}.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +000046This article will simply cover the most significant points from the
47full interface.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +000048
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +000049In Python source code, Unicode strings are written as
50\code{u"string"}. Arbitrary Unicode characters can be written using a
Andrew M. Kuchlingb853ea02000-06-03 03:06:58 +000051new escape sequence, \code{\e u\var{HHHH}}, where \var{HHHH} is a
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000524-digit hexadecimal number from 0000 to FFFF. The existing
Andrew M. Kuchlingb853ea02000-06-03 03:06:58 +000053\code{\e x\var{HHHH}} escape sequence can also be used, and octal
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +000054escapes can be used for characters up to U+01FF, which is represented
Andrew M. Kuchlingb853ea02000-06-03 03:06:58 +000055by \code{\e 777}.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +000056
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +000057Unicode strings, just like regular strings, are an immutable sequence
58type, so they can be indexed and sliced. They also have an
Andrew M. Kuchlingc0328f02000-06-10 15:11:20 +000059\method{encode( \optional{\var{encoding}} )} method that returns an
608-bit string in the desired encoding. Encodings are named by strings,
61such as \code{'ascii'}, \code{'utf-8'}, \code{'iso-8859-1'}, or
62whatever. A codec API is defined for implementing and registering new
63encodings that are then available throughout a Python program. If an
64encoding isn't specified, the default encoding is usually 7-bit ASCII,
65though it can be changed for your Python installation by calling the
66\function{sys.setdefaultencoding(\var{encoding})} function in a
67customized version of \file{site.py}.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +000068
69Combining 8-bit and Unicode strings always coerces to Unicode, using
70the default ASCII encoding; the result of \code{'a' + u'bc'} is
Andrew M. Kuchling7f6270d2000-06-09 02:48:18 +000071\code{u'abc'}.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +000072
73New built-in functions have been added, and existing built-ins
74modified to support Unicode:
75
76\begin{itemize}
77\item \code{unichr(\var{ch})} returns a Unicode string 1 character
78long, containing the character \var{ch}.
79
80\item \code{ord(\var{u})}, where \var{u} is a 1-character regular or Unicode string, returns the number of the character as an integer.
81
Andrew M. Kuchlingb853ea02000-06-03 03:06:58 +000082\item \code{unicode(\var{string}, \optional{\var{encoding},}
83\optional{\var{errors}} ) } creates a Unicode string from an 8-bit
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +000084string. \code{encoding} is a string naming the encoding to use.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +000085The \code{errors} parameter specifies the treatment of characters that
86are invalid for the current encoding; passing \code{'strict'} as the
87value causes an exception to be raised on any encoding error, while
88\code{'ignore'} causes errors to be silently ignored and
89\code{'replace'} uses U+FFFD, the official replacement character, in
90case of any problems.
91
92\end{itemize}
93
94A new module, \module{unicodedata}, provides an interface to Unicode
95character properties. For example, \code{unicodedata.category(u'A')}
96returns the 2-character string 'Lu', the 'L' denoting it's a letter,
97and 'u' meaning that it's uppercase.
Andrew M. Kuchlingb853ea02000-06-03 03:06:58 +000098\code{u.bidirectional(u'\e x0660')} returns 'AN', meaning that U+0660 is
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +000099an Arabic number.
100
Andrew M. Kuchlingb853ea02000-06-03 03:06:58 +0000101The \module{codecs} module contains functions to look up existing encodings
102and register new ones. Unless you want to implement a
103new encoding, you'll most often use the
104\function{codecs.lookup(\var{encoding})} function, which returns a
1054-element tuple: \code{(\var{encode_func},
106\var{decode_func}, \var{stream_reader}, \var{stream_writer})}.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000107
108\begin{itemize}
109\item \var{encode_func} is a function that takes a Unicode string, and
110returns a 2-tuple \code{(\var{string}, \var{length})}. \var{string}
111is an 8-bit string containing a portion (perhaps all) of the Unicode
112string converted into the given encoding, and \var{length} tells you how much of the Unicode string was converted.
113
114\item \var{decode_func} is the mirror of \var{encode_func},
115taking a Unicode string and
116returns a 2-tuple \code{(\var{ustring}, \var{length})} containing a Unicode string
117and \var{length} telling you how much of the string was consumed.
118
119\item \var{stream_reader} is a class that supports decoding input from
120a stream. \var{stream_reader(\var{file_obj})} returns an object that
121supports the \method{read()}, \method{readline()}, and
122\method{readlines()} methods. These methods will all translate from
123the given encoding and return Unicode strings.
124
125\item \var{stream_writer}, similarly, is a class that supports
126encoding output to a stream. \var{stream_writer(\var{file_obj})}
127returns an object that supports the \method{write()} and
128\method{writelines()} methods. These methods expect Unicode strings, translating them to the given encoding on output.
129\end{itemize}
130
131For example, the following code writes a Unicode string into a file,
132encoding it as UTF-8:
133
134\begin{verbatim}
135import codecs
136
137unistr = u'\u0660\u2000ab ...'
138
139(UTF8_encode, UTF8_decode,
140 UTF8_streamreader, UTF8_streamwriter) = codecs.lookup('UTF-8')
141
142output = UTF8_streamwriter( open( '/tmp/output', 'wb') )
143output.write( unistr )
144output.close()
145\end{verbatim}
146
147The following code would then read UTF-8 input from the file:
148
149\begin{verbatim}
150input = UTF8_streamread( open( '/tmp/output', 'rb') )
151print repr(input.read())
152input.close()
153\end{verbatim}
154
155Unicode-aware regular expressions are available through the
156\module{re} module, which has a new underlying implementation called
157SRE written by Fredrik Lundh of Secret Labs AB.
158
Andrew M. Kuchlingc0328f02000-06-10 15:11:20 +0000159A \code{-U} command line option was added which causes the Python
160compiler to interpret all string literals as Unicode string literals.
161This is intended to be used in testing and future-proofing your Python
162code, since some future version of Python may drop support for 8-bit
163strings and provide only Unicode strings.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000164
165% ======================================================================
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000166\section{Distutils: Making Modules Easy to Install}
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000167
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000168Before Python 1.6, installing modules was a tedious affair -- there
169was no way to figure out automatically where Python is installed, or
170what compiler options to use for extension modules. Software authors
171had to go through an ardous ritual of editing Makefiles and
172configuration files, which only really work on Unix and leave Windows
173and MacOS unsupported. Software users faced wildly differing
174installation instructions
175
176The SIG for distribution utilities, shepherded by Greg Ward, has
177created the Distutils, a system to make package installation much
Andrew M. Kuchlingb853ea02000-06-03 03:06:58 +0000178easier. They form the \module{distutils} package, a new part of
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000179Python's standard library. In the best case, installing a Python
180module from source will require the same steps: first you simply mean
181unpack the tarball or zip archive, and the run ``\code{python setup.py
182install}''. The platform will be automatically detected, the compiler
183will be recognized, C extension modules will be compiled, and the
184distribution installed into the proper directory. Optional
185command-line arguments provide more control over the installation
186process, the distutils package offers many places to override defaults
187-- separating the build from the install, building or installing in
188non-default directories, and more.
189
190In order to use the Distutils, you need to write a \file{setup.py}
191script. For the simple case, when the software contains only .py
192files, a minimal \file{setup.py} can be just a few lines long:
193
194\begin{verbatim}
195from distutils.core import setup
196setup (name = "foo", version = "1.0",
197 py_modules = ["module1", "module2"])
198\end{verbatim}
199
200The \file{setup.py} file isn't much more complicated if the software
201consists of a few packages:
202
203\begin{verbatim}
204from distutils.core import setup
205setup (name = "foo", version = "1.0",
206 packages = ["package", "package.subpackage"])
207\end{verbatim}
208
209A C extension can be the most complicated case; here's an example taken from
210the PyXML package:
211
212
213\begin{verbatim}
214from distutils.core import setup, Extension
215
216expat_extension = Extension('xml.parsers.pyexpat',
217 define_macros = [('XML_NS', None)],
218 include_dirs = [ 'extensions/expat/xmltok',
219 'extensions/expat/xmlparse' ],
220 sources = [ 'extensions/pyexpat.c',
221 'extensions/expat/xmltok/xmltok.c',
222 'extensions/expat/xmltok/xmlrole.c',
223 ]
224 )
225setup (name = "PyXML", version = "0.5.4",
226 ext_modules =[ expat_extension ] )
227
228\end{verbatim}
229
230The Distutils can also take care of creating source and binary
231distributions. The ``sdist'' command, run by ``\code{python setup.py
232sdist}', builds a source distribution such as \file{foo-1.0.tar.gz}.
233Adding new commands isn't difficult, and a ``bdist_rpm'' command has
234already been contributed to create an RPM distribution for the
235software. Commands to create Windows installer programs, Debian
236packages, and Solaris .pkg files have been discussed and are in
237various stages of development.
238
239All this is documented in a new manual, \textit{Distributing Python
Andrew M. Kuchlinga5bbb002000-06-10 02:41:46 +0000240Modules}, that joins the basic set of Python documentation.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000241
242% ======================================================================
243\section{String Methods}
244
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000245Until now string-manipulation functionality was in the \module{string}
246Python module, which was usually a front-end for the \module{strop}
247module written in C. The addition of Unicode posed a difficulty for
248the \module{strop} module, because the functions would all need to be
249rewritten in order to accept either 8-bit or Unicode strings. For
250functions such as \function{string.replace()}, which takes 3 string
251arguments, that means eight possible permutations, and correspondingly
252complicated code.
253
254Instead, Python 1.6 pushes the problem onto the string type, making
255string manipulation functionality available through methods on both
2568-bit strings and Unicode strings.
257
258\begin{verbatim}
259>>> 'andrew'.capitalize()
260'Andrew'
261>>> 'hostname'.replace('os', 'linux')
262'hlinuxtname'
263>>> 'moshe'.find('sh')
2642
265\end{verbatim}
266
267One thing that hasn't changed, April Fools' jokes notwithstanding, is
268that Python strings are immutable. Thus, the string methods return new
269strings, and do not modify the string on which they operate.
270
271The old \module{string} module is still around for backwards
272compatibility, but it mostly acts as a front-end to the new string
273methods.
274
275Two methods which have no parallel in pre-1.6 versions, although they
276did exist in JPython for quite some time, are \method{startswith()}
277and \method{endswith}. \code{s.startswith(t)} is equivalent to \code{s[:len(t)]
278== t}, while \code{s.endswith(t)} is equivalent to \code{s[-len(t):] == t}.
279
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000280(XXX what'll happen to join? is this even worth mentioning?) One
281other method which deserves special mention is \method{join}. The
282\method{join} method of a string receives one parameter, a sequence of
283strings, and is equivalent to the \function{string.join} function from
284the old \module{string} module, with the arguments reversed. In other
285words, \code{s.join(seq)} is equivalent to the old
286\code{string.join(seq, s)}.
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000287
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000288% ======================================================================
289\section{Porting to 1.6}
290
291New Python releases try hard to be compatible with previous releases,
292and the record has been pretty good. However, some changes are
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000293considered useful enough, often fixing initial design decisions that
294turned to be actively mistaken, that breaking backward compatibility
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000295can't always be avoided. This section lists the changes in Python 1.6
296that may cause old Python code to break.
297
298The change which will probably break the most code is tightening up
299the arguments accepted by some methods. Some methods would take
300multiple arguments and treat them as a tuple, particularly various
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000301list methods such as \method{.append()} and \method{.insert()}.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000302In earlier versions of Python, if \code{L} is a list, \code{L.append(
3031,2 )} appends the tuple \code{(1,2)} to the list. In Python 1.6 this
304causes a \exception{TypeError} exception to be raised, with the
305message: 'append requires exactly 1 argument; 2 given'. The fix is to
306simply add an extra set of parentheses to pass both values as a tuple:
307\code{L.append( (1,2) )}.
308
309The earlier versions of these methods were more forgiving because they
310used an old function in Python's C interface to parse their arguments;
3111.6 modernizes them to use \function{PyArg_ParseTuple}, the current
312argument parsing function, which provides more helpful error messages
313and treats multi-argument calls as errors. If you absolutely must use
3141.6 but can't fix your code, you can edit \file{Objects/listobject.c}
315and define the preprocessor symbol \code{NO_STRICT_LIST_APPEND} to
316preserve the old behaviour; this isn't recommended.
317
318Some of the functions in the \module{socket} module are still
319forgiving in this way. For example, \function{socket.connect(
320('hostname', 25) )} is the correct form, passing a tuple representing
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000321an IP address, but \function{socket.connect( 'hostname', 25 )} also
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000322works. \function{socket.connect_ex()} and \function{socket.bind()} are
323similarly easy-going. 1.6alpha1 tightened these functions up, but
324because the documentation actually used the erroneous multiple
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000325argument form, many people wrote code which would break with the
326stricter checking. GvR backed out the changes in the face of public
327reaction, so for the\module{socket} module, the documentation was
328fixed and the multiple argument form is simply marked as deprecated;
329it \emph{will} be tightened up again in a future Python version.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000330
331Some work has been done to make integers and long integers a bit more
332interchangeable. In 1.5.2, large-file support was added for Solaris,
333to allow reading files larger than 2Gb; this made the \method{tell()}
334method of file objects return a long integer instead of a regular
335integer. Some code would subtract two file offsets and attempt to use
336the result to multiply a sequence or slice a string, but this raised a
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000337\exception{TypeError}. In 1.6, long integers can be used to multiply
338or slice a sequence, and it'll behave as you'd intuitively expect it
339to; \code{3L * 'abc'} produces 'abcabcabc', and \code{
340(0,1,2,3)[2L:4L]} produces (2,3). Long integers can also be used in
341various new places where previously only integers were accepted, such
342as in the \method{seek()} method of file objects.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000343
344The subtlest long integer change of all is that the \function{str()}
345of a long integer no longer has a trailing 'L' character, though
346\function{repr()} still includes it. The 'L' annoyed many people who
347wanted to print long integers that looked just like regular integers,
348since they had to go out of their way to chop off the character. This
349is no longer a problem in 1.6, but code which assumes the 'L' is
350there, and does \code{str(longval)[:-1]} will now lose the final
351digit.
352
353Taking the \function{repr()} of a float now uses a different
354formatting precision than \function{str()}. \function{repr()} uses
355``%.17g'' format string for C's \function{sprintf()}, while
356\function{str()} uses ``%.12g'' as before. The effect is that
357\function{repr()} may occasionally show more decimal places than
358\function{str()}, for numbers
Andrew M. Kuchlinga5bbb002000-06-10 02:41:46 +0000359For example, the number 8.1 can't be represented exactly in binary, so
360\code{repr(8.1)} is \code{'8.0999999999999996'}, while str(8.1) is
361\code{'8.1'}.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000362
Andrew M. Kuchlinga5bbb002000-06-10 02:41:46 +0000363%The \code{-X} command-line option, which turns all standard exceptions
364%into strings instead of classes, has been removed.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000365
366% ======================================================================
367\section{Core Changes}
368
Andrew M. Kuchling5b8311e2000-05-31 03:28:42 +0000369Various minor changes have been made to Python's syntax and built-in
370functions. None of the changes are very far-reaching, but they're
371handy conveniences.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000372
Andrew M. Kuchling5b8311e2000-05-31 03:28:42 +0000373A change to syntax makes it more convenient to call a given function
374with a tuple of arguments and/or a dictionary of keyword arguments.
Andrew M. Kuchlingb853ea02000-06-03 03:06:58 +0000375In Python 1.5 and earlier, you do this with the \function{apply()}
Andrew M. Kuchling5b8311e2000-05-31 03:28:42 +0000376built-in function: \code{apply(f, \var{args}, \var{kw})} calls the
377function \function{f()} with the argument tuple \var{args} and the
378keyword arguments in the dictionary \var{kw}. Thanks to a patch from
379Greg Ewing, 1.6 adds \code{f(*\var{args}, **\var{kw})} as a shorter
380and clearer way to achieve the same effect. This syntax is
381symmetrical with the syntax for defining functions:
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000382
Andrew M. Kuchling5b8311e2000-05-31 03:28:42 +0000383\begin{verbatim}
384def f(*args, **kw):
385 # args is a tuple of positional args,
386 # kw is a dictionary of keyword args
387 ...
388\end{verbatim}
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000389
Andrew M. Kuchlingb853ea02000-06-03 03:06:58 +0000390A new format style is available when using the \code{\%} operator.
Andrew M. Kuchling5b8311e2000-05-31 03:28:42 +0000391'\%r' will insert the \function{repr()} of its argument. This was
392also added from symmetry considerations, this time for symmetry with
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000393the existing '\%s' format style, which inserts the \function{str()} of
Andrew M. Kuchlingb853ea02000-06-03 03:06:58 +0000394its argument. For example, \code{'\%r \%s' \% ('abc', 'abc')} returns a
Andrew M. Kuchling5b8311e2000-05-31 03:28:42 +0000395string containing \verb|'abc' abc|.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000396
Andrew M. Kuchlingb853ea02000-06-03 03:06:58 +0000397The \function{int()} and \function{long()} functions now accept an
Andrew M. Kuchling5b8311e2000-05-31 03:28:42 +0000398optional ``base'' parameter when the first argument is a string.
399\code{int('123', 10)} returns 123, while \code{int('123', 16)} returns
400291. \code{int(123, 16)} raises a \exception{TypeError} exception
401with the message ``can't convert non-string with explicit base''.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000402
Andrew M. Kuchling5b8311e2000-05-31 03:28:42 +0000403Previously there was no way to implement a class that overrode
Andrew M. Kuchlingb853ea02000-06-03 03:06:58 +0000404Python's built-in \keyword{in} operator and implemented a custom
Andrew M. Kuchling5b8311e2000-05-31 03:28:42 +0000405version. \code{\var{obj} in \var{seq}} returns true if \var{obj} is
406present in the sequence \var{seq}; Python computes this by simply
407trying every index of the sequence until either \var{obj} is found or
408an \exception{IndexError} is encountered. Moshe Zadka contributed a
409patch which adds a \method{__contains__} magic method for providing a
Andrew M. Kuchlingb853ea02000-06-03 03:06:58 +0000410custom implementation for \keyword{in}. Additionally, new built-in
411objects written in C can define what \keyword{in} means for them via a
412new slot in the sequence protocol.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000413
Andrew M. Kuchling5b8311e2000-05-31 03:28:42 +0000414Earlier versions of Python used a recursive algorithm for deleting
415objects. Deeply nested data structures could cause the interpreter to
416fill up the C stack and crash; Christian Tismer rewrote the deletion
417logic to fix this problem. On a related note, comparing recursive
418objects recursed infinitely and crashed; Jeremy Hylton rewrote the
419code to no longer crash, producing a useful result instead. For
420example, after this code:
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000421
Andrew M. Kuchling5b8311e2000-05-31 03:28:42 +0000422\begin{verbatim}
423a = []
424b = []
425a.append(a)
426b.append(b)
427\end{verbatim}
428
429The comparison \code{a==b} returns true, because the two recursive
430data structures are isomorphic.
431\footnote{See the thread ``trashcan and PR\#7'' in the April 2000 archives of the python-dev mailing list for the discussion leading up to this implementation, and some useful relevant links.
432%http://www.python.org/pipermail/python-dev/2000-April/004834.html
433}
434
435Work has been done on porting Python to 64-bit Windows on the Itanium
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000436processor, mostly by Trent Mick of ActiveState. (Confusingly, \code{sys.platform} is still \code{'win32'} on
437Win64 because it seems that for ease of porting, MS Visual C++ treats code
438as 32 bit.
439) PythonWin also supports Windows CE; see the Python CE page at
Andrew M. Kuchling5b8311e2000-05-31 03:28:42 +0000440\url{http://www.python.net/crew/mhammond/ce/} for more information.
441
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000442An attempt has been made to alleviate one of Python's warts, the
443often-confusing \exception{NameError} exception when code refers to a
444local variable before the variable has been assigned a value. For
445example, the following code raises an exception on the \keyword{print}
446statement in both 1.5.2 and 1.6; in 1.5.2 a \exception{NameError}
447exception is raised, while 1.6 raises \exception{UnboundLocalError}.
448
449\begin{verbatim}
450def f():
451 print "i=",i
452 i = i + 1
453f()
454\end{verbatim}
Andrew M. Kuchling5b8311e2000-05-31 03:28:42 +0000455
456A new variable holding more detailed version information has been
457added to the \module{sys} module. \code{sys.version_info} is a tuple
458\code{(\var{major}, \var{minor}, \var{micro}, \var{level},
459\var{serial})} For example, in 1.6a2 \code{sys.version_info} is
460\code{(1, 6, 0, 'alpha', 2)}. \var{level} is a string such as
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000461\code{"alpha"}, \code{"beta"}, or \code{""} for a final release.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000462
463% ======================================================================
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000464\section{Extending/Embedding Changes}
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000465
466Some of the changes are under the covers, and will only be apparent to
467people writing C extension modules, or embedding a Python interpreter
468in a larger application. If you aren't dealing with Python's C API,
Andrew M. Kuchling5b8311e2000-05-31 03:28:42 +0000469you can safely skip this section.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000470
Andrew M. Kuchlinga5bbb002000-06-10 02:41:46 +0000471The version number of the Python C API was incremented, so C
472extensions compiled for 1.5.2 must be recompiled in order to work with
4731.6. On Windows, attempting to import a third party extension built
474for Python 1.5.x usually results in an immediate crash; there's not
475much we can do about this. (XXX can anyone tell me why it crashes?)
476
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000477Users of Jim Fulton's ExtensionClass module will be pleased to find
478out that hooks have been added so that ExtensionClasses are now
479supported by \function{isinstance()} and \function{issubclass()}.
480This means you no longer have to remember to write code such as
481\code{if type(obj) == myExtensionClass}, but can use the more natural
482\code{if isinstance(obj, myExtensionClass)}.
483
Andrew M. Kuchlingb853ea02000-06-03 03:06:58 +0000484The \file{Python/importdl.c} file, which was a mass of \#ifdefs to
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000485support dynamic loading on many different platforms, was cleaned up
486are reorganized by Greg Stein. \file{importdl.c} is now quite small,
487and platform-specific code has been moved into a bunch of
488\file{Python/dynload_*.c} files.
489
490Vladimir Marangozov's long-awaited malloc restructuring was completed,
491to make it easy to have the Python interpreter use a custom allocator
492instead of C's standard \function{malloc()}. For documentation, read
493the comments in \file{Include/mymalloc.h} and
494\file{Include/objimpl.h}. For the lengthy discussions during which
495the interface was hammered out, see the Web archives of the 'patches'
496and 'python-dev' lists at python.org.
497
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000498Recent versions of the GUSI development environment for MacOS support
499POSIX threads. Therefore, Python's POSIX threading support now works
500on the Macintosh. Threading support using the user-space GNU \texttt{pth}
501library was also contributed.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000502
503Threading support on Windows was enhanced, too. Windows supports
504thread locks that use kernel objects only in case of contention; in
505the common case when there's no contention, they use simpler functions
506which are an order of magnitude faster. A threaded version of Python
5071.5.2 on NT is twice as slow as an unthreaded version; with the 1.6
508changes, the difference is only 10\%. These improvements were
509contributed by Yakov Markovitch.
510
511% ======================================================================
512\section{Module changes}
513
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000514Lots of improvements and bugfixes were made to Python's extensive
515standard library; some of the affected modules include
516\module{readline}, \module{ConfigParser}, \module{cgi},
517\module{calendar}, \module{posix}, \module{readline}, \module{xmllib},
518\module{aifc}, \module{chunk, wave}, \module{random}, \module{shelve},
519and \module{nntplib}. Consult the CVS logs for the exact
520patch-by-patch details.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000521
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000522Brian Gallew contributed OpenSSL support for the \module{socket}
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000523module. OpenSSL is an implementation of the Secure Socket Layer,
524which encrypts the data being sent over a socket. When compiling
525Python, you can edit \file{Modules/Setup} to include SSL support,
526which adds an additional function to the \module{socket} module:
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000527\function{socket.ssl(\var{socket}, \var{keyfile}, \var{certfile})},
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000528which takes a socket object and returns an SSL socket. The
529\module{httplib} and \module{urllib} modules were also changed to
530support ``https://'' URLs, though no one has implemented FTP or SMTP
531over SSL.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000532
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000533The \module{Tkinter} module now supports Tcl/Tk version 8.1, 8.2, or
5348.3, and support for the older 7.x versions has been dropped. The
535Tkinter module also supports displaying Unicode strings in Tk
536widgets.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000537
Andrew M. Kuchlingfa33a4e2000-06-03 02:52:40 +0000538The \module{curses} module has been greatly extended, starting from
539Oliver Andrich's enhanced version, to provide many additional
540functions from ncurses and SYSV curses, such as colour, alternative
541character set support, pads, and other new features. This means the
542module is no longer compatible with operating systems that only have
543BSD curses, but there don't seem to be any currently maintained OSes
544that fall into this category.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000545
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000546As mentioned in the earlier discussion of 1.6's Unicode support, the
547underlying implementation of the regular expressions provided by the
548\module{re} module has been changed. SRE, a new regular expression
549engine written by Fredrik Lundh and partially funded by Hewlett
550Packard, supports matching against both 8-bit strings and Unicode
551strings.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000552
553% ======================================================================
554\section{New modules}
555
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000556A number of new modules were added. We'll simply list them with brief
557descriptions; consult the 1.6 documentation for the details of a
558particular module.
559
560\begin{itemize}
561
562\item{\module{codecs}, \module{encodings}, \module{unicodedata}:} Added as part of the new Unicode support.
563
564\item{\module{filecmp}:} Supersedes the old \module{cmp} and
565\module{dircmp} modules, which have now become deprecated.
Andrew M. Kuchlingc0328f02000-06-10 15:11:20 +0000566(Contributed by Gordon MacMillan and Moshe Zadka.)
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000567
568\item{\module{linuxaudio}:} Support for the \file{/dev/audio} device on Linux,
569a twin to the existing \module{sunaudiodev} module.
570(Contributed by Peter Bosch.)
571
572\item{\module{mmap}:} An interface to memory-mapped files on both
573Windows and Unix. A file's contents can be mapped directly into
574memory, at which point it behaves like a mutable string, so its
575contents can be read and modified. They can even be passed to
576functions that expect ordinary strings, such as the \module{re}
577module. (Contributed by Sam Rushing, with some extensions by
578A.M. Kuchling.)
579
580\item{\module{PyExpat}:} An interface to the Expat XML parser.
581(Contributed by Paul Prescod.)
582
583\item{\module{robotparser}:} Parse a \file{robots.txt} file, which is
584used for writing Web spiders that politely avoid certain areas of a
585Web site. The parser accepts the contents of a \file{robots.txt} file
586builds a set of rules from it, and can then answer questions about
587the fetchability of a given URL. (Contributed by Skip Montanaro.)
588
589\item{\module{tabnanny}:} A module/script to
590checks Python source code for ambiguous indentation.
591(Contributed by Tim Peters.)
592
Andrew M. Kuchlinga5bbb002000-06-10 02:41:46 +0000593\item{\module{UserString}:} A base class useful for deriving objects that behave like strings.
594
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000595\item{\module{winreg}:} An interface to the Windows registry.
596\module{winreg} has been part of PythonWin since 1995, but now has
597been added to the core distribution, and enhanced to support Unicode.
598(Contributed by Bill Tutt and Mark Hammond.)
599
600\item{\module{zipfile}:} A module for reading and writing ZIP-format
601archives. These are archives produced by \program{PKZIP} on
602DOS/Windows or \program{zip} on Unix, not to be confused with
603\program{gzip}-format files (which are supported by the \module{gzip}
604module)
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000605(Contributed by James C. Ahlstrom.)
606
607\end{itemize}
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000608
609% ======================================================================
610\section{IDLE Improvements}
611
Andrew M. Kuchlingc0328f02000-06-10 15:11:20 +0000612IDLE is the official Python cross-platform IDE, written using Tkinter.
613Python 1.6 includes IDLE 0.6, which adds a number of new features and
614improvements. A partial list:
615
616\begin{itemize}
617\item UI improvements and optimizations,
618especially in the area of syntax highlighting and auto-indentation.
619
620\item The class browser now shows more information, such as the top
621level functions in a module (XXX did I interpret that right?).
622
623\item Tab width is now a user settable option. When opening an existing Python
624file, IDLE automatically detects the indentation conventions, and adapts.
625
626\item There is now support for calling browsers on various platforms,
627used to open the Python documentation in a browser.
628
629\item IDLE now has a command line, which is largely similar to
630the vanilla Python interpreter.
631
632\item Call tips were added in many places.
633
634\item IDLE can now be installed as a package.
635
636\item In the editor window, there is now a line/column bar at the bottom.
637
638\item Three new keystroke commands: Check module (Alt-F5), Import
639module (F5) and Run script (Ctrl-F5)
640
641\end{itemize}
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000642
643% ======================================================================
644\section{Deleted and Deprecated Modules}
645
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000646A few modules have been dropped because they're obsolete, or because
647there are now better ways to do the same thing. The \module{stdwin}
648module is gone; it was for a platform-independent windowing toolkit
649that's no longer developed.
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000650
Andrew M. Kuchlinga5bbb002000-06-10 02:41:46 +0000651A number of modules have been moved to the
652\file{lib-old} subdirectory:
653\module{cmp}, \module{cmpcache}, \module{dircmp}, \module{dump},
654\module{find}, \module{grep}, \module{packmail},
655\module{poly}, \module{util}, \module{whatsound}, \module{zmod}.
656If you have code which relies on a module that's been moved to
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000657\file{lib-old}, you can simply add that directory to \code{sys.path}
Andrew M. Kuchlinga5bbb002000-06-10 02:41:46 +0000658to get them back, but you're encouraged to update any code that uses
659these modules.
Andrew M. Kuchling6c3cd8d2000-06-10 02:24:31 +0000660
661XXX any others deleted?
662
663XXX Other candidates for deletion in 1.6: sgimodule.c, glmodule.c (and hence
664cgenmodule.c), imgfile.c, svmodule.c, flmodule.c, fmmodule.c, almodule.c, clmodule.c,
665 knee.py.
Andrew M. Kuchling25bfd0e2000-05-27 11:28:26 +0000666
667\end{document}
668