Doc/whatsnew/whatsnew22.tex - platform/external/python/cpython3 - Gitiles

 \documentclass{howto}

 % $Id$

 \title{What's New in Python 2.2}
 \release{0.06}
 \author{A.M. Kuchling}
 \authoraddress{\email{akuchlin@mems-exchange.org}}
 \begin{document}
 \maketitle\tableofcontents

 \section{Introduction}

 {\large This document is a draft, and is subject to change until the
 final version of Python 2.2 is released.  Currently it's up to date
 for Python 2.2 alpha 4.  Please send any comments, bug reports, or
 questions, no matter how minor, to \email{akuchlin@mems-exchange.org}.
 }

 This article explains the new features in Python 2.2.

 Python 2.2 can be thought of as the "cleanup release".  There are some
 features such as generators and iterators that are completely new, but
 most of the changes, significant and far-reaching though they may be,
 are aimed at cleaning up irregularities and dark corners of the
 language design.

 This article doesn't attempt to provide a complete specification of
 the new features, but instead provides a convenient overview.  For
 full details, you should refer to the documentation for Python 2.2,
 such as the
 \citetitle[http://python.sourceforge.net/devel-docs/lib/lib.html]{Python
 Library Reference} and the
 \citetitle[http://python.sourceforge.net/devel-docs/ref/ref.html]{Python
 Reference Manual}.
 % XXX These \citetitle marks should get the python.org URLs for the final
 % release, just as soon as the docs are published there.
 If you want to understand the complete implementation and design
 rationale for a change, refer to the PEP for a particular new feature.


 The final release of Python 2.2 is planned for October 2001.

 \begin{seealso}

 \url{http://www.unixreview.com/documents/s=1356/urm0109h/0109h.htm}
 {``What's So Special About Python 2.2?'' is also about the new 2.2
 features, and was written by Cameron Laird and Kathryn Soraiz.}

 \end{seealso}


 %======================================================================
 \section{PEP 252: Type and Class Changes}

 XXX I need to read and digest the relevant PEPs.

 \begin{seealso}

 \seepep{252}{Making Types Look More Like Classes}{Written and implemented
 by Guido van Rossum.}

 \seeurl{http://www.python.org/2.2/descrintro.html}{A tutorial
 on the type/class changes in 2.2.}

 \end{seealso}


 %======================================================================
 \section{PEP 234: Iterators}

 A significant addition to 2.2 is an iteration interface at both the C
 and Python levels.  Objects can define how they can be looped over by
 callers.

 In Python versions up to 2.1, the usual way to make \code{for item in
 obj} work is to define a \method{__getitem__()} method that looks
 something like this:

 \begin{verbatim}
     def __getitem__(self, index):
         return <next item>
 \end{verbatim}

 \method{__getitem__()} is more properly used to define an indexing
 operation on an object so that you can write \code{obj[5]} to retrieve
 the sixth element.  It's a bit misleading when you're using this only
 to support \keyword{for} loops.  Consider some file-like object that
 wants to be looped over; the \var{index} parameter is essentially
 meaningless, as the class probably assumes that a series of
 \method{__getitem__()} calls will be made, with \var{index}
 incrementing by one each time.  In other words, the presence of the
 \method{__getitem__()} method doesn't mean that \code{file[5]} will
 work, though it really should.

 In Python 2.2, iteration can be implemented separately, and
 \method{__getitem__()} methods can be limited to classes that really
 do support random access.  The basic idea of iterators is quite
 simple.  A new built-in function, \function{iter(obj)}, returns an
 iterator for the object \var{obj}.  (It can also take two arguments:
 \code{iter(\var{C}, \var{sentinel})} will call the callable \var{C},
 until it returns \var{sentinel}, which will signal that the iterator
 is done.  This form probably won't be used very often.)

 Python classes can define an \method{__iter__()} method, which should
 create and return a new iterator for the object; if the object is its
 own iterator, this method can just return \code{self}.  In particular,
 iterators will usually be their own iterators.  Extension types
 implemented in C can implement a \code{tp_iter} function in order to
 return an iterator, and extension types that want to behave as
 iterators can define a \code{tp_iternext} function.

 So what do iterators do?  They have one required method,
 \method{next()}, which takes no arguments and returns the next value.
 When there are no more values to be returned, calling \method{next()}
 should raise the \exception{StopIteration} exception.

 \begin{verbatim}
 >>> L = [1,2,3]
 >>> i = iter(L)
 >>> print i
 <iterator object at 0x8116870>
 >>> i.next()
 1
 >>> i.next()
 2
 >>> i.next()
 3
 >>> i.next()
 Traceback (most recent call last):
   File "<stdin>", line 1, in ?
 StopIteration
 >>>
 \end{verbatim}

 In 2.2, Python's \keyword{for} statement no longer expects a sequence;
 it expects something for which \function{iter()} will return something.
 For backward compatibility, and convenience, an iterator is
 automatically constructed for sequences that don't implement
 \method{__iter__()} or a \code{tp_iter} slot, so \code{for i in
 [1,2,3]} will still work.  Wherever the Python interpreter loops over
 a sequence, it's been changed to use the iterator protocol.  This
 means you can do things like this:

 \begin{verbatim}
 >>> i = iter(L)
 >>> a,b,c = i
 >>> a,b,c
 (1, 2, 3)
 >>>
 \end{verbatim}

 Iterator support has been added to some of Python's basic types.
 Calling \function{iter()} on a dictionary will return an iterator
 which loops over its keys:

 \begin{verbatim}
 >>> m = {'Jan': 1, 'Feb': 2, 'Mar': 3, 'Apr': 4, 'May': 5, 'Jun': 6,
 ...      'Jul': 7, 'Aug': 8, 'Sep': 9, 'Oct': 10, 'Nov': 11, 'Dec': 12}
 >>> for key in m: print key, m[key]
 ...
 Mar 3
 Feb 2
 Aug 8
 Sep 9
 May 5
 Jun 6
 Jul 7
 Jan 1
 Apr 4
 Nov 11
 Dec 12
 Oct 10
 >>>
 \end{verbatim}

 That's just the default behaviour.  If you want to iterate over keys,
 values, or key/value pairs, you can explicitly call the
 \method{iterkeys()}, \method{itervalues()}, or \method{iteritems()}
 methods to get an appropriate iterator.  In a minor related change,
 the \keyword{in} operator now works on dictionaries, so
 \code{\var{key} in dict} is now equivalent to
 \code{dict.has_key(\var{key})}.


 Files also provide an iterator, which calls the \method{readline()}
 method until there are no more lines in the file.  This means you can
 now read each line of a file using code like this:

 \begin{verbatim}
 for line in file:
     # do something for each line
 \end{verbatim}

 Note that you can only go forward in an iterator; there's no way to
 get the previous element, reset the iterator, or make a copy of it.
 An iterator object could provide such additional capabilities, but the
 iterator protocol only requires a \method{next()} method.

 \begin{seealso}

 \seepep{234}{Iterators}{Written by Ka-Ping Yee and GvR; implemented
 by the Python Labs crew, mostly by GvR and Tim Peters.}

 \end{seealso}


 %======================================================================
 \section{PEP 255: Simple Generators}

 Generators are another new feature, one that interacts with the
 introduction of iterators.

 You're doubtless familiar with how function calls work in Python or
 C.  When you call a function, it gets a private area where its local
 variables are created.  When the function reaches a \keyword{return}
 statement, the local variables are destroyed and the resulting value
 is returned to the caller.  A later call to the same function will get
 a fresh new set of local variables.  But, what if the local variables
 weren't destroyed on exiting a function?  What if you could later
 resume the function where it left off?  This is what generators
 provide; they can be thought of as resumable functions.

 Here's the simplest example of a generator function:

 \begin{verbatim}
 def generate_ints(N):
     for i in range(N):
         yield i
 \end{verbatim}

 A new keyword, \keyword{yield}, was introduced for generators.  Any
 function containing a \keyword{yield} statement is a generator
 function; this is detected by Python's bytecode compiler which
 compiles the function specially.  Because a new keyword was
 introduced, generators must be explicitly enabled in a module by
 including a \code{from __future__ import generators} statement near
 the top of the module's source code.  In Python 2.3 this statement
 will become unnecessary.

 When you call a generator function, it doesn't return a single value;
 instead it returns a generator object that supports the iterator
 interface.  On executing the \keyword{yield} statement, the generator
 outputs the value of \code{i}, similar to a \keyword{return}
 statement.  The big difference between \keyword{yield} and a
 \keyword{return} statement is that, on reaching a \keyword{yield} the
 generator's state of execution is suspended and local variables are
 preserved.  On the next call to the generator's \code{.next()} method,
 the function will resume executing immediately after the
 \keyword{yield} statement.  (For complicated reasons, the
 \keyword{yield} statement isn't allowed inside the \keyword{try} block
 of a \code{try...finally} statement; read PEP 255 for a full
 explanation of the interaction between \keyword{yield} and
 exceptions.)

 Here's a sample usage of the \function{generate_ints} generator:

 \begin{verbatim}
 >>> gen = generate_ints(3)
 >>> gen
 <generator object at 0x8117f90>
 >>> gen.next()
 0
 >>> gen.next()
 1
 >>> gen.next()
 2
 >>> gen.next()
 Traceback (most recent call last):
   File "<stdin>", line 1, in ?
   File "<stdin>", line 2, in generate_ints
 StopIteration
 >>>
 \end{verbatim}

 You could equally write \code{for i in generate_ints(5)}, or
 \code{a,b,c = generate_ints(3)}.

 Inside a generator function, the \keyword{return} statement can only
 be used without a value, and signals the end of the procession of
 values; afterwards the generator cannot return any further values.
 \keyword{return} with a value, such as \code{return 5}, is a syntax
 error inside a generator function.  The end of the generator's results
 can also be indicated by raising \exception{StopIteration} manually,
 or by just letting the flow of execution fall off the bottom of the
 function.

 You could achieve the effect of generators manually by writing your
 own class and storing all the local variables of the generator as
 instance variables.  For example, returning a list of integers could
 be done by setting \code{self.count} to 0, and having the
 \method{next()} method increment \code{self.count} and return it.
 However, for a moderately complicated generator, writing a
 corresponding class would be much messier.
 \file{Lib/test/test_generators.py} contains a number of more
 interesting examples.  The simplest one implements an in-order
 traversal of a tree using generators recursively.

 \begin{verbatim}
 # A recursive generator that generates Tree leaves in in-order.
 def inorder(t):
     if t:
         for x in inorder(t.left):
             yield x
         yield t.label
         for x in inorder(t.right):
             yield x
 \end{verbatim}

 Two other examples in \file{Lib/test/test_generators.py} produce
 solutions for the N-Queens problem (placing $N$ queens on an $NxN$
 chess board so that no queen threatens another) and the Knight's Tour
 (a route that takes a knight to every square of an $NxN$ chessboard
 without visiting any square twice).

 The idea of generators comes from other programming languages,
 especially Icon (\url{http://www.cs.arizona.edu/icon/}), where the
 idea of generators is central to the language.  In Icon, every
 expression and function call behaves like a generator.  One example
 from ``An Overview of the Icon Programming Language'' at
 \url{http://www.cs.arizona.edu/icon/docs/ipd266.htm} gives an idea of
 what this looks like:

 \begin{verbatim}
 sentence := "Store it in the neighboring harbor"
 if (i := find("or", sentence)) > 5 then write(i)
 \end{verbatim}

 The \function{find()} function returns the indexes at which the
 substring ``or'' is found: 3, 23, 33.  In the \keyword{if} statement,
 \code{i} is first assigned a value of 3, but 3 is less than 5, so the
 comparison fails, and Icon retries it with the second value of 23.  23
 is greater than 5, so the comparison now succeeds, and the code prints
 the value 23 to the screen.

 Python doesn't go nearly as far as Icon in adopting generators as a
 central concept.  Generators are considered a new part of the core
 Python language, but learning or using them isn't compulsory; if they
 don't solve any problems that you have, feel free to ignore them.
 This is different from Icon where the idea of generators is a basic
 concept.  One novel feature of Python's interface as compared to
 Icon's is that a generator's state is represented as a concrete object
 that can be passed around to other functions or stored in a data
 structure.

 \begin{seealso}

 \seepep{255}{Simple Generators}{Written by Neil Schemenauer, Tim
 Peters, Magnus Lie Hetland.  Implemented mostly by Neil Schemenauer
 and Tim Peters, with other fixes from the Python Labs crew.}

 \end{seealso}


 %======================================================================
 \section{PEP 237: Unifying Long Integers and Integers}

 In recent versions, the distinction between regular integers, which
 are 32-bit values on most machines, and long integers, which can be of
 arbitrary size, was becoming an annoyance.  For example, on platforms
 that support large files (files larger than \code{2**32} bytes), the
 \method{tell()} method of file objects has to return a long integer.
 However, there were various bits of Python that expected plain
 integers and would raise an error if a long integer was provided
 instead.  For example, in Python 1.5, only regular integers
 could be used as a slice index, and \code{'abc'[1L:]} would raise a
 \exception{TypeError} exception with the message 'slice index must be
 int'.

 Python 2.2 will shift values from short to long integers as required.
 The 'L' suffix is no longer needed to indicate a long integer literal,
 as now the compiler will choose the appropriate type.  (Using the 'L'
 suffix will be discouraged in future 2.x versions of Python,
 triggering a warning in Python 2.4, and probably dropped in Python
 3.0.)  Many operations that used to raise an \exception{OverflowError}
 will now return a long integer as their result.  For example:

 \begin{verbatim}
 >>> 1234567890123
 1234567890123L
 >>> 2 ** 64
 18446744073709551616L
 \end{verbatim}

 In most cases, integers and long integers will now be treated
 identically.  You can still distinguish them with the
 \function{type()} built-in function, but that's rarely needed.  The
 \function{int()} function will now return a long integer if the value
 is large enough.

 \begin{seealso}

 \seepep{237}{Unifying Long Integers and Integers}{Written by
 Moshe Zadka and Guido van Rossum.  Implemented mostly by Guido van Rossum.}

 \end{seealso}


 %======================================================================
 \section{PEP 238: Changing the Division Operator}

 The most controversial change in Python 2.2 is the start of an effort
 to fix an old design flaw that's been in Python from the beginning.
 Currently Python's division operator, \code{/}, behaves like C's
 division operator when presented with two integer arguments.  It
 returns an integer result that's truncated down when there would be
 fractional part.  For example, \code{3/2} is 1, not 1.5, and
 \code{(-1)/2} is -1, not -0.5.  This means that the results of divison
 can vary unexpectedly depending on the type of the two operands and
 because Python is dynamically typed, it can be difficult to determine
 the possible types of the operands.

 (The controversy is over whether this is \emph{really} a design flaw,
 and whether it's worth breaking existing code to fix this.  It's
 caused endless discussions on python-dev and in July erupted into an
 storm of acidly sarcastic postings on \newsgroup{comp.lang.python}. I
 won't argue for either side here; read PEP 238 for a summary of
 arguments and counter-arguments.)

 Because this change might break code, it's being introduced very
 gradually.  Python 2.2 begins the transition, but the switch won't be
 complete until Python 3.0.

 First, some terminology from PEP 238.  ``True division'' is the
 division that most non-programmers are familiar with: 3/2 is 1.5, 1/4
 is 0.25, and so forth.  ``Floor division'' is what Python's \code{/}
 operator currently does when given integer operands; the result is the
 floor of the value returned by true division.  ``Classic division'' is
 the current mixed behaviour of \code{/}; it returns the result of
 floor division when the operands are integers, and returns the result
 of true division when one of the operands is a floating-point number.

 Here are the changes 2.2 introduces:

 \begin{itemize}

 \item A new operator, \code{//}, is the floor division operator.
 (Yes, we know it looks like \Cpp's comment symbol.)  \code{//}
 \emph{always} returns the floor divison no matter what the types of
 its operands are, so \code{1 // 2} is 0 and \code{1.0 // 2.0} is also
 0.0.

 \code{//} is always available in Python 2.2; you don't need to enable
 it using a \code{__future__} statement.

 \item By including a \code{from __future__ import true_division} in a
 module, the \code{/} operator will be changed to return the result of
 true division, so \code{1/2} is 0.5.  Without the \code{__future__}
 statement, \code{/} still means classic division.  The default meaning
 of \code{/} will not change until Python 3.0.

 \item Classes can define methods called \method{__truediv__} and
 \method{__floordiv__} to overload the two division operators.  At the
 C level, there are also slots in the \code{PyNumberMethods} structure
 so extension types can define the two operators.

 % XXX a warning someday?

 \end{itemize}

 \begin{seealso}

 \seepep{238}{Changing the Division Operator}{Written by Moshe Zadka and
 Guido van Rossum.  Implemented by Guido van Rossum..}

 \end{seealso}


 %======================================================================
 \section{Unicode Changes}

 Python's Unicode support has been enhanced a bit in 2.2.  Unicode
 strings are usually stored as UCS-2, as 16-bit unsigned integers.
 Python 2.2 can also be compiled to use UCS-4, 32-bit unsigned
 integers, as its internal encoding by supplying
 \longprogramopt{enable-unicode=ucs4} to the configure script.  When
 built to use UCS-4 (a ``wide Python''), the interpreter can natively
 handle Unicode characters from U+000000 to U+110000, so the range of
 legal values for the \function{unichr()} function is expanded
 accordingly.  Using an interpreter compiled to use UCS-2 (a ``narrow
 Python''), values greater than 65535 will still cause
 \function{unichr()} to raise a \exception{ValueError} exception.

 All this is the province of the still-unimplemented PEP 261, ``Support
 for `wide' Unicode characters''; consult it for further details, and
 please offer comments on the PEP and on your experiences with the
 2.2 alpha releases.
 % XXX update previous line once 2.2 reaches beta.

 Another change is much simpler to explain. Since their introduction,
 Unicode strings have supported an \method{encode()} method to convert
 the string to a selected encoding such as UTF-8 or Latin-1.  A
 symmetric \method{decode(\optional{\var{encoding}})} method has been
 added to 8-bit strings (though not to Unicode strings) in 2.2.
 \method{decode()} assumes that the string is in the specified encoding
 and decodes it, returning whatever is returned by the codec.

 Using this new feature, codecs have been added for tasks not directly
 related to Unicode.  For example, codecs have been added for
 uu-encoding, MIME's base64 encoding, and compression with the
 \module{zlib} module:

 \begin{verbatim}
 >>> s = """Here is a lengthy piece of redundant, overly verbose,
 ... and repetitive text.
 ... """
 >>> data = s.encode('zlib')
 >>> data
 'x\x9c\r\xc9\xc1\r\x80 \x10\x04\xc0?Ul...'
 >>> data.decode('zlib')
 'Here is a lengthy piece of redundant, overly verbose,\nand repetitive text.\n'
 >>> print s.encode('uu')
 begin 666 <data>
 M2&5R92!I<R!A(&QE;F=T:'D@<&EE8V4@;V8@<F5D=6YD86YT+"!O=F5R;'D@
 >=F5R8F]S92P*86YD(')E<&5T:71I=F4@=&5X="X*

 end
 >>> "sheesh".encode('rot-13')
 'furrfu'
 \end{verbatim}

 \method{encode()} and \method{decode()} were implemented by
 Marc-Andr\'e Lemburg.  The changes to support using UCS-4 internally
 were implemented by Fredrik Lundh and Martin von L\"owis.

 \begin{seealso}

 \seepep{261}{Support for `wide' Unicode characters}{PEP written by
 Paul Prescod.  Not yet accepted or fully implemented.}

 \end{seealso}

 %======================================================================
 \section{PEP 227: Nested Scopes}

 In Python 2.1, statically nested scopes were added as an optional
 feature, to be enabled by a \code{from __future__ import
 nested_scopes} directive.  In 2.2 nested scopes no longer need to be
 specially enabled, but are always enabled.  The rest of this section
 is a copy of the description of nested scopes from my ``What's New in
 Python 2.1'' document; if you read it when 2.1 came out, you can skip
 the rest of this section.

 The largest change introduced in Python 2.1, and made complete in 2.2,
 is to Python's scoping rules.  In Python 2.0, at any given time there
 are at most three namespaces used to look up variable names: local,
 module-level, and the built-in namespace.  This often surprised people
 because it didn't match their intuitive expectations.  For example, a
 nested recursive function definition doesn't work:

 \begin{verbatim}
 def f():
     ...
     def g(value):
         ...
         return g(value-1) + 1
     ...
 \end{verbatim}

 The function \function{g()} will always raise a \exception{NameError}
 exception, because the binding of the name \samp{g} isn't in either
 its local namespace or in the module-level namespace.  This isn't much
 of a problem in practice (how often do you recursively define interior
 functions like this?), but this also made using the \keyword{lambda}
 statement clumsier, and this was a problem in practice.  In code which
 uses \keyword{lambda} you can often find local variables being copied
 by passing them as the default values of arguments.

 \begin{verbatim}
 def find(self, name):
     "Return list of any entries equal to 'name'"
     L = filter(lambda x, name=name: x == name,
                self.list_attribute)
     return L
 \end{verbatim}

 The readability of Python code written in a strongly functional style
 suffers greatly as a result.

 The most significant change to Python 2.2 is that static scoping has
 been added to the language to fix this problem.  As a first effect,
 the \code{name=name} default argument is now unnecessary in the above
 example.  Put simply, when a given variable name is not assigned a
 value within a function (by an assignment, or the \keyword{def},
 \keyword{class}, or \keyword{import} statements), references to the
 variable will be looked up in the local namespace of the enclosing
 scope.  A more detailed explanation of the rules, and a dissection of
 the implementation, can be found in the PEP.

 This change may cause some compatibility problems for code where the
 same variable name is used both at the module level and as a local
 variable within a function that contains further function definitions.
 This seems rather unlikely though, since such code would have been
 pretty confusing to read in the first place.

 One side effect of the change is that the \code{from \var{module}
 import *} and \keyword{exec} statements have been made illegal inside
 a function scope under certain conditions.  The Python reference
 manual has said all along that \code{from \var{module} import *} is
 only legal at the top level of a module, but the CPython interpreter
 has never enforced this before.  As part of the implementation of
 nested scopes, the compiler which turns Python source into bytecodes
 has to generate different code to access variables in a containing
 scope.  \code{from \var{module} import *} and \keyword{exec} make it
 impossible for the compiler to figure this out, because they add names
 to the local namespace that are unknowable at compile time.
 Therefore, if a function contains function definitions or
 \keyword{lambda} expressions with free variables, the compiler will
 flag this by raising a \exception{SyntaxError} exception.

 To make the preceding explanation a bit clearer, here's an example:

 \begin{verbatim}
 x = 1
 def f():
     # The next line is a syntax error
     exec 'x=2'
     def g():
         return x
 \end{verbatim}

 Line 4 containing the \keyword{exec} statement is a syntax error,
 since \keyword{exec} would define a new local variable named \samp{x}
 whose value should be accessed by \function{g()}.

 This shouldn't be much of a limitation, since \keyword{exec} is rarely
 used in most Python code (and when it is used, it's often a sign of a
 poor design anyway).

 \begin{seealso}

 \seepep{227}{Statically Nested Scopes}{Written and implemented by
 Jeremy Hylton.}

 \end{seealso}


 %======================================================================
 \section{New and Improved Modules}

 \begin{itemize}

   \item The \module{xmlrpclib} module was contributed to the standard
   library by Fredrik Lundh.  It provides support for writing XML-RPC
   clients; XML-RPC is a simple remote procedure call protocol built on
   top of HTTP and XML. For example, the following snippet retrieves a
   list of RSS channels from the O'Reilly Network, and then retrieves a
   list of the recent headlines for one channel:

 \begin{verbatim}
 import xmlrpclib
 s = xmlrpclib.Server(
       'http://www.oreillynet.com/meerkat/xml-rpc/server.php')
 channels = s.meerkat.getChannels()
 # channels is a list of dictionaries, like this:
 # [{'id': 4, 'title': 'Freshmeat Daily News'}
 #  {'id': 190, 'title': '32Bits Online'},
 #  {'id': 4549, 'title': '3DGamers'}, ... ]

 # Get the items for one channel
 items = s.meerkat.getItems( {'channel': 4} )

 # 'items' is another list of dictionaries, like this:
 # [{'link': 'http://freshmeat.net/releases/52719/',
 #   'description': 'A utility which converts HTML to XSL FO.',
 #   'title': 'html2fo 0.3 (Default)'}, ... ]
 \end{verbatim}

 The \module{SimpleXMLRPCServer} module makes it easy to create
 straightforward XML-RPC servers.  See \url{http://www.xmlrpc.com/} for
 more information about XML-RPC.

   \item The new \module{hmac} module implements implements the HMAC
   algorithm described by \rfc{2104}.

   \item The \module{socket} module can be compiled to support IPv6;
   specify the \longprogramopt{enable-ipv6} option to Python's configure
   script.  (Contributed by Jun-ichiro ``itojun'' Hagino.)

   \item Two new format characters were added to the \module{struct}
   module for 64-bit integers on platforms that support the C
   \ctype{long long} type.  \samp{q} is for a signed 64-bit integer,
   and \samp{Q} is for an unsigned one.  The value is returned in
   Python's long integer type.  (Contributed by Tim Peters.)

   \item In the interpreter's interactive mode, there's a new built-in
   function \function{help()}, that uses the \module{pydoc} module
   introduced in Python 2.1 to provide interactive.
   \code{help(\var{object})} displays any available help text about
   \var{object}.  \code{help()} with no argument puts you in an online
   help utility, where you can enter the names of functions, classes,
   or modules to read their help text.
   (Contributed by Guido van Rossum, using Ka-Ping Yee's \module{pydoc} module.)

   \item Various bugfixes and performance improvements have been made
   to the SRE engine underlying the \module{re} module.  For example,
   \function{re.sub()} will now use \function{string.replace()}
   automatically when the pattern and its replacement are both just
   literal strings without regex metacharacters.  Another contributed
   patch speeds up certain Unicode character ranges by a factor of
   two. (SRE is maintained by Fredrik Lundh.  The BIGCHARSET patch was
   contributed by Martin von L\"owis.)

   \item The \module{smtplib} module now supports \rfc{2487}, ``Secure
   SMTP over TLS'', so it's now possible to encrypt the SMTP traffic
   between a Python program and the mail transport agent being handed a
   message.  (Contributed by Gerhard H\"aring.)

   \item The \module{imaplib} module, maintained by Piers Lauder, has
   support for several new extensions: the NAMESPACE extension defined
   in \rfc{2342}, SORT, GETACL and SETACL.  (Contributed by Anthony
   Baxter and Michel Pelletier.)

   \item The \module{rfc822} module's parsing of email addresses is now
   compliant with \rfc{2822}, an update to \rfc{822}.  (The module's
   name is \emph{not} going to be changed to \samp{rfc2822}.)  A new
   package, \module{email}, has also been added for parsing and
   generating e-mail messages.  (Contributed by Barry Warsaw, and
   arising out of his work on Mailman.)

   \item New constants \constant{ascii_letters},
   \constant{ascii_lowercase}, and \constant{ascii_uppercase} were
   added to the \module{string} module.  There were several modules in
   the standard library that used \constant{string.letters} to mean the
   ranges A-Za-z, but that assumption is incorrect when locales are in
   use, because \constant{string.letters} varies depending on the set
   of legal characters defined by the current locale.  The buggy
   modules have all been fixed to use \constant{ascii_letters} instead.
   (Reported by an unknown person; fixed by Fred L. Drake, Jr.)

   \item The \module{mimetypes} module now makes it easier to use
   alternative MIME-type databases by the addition of a
   \class{MimeTypes} class, which takes a list of filenames to be
   parsed.  (Contributed by Fred L. Drake, Jr.)

   \item A \class{Timer} class was added to the \module{threading}
   module that allows scheduling an activity to happen at some future
   time.  (Contributed by Itamar Shtull-Trauring.)

 \end{itemize}


 %======================================================================
 \section{Interpreter Changes and Fixes}

 Some of the changes only affect people who deal with the Python
 interpreter at the C level, writing Python extension modules,
 embedding the interpreter, or just hacking on the interpreter itself.
 If you only write Python code, none of the changes described here will
 affect you very much.

 \begin{itemize}

   \item Profiling and tracing functions can now be implemented in C,
   which can operate at much higher speeds than Python-based functions
   and should reduce the overhead of enabling profiling and tracing, so
   it will be of interest to authors of development environments for
   Python.  Two new C functions were added to Python's API,
   \cfunction{PyEval_SetProfile()} and \cfunction{PyEval_SetTrace()}.
   The existing \function{sys.setprofile()} and
   \function{sys.settrace()} functions still exist, and have simply
   been changed to use the new C-level interface.  (Contributed by Fred
   L. Drake, Jr.)

   \item Another low-level API, primarily of interest to implementors
   of Python debuggers and development tools, was added.
   \cfunction{PyInterpreterState_Head()} and
   \cfunction{PyInterpreterState_Next()} let a caller walk through all
   the existing interpreter objects;
   \cfunction{PyInterpreterState_ThreadHead()} and
   \cfunction{PyThreadState_Next()} allow looping over all the thread
   states for a given interpreter.  (Contributed by David Beazley.)

   \item A new \samp{et} format sequence was added to
   \cfunction{PyArg_ParseTuple}; \samp{et} takes both a parameter and
   an encoding name, and converts the parameter to the given encoding
   if the parameter turns out to be a Unicode string, or leaves it
   alone if it's an 8-bit string, assuming it to already be in the
   desired encoding.  This differs from the \samp{es} format character,
   which assumes that 8-bit strings are in Python's default ASCII
   encoding and converts them to the specified new encoding.
   (Contributed by M.-A. Lemburg.)

   \item Two new flags \constant{METH_NOARGS} and \constant{METH_O} are
    available in method definition tables to simplify implementation of
    methods with no arguments or a single untyped argument. Calling
    such methods is more efficient than calling a corresponding method
    that uses \constant{METH_VARARGS}.
    Also, the old \constant{METH_OLDARGS} style of writing C methods is
    now officially deprecated.

 \item
    Two new wrapper functions, \cfunction{PyOS_snprintf()} and
    \cfunction{PyOS_vsnprintf()} were added.  which provide a
    cross-platform implementations for the relatively new
    \cfunction{snprintf()} and \cfunction{vsnprintf()} C lib APIs. In
    contrast to the standard \cfunction{sprintf()} and
    \cfunction{vsprintf()} functions, the Python versions check the
    bounds of the buffer used to protect against buffer overruns.
    (Contributed by M.-A. Lemburg.)

 \end{itemize}


 %======================================================================
 \section{Other Changes and Fixes}

 % XXX update the patch and bug figures as we go
 As usual there were a bunch of other improvements and bugfixes
 scattered throughout the source tree.  A search through the CVS change
 logs finds there were 119 patches applied, and 179 bugs fixed; both
 figures are likely to be underestimates.  Some of the more notable
 changes are:

 \begin{itemize}

   \item The code for the MacOS port for Python, maintained by Jack
   Jansen, is now kept in the main Python CVS tree, and many changes
   have been made to support MacOS X.

 The most significant change is the ability to build Python as a
 framework, enabled by supplying the \longprogramopt{enable-framework}
 option to the configure script when compiling Python.  According to
 Jack Jansen, ``This installs a self-contained Python installation plus
 the OSX framework "glue" into
 \file{/Library/Frameworks/Python.framework} (or another location of
 choice).  For now there is little immediate added benefit to this
 (actually, there is the disadvantage that you have to change your PATH
 to be able to find Python), but it is the basis for creating a
 full-blown Python application, porting the MacPython IDE, possibly
 using Python as a standard OSA scripting language and much more.''

 Most of the MacPython toolbox modules, which interface to MacOS APIs
 such as windowing, QuickTime, scripting, etc. have been ported to OS
 X, but they've been left commented out in setup.py.  People who want
 to experiment with these modules can uncomment them manually.

 % Jack's original comments:
 %The main change is the possibility to build Python as a
 %framework. This installs a self-contained Python installation plus the
 %OSX framework "glue" into /Library/Frameworks/Python.framework (or
 %another location of choice). For now there is little immedeate added
 %benefit to this (actually, there is the disadvantage that you have to
 %change your PATH to be able to find Python), but it is the basis for
 %creating a fullblown Python application, porting the MacPython IDE,
 %possibly using Python as a standard OSA scripting language and much
 %more. You enable this with "configure --enable-framework".

 %The other change is that most MacPython toolbox modules, which
 %interface to all the MacOS APIs such as windowing, quicktime,
 %scripting, etc. have been ported. Again, most of these are not of
 %immedeate use, as they need a full application to be really useful, so
 %they have been commented out in setup.py. People wanting to experiment
 %can uncomment them. Gestalt and Internet Config modules are enabled by
 %default.


   \item Keyword arguments passed to builtin functions that don't take them
   now cause a \exception{TypeError} exception to be raised, with the
   message "\var{function} takes no keyword arguments".

   \item A new script, \file{Tools/scripts/cleanfuture.py} by Tim
   Peters, automatically removes obsolete \code{__future__} statements
   from Python source code.

   \item The new license introduced with Python 1.6 wasn't
   GPL-compatible.  This is fixed by some minor textual changes to the
   2.2 license, so Python can now be embedded inside a GPLed program
   again.  The license changes were also applied to the Python 2.0.1
   and 2.1.1 releases.

   \item When presented with a Unicode filename on Windows, Python will
   now convert it to an MBCS encoded string, as used by the Microsoft
   file APIs.  As MBCS is explicitly used by the file APIs, Python's
   choice of ASCII as the default encoding turns out to be an
   annoyance.
   (Contributed by Mark Hammond with assistance from Marc-Andr\'e
   Lemburg.)

   \item Large file support is now enabled on Windows.  (Contributed by
   Tim Peters.)

   \item The \file{Tools/scripts/ftpmirror.py} script
   now parses a \file{.netrc} file, if you have one.
   (Contributed by Mike Romberg.)

   \item Some features of the object returned by the
   \function{xrange()} function are now deprecated, and trigger
   warnings when they're accessed; they'll disappear in Python 2.3.
   \class{xrange} objects tried to pretend they were full sequence
   types by supporting slicing, sequence multiplication, and the
   \keyword{in} operator, but these features were rarely used and
   therefore buggy.  The \method{tolist()} method and the
   \member{start}, \member{stop}, and \member{step} attributes are also
   being deprecated.  At the C level, the fourth argument to the
   \cfunction{PyRange_New()} function, \samp{repeat}, has also been
   deprecated.

   \item There were a bunch of patches to the dictionary
   implementation, mostly to fix potential core dumps if a dictionary
   contains objects that sneakily changed their hash value, or mutated
   the dictionary they were contained in. For a while python-dev fell
   into a gentle rhythm of Michael Hudson finding a case that dump
   core, Tim Peters fixing it, Michael finding another case, and round
   and round it went.

   \item On Windows, Python can now be compiled with Borland C thanks
   to a number of patches contributed by Stephen Hansen, though the
   result isn't fully functional yet.  (But this \emph{is} progress...)

   \item Another Windows enhancement: Wise Solutions generously offered
   PythonLabs use of their InstallerMaster 8.1 system.  Earlier
   PythonLabs Windows installers used Wise 5.0a, which was beginning to
   show its age.  (Packaged up by Tim Peters.)

   \item Files ending in \samp{.pyw} can now be imported on Windows.
   \samp{.pyw} is a Windows-only thing, used to indicate that a script
   needs to be run using PYTHONW.EXE instead of PYTHON.EXE in order to
   prevent a DOS console from popping up to display the output.  This
   patch makes it possible to import such scripts, in case they're also
   usable as modules.  (Implemented by David Bolen.)

   \item On platforms where Python uses the C \cfunction{dlopen()} function
   to load extension modules, it's now possible to set the flags used
   by \cfunction{dlopen()} using the \function{sys.getdlopenflags()} and
   \function{sys.setdlopenflags()} functions.    (Contributed by Bram Stolk.)

   \item The \function{pow()} built-in function no longer supports 3
   arguments when floating-point numbers are supplied.
   \code{pow(\var{x}, \var{y}, \var{z})} returns \code{(x**y) \% z}, but
   this is never useful for floating point numbers, and the final
   result varies unpredictably depending on the platform.  A call such
   as \code{pow(2.0, 8.0, 7.0)} will now raise a \exception{TypeError}
   exception.

 \end{itemize}


 %======================================================================
 \section{Acknowledgements}

 The author would like to thank the following people for offering
 suggestions and corrections to various drafts of this article: Fred
 Bremmer, Keith Briggs, Fred L. Drake, Jr., Carel Fellinger, Mark
 Hammond, Stephen Hansen, Jack Jansen, Marc-Andr\'e Lemburg, Tim Peters, Neil
 Schemenauer, Guido van Rossum.

 \end{document}