Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 1 | \documentclass{howto} |
| 2 | |
| 3 | \title{What's New in Python 1.6} |
| 4 | \release{0.01} |
| 5 | \author{A.M. Kuchling} |
| 6 | \authoraddress{\email{amk1@bigfoot.com}} |
| 7 | \begin{document} |
| 8 | \maketitle\tableofcontents |
| 9 | |
| 10 | \section{Introduction} |
| 11 | |
| 12 | A new release of Python, version 1.6, will be released some time this |
| 13 | summer. Alpha versions are already available from |
| 14 | \url{http://www.python.org/1.6/}. This article talks about the |
| 15 | exciting new features in 1.6, highlights some useful new features, and |
| 16 | points out a few incompatible changes that may require rewriting code. |
| 17 | |
| 18 | Python's development never ceases, and a steady flow of bug fixes and |
| 19 | improvements are always being submitted. A host of minor bug-fixes, a |
| 20 | few optimizations, additional docstrings, and better error messages |
| 21 | went into 1.6; to list them all would be impossible, but they're |
| 22 | certainly significant. Consult the publicly-available CVS logs if you |
| 23 | want to see the full list. |
| 24 | |
| 25 | % ====================================================================== |
| 26 | \section{Unicode} |
| 27 | |
| 28 | XXX |
| 29 | |
| 30 | unicode support: Unicode strings are marked with u"string", and there |
| 31 | is support for arbitrary encoders/decoders |
| 32 | |
| 33 | Added -U command line option. With the option enabled the Python |
| 34 | compiler interprets all "..." strings as u"..." (same with r"..." and |
| 35 | ur"..."). (Is this just for experimenting?) |
| 36 | |
| 37 | |
| 38 | % ====================================================================== |
| 39 | \section{Distribution Utilities} |
| 40 | |
| 41 | XXX |
| 42 | |
| 43 | % ====================================================================== |
| 44 | \section{String Methods} |
| 45 | |
| 46 | % ====================================================================== |
| 47 | \section{Porting to 1.6} |
| 48 | |
| 49 | New Python releases try hard to be compatible with previous releases, |
| 50 | and the record has been pretty good. However, some changes are |
| 51 | considered useful enough (often fixing design decisions that were |
| 52 | initially bad) that breaking backward compatibility in subtle ways |
| 53 | can't always be avoided. This section lists the changes in Python 1.6 |
| 54 | that may cause old Python code to break. |
| 55 | |
| 56 | The change which will probably break the most code is tightening up |
| 57 | the arguments accepted by some methods. Some methods would take |
| 58 | multiple arguments and treat them as a tuple, particularly various |
| 59 | list methods such as \method{.append()}, \method{.insert()}, |
| 60 | \method{remove()}, and \method{.count()}. |
| 61 | % |
| 62 | % XXX did anyone ever call the last 2 methods with multiple args? |
| 63 | % |
| 64 | In earlier versions of Python, if \code{L} is a list, \code{L.append( |
| 65 | 1,2 )} appends the tuple \code{(1,2)} to the list. In Python 1.6 this |
| 66 | causes a \exception{TypeError} exception to be raised, with the |
| 67 | message: 'append requires exactly 1 argument; 2 given'. The fix is to |
| 68 | simply add an extra set of parentheses to pass both values as a tuple: |
| 69 | \code{L.append( (1,2) )}. |
| 70 | |
| 71 | The earlier versions of these methods were more forgiving because they |
| 72 | used an old function in Python's C interface to parse their arguments; |
| 73 | 1.6 modernizes them to use \function{PyArg_ParseTuple}, the current |
| 74 | argument parsing function, which provides more helpful error messages |
| 75 | and treats multi-argument calls as errors. If you absolutely must use |
| 76 | 1.6 but can't fix your code, you can edit \file{Objects/listobject.c} |
| 77 | and define the preprocessor symbol \code{NO_STRICT_LIST_APPEND} to |
| 78 | preserve the old behaviour; this isn't recommended. |
| 79 | |
| 80 | Some of the functions in the \module{socket} module are still |
| 81 | forgiving in this way. For example, \function{socket.connect( |
| 82 | ('hostname', 25) )} is the correct form, passing a tuple representing |
| 83 | an IP address, but |
| 84 | \function{socket.connect( 'hostname', 25 )} also |
| 85 | works. \function{socket.connect_ex()} and \function{socket.bind()} are |
| 86 | similarly easy-going. 1.6alpha1 tightened these functions up, but |
| 87 | because the documentation actually used the erroneous multiple |
| 88 | argument form, many people wrote code which will break. So for |
| 89 | the\module{socket} module, the documentation was fixed and the |
| 90 | multiple argument form is simply marked as deprecated; it'll be |
| 91 | removed in a future Python version. |
| 92 | |
| 93 | Some work has been done to make integers and long integers a bit more |
| 94 | interchangeable. In 1.5.2, large-file support was added for Solaris, |
| 95 | to allow reading files larger than 2Gb; this made the \method{tell()} |
| 96 | method of file objects return a long integer instead of a regular |
| 97 | integer. Some code would subtract two file offsets and attempt to use |
| 98 | the result to multiply a sequence or slice a string, but this raised a |
| 99 | \exception{TypeError}. In 1.6, long integers can be used to multiply |
| 100 | or slice a sequence, and it'll behave as you'd intuitively expect it to; |
| 101 | \code{3L * 'abc'} produces 'abcabcabc', and |
| 102 | \code{ (0,1,2,3)[2L:4L]} produces (2,3). Long integers can also be |
| 103 | used in various new places where previously only integers were |
| 104 | accepted, such as in the \method{seek()} method of file objects. |
| 105 | |
| 106 | The subtlest long integer change of all is that the \function{str()} |
| 107 | of a long integer no longer has a trailing 'L' character, though |
| 108 | \function{repr()} still includes it. The 'L' annoyed many people who |
| 109 | wanted to print long integers that looked just like regular integers, |
| 110 | since they had to go out of their way to chop off the character. This |
| 111 | is no longer a problem in 1.6, but code which assumes the 'L' is |
| 112 | there, and does \code{str(longval)[:-1]} will now lose the final |
| 113 | digit. |
| 114 | |
| 115 | Taking the \function{repr()} of a float now uses a different |
| 116 | formatting precision than \function{str()}. \function{repr()} uses |
| 117 | ``%.17g'' format string for C's \function{sprintf()}, while |
| 118 | \function{str()} uses ``%.12g'' as before. The effect is that |
| 119 | \function{repr()} may occasionally show more decimal places than |
| 120 | \function{str()}, for numbers |
Andrew M. Kuchling | 5b8311e | 2000-05-31 03:28:42 +0000 | [diff] [blame^] | 121 | |
| 122 | XXX need example value here to demonstrate problem. |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 123 | |
| 124 | |
| 125 | % ====================================================================== |
| 126 | \section{Core Changes} |
| 127 | |
Andrew M. Kuchling | 5b8311e | 2000-05-31 03:28:42 +0000 | [diff] [blame^] | 128 | Various minor changes have been made to Python's syntax and built-in |
| 129 | functions. None of the changes are very far-reaching, but they're |
| 130 | handy conveniences. |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 131 | |
Andrew M. Kuchling | 5b8311e | 2000-05-31 03:28:42 +0000 | [diff] [blame^] | 132 | A change to syntax makes it more convenient to call a given function |
| 133 | with a tuple of arguments and/or a dictionary of keyword arguments. |
| 134 | In Python 1.5 and earlier, you do this with the \builtin{apply()} |
| 135 | built-in function: \code{apply(f, \var{args}, \var{kw})} calls the |
| 136 | function \function{f()} with the argument tuple \var{args} and the |
| 137 | keyword arguments in the dictionary \var{kw}. Thanks to a patch from |
| 138 | Greg Ewing, 1.6 adds \code{f(*\var{args}, **\var{kw})} as a shorter |
| 139 | and clearer way to achieve the same effect. This syntax is |
| 140 | symmetrical with the syntax for defining functions: |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 141 | |
Andrew M. Kuchling | 5b8311e | 2000-05-31 03:28:42 +0000 | [diff] [blame^] | 142 | \begin{verbatim} |
| 143 | def f(*args, **kw): |
| 144 | # args is a tuple of positional args, |
| 145 | # kw is a dictionary of keyword args |
| 146 | ... |
| 147 | \end{verbatim} |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 148 | |
Andrew M. Kuchling | 5b8311e | 2000-05-31 03:28:42 +0000 | [diff] [blame^] | 149 | A new format style is available when using the \operator{\%} operator. |
| 150 | '\%r' will insert the \function{repr()} of its argument. This was |
| 151 | also added from symmetry considerations, this time for symmetry with |
| 152 | the existing '\%s' format style which inserts the \function{str()} of |
| 153 | its argument. For example, \code{'%r %s' % ('abc', 'abc')} returns a |
| 154 | string containing \verb|'abc' abc|. |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 155 | |
Andrew M. Kuchling | 5b8311e | 2000-05-31 03:28:42 +0000 | [diff] [blame^] | 156 | The \builtin{int()} and \builtin{long()} functions now accept an |
| 157 | optional ``base'' parameter when the first argument is a string. |
| 158 | \code{int('123', 10)} returns 123, while \code{int('123', 16)} returns |
| 159 | 291. \code{int(123, 16)} raises a \exception{TypeError} exception |
| 160 | with the message ``can't convert non-string with explicit base''. |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 161 | |
Andrew M. Kuchling | 5b8311e | 2000-05-31 03:28:42 +0000 | [diff] [blame^] | 162 | Previously there was no way to implement a class that overrode |
| 163 | Python's built-in \operator{in} operator and implemented a custom |
| 164 | version. \code{\var{obj} in \var{seq}} returns true if \var{obj} is |
| 165 | present in the sequence \var{seq}; Python computes this by simply |
| 166 | trying every index of the sequence until either \var{obj} is found or |
| 167 | an \exception{IndexError} is encountered. Moshe Zadka contributed a |
| 168 | patch which adds a \method{__contains__} magic method for providing a |
| 169 | custom implementation for \operator{in}. |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 170 | |
Andrew M. Kuchling | 5b8311e | 2000-05-31 03:28:42 +0000 | [diff] [blame^] | 171 | Earlier versions of Python used a recursive algorithm for deleting |
| 172 | objects. Deeply nested data structures could cause the interpreter to |
| 173 | fill up the C stack and crash; Christian Tismer rewrote the deletion |
| 174 | logic to fix this problem. On a related note, comparing recursive |
| 175 | objects recursed infinitely and crashed; Jeremy Hylton rewrote the |
| 176 | code to no longer crash, producing a useful result instead. For |
| 177 | example, after this code: |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 178 | |
Andrew M. Kuchling | 5b8311e | 2000-05-31 03:28:42 +0000 | [diff] [blame^] | 179 | \begin{verbatim} |
| 180 | a = [] |
| 181 | b = [] |
| 182 | a.append(a) |
| 183 | b.append(b) |
| 184 | \end{verbatim} |
| 185 | |
| 186 | The comparison \code{a==b} returns true, because the two recursive |
| 187 | data structures are isomorphic. |
| 188 | \footnote{See the thread ``trashcan and PR\#7'' in the April 2000 archives of the python-dev mailing list for the discussion leading up to this implementation, and some useful relevant links. |
| 189 | %http://www.python.org/pipermail/python-dev/2000-April/004834.html |
| 190 | } |
| 191 | |
| 192 | Work has been done on porting Python to 64-bit Windows on the Itanium |
| 193 | processor, mostly by Trent Mick of ActiveState. (Confusingly, for |
| 194 | complicated reasons \code{sys.platform} is still \code{'win32'} on |
| 195 | Win64.) PythonWin also supports Windows CE; see the Python CE page at |
| 196 | \url{http://www.python.net/crew/mhammond/ce/} for more information. |
| 197 | |
| 198 | XXX UnboundLocalError is raised when a local variable is undefined |
| 199 | |
| 200 | A new variable holding more detailed version information has been |
| 201 | added to the \module{sys} module. \code{sys.version_info} is a tuple |
| 202 | \code{(\var{major}, \var{minor}, \var{micro}, \var{level}, |
| 203 | \var{serial})} For example, in 1.6a2 \code{sys.version_info} is |
| 204 | \code{(1, 6, 0, 'alpha', 2)}. \var{level} is a string such as |
| 205 | "alpha", "beta", or '' for a final release. |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 206 | |
| 207 | % ====================================================================== |
| 208 | \section{Extending/embedding Changes} |
| 209 | |
| 210 | Some of the changes are under the covers, and will only be apparent to |
| 211 | people writing C extension modules, or embedding a Python interpreter |
| 212 | in a larger application. If you aren't dealing with Python's C API, |
Andrew M. Kuchling | 5b8311e | 2000-05-31 03:28:42 +0000 | [diff] [blame^] | 213 | you can safely skip this section. |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 214 | |
| 215 | Users of Jim Fulton's ExtensionClass module will be pleased to find |
| 216 | out that hooks have been added so that ExtensionClasses are now |
| 217 | supported by \function{isinstance()} and \function{issubclass()}. |
| 218 | This means you no longer have to remember to write code such as |
| 219 | \code{if type(obj) == myExtensionClass}, but can use the more natural |
| 220 | \code{if isinstance(obj, myExtensionClass)}. |
| 221 | |
| 222 | The \file{Python/importdl.c} file, which was a mass of #ifdefs to |
| 223 | support dynamic loading on many different platforms, was cleaned up |
| 224 | are reorganized by Greg Stein. \file{importdl.c} is now quite small, |
| 225 | and platform-specific code has been moved into a bunch of |
| 226 | \file{Python/dynload_*.c} files. |
| 227 | |
| 228 | Vladimir Marangozov's long-awaited malloc restructuring was completed, |
| 229 | to make it easy to have the Python interpreter use a custom allocator |
| 230 | instead of C's standard \function{malloc()}. For documentation, read |
| 231 | the comments in \file{Include/mymalloc.h} and |
| 232 | \file{Include/objimpl.h}. For the lengthy discussions during which |
| 233 | the interface was hammered out, see the Web archives of the 'patches' |
| 234 | and 'python-dev' lists at python.org. |
| 235 | |
| 236 | Recent versions of the GUSI % XXX what is GUSI? |
| 237 | development environment for MacOS support POSIX threads. Therefore, |
| 238 | POSIX threads are now supported on the Macintosh too. Threading |
| 239 | support using the user-space GNU pth library was also contributed. |
| 240 | |
| 241 | Threading support on Windows was enhanced, too. Windows supports |
| 242 | thread locks that use kernel objects only in case of contention; in |
| 243 | the common case when there's no contention, they use simpler functions |
| 244 | which are an order of magnitude faster. A threaded version of Python |
| 245 | 1.5.2 on NT is twice as slow as an unthreaded version; with the 1.6 |
| 246 | changes, the difference is only 10\%. These improvements were |
| 247 | contributed by Yakov Markovitch. |
| 248 | |
| 249 | % ====================================================================== |
| 250 | \section{Module changes} |
| 251 | |
| 252 | re - changed to be a frontend to sre |
| 253 | |
| 254 | readline, ConfigParser, cgi, calendar, posix, readline, xmllib, aifc, chunk, |
| 255 | wave, random, shelve, nntplib - minor enhancements |
| 256 | |
| 257 | socket, httplib, urllib - optional OpenSSL support |
| 258 | |
| 259 | _tkinter - support for 8.1,8.2,8.3 (support for versions older then 8.0 |
| 260 | has been dropped). Supports Unicode (Lib/lib-tk/Tkinter.py has a test) |
| 261 | |
| 262 | curses -- changed to use ncurses |
| 263 | |
| 264 | % ====================================================================== |
| 265 | \section{New modules} |
| 266 | |
| 267 | winreg - Windows registry interface. |
| 268 | Distutils - tools for distributing Python modules |
| 269 | PyExpat - interface to Expat XML parser |
| 270 | robotparser - parse a robots.txt file (for writing web spiders) |
| 271 | linuxaudio - audio for Linux |
| 272 | mmap - treat a file as a memory buffer |
| 273 | filecmp - supersedes the old cmp.py and dircmp.py modules |
| 274 | tabnanny - check Python sources for tab-width dependance |
| 275 | sre - regular expressions (fast, supports unicode) |
| 276 | unicode - support for unicode |
| 277 | codecs - support for Unicode encoders/decoders |
| 278 | |
| 279 | % ====================================================================== |
| 280 | \section{IDLE Improvements} |
| 281 | |
| 282 | XXX IDLE -- complete overhaul; what are the changes? |
| 283 | |
| 284 | % ====================================================================== |
| 285 | \section{Deleted and Deprecated Modules} |
| 286 | |
| 287 | stdwin |
| 288 | |
| 289 | \end{document} |
| 290 | |