Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 1 | \documentclass{howto} |
| 2 | |
Andrew M. Kuchling | 3ad4e74 | 2000-09-27 01:33:41 +0000 | [diff] [blame] | 3 | % $Id$ |
| 4 | |
Andrew M. Kuchling | 730067e | 2000-06-30 01:44:05 +0000 | [diff] [blame] | 5 | \title{What's New in Python 2.0} |
Andrew M. Kuchling | 9546772 | 2002-05-02 14:48:26 +0000 | [diff] [blame] | 6 | \release{1.02} |
Andrew M. Kuchling | fa33a4e | 2000-06-03 02:52:40 +0000 | [diff] [blame] | 7 | \author{A.M. Kuchling and Moshe Zadka} |
Andrew M. Kuchling | 5ef2b21 | 2002-11-27 18:53:38 +0000 | [diff] [blame] | 8 | \authoraddress{\email{amk@amk.ca}, \email{moshez@twistedmatrix.com} } |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 9 | \begin{document} |
| 10 | \maketitle\tableofcontents |
| 11 | |
| 12 | \section{Introduction} |
| 13 | |
Andrew M. Kuchling | 91bae44 | 2002-04-18 02:18:27 +0000 | [diff] [blame] | 14 | A new release of Python, version 2.0, was released on October 16, 2000. This |
| 15 | article covers the exciting new features in 2.0, highlights some other |
| 16 | useful changes, and points out a few incompatible changes that may require |
Andrew M. Kuchling | 70ba382 | 2000-07-01 00:13:30 +0000 | [diff] [blame] | 17 | rewriting code. |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 18 | |
Andrew M. Kuchling | fa33a4e | 2000-06-03 02:52:40 +0000 | [diff] [blame] | 19 | Python's development never completely stops between releases, and a |
| 20 | steady flow of bug fixes and improvements are always being submitted. |
| 21 | A host of minor fixes, a few optimizations, additional docstrings, and |
Andrew M. Kuchling | 730067e | 2000-06-30 01:44:05 +0000 | [diff] [blame] | 22 | better error messages went into 2.0; to list them all would be |
Andrew M. Kuchling | fa33a4e | 2000-06-03 02:52:40 +0000 | [diff] [blame] | 23 | impossible, but they're certainly significant. Consult the |
Andrew M. Kuchling | ec1722e | 2000-10-12 03:04:22 +0000 | [diff] [blame] | 24 | publicly-available CVS logs if you want to see the full list. This |
| 25 | progress is due to the five developers working for |
| 26 | PythonLabs are now getting paid to spend their days fixing bugs, |
| 27 | and also due to the improved communication resulting |
| 28 | from moving to SourceForge. |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 29 | |
| 30 | % ====================================================================== |
Andrew M. Kuchling | 4d46d38 | 2000-09-06 17:58:49 +0000 | [diff] [blame] | 31 | \section{What About Python 1.6?} |
| 32 | |
| 33 | Python 1.6 can be thought of as the Contractual Obligations Python |
| 34 | release. After the core development team left CNRI in May 2000, CNRI |
| 35 | requested that a 1.6 release be created, containing all the work on |
| 36 | Python that had been performed at CNRI. Python 1.6 therefore |
| 37 | represents the state of the CVS tree as of May 2000, with the most |
| 38 | significant new feature being Unicode support. Development continued |
| 39 | after May, of course, so the 1.6 tree received a few fixes to ensure |
| 40 | that it's forward-compatible with Python 2.0. 1.6 is therefore part |
| 41 | of Python's evolution, and not a side branch. |
| 42 | |
| 43 | So, should you take much interest in Python 1.6? Probably not. The |
| 44 | 1.6final and 2.0beta1 releases were made on the same day (September 5, |
| 45 | 2000), the plan being to finalize Python 2.0 within a month or so. If |
| 46 | you have applications to maintain, there seems little point in |
| 47 | breaking things by moving to 1.6, fixing them, and then having another |
| 48 | round of breakage within a month by moving to 2.0; you're better off |
| 49 | just going straight to 2.0. Most of the really interesting features |
| 50 | described in this document are only in 2.0, because a lot of work was |
| 51 | done between May and September. |
| 52 | |
| 53 | % ====================================================================== |
Andrew M. Kuchling | be870dd | 2000-09-27 02:36:10 +0000 | [diff] [blame] | 54 | \section{New Development Process} |
| 55 | |
| 56 | The most important change in Python 2.0 may not be to the code at all, |
Andrew M. Kuchling | d44dc3c | 2000-10-04 12:40:44 +0000 | [diff] [blame] | 57 | but to how Python is developed: in May 2000 the Python developers |
| 58 | began using the tools made available by SourceForge for storing |
| 59 | source code, tracking bug reports, and managing the queue of patch |
| 60 | submissions. To report bugs or submit patches for Python 2.0, use the |
| 61 | bug tracking and patch manager tools available from Python's project |
| 62 | page, located at \url{http://sourceforge.net/projects/python/}. |
Andrew M. Kuchling | be870dd | 2000-09-27 02:36:10 +0000 | [diff] [blame] | 63 | |
Andrew M. Kuchling | d44dc3c | 2000-10-04 12:40:44 +0000 | [diff] [blame] | 64 | The most important of the services now hosted at SourceForge is the |
| 65 | Python CVS tree, the version-controlled repository containing the |
| 66 | source code for Python. Previously, there were roughly 7 or so people |
| 67 | who had write access to the CVS tree, and all patches had to be |
| 68 | inspected and checked in by one of the people on this short list. |
| 69 | Obviously, this wasn't very scalable. By moving the CVS tree to |
| 70 | SourceForge, it became possible to grant write access to more people; |
| 71 | as of September 2000 there were 27 people able to check in changes, a |
| 72 | fourfold increase. This makes possible large-scale changes that |
| 73 | wouldn't be attempted if they'd have to be filtered through the small |
| 74 | group of core developers. For example, one day Peter Schneider-Kamp |
| 75 | took it into his head to drop K\&R C compatibility and convert the C |
| 76 | source for Python to ANSI C. After getting approval on the python-dev |
| 77 | mailing list, he launched into a flurry of checkins that lasted about |
| 78 | a week, other developers joined in to help, and the job was done. If |
| 79 | there were only 5 people with write access, probably that task would |
| 80 | have been viewed as ``nice, but not worth the time and effort needed'' |
| 81 | and it would never have gotten done. |
Andrew M. Kuchling | be870dd | 2000-09-27 02:36:10 +0000 | [diff] [blame] | 82 | |
Andrew M. Kuchling | d44dc3c | 2000-10-04 12:40:44 +0000 | [diff] [blame] | 83 | The shift to using SourceForge's services has resulted in a remarkable |
| 84 | increase in the speed of development. Patches now get submitted, |
| 85 | commented on, revised by people other than the original submitter, and |
| 86 | bounced back and forth between people until the patch is deemed worth |
Andrew M. Kuchling | ec1722e | 2000-10-12 03:04:22 +0000 | [diff] [blame] | 87 | checking in. Bugs are tracked in one central location and can be |
| 88 | assigned to a specific person for fixing, and we can count the number |
| 89 | of open bugs to measure progress. This didn't come without a cost: |
| 90 | developers now have more e-mail to deal with, more mailing lists to |
| 91 | follow, and special tools had to be written for the new environment. |
| 92 | For example, SourceForge sends default patch and bug notification |
| 93 | e-mail messages that are completely unhelpful, so Ka-Ping Yee wrote an |
| 94 | HTML screen-scraper that sends more useful messages. |
Andrew M. Kuchling | be870dd | 2000-09-27 02:36:10 +0000 | [diff] [blame] | 95 | |
| 96 | The ease of adding code caused a few initial growing pains, such as |
| 97 | code was checked in before it was ready or without getting clear |
| 98 | agreement from the developer group. The approval process that has |
| 99 | emerged is somewhat similar to that used by the Apache group. |
| 100 | Developers can vote +1, +0, -0, or -1 on a patch; +1 and -1 denote |
| 101 | acceptance or rejection, while +0 and -0 mean the developer is mostly |
| 102 | indifferent to the change, though with a slight positive or negative |
Andrew M. Kuchling | ec1722e | 2000-10-12 03:04:22 +0000 | [diff] [blame] | 103 | slant. The most significant change from the Apache model is that the |
| 104 | voting is essentially advisory, letting Guido van Rossum, who has |
| 105 | Benevolent Dictator For Life status, know what the general opinion is. |
| 106 | He can still ignore the result of a vote, and approve or |
| 107 | reject a change even if the community disagrees with him. |
Andrew M. Kuchling | be870dd | 2000-09-27 02:36:10 +0000 | [diff] [blame] | 108 | |
| 109 | Producing an actual patch is the last step in adding a new feature, |
| 110 | and is usually easy compared to the earlier task of coming up with a |
| 111 | good design. Discussions of new features can often explode into |
| 112 | lengthy mailing list threads, making the discussion hard to follow, |
| 113 | and no one can read every posting to python-dev. Therefore, a |
| 114 | relatively formal process has been set up to write Python Enhancement |
| 115 | Proposals (PEPs), modelled on the Internet RFC process. PEPs are |
| 116 | draft documents that describe a proposed new feature, and are |
| 117 | continually revised until the community reaches a consensus, either |
| 118 | accepting or rejecting the proposal. Quoting from the introduction to |
| 119 | PEP 1, ``PEP Purpose and Guidelines'': |
| 120 | |
| 121 | \begin{quotation} |
| 122 | PEP stands for Python Enhancement Proposal. A PEP is a design |
| 123 | document providing information to the Python community, or |
| 124 | describing a new feature for Python. The PEP should provide a |
| 125 | concise technical specification of the feature and a rationale for |
| 126 | the feature. |
| 127 | |
| 128 | We intend PEPs to be the primary mechanisms for proposing new |
| 129 | features, for collecting community input on an issue, and for |
| 130 | documenting the design decisions that have gone into Python. The |
| 131 | PEP author is responsible for building consensus within the |
| 132 | community and documenting dissenting opinions. |
| 133 | \end{quotation} |
| 134 | |
| 135 | Read the rest of PEP 1 for the details of the PEP editorial process, |
| 136 | style, and format. PEPs are kept in the Python CVS tree on |
| 137 | SourceForge, though they're not part of the Python 2.0 distribution, |
| 138 | and are also available in HTML form from |
Fred Drake | b81fbad | 2002-04-03 02:52:50 +0000 | [diff] [blame] | 139 | \url{http://www.python.org/peps/}. As of September 2000, |
Andrew M. Kuchling | be870dd | 2000-09-27 02:36:10 +0000 | [diff] [blame] | 140 | there are 25 PEPS, ranging from PEP 201, ``Lockstep Iteration'', to |
| 141 | PEP 225, ``Elementwise/Objectwise Operators''. |
| 142 | |
Andrew M. Kuchling | be870dd | 2000-09-27 02:36:10 +0000 | [diff] [blame] | 143 | % ====================================================================== |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 144 | \section{Unicode} |
| 145 | |
Andrew M. Kuchling | 730067e | 2000-06-30 01:44:05 +0000 | [diff] [blame] | 146 | The largest new feature in Python 2.0 is a new fundamental data type: |
Andrew M. Kuchling | fa33a4e | 2000-06-03 02:52:40 +0000 | [diff] [blame] | 147 | Unicode strings. Unicode uses 16-bit numbers to represent characters |
| 148 | instead of the 8-bit number used by ASCII, meaning that 65,536 |
| 149 | distinct characters can be supported. |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 150 | |
Andrew M. Kuchling | fa33a4e | 2000-06-03 02:52:40 +0000 | [diff] [blame] | 151 | The final interface for Unicode support was arrived at through |
Andrew M. Kuchling | b853ea0 | 2000-06-03 03:06:58 +0000 | [diff] [blame] | 152 | countless often-stormy discussions on the python-dev mailing list, and |
Andrew M. Kuchling | 62cdd96 | 2000-06-30 12:46:41 +0000 | [diff] [blame] | 153 | mostly implemented by Marc-Andr\'e Lemburg, based on a Unicode string |
| 154 | type implementation by Fredrik Lundh. A detailed explanation of the |
Andrew M. Kuchling | 9546772 | 2002-05-02 14:48:26 +0000 | [diff] [blame] | 155 | interface was written up as \pep{100}, ``Python Unicode Integration''. |
| 156 | This article will simply cover the most significant points about the |
| 157 | Unicode interfaces. |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 158 | |
Andrew M. Kuchling | fa33a4e | 2000-06-03 02:52:40 +0000 | [diff] [blame] | 159 | In Python source code, Unicode strings are written as |
| 160 | \code{u"string"}. Arbitrary Unicode characters can be written using a |
Andrew M. Kuchling | b853ea0 | 2000-06-03 03:06:58 +0000 | [diff] [blame] | 161 | new escape sequence, \code{\e u\var{HHHH}}, where \var{HHHH} is a |
Andrew M. Kuchling | fa33a4e | 2000-06-03 02:52:40 +0000 | [diff] [blame] | 162 | 4-digit hexadecimal number from 0000 to FFFF. The existing |
Andrew M. Kuchling | b853ea0 | 2000-06-03 03:06:58 +0000 | [diff] [blame] | 163 | \code{\e x\var{HHHH}} escape sequence can also be used, and octal |
Andrew M. Kuchling | fa33a4e | 2000-06-03 02:52:40 +0000 | [diff] [blame] | 164 | escapes can be used for characters up to U+01FF, which is represented |
Andrew M. Kuchling | b853ea0 | 2000-06-03 03:06:58 +0000 | [diff] [blame] | 165 | by \code{\e 777}. |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 166 | |
Andrew M. Kuchling | fa33a4e | 2000-06-03 02:52:40 +0000 | [diff] [blame] | 167 | Unicode strings, just like regular strings, are an immutable sequence |
Andrew M. Kuchling | 662d76e | 2000-06-25 14:32:48 +0000 | [diff] [blame] | 168 | type. They can be indexed and sliced, but not modified in place. |
Andrew M. Kuchling | 62cdd96 | 2000-06-30 12:46:41 +0000 | [diff] [blame] | 169 | Unicode strings have an \method{encode( \optional{encoding} )} method |
Andrew M. Kuchling | 662d76e | 2000-06-25 14:32:48 +0000 | [diff] [blame] | 170 | that returns an 8-bit string in the desired encoding. Encodings are |
| 171 | named by strings, such as \code{'ascii'}, \code{'utf-8'}, |
| 172 | \code{'iso-8859-1'}, or whatever. A codec API is defined for |
| 173 | implementing and registering new encodings that are then available |
| 174 | throughout a Python program. If an encoding isn't specified, the |
| 175 | default encoding is usually 7-bit ASCII, though it can be changed for |
| 176 | your Python installation by calling the |
Andrew M. Kuchling | c0328f0 | 2000-06-10 15:11:20 +0000 | [diff] [blame] | 177 | \function{sys.setdefaultencoding(\var{encoding})} function in a |
Andrew M. Kuchling | 69db0e4 | 2000-06-28 02:16:00 +0000 | [diff] [blame] | 178 | customised version of \file{site.py}. |
Andrew M. Kuchling | fa33a4e | 2000-06-03 02:52:40 +0000 | [diff] [blame] | 179 | |
| 180 | Combining 8-bit and Unicode strings always coerces to Unicode, using |
| 181 | the default ASCII encoding; the result of \code{'a' + u'bc'} is |
Andrew M. Kuchling | 7f6270d | 2000-06-09 02:48:18 +0000 | [diff] [blame] | 182 | \code{u'abc'}. |
Andrew M. Kuchling | fa33a4e | 2000-06-03 02:52:40 +0000 | [diff] [blame] | 183 | |
| 184 | New built-in functions have been added, and existing built-ins |
| 185 | modified to support Unicode: |
| 186 | |
| 187 | \begin{itemize} |
| 188 | \item \code{unichr(\var{ch})} returns a Unicode string 1 character |
| 189 | long, containing the character \var{ch}. |
| 190 | |
| 191 | \item \code{ord(\var{u})}, where \var{u} is a 1-character regular or Unicode string, returns the number of the character as an integer. |
| 192 | |
Andrew M. Kuchling | 5e08a01 | 2000-09-04 17:59:27 +0000 | [diff] [blame] | 193 | \item \code{unicode(\var{string} \optional{, \var{encoding}} |
| 194 | \optional{, \var{errors}} ) } creates a Unicode string from an 8-bit |
Andrew M. Kuchling | fa33a4e | 2000-06-03 02:52:40 +0000 | [diff] [blame] | 195 | string. \code{encoding} is a string naming the encoding to use. |
Andrew M. Kuchling | fa33a4e | 2000-06-03 02:52:40 +0000 | [diff] [blame] | 196 | The \code{errors} parameter specifies the treatment of characters that |
| 197 | are invalid for the current encoding; passing \code{'strict'} as the |
| 198 | value causes an exception to be raised on any encoding error, while |
| 199 | \code{'ignore'} causes errors to be silently ignored and |
| 200 | \code{'replace'} uses U+FFFD, the official replacement character, in |
| 201 | case of any problems. |
| 202 | |
Andrew M. Kuchling | 3ad4e74 | 2000-09-27 01:33:41 +0000 | [diff] [blame] | 203 | \item The \keyword{exec} statement, and various built-ins such as |
| 204 | \code{eval()}, \code{getattr()}, and \code{setattr()} will also |
| 205 | accept Unicode strings as well as regular strings. (It's possible |
| 206 | that the process of fixing this missed some built-ins; if you find a |
| 207 | built-in function that accepts strings but doesn't accept Unicode |
| 208 | strings at all, please report it as a bug.) |
| 209 | |
Andrew M. Kuchling | fa33a4e | 2000-06-03 02:52:40 +0000 | [diff] [blame] | 210 | \end{itemize} |
| 211 | |
| 212 | A new module, \module{unicodedata}, provides an interface to Unicode |
| 213 | character properties. For example, \code{unicodedata.category(u'A')} |
| 214 | returns the 2-character string 'Lu', the 'L' denoting it's a letter, |
| 215 | and 'u' meaning that it's uppercase. |
Andrew M. Kuchling | b853ea0 | 2000-06-03 03:06:58 +0000 | [diff] [blame] | 216 | \code{u.bidirectional(u'\e x0660')} returns 'AN', meaning that U+0660 is |
Andrew M. Kuchling | fa33a4e | 2000-06-03 02:52:40 +0000 | [diff] [blame] | 217 | an Arabic number. |
| 218 | |
Andrew M. Kuchling | b853ea0 | 2000-06-03 03:06:58 +0000 | [diff] [blame] | 219 | The \module{codecs} module contains functions to look up existing encodings |
| 220 | and register new ones. Unless you want to implement a |
| 221 | new encoding, you'll most often use the |
| 222 | \function{codecs.lookup(\var{encoding})} function, which returns a |
| 223 | 4-element tuple: \code{(\var{encode_func}, |
| 224 | \var{decode_func}, \var{stream_reader}, \var{stream_writer})}. |
Andrew M. Kuchling | fa33a4e | 2000-06-03 02:52:40 +0000 | [diff] [blame] | 225 | |
| 226 | \begin{itemize} |
| 227 | \item \var{encode_func} is a function that takes a Unicode string, and |
| 228 | returns a 2-tuple \code{(\var{string}, \var{length})}. \var{string} |
| 229 | is an 8-bit string containing a portion (perhaps all) of the Unicode |
Andrew M. Kuchling | 2d2dc9f | 2000-08-17 00:27:06 +0000 | [diff] [blame] | 230 | string converted into the given encoding, and \var{length} tells you |
| 231 | how much of the Unicode string was converted. |
Andrew M. Kuchling | fa33a4e | 2000-06-03 02:52:40 +0000 | [diff] [blame] | 232 | |
Andrew M. Kuchling | 118ee96 | 2000-09-27 01:01:18 +0000 | [diff] [blame] | 233 | \item \var{decode_func} is the opposite of \var{encode_func}, taking |
| 234 | an 8-bit string and returning a 2-tuple \code{(\var{ustring}, |
| 235 | \var{length})}, consisting of the resulting Unicode string |
| 236 | \var{ustring} and the integer \var{length} telling how much of the |
Andrew M. Kuchling | 3ad4e74 | 2000-09-27 01:33:41 +0000 | [diff] [blame] | 237 | 8-bit string was consumed. |
Andrew M. Kuchling | fa33a4e | 2000-06-03 02:52:40 +0000 | [diff] [blame] | 238 | |
| 239 | \item \var{stream_reader} is a class that supports decoding input from |
| 240 | a stream. \var{stream_reader(\var{file_obj})} returns an object that |
| 241 | supports the \method{read()}, \method{readline()}, and |
| 242 | \method{readlines()} methods. These methods will all translate from |
| 243 | the given encoding and return Unicode strings. |
| 244 | |
| 245 | \item \var{stream_writer}, similarly, is a class that supports |
| 246 | encoding output to a stream. \var{stream_writer(\var{file_obj})} |
Andrew M. Kuchling | 69db0e4 | 2000-06-28 02:16:00 +0000 | [diff] [blame] | 247 | returns an object that supports the \method{write()} and |
| 248 | \method{writelines()} methods. These methods expect Unicode strings, |
| 249 | translating them to the given encoding on output. |
Andrew M. Kuchling | fa33a4e | 2000-06-03 02:52:40 +0000 | [diff] [blame] | 250 | \end{itemize} |
| 251 | |
| 252 | For example, the following code writes a Unicode string into a file, |
| 253 | encoding it as UTF-8: |
| 254 | |
| 255 | \begin{verbatim} |
| 256 | import codecs |
| 257 | |
| 258 | unistr = u'\u0660\u2000ab ...' |
| 259 | |
| 260 | (UTF8_encode, UTF8_decode, |
| 261 | UTF8_streamreader, UTF8_streamwriter) = codecs.lookup('UTF-8') |
| 262 | |
| 263 | output = UTF8_streamwriter( open( '/tmp/output', 'wb') ) |
| 264 | output.write( unistr ) |
| 265 | output.close() |
| 266 | \end{verbatim} |
| 267 | |
| 268 | The following code would then read UTF-8 input from the file: |
| 269 | |
| 270 | \begin{verbatim} |
Andrew M. Kuchling | 5e08a01 | 2000-09-04 17:59:27 +0000 | [diff] [blame] | 271 | input = UTF8_streamreader( open( '/tmp/output', 'rb') ) |
Andrew M. Kuchling | fa33a4e | 2000-06-03 02:52:40 +0000 | [diff] [blame] | 272 | print repr(input.read()) |
| 273 | input.close() |
| 274 | \end{verbatim} |
| 275 | |
| 276 | Unicode-aware regular expressions are available through the |
| 277 | \module{re} module, which has a new underlying implementation called |
| 278 | SRE written by Fredrik Lundh of Secret Labs AB. |
| 279 | |
Andrew M. Kuchling | c0328f0 | 2000-06-10 15:11:20 +0000 | [diff] [blame] | 280 | A \code{-U} command line option was added which causes the Python |
| 281 | compiler to interpret all string literals as Unicode string literals. |
| 282 | This is intended to be used in testing and future-proofing your Python |
| 283 | code, since some future version of Python may drop support for 8-bit |
| 284 | strings and provide only Unicode strings. |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 285 | |
| 286 | % ====================================================================== |
Andrew M. Kuchling | 2d2dc9f | 2000-08-17 00:27:06 +0000 | [diff] [blame] | 287 | \section{List Comprehensions} |
| 288 | |
| 289 | Lists are a workhorse data type in Python, and many programs |
| 290 | manipulate a list at some point. Two common operations on lists are |
| 291 | to loop over them, and either pick out the elements that meet a |
| 292 | certain criterion, or apply some function to each element. For |
| 293 | example, given a list of strings, you might want to pull out all the |
| 294 | strings containing a given substring, or strip off trailing whitespace |
| 295 | from each line. |
| 296 | |
| 297 | The existing \function{map()} and \function{filter()} functions can be |
| 298 | used for this purpose, but they require a function as one of their |
| 299 | arguments. This is fine if there's an existing built-in function that |
| 300 | can be passed directly, but if there isn't, you have to create a |
| 301 | little function to do the required work, and Python's scoping rules |
| 302 | make the result ugly if the little function needs additional |
| 303 | information. Take the first example in the previous paragraph, |
| 304 | finding all the strings in the list containing a given substring. You |
| 305 | could write the following to do it: |
| 306 | |
| 307 | \begin{verbatim} |
| 308 | # Given the list L, make a list of all strings |
| 309 | # containing the substring S. |
| 310 | sublist = filter( lambda s, substring=S: |
| 311 | string.find(s, substring) != -1, |
| 312 | L) |
| 313 | \end{verbatim} |
| 314 | |
| 315 | Because of Python's scoping rules, a default argument is used so that |
| 316 | the anonymous function created by the \keyword{lambda} statement knows |
| 317 | what substring is being searched for. List comprehensions make this |
| 318 | cleaner: |
| 319 | |
| 320 | \begin{verbatim} |
| 321 | sublist = [ s for s in L if string.find(s, S) != -1 ] |
| 322 | \end{verbatim} |
| 323 | |
| 324 | List comprehensions have the form: |
| 325 | |
| 326 | \begin{verbatim} |
| 327 | [ expression for expr in sequence1 |
| 328 | for expr2 in sequence2 ... |
| 329 | for exprN in sequenceN |
| 330 | if condition |
| 331 | \end{verbatim} |
| 332 | |
| 333 | The \keyword{for}...\keyword{in} clauses contain the sequences to be |
| 334 | iterated over. The sequences do not have to be the same length, |
| 335 | because they are \emph{not} iterated over in parallel, but |
| 336 | from left to right; this is explained more clearly in the following |
| 337 | paragraphs. The elements of the generated list will be the successive |
| 338 | values of \var{expression}. The final \keyword{if} clause is |
| 339 | optional; if present, \var{expression} is only evaluated and added to |
| 340 | the result if \var{condition} is true. |
| 341 | |
| 342 | To make the semantics very clear, a list comprehension is equivalent |
| 343 | to the following Python code: |
| 344 | |
| 345 | \begin{verbatim} |
| 346 | for expr1 in sequence1: |
| 347 | for expr2 in sequence2: |
| 348 | ... |
| 349 | for exprN in sequenceN: |
| 350 | if (condition): |
| 351 | # Append the value of |
| 352 | # the expression to the |
| 353 | # resulting list. |
| 354 | \end{verbatim} |
| 355 | |
| 356 | This means that when there are \keyword{for}...\keyword{in} clauses, |
| 357 | the resulting list will be equal to the product of the lengths of all |
| 358 | the sequences. If you have two lists of length 3, the output list is |
| 359 | 9 elements long: |
| 360 | |
| 361 | \begin{verbatim} |
| 362 | seq1 = 'abc' |
| 363 | seq2 = (1,2,3) |
| 364 | >>> [ (x,y) for x in seq1 for y in seq2] |
| 365 | [('a', 1), ('a', 2), ('a', 3), ('b', 1), ('b', 2), ('b', 3), ('c', 1), |
| 366 | ('c', 2), ('c', 3)] |
| 367 | \end{verbatim} |
| 368 | |
| 369 | To avoid introducing an ambiguity into Python's grammar, if |
| 370 | \var{expression} is creating a tuple, it must be surrounded with |
| 371 | parentheses. The first list comprehension below is a syntax error, |
| 372 | while the second one is correct: |
| 373 | |
| 374 | \begin{verbatim} |
| 375 | # Syntax error |
| 376 | [ x,y for x in seq1 for y in seq2] |
| 377 | # Correct |
| 378 | [ (x,y) for x in seq1 for y in seq2] |
| 379 | \end{verbatim} |
| 380 | |
Andrew M. Kuchling | 2d2dc9f | 2000-08-17 00:27:06 +0000 | [diff] [blame] | 381 | The idea of list comprehensions originally comes from the functional |
| 382 | programming language Haskell (\url{http://www.haskell.org}). Greg |
| 383 | Ewing argued most effectively for adding them to Python and wrote the |
| 384 | initial list comprehension patch, which was then discussed for a |
| 385 | seemingly endless time on the python-dev mailing list and kept |
| 386 | up-to-date by Skip Montanaro. |
| 387 | |
Andrew M. Kuchling | 2d2dc9f | 2000-08-17 00:27:06 +0000 | [diff] [blame] | 388 | % ====================================================================== |
Andrew M. Kuchling | 4373764 | 2000-08-30 00:51:02 +0000 | [diff] [blame] | 389 | \section{Augmented Assignment} |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 390 | |
Andrew M. Kuchling | 4373764 | 2000-08-30 00:51:02 +0000 | [diff] [blame] | 391 | Augmented assignment operators, another long-requested feature, have |
Andrew M. Kuchling | 5e08a01 | 2000-09-04 17:59:27 +0000 | [diff] [blame] | 392 | been added to Python 2.0. Augmented assignment operators include |
| 393 | \code{+=}, \code{-=}, \code{*=}, and so forth. For example, the |
| 394 | statement \code{a += 2} increments the value of the variable |
| 395 | \code{a} by 2, equivalent to the slightly lengthier \code{a = a + 2}. |
Andrew M. Kuchling | fa33a4e | 2000-06-03 02:52:40 +0000 | [diff] [blame] | 396 | |
Andrew M. Kuchling | 4373764 | 2000-08-30 00:51:02 +0000 | [diff] [blame] | 397 | The full list of supported assignment operators is \code{+=}, |
| 398 | \code{-=}, \code{*=}, \code{/=}, \code{\%=}, \code{**=}, \code{\&=}, |
Andrew M. Kuchling | 3cdb576 | 2000-08-30 12:55:42 +0000 | [diff] [blame] | 399 | \code{|=}, \verb|^=|, \code{>>=}, and \code{<<=}. Python classes can |
Andrew M. Kuchling | 4373764 | 2000-08-30 00:51:02 +0000 | [diff] [blame] | 400 | override the augmented assignment operators by defining methods named |
| 401 | \method{__iadd__}, \method{__isub__}, etc. For example, the following |
| 402 | \class{Number} class stores a number and supports using += to create a |
| 403 | new instance with an incremented value. |
Andrew M. Kuchling | fa33a4e | 2000-06-03 02:52:40 +0000 | [diff] [blame] | 404 | |
| 405 | \begin{verbatim} |
Andrew M. Kuchling | 4373764 | 2000-08-30 00:51:02 +0000 | [diff] [blame] | 406 | class Number: |
| 407 | def __init__(self, value): |
| 408 | self.value = value |
| 409 | def __iadd__(self, increment): |
| 410 | return Number( self.value + increment) |
| 411 | |
| 412 | n = Number(5) |
| 413 | n += 3 |
| 414 | print n.value |
Andrew M. Kuchling | fa33a4e | 2000-06-03 02:52:40 +0000 | [diff] [blame] | 415 | \end{verbatim} |
| 416 | |
Andrew M. Kuchling | 4373764 | 2000-08-30 00:51:02 +0000 | [diff] [blame] | 417 | The \method{__iadd__} special method is called with the value of the |
| 418 | increment, and should return a new instance with an appropriately |
| 419 | modified value; this return value is bound as the new value of the |
| 420 | variable on the left-hand side. |
Andrew M. Kuchling | fa33a4e | 2000-06-03 02:52:40 +0000 | [diff] [blame] | 421 | |
Andrew M. Kuchling | 4373764 | 2000-08-30 00:51:02 +0000 | [diff] [blame] | 422 | Augmented assignment operators were first introduced in the C |
| 423 | programming language, and most C-derived languages, such as |
| 424 | \program{awk}, C++, Java, Perl, and PHP also support them. The augmented |
| 425 | assignment patch was implemented by Thomas Wouters. |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 426 | |
| 427 | % ====================================================================== |
| 428 | \section{String Methods} |
| 429 | |
Andrew M. Kuchling | fa33a4e | 2000-06-03 02:52:40 +0000 | [diff] [blame] | 430 | Until now string-manipulation functionality was in the \module{string} |
Andrew M. Kuchling | 5e08a01 | 2000-09-04 17:59:27 +0000 | [diff] [blame] | 431 | module, which was usually a front-end for the \module{strop} |
Andrew M. Kuchling | fa33a4e | 2000-06-03 02:52:40 +0000 | [diff] [blame] | 432 | module written in C. The addition of Unicode posed a difficulty for |
| 433 | the \module{strop} module, because the functions would all need to be |
| 434 | rewritten in order to accept either 8-bit or Unicode strings. For |
| 435 | functions such as \function{string.replace()}, which takes 3 string |
| 436 | arguments, that means eight possible permutations, and correspondingly |
| 437 | complicated code. |
| 438 | |
Andrew M. Kuchling | 730067e | 2000-06-30 01:44:05 +0000 | [diff] [blame] | 439 | Instead, Python 2.0 pushes the problem onto the string type, making |
Andrew M. Kuchling | fa33a4e | 2000-06-03 02:52:40 +0000 | [diff] [blame] | 440 | string manipulation functionality available through methods on both |
| 441 | 8-bit strings and Unicode strings. |
| 442 | |
| 443 | \begin{verbatim} |
| 444 | >>> 'andrew'.capitalize() |
| 445 | 'Andrew' |
| 446 | >>> 'hostname'.replace('os', 'linux') |
| 447 | 'hlinuxtname' |
| 448 | >>> 'moshe'.find('sh') |
| 449 | 2 |
| 450 | \end{verbatim} |
| 451 | |
Andrew M. Kuchling | 4373764 | 2000-08-30 00:51:02 +0000 | [diff] [blame] | 452 | One thing that hasn't changed, a noteworthy April Fools' joke |
| 453 | notwithstanding, is that Python strings are immutable. Thus, the |
| 454 | string methods return new strings, and do not modify the string on |
| 455 | which they operate. |
Andrew M. Kuchling | fa33a4e | 2000-06-03 02:52:40 +0000 | [diff] [blame] | 456 | |
| 457 | The old \module{string} module is still around for backwards |
| 458 | compatibility, but it mostly acts as a front-end to the new string |
| 459 | methods. |
| 460 | |
Andrew M. Kuchling | 730067e | 2000-06-30 01:44:05 +0000 | [diff] [blame] | 461 | Two methods which have no parallel in pre-2.0 versions, although they |
Andrew M. Kuchling | fa33a4e | 2000-06-03 02:52:40 +0000 | [diff] [blame] | 462 | did exist in JPython for quite some time, are \method{startswith()} |
| 463 | and \method{endswith}. \code{s.startswith(t)} is equivalent to \code{s[:len(t)] |
| 464 | == t}, while \code{s.endswith(t)} is equivalent to \code{s[-len(t):] == t}. |
| 465 | |
Andrew M. Kuchling | fed4f1e | 2000-07-01 12:33:43 +0000 | [diff] [blame] | 466 | One other method which deserves special mention is \method{join}. The |
| 467 | \method{join} method of a string receives one parameter, a sequence of |
| 468 | strings, and is equivalent to the \function{string.join} function from |
| 469 | the old \module{string} module, with the arguments reversed. In other |
| 470 | words, \code{s.join(seq)} is equivalent to the old |
| 471 | \code{string.join(seq, s)}. |
Andrew M. Kuchling | fa33a4e | 2000-06-03 02:52:40 +0000 | [diff] [blame] | 472 | |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 473 | % ====================================================================== |
Andrew M. Kuchling | ec1722e | 2000-10-12 03:04:22 +0000 | [diff] [blame] | 474 | \section{Garbage Collection of Cycles} |
Andrew M. Kuchling | 35e8afb | 2000-07-08 12:06:31 +0000 | [diff] [blame] | 475 | |
| 476 | The C implementation of Python uses reference counting to implement |
| 477 | garbage collection. Every Python object maintains a count of the |
| 478 | number of references pointing to itself, and adjusts the count as |
| 479 | references are created or destroyed. Once the reference count reaches |
| 480 | zero, the object is no longer accessible, since you need to have a |
| 481 | reference to an object to access it, and if the count is zero, no |
| 482 | references exist any longer. |
| 483 | |
| 484 | Reference counting has some pleasant properties: it's easy to |
| 485 | understand and implement, and the resulting implementation is |
| 486 | portable, fairly fast, and reacts well with other libraries that |
| 487 | implement their own memory handling schemes. The major problem with |
| 488 | reference counting is that it sometimes doesn't realise that objects |
| 489 | are no longer accessible, resulting in a memory leak. This happens |
| 490 | when there are cycles of references. |
| 491 | |
| 492 | Consider the simplest possible cycle, |
| 493 | a class instance which has a reference to itself: |
| 494 | |
| 495 | \begin{verbatim} |
| 496 | instance = SomeClass() |
| 497 | instance.myself = instance |
| 498 | \end{verbatim} |
| 499 | |
| 500 | After the above two lines of code have been executed, the reference |
| 501 | count of \code{instance} is 2; one reference is from the variable |
| 502 | named \samp{'instance'}, and the other is from the \samp{myself} |
| 503 | attribute of the instance. |
| 504 | |
| 505 | If the next line of code is \code{del instance}, what happens? The |
| 506 | reference count of \code{instance} is decreased by 1, so it has a |
| 507 | reference count of 1; the reference in the \samp{myself} attribute |
| 508 | still exists. Yet the instance is no longer accessible through Python |
| 509 | code, and it could be deleted. Several objects can participate in a |
| 510 | cycle if they have references to each other, causing all of the |
| 511 | objects to be leaked. |
| 512 | |
Andrew M. Kuchling | ec1722e | 2000-10-12 03:04:22 +0000 | [diff] [blame] | 513 | Python 2.0 fixes this problem by periodically executing a cycle |
| 514 | detection algorithm which looks for inaccessible cycles and deletes |
| 515 | the objects involved. A new \module{gc} module provides functions to |
| 516 | perform a garbage collection, obtain debugging statistics, and tuning |
| 517 | the collector's parameters. |
Andrew M. Kuchling | 35e8afb | 2000-07-08 12:06:31 +0000 | [diff] [blame] | 518 | |
Andrew M. Kuchling | ec1722e | 2000-10-12 03:04:22 +0000 | [diff] [blame] | 519 | Running the cycle detection algorithm takes some time, and therefore |
| 520 | will result in some additional overhead. It is hoped that after we've |
| 521 | gotten experience with the cycle collection from using 2.0, Python 2.1 |
| 522 | will be able to minimize the overhead with careful tuning. It's not |
| 523 | yet obvious how much performance is lost, because benchmarking this is |
| 524 | tricky and depends crucially on how often the program creates and |
| 525 | destroys objects. The detection of cycles can be disabled when Python |
| 526 | is compiled, if you can't afford even a tiny speed penalty or suspect |
| 527 | that the cycle collection is buggy, by specifying the |
| 528 | \samp{--without-cycle-gc} switch when running the \file{configure} |
| 529 | script. |
Andrew M. Kuchling | 35e8afb | 2000-07-08 12:06:31 +0000 | [diff] [blame] | 530 | |
| 531 | Several people tackled this problem and contributed to a solution. An |
| 532 | early implementation of the cycle detection approach was written by |
| 533 | Toby Kelsey. The current algorithm was suggested by Eric Tiedemann |
| 534 | during a visit to CNRI, and Guido van Rossum and Neil Schemenauer |
| 535 | wrote two different implementations, which were later integrated by |
| 536 | Neil. Lots of other people offered suggestions along the way; the |
| 537 | March 2000 archives of the python-dev mailing list contain most of the |
| 538 | relevant discussion, especially in the threads titled ``Reference |
| 539 | cycle collection for Python'' and ``Finalization again''. |
| 540 | |
Andrew M. Kuchling | 35e8afb | 2000-07-08 12:06:31 +0000 | [diff] [blame] | 541 | % ====================================================================== |
Andrew M. Kuchling | 4373764 | 2000-08-30 00:51:02 +0000 | [diff] [blame] | 542 | \section{Other Core Changes} |
Andrew M. Kuchling | 35e8afb | 2000-07-08 12:06:31 +0000 | [diff] [blame] | 543 | |
Andrew M. Kuchling | 4373764 | 2000-08-30 00:51:02 +0000 | [diff] [blame] | 544 | Various minor changes have been made to Python's syntax and built-in |
| 545 | functions. None of the changes are very far-reaching, but they're |
| 546 | handy conveniences. |
| 547 | |
| 548 | \subsection{Minor Language Changes} |
| 549 | |
| 550 | A new syntax makes it more convenient to call a given function |
| 551 | with a tuple of arguments and/or a dictionary of keyword arguments. |
| 552 | In Python 1.5 and earlier, you'd use the \function{apply()} |
| 553 | built-in function: \code{apply(f, \var{args}, \var{kw})} calls the |
| 554 | function \function{f()} with the argument tuple \var{args} and the |
| 555 | keyword arguments in the dictionary \var{kw}. \function{apply()} |
| 556 | is the same in 2.0, but thanks to a patch from |
| 557 | Greg Ewing, \code{f(*\var{args}, **\var{kw})} as a shorter |
| 558 | and clearer way to achieve the same effect. This syntax is |
| 559 | symmetrical with the syntax for defining functions: |
| 560 | |
| 561 | \begin{verbatim} |
| 562 | def f(*args, **kw): |
| 563 | # args is a tuple of positional args, |
| 564 | # kw is a dictionary of keyword args |
| 565 | ... |
| 566 | \end{verbatim} |
| 567 | |
| 568 | The \keyword{print} statement can now have its output directed to a |
Andrew M. Kuchling | 5e08a01 | 2000-09-04 17:59:27 +0000 | [diff] [blame] | 569 | file-like object by following the \keyword{print} with |
| 570 | \verb|>> file|, similar to the redirection operator in Unix shells. |
Andrew M. Kuchling | 4373764 | 2000-08-30 00:51:02 +0000 | [diff] [blame] | 571 | Previously you'd either have to use the \method{write()} method of the |
| 572 | file-like object, which lacks the convenience and simplicity of |
Andrew M. Kuchling | 5e08a01 | 2000-09-04 17:59:27 +0000 | [diff] [blame] | 573 | \keyword{print}, or you could assign a new value to |
| 574 | \code{sys.stdout} and then restore the old value. For sending output to standard error, |
Andrew M. Kuchling | 4373764 | 2000-08-30 00:51:02 +0000 | [diff] [blame] | 575 | it's much easier to write this: |
| 576 | |
| 577 | \begin{verbatim} |
| 578 | print >> sys.stderr, "Warning: action field not supplied" |
| 579 | \end{verbatim} |
| 580 | |
| 581 | Modules can now be renamed on importing them, using the syntax |
| 582 | \code{import \var{module} as \var{name}} or \code{from \var{module} |
| 583 | import \var{name} as \var{othername}}. The patch was submitted by |
| 584 | Thomas Wouters. |
| 585 | |
| 586 | A new format style is available when using the \code{\%} operator; |
| 587 | '\%r' will insert the \function{repr()} of its argument. This was |
| 588 | also added from symmetry considerations, this time for symmetry with |
| 589 | the existing '\%s' format style, which inserts the \function{str()} of |
| 590 | its argument. For example, \code{'\%r \%s' \% ('abc', 'abc')} returns a |
| 591 | string containing \verb|'abc' abc|. |
| 592 | |
| 593 | Previously there was no way to implement a class that overrode |
| 594 | Python's built-in \keyword{in} operator and implemented a custom |
| 595 | version. \code{\var{obj} in \var{seq}} returns true if \var{obj} is |
| 596 | present in the sequence \var{seq}; Python computes this by simply |
| 597 | trying every index of the sequence until either \var{obj} is found or |
| 598 | an \exception{IndexError} is encountered. Moshe Zadka contributed a |
| 599 | patch which adds a \method{__contains__} magic method for providing a |
| 600 | custom implementation for \keyword{in}. Additionally, new built-in |
| 601 | objects written in C can define what \keyword{in} means for them via a |
| 602 | new slot in the sequence protocol. |
| 603 | |
| 604 | Earlier versions of Python used a recursive algorithm for deleting |
| 605 | objects. Deeply nested data structures could cause the interpreter to |
| 606 | fill up the C stack and crash; Christian Tismer rewrote the deletion |
| 607 | logic to fix this problem. On a related note, comparing recursive |
| 608 | objects recursed infinitely and crashed; Jeremy Hylton rewrote the |
| 609 | code to no longer crash, producing a useful result instead. For |
| 610 | example, after this code: |
| 611 | |
| 612 | \begin{verbatim} |
| 613 | a = [] |
| 614 | b = [] |
| 615 | a.append(a) |
| 616 | b.append(b) |
| 617 | \end{verbatim} |
| 618 | |
| 619 | The comparison \code{a==b} returns true, because the two recursive |
Andrew M. Kuchling | 6032c48 | 2000-10-12 02:37:14 +0000 | [diff] [blame] | 620 | data structures are isomorphic. See the thread ``trashcan |
Andrew M. Kuchling | 4373764 | 2000-08-30 00:51:02 +0000 | [diff] [blame] | 621 | and PR\#7'' in the April 2000 archives of the python-dev mailing list |
| 622 | for the discussion leading up to this implementation, and some useful |
Andrew M. Kuchling | ec1722e | 2000-10-12 03:04:22 +0000 | [diff] [blame] | 623 | relevant links. |
Andrew M. Kuchling | 6032c48 | 2000-10-12 02:37:14 +0000 | [diff] [blame] | 624 | % Starting URL: |
| 625 | % http://www.python.org/pipermail/python-dev/2000-April/004834.html |
Andrew M. Kuchling | 4373764 | 2000-08-30 00:51:02 +0000 | [diff] [blame] | 626 | |
Andrew M. Kuchling | ec1722e | 2000-10-12 03:04:22 +0000 | [diff] [blame] | 627 | Note that comparisons can now also raise exceptions. In earlier |
| 628 | versions of Python, a comparison operation such as \code{cmp(a,b)} |
| 629 | would always produce an answer, even if a user-defined |
| 630 | \method{__cmp__} method encountered an error, since the resulting |
| 631 | exception would simply be silently swallowed. |
| 632 | |
Andrew M. Kuchling | 4373764 | 2000-08-30 00:51:02 +0000 | [diff] [blame] | 633 | Work has been done on porting Python to 64-bit Windows on the Itanium |
| 634 | processor, mostly by Trent Mick of ActiveState. (Confusingly, |
| 635 | \code{sys.platform} is still \code{'win32'} on Win64 because it seems |
| 636 | that for ease of porting, MS Visual C++ treats code as 32 bit on Itanium.) |
| 637 | PythonWin also supports Windows CE; see the Python CE page at |
| 638 | \url{http://starship.python.net/crew/mhammond/ce/} for more |
| 639 | information. |
| 640 | |
Andrew M. Kuchling | ec1722e | 2000-10-12 03:04:22 +0000 | [diff] [blame] | 641 | Another new platform is Darwin/MacOS X; inital support for it is in |
| 642 | Python 2.0. Dynamic loading works, if you specify ``configure |
| 643 | --with-dyld --with-suffix=.x''. Consult the README in the Python |
| 644 | source distribution for more instructions. |
| 645 | |
Andrew M. Kuchling | 4373764 | 2000-08-30 00:51:02 +0000 | [diff] [blame] | 646 | An attempt has been made to alleviate one of Python's warts, the |
| 647 | often-confusing \exception{NameError} exception when code refers to a |
| 648 | local variable before the variable has been assigned a value. For |
| 649 | example, the following code raises an exception on the \keyword{print} |
| 650 | statement in both 1.5.2 and 2.0; in 1.5.2 a \exception{NameError} |
| 651 | exception is raised, while 2.0 raises a new |
| 652 | \exception{UnboundLocalError} exception. |
| 653 | \exception{UnboundLocalError} is a subclass of \exception{NameError}, |
| 654 | so any existing code that expects \exception{NameError} to be raised |
| 655 | should still work. |
| 656 | |
| 657 | \begin{verbatim} |
| 658 | def f(): |
| 659 | print "i=",i |
| 660 | i = i + 1 |
| 661 | f() |
| 662 | \end{verbatim} |
| 663 | |
Andrew M. Kuchling | 4d46d38 | 2000-09-06 17:58:49 +0000 | [diff] [blame] | 664 | Two new exceptions, \exception{TabError} and |
| 665 | \exception{IndentationError}, have been introduced. They're both |
| 666 | subclasses of \exception{SyntaxError}, and are raised when Python code |
| 667 | is found to be improperly indented. |
| 668 | |
Andrew M. Kuchling | 4373764 | 2000-08-30 00:51:02 +0000 | [diff] [blame] | 669 | \subsection{Changes to Built-in Functions} |
| 670 | |
| 671 | A new built-in, \function{zip(\var{seq1}, \var{seq2}, ...)}, has been |
| 672 | added. \function{zip()} returns a list of tuples where each tuple |
| 673 | contains the i-th element from each of the argument sequences. The |
| 674 | difference between \function{zip()} and \code{map(None, \var{seq1}, |
Andrew M. Kuchling | 5e08a01 | 2000-09-04 17:59:27 +0000 | [diff] [blame] | 675 | \var{seq2})} is that \function{map()} pads the sequences with |
| 676 | \code{None} if the sequences aren't all of the same length, while |
| 677 | \function{zip()} truncates the returned list to the length of the |
| 678 | shortest argument sequence. |
Andrew M. Kuchling | 4373764 | 2000-08-30 00:51:02 +0000 | [diff] [blame] | 679 | |
| 680 | The \function{int()} and \function{long()} functions now accept an |
| 681 | optional ``base'' parameter when the first argument is a string. |
| 682 | \code{int('123', 10)} returns 123, while \code{int('123', 16)} returns |
| 683 | 291. \code{int(123, 16)} raises a \exception{TypeError} exception |
| 684 | with the message ``can't convert non-string with explicit base''. |
| 685 | |
| 686 | A new variable holding more detailed version information has been |
| 687 | added to the \module{sys} module. \code{sys.version_info} is a tuple |
| 688 | \code{(\var{major}, \var{minor}, \var{micro}, \var{level}, |
| 689 | \var{serial})} For example, in a hypothetical 2.0.1beta1, |
| 690 | \code{sys.version_info} would be \code{(2, 0, 1, 'beta', 1)}. |
| 691 | \var{level} is a string such as \code{"alpha"}, \code{"beta"}, or |
Andrew M. Kuchling | 5e08a01 | 2000-09-04 17:59:27 +0000 | [diff] [blame] | 692 | \code{"final"} for a final release. |
Andrew M. Kuchling | 4373764 | 2000-08-30 00:51:02 +0000 | [diff] [blame] | 693 | |
| 694 | Dictionaries have an odd new method, \method{setdefault(\var{key}, |
| 695 | \var{default})}, which behaves similarly to the existing |
| 696 | \method{get()} method. However, if the key is missing, |
| 697 | \method{setdefault()} both returns the value of \var{default} as |
| 698 | \method{get()} would do, and also inserts it into the dictionary as |
| 699 | the value for \var{key}. Thus, the following lines of code: |
| 700 | |
| 701 | \begin{verbatim} |
| 702 | if dict.has_key( key ): return dict[key] |
| 703 | else: |
| 704 | dict[key] = [] |
| 705 | return dict[key] |
| 706 | \end{verbatim} |
| 707 | |
| 708 | can be reduced to a single \code{return dict.setdefault(key, [])} statement. |
| 709 | |
Andrew M. Kuchling | 4d46d38 | 2000-09-06 17:58:49 +0000 | [diff] [blame] | 710 | The interpreter sets a maximum recursion depth in order to catch |
| 711 | runaway recursion before filling the C stack and causing a core dump |
| 712 | or GPF.. Previously this limit was fixed when you compiled Python, |
| 713 | but in 2.0 the maximum recursion depth can be read and modified using |
| 714 | \function{sys.getrecursionlimit} and \function{sys.setrecursionlimit}. |
| 715 | The default value is 1000, and a rough maximum value for a given |
| 716 | platform can be found by running a new script, |
| 717 | \file{Misc/find_recursionlimit.py}. |
Andrew M. Kuchling | 35e8afb | 2000-07-08 12:06:31 +0000 | [diff] [blame] | 718 | |
| 719 | % ====================================================================== |
Andrew M. Kuchling | 730067e | 2000-06-30 01:44:05 +0000 | [diff] [blame] | 720 | \section{Porting to 2.0} |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 721 | |
| 722 | New Python releases try hard to be compatible with previous releases, |
| 723 | and the record has been pretty good. However, some changes are |
Andrew M. Kuchling | 4d46d38 | 2000-09-06 17:58:49 +0000 | [diff] [blame] | 724 | considered useful enough, usually because they fix initial design decisions that |
| 725 | turned out to be actively mistaken, that breaking backward compatibility |
Andrew M. Kuchling | 730067e | 2000-06-30 01:44:05 +0000 | [diff] [blame] | 726 | can't always be avoided. This section lists the changes in Python 2.0 |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 727 | that may cause old Python code to break. |
| 728 | |
| 729 | The change which will probably break the most code is tightening up |
| 730 | the arguments accepted by some methods. Some methods would take |
| 731 | multiple arguments and treat them as a tuple, particularly various |
Andrew M. Kuchling | 6c3cd8d | 2000-06-10 02:24:31 +0000 | [diff] [blame] | 732 | list methods such as \method{.append()} and \method{.insert()}. |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 733 | In earlier versions of Python, if \code{L} is a list, \code{L.append( |
Andrew M. Kuchling | 730067e | 2000-06-30 01:44:05 +0000 | [diff] [blame] | 734 | 1,2 )} appends the tuple \code{(1,2)} to the list. In Python 2.0 this |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 735 | causes a \exception{TypeError} exception to be raised, with the |
| 736 | message: 'append requires exactly 1 argument; 2 given'. The fix is to |
| 737 | simply add an extra set of parentheses to pass both values as a tuple: |
| 738 | \code{L.append( (1,2) )}. |
| 739 | |
| 740 | The earlier versions of these methods were more forgiving because they |
| 741 | used an old function in Python's C interface to parse their arguments; |
Andrew M. Kuchling | 730067e | 2000-06-30 01:44:05 +0000 | [diff] [blame] | 742 | 2.0 modernizes them to use \function{PyArg_ParseTuple}, the current |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 743 | argument parsing function, which provides more helpful error messages |
| 744 | and treats multi-argument calls as errors. If you absolutely must use |
Andrew M. Kuchling | 730067e | 2000-06-30 01:44:05 +0000 | [diff] [blame] | 745 | 2.0 but can't fix your code, you can edit \file{Objects/listobject.c} |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 746 | and define the preprocessor symbol \code{NO_STRICT_LIST_APPEND} to |
| 747 | preserve the old behaviour; this isn't recommended. |
| 748 | |
| 749 | Some of the functions in the \module{socket} module are still |
| 750 | forgiving in this way. For example, \function{socket.connect( |
| 751 | ('hostname', 25) )} is the correct form, passing a tuple representing |
Andrew M. Kuchling | 6c3cd8d | 2000-06-10 02:24:31 +0000 | [diff] [blame] | 752 | an IP address, but \function{socket.connect( 'hostname', 25 )} also |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 753 | works. \function{socket.connect_ex()} and \function{socket.bind()} are |
Andrew M. Kuchling | 730067e | 2000-06-30 01:44:05 +0000 | [diff] [blame] | 754 | similarly easy-going. 2.0alpha1 tightened these functions up, but |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 755 | because the documentation actually used the erroneous multiple |
Andrew M. Kuchling | 6c3cd8d | 2000-06-10 02:24:31 +0000 | [diff] [blame] | 756 | argument form, many people wrote code which would break with the |
| 757 | stricter checking. GvR backed out the changes in the face of public |
Andrew M. Kuchling | 5e08a01 | 2000-09-04 17:59:27 +0000 | [diff] [blame] | 758 | reaction, so for the \module{socket} module, the documentation was |
Andrew M. Kuchling | 6c3cd8d | 2000-06-10 02:24:31 +0000 | [diff] [blame] | 759 | fixed and the multiple argument form is simply marked as deprecated; |
| 760 | it \emph{will} be tightened up again in a future Python version. |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 761 | |
Andrew M. Kuchling | 4d46d38 | 2000-09-06 17:58:49 +0000 | [diff] [blame] | 762 | The \code{\e x} escape in string literals now takes exactly 2 hex |
| 763 | digits. Previously it would consume all the hex digits following the |
| 764 | 'x' and take the lowest 8 bits of the result, so \code{\e x123456} was |
| 765 | equivalent to \code{\e x56}. |
| 766 | |
Andrew M. Kuchling | 2a15980 | 2002-05-02 14:37:14 +0000 | [diff] [blame] | 767 | The \exception{AttributeError} and \exception{NameError} exceptions |
| 768 | have a more friendly error message, whose text will be something like |
| 769 | \code{'Spam' instance has no attribute 'eggs'} or \code{name 'eggs' is |
| 770 | not defined}. Previously the error message was just the missing |
| 771 | attribute name \code{eggs}, and code written to take advantage of this |
| 772 | fact will break in 2.0. |
Andrew M. Kuchling | 4d46d38 | 2000-09-06 17:58:49 +0000 | [diff] [blame] | 773 | |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 774 | Some work has been done to make integers and long integers a bit more |
| 775 | interchangeable. In 1.5.2, large-file support was added for Solaris, |
| 776 | to allow reading files larger than 2Gb; this made the \method{tell()} |
| 777 | method of file objects return a long integer instead of a regular |
| 778 | integer. Some code would subtract two file offsets and attempt to use |
| 779 | the result to multiply a sequence or slice a string, but this raised a |
Andrew M. Kuchling | 730067e | 2000-06-30 01:44:05 +0000 | [diff] [blame] | 780 | \exception{TypeError}. In 2.0, long integers can be used to multiply |
Andrew M. Kuchling | 6c3cd8d | 2000-06-10 02:24:31 +0000 | [diff] [blame] | 781 | or slice a sequence, and it'll behave as you'd intuitively expect it |
| 782 | to; \code{3L * 'abc'} produces 'abcabcabc', and \code{ |
| 783 | (0,1,2,3)[2L:4L]} produces (2,3). Long integers can also be used in |
Andrew M. Kuchling | 3ad4e74 | 2000-09-27 01:33:41 +0000 | [diff] [blame] | 784 | various contexts where previously only integers were accepted, such |
| 785 | as in the \method{seek()} method of file objects, and in the formats |
| 786 | supported by the \verb|%| operator (\verb|%d|, \verb|%i|, \verb|%x|, |
| 787 | etc.). For example, \code{"\%d" \% 2L**64} will produce the string |
| 788 | \samp{18446744073709551616}. |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 789 | |
| 790 | The subtlest long integer change of all is that the \function{str()} |
| 791 | of a long integer no longer has a trailing 'L' character, though |
| 792 | \function{repr()} still includes it. The 'L' annoyed many people who |
| 793 | wanted to print long integers that looked just like regular integers, |
| 794 | since they had to go out of their way to chop off the character. This |
Andrew M. Kuchling | 5e08a01 | 2000-09-04 17:59:27 +0000 | [diff] [blame] | 795 | is no longer a problem in 2.0, but code which does \code{str(longval)[:-1]} and assumes the 'L' is there, will now lose |
| 796 | the final digit. |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 797 | |
| 798 | Taking the \function{repr()} of a float now uses a different |
| 799 | formatting precision than \function{str()}. \function{repr()} uses |
Andrew M. Kuchling | 662d76e | 2000-06-25 14:32:48 +0000 | [diff] [blame] | 800 | \code{\%.17g} format string for C's \function{sprintf()}, while |
| 801 | \function{str()} uses \code{\%.12g} as before. The effect is that |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 802 | \function{repr()} may occasionally show more decimal places than |
Andrew M. Kuchling | 5e08a01 | 2000-09-04 17:59:27 +0000 | [diff] [blame] | 803 | \function{str()}, for certain numbers. |
Andrew M. Kuchling | a5bbb00 | 2000-06-10 02:41:46 +0000 | [diff] [blame] | 804 | For example, the number 8.1 can't be represented exactly in binary, so |
| 805 | \code{repr(8.1)} is \code{'8.0999999999999996'}, while str(8.1) is |
| 806 | \code{'8.1'}. |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 807 | |
Andrew M. Kuchling | 730067e | 2000-06-30 01:44:05 +0000 | [diff] [blame] | 808 | The \code{-X} command-line option, which turned all standard |
Andrew M. Kuchling | 62cdd96 | 2000-06-30 12:46:41 +0000 | [diff] [blame] | 809 | exceptions into strings instead of classes, has been removed; the |
| 810 | standard exceptions will now always be classes. The |
| 811 | \module{exceptions} module containing the standard exceptions was |
| 812 | translated from Python to a built-in C module, written by Barry Warsaw |
| 813 | and Fredrik Lundh. |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 814 | |
Andrew M. Kuchling | 791b366 | 2000-07-01 15:04:18 +0000 | [diff] [blame] | 815 | % Commented out for now -- I don't think anyone will care. |
| 816 | %The pattern and match objects provided by SRE are C types, not Python |
| 817 | %class instances as in 1.5. This means you can no longer inherit from |
| 818 | %\class{RegexObject} or \class{MatchObject}, but that shouldn't be much |
| 819 | %of a problem since no one should have been doing that in the first |
| 820 | %place. |
| 821 | |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 822 | % ====================================================================== |
Andrew M. Kuchling | fa33a4e | 2000-06-03 02:52:40 +0000 | [diff] [blame] | 823 | \section{Extending/Embedding Changes} |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 824 | |
| 825 | Some of the changes are under the covers, and will only be apparent to |
Andrew M. Kuchling | 8357c4c | 2000-07-01 00:14:43 +0000 | [diff] [blame] | 826 | people writing C extension modules or embedding a Python interpreter |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 827 | in a larger application. If you aren't dealing with Python's C API, |
Andrew M. Kuchling | 5b8311e | 2000-05-31 03:28:42 +0000 | [diff] [blame] | 828 | you can safely skip this section. |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 829 | |
Andrew M. Kuchling | a5bbb00 | 2000-06-10 02:41:46 +0000 | [diff] [blame] | 830 | The version number of the Python C API was incremented, so C |
| 831 | extensions compiled for 1.5.2 must be recompiled in order to work with |
Andrew M. Kuchling | a8d1078 | 2000-10-19 01:42:33 +0000 | [diff] [blame] | 832 | 2.0. On Windows, it's not possible for Python 2.0 to import a third |
| 833 | party extension built for Python 1.5.x due to how Windows DLLs work, |
| 834 | so Python will raise an exception and the import will fail. |
Andrew M. Kuchling | a5bbb00 | 2000-06-10 02:41:46 +0000 | [diff] [blame] | 835 | |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 836 | Users of Jim Fulton's ExtensionClass module will be pleased to find |
| 837 | out that hooks have been added so that ExtensionClasses are now |
| 838 | supported by \function{isinstance()} and \function{issubclass()}. |
| 839 | This means you no longer have to remember to write code such as |
| 840 | \code{if type(obj) == myExtensionClass}, but can use the more natural |
| 841 | \code{if isinstance(obj, myExtensionClass)}. |
| 842 | |
Andrew M. Kuchling | b853ea0 | 2000-06-03 03:06:58 +0000 | [diff] [blame] | 843 | The \file{Python/importdl.c} file, which was a mass of \#ifdefs to |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 844 | support dynamic loading on many different platforms, was cleaned up |
Andrew M. Kuchling | 69db0e4 | 2000-06-28 02:16:00 +0000 | [diff] [blame] | 845 | and reorganised by Greg Stein. \file{importdl.c} is now quite small, |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 846 | and platform-specific code has been moved into a bunch of |
Andrew M. Kuchling | b9fb1f2 | 2000-08-04 12:40:35 +0000 | [diff] [blame] | 847 | \file{Python/dynload_*.c} files. Another cleanup: there were also a |
| 848 | number of \file{my*.h} files in the Include/ directory that held |
| 849 | various portability hacks; they've been merged into a single file, |
| 850 | \file{Include/pyport.h}. |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 851 | |
| 852 | Vladimir Marangozov's long-awaited malloc restructuring was completed, |
| 853 | to make it easy to have the Python interpreter use a custom allocator |
| 854 | instead of C's standard \function{malloc()}. For documentation, read |
Andrew M. Kuchling | 2d2dc9f | 2000-08-17 00:27:06 +0000 | [diff] [blame] | 855 | the comments in \file{Include/pymem.h} and |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 856 | \file{Include/objimpl.h}. For the lengthy discussions during which |
| 857 | the interface was hammered out, see the Web archives of the 'patches' |
| 858 | and 'python-dev' lists at python.org. |
| 859 | |
Andrew M. Kuchling | 6c3cd8d | 2000-06-10 02:24:31 +0000 | [diff] [blame] | 860 | Recent versions of the GUSI development environment for MacOS support |
| 861 | POSIX threads. Therefore, Python's POSIX threading support now works |
| 862 | on the Macintosh. Threading support using the user-space GNU \texttt{pth} |
| 863 | library was also contributed. |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 864 | |
| 865 | Threading support on Windows was enhanced, too. Windows supports |
| 866 | thread locks that use kernel objects only in case of contention; in |
| 867 | the common case when there's no contention, they use simpler functions |
| 868 | which are an order of magnitude faster. A threaded version of Python |
Andrew M. Kuchling | 730067e | 2000-06-30 01:44:05 +0000 | [diff] [blame] | 869 | 1.5.2 on NT is twice as slow as an unthreaded version; with the 2.0 |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 870 | changes, the difference is only 10\%. These improvements were |
| 871 | contributed by Yakov Markovitch. |
| 872 | |
Andrew M. Kuchling | 08d87c6 | 2000-07-09 15:05:15 +0000 | [diff] [blame] | 873 | Python 2.0's source now uses only ANSI C prototypes, so compiling Python now |
| 874 | requires an ANSI C compiler, and can no longer be done using a compiler that |
| 875 | only supports K\&R C. |
| 876 | |
Andrew M. Kuchling | 4d46d38 | 2000-09-06 17:58:49 +0000 | [diff] [blame] | 877 | Previously the Python virtual machine used 16-bit numbers in its |
| 878 | bytecode, limiting the size of source files. In particular, this |
| 879 | affected the maximum size of literal lists and dictionaries in Python |
Andrew M. Kuchling | 3ad4e74 | 2000-09-27 01:33:41 +0000 | [diff] [blame] | 880 | source; occasionally people who are generating Python code would run |
| 881 | into this limit. A patch by Charles G. Waldman raises the limit from |
| 882 | \verb|2^16| to \verb|2^{32}|. |
Andrew M. Kuchling | 4d46d38 | 2000-09-06 17:58:49 +0000 | [diff] [blame] | 883 | |
Andrew M. Kuchling | 3ad4e74 | 2000-09-27 01:33:41 +0000 | [diff] [blame] | 884 | Three new convenience functions intended for adding constants to a |
| 885 | module's dictionary at module initialization time were added: |
| 886 | \function{PyModule_AddObject()}, \function{PyModule_AddIntConstant()}, |
| 887 | and \function{PyModule_AddStringConstant()}. Each of these functions |
| 888 | takes a module object, a null-terminated C string containing the name |
| 889 | to be added, and a third argument for the value to be assigned to the |
| 890 | name. This third argument is, respectively, a Python object, a C |
| 891 | long, or a C string. |
| 892 | |
| 893 | A wrapper API was added for Unix-style signal handlers. |
| 894 | \function{PyOS_getsig()} gets a signal handler and |
| 895 | \function{PyOS_setsig()} will set a new handler. |
Andrew M. Kuchling | 4d46d38 | 2000-09-06 17:58:49 +0000 | [diff] [blame] | 896 | |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 897 | % ====================================================================== |
Andrew M. Kuchling | 4373764 | 2000-08-30 00:51:02 +0000 | [diff] [blame] | 898 | \section{Distutils: Making Modules Easy to Install} |
| 899 | |
| 900 | Before Python 2.0, installing modules was a tedious affair -- there |
| 901 | was no way to figure out automatically where Python is installed, or |
| 902 | what compiler options to use for extension modules. Software authors |
Andrew M. Kuchling | 5e08a01 | 2000-09-04 17:59:27 +0000 | [diff] [blame] | 903 | had to go through an arduous ritual of editing Makefiles and |
Andrew M. Kuchling | 4373764 | 2000-08-30 00:51:02 +0000 | [diff] [blame] | 904 | configuration files, which only really work on Unix and leave Windows |
Andrew M. Kuchling | 3ad4e74 | 2000-09-27 01:33:41 +0000 | [diff] [blame] | 905 | and MacOS unsupported. Python users faced wildly differing |
| 906 | installation instructions which varied between different extension |
| 907 | packages, which made adminstering a Python installation something of a |
| 908 | chore. |
Andrew M. Kuchling | 4373764 | 2000-08-30 00:51:02 +0000 | [diff] [blame] | 909 | |
| 910 | The SIG for distribution utilities, shepherded by Greg Ward, has |
| 911 | created the Distutils, a system to make package installation much |
| 912 | easier. They form the \module{distutils} package, a new part of |
| 913 | Python's standard library. In the best case, installing a Python |
| 914 | module from source will require the same steps: first you simply mean |
| 915 | unpack the tarball or zip archive, and the run ``\code{python setup.py |
| 916 | install}''. The platform will be automatically detected, the compiler |
| 917 | will be recognized, C extension modules will be compiled, and the |
| 918 | distribution installed into the proper directory. Optional |
| 919 | command-line arguments provide more control over the installation |
| 920 | process, the distutils package offers many places to override defaults |
| 921 | -- separating the build from the install, building or installing in |
| 922 | non-default directories, and more. |
| 923 | |
| 924 | In order to use the Distutils, you need to write a \file{setup.py} |
| 925 | script. For the simple case, when the software contains only .py |
| 926 | files, a minimal \file{setup.py} can be just a few lines long: |
| 927 | |
| 928 | \begin{verbatim} |
| 929 | from distutils.core import setup |
| 930 | setup (name = "foo", version = "1.0", |
| 931 | py_modules = ["module1", "module2"]) |
| 932 | \end{verbatim} |
| 933 | |
| 934 | The \file{setup.py} file isn't much more complicated if the software |
| 935 | consists of a few packages: |
| 936 | |
| 937 | \begin{verbatim} |
| 938 | from distutils.core import setup |
| 939 | setup (name = "foo", version = "1.0", |
| 940 | packages = ["package", "package.subpackage"]) |
| 941 | \end{verbatim} |
| 942 | |
| 943 | A C extension can be the most complicated case; here's an example taken from |
| 944 | the PyXML package: |
| 945 | |
| 946 | |
| 947 | \begin{verbatim} |
| 948 | from distutils.core import setup, Extension |
| 949 | |
| 950 | expat_extension = Extension('xml.parsers.pyexpat', |
| 951 | define_macros = [('XML_NS', None)], |
| 952 | include_dirs = [ 'extensions/expat/xmltok', |
| 953 | 'extensions/expat/xmlparse' ], |
| 954 | sources = [ 'extensions/pyexpat.c', |
| 955 | 'extensions/expat/xmltok/xmltok.c', |
| 956 | 'extensions/expat/xmltok/xmlrole.c', |
| 957 | ] |
| 958 | ) |
| 959 | setup (name = "PyXML", version = "0.5.4", |
| 960 | ext_modules =[ expat_extension ] ) |
Andrew M. Kuchling | 4373764 | 2000-08-30 00:51:02 +0000 | [diff] [blame] | 961 | \end{verbatim} |
| 962 | |
| 963 | The Distutils can also take care of creating source and binary |
| 964 | distributions. The ``sdist'' command, run by ``\code{python setup.py |
| 965 | sdist}', builds a source distribution such as \file{foo-1.0.tar.gz}. |
| 966 | Adding new commands isn't difficult, ``bdist_rpm'' and |
| 967 | ``bdist_wininst'' commands have already been contributed to create an |
| 968 | RPM distribution and a Windows installer for the software, |
| 969 | respectively. Commands to create other distribution formats such as |
| 970 | Debian packages and Solaris \file{.pkg} files are in various stages of |
| 971 | development. |
| 972 | |
| 973 | All this is documented in a new manual, \textit{Distributing Python |
| 974 | Modules}, that joins the basic set of Python documentation. |
| 975 | |
Fred Drake | 7486c6b | 2000-10-12 02:49:12 +0000 | [diff] [blame] | 976 | % ====================================================================== |
Andrew M. Kuchling | 6032c48 | 2000-10-12 02:37:14 +0000 | [diff] [blame] | 977 | \section{XML Modules} |
Andrew M. Kuchling | 4373764 | 2000-08-30 00:51:02 +0000 | [diff] [blame] | 978 | |
Andrew M. Kuchling | 6032c48 | 2000-10-12 02:37:14 +0000 | [diff] [blame] | 979 | Python 1.5.2 included a simple XML parser in the form of the |
| 980 | \module{xmllib} module, contributed by Sjoerd Mullender. Since |
| 981 | 1.5.2's release, two different interfaces for processing XML have |
| 982 | become common: SAX2 (version 2 of the Simple API for XML) provides an |
| 983 | event-driven interface with some similarities to \module{xmllib}, and |
| 984 | the DOM (Document Object Model) provides a tree-based interface, |
| 985 | transforming an XML document into a tree of nodes that can be |
| 986 | traversed and modified. Python 2.0 includes a SAX2 interface and a |
| 987 | stripped-down DOM interface as part of the \module{xml} package. |
| 988 | Here we will give a brief overview of these new interfaces; consult |
| 989 | the Python documentation or the source code for complete details. |
| 990 | The Python XML SIG is also working on improved documentation. |
| 991 | |
| 992 | \subsection{SAX2 Support} |
| 993 | |
| 994 | SAX defines an event-driven interface for parsing XML. To use SAX, |
| 995 | you must write a SAX handler class. Handler classes inherit from |
| 996 | various classes provided by SAX, and override various methods that |
| 997 | will then be called by the XML parser. For example, the |
| 998 | \method{startElement} and \method{endElement} methods are called for |
| 999 | every starting and end tag encountered by the parser, the |
| 1000 | \method{characters()} method is called for every chunk of character |
| 1001 | data, and so forth. |
| 1002 | |
| 1003 | The advantage of the event-driven approach is that that the whole |
| 1004 | document doesn't have to be resident in memory at any one time, which |
| 1005 | matters if you are processing really huge documents. However, writing |
| 1006 | the SAX handler class can get very complicated if you're trying to |
| 1007 | modify the document structure in some elaborate way. |
| 1008 | |
| 1009 | For example, this little example program defines a handler that prints |
| 1010 | a message for every starting and ending tag, and then parses the file |
| 1011 | \file{hamlet.xml} using it: |
| 1012 | |
| 1013 | \begin{verbatim} |
| 1014 | from xml import sax |
| 1015 | |
| 1016 | class SimpleHandler(sax.ContentHandler): |
| 1017 | def startElement(self, name, attrs): |
| 1018 | print 'Start of element:', name, attrs.keys() |
| 1019 | |
| 1020 | def endElement(self, name): |
| 1021 | print 'End of element:', name |
| 1022 | |
| 1023 | # Create a parser object |
| 1024 | parser = sax.make_parser() |
| 1025 | |
| 1026 | # Tell it what handler to use |
| 1027 | handler = SimpleHandler() |
| 1028 | parser.setContentHandler( handler ) |
| 1029 | |
| 1030 | # Parse a file! |
| 1031 | parser.parse( 'hamlet.xml' ) |
| 1032 | \end{verbatim} |
| 1033 | |
| 1034 | For more information, consult the Python documentation, or the XML |
Andrew M. Kuchling | 9546772 | 2002-05-02 14:48:26 +0000 | [diff] [blame] | 1035 | HOWTO at \url{http://pyxml.sourceforge.net/topics/howto/xml-howto.html}. |
Andrew M. Kuchling | 6032c48 | 2000-10-12 02:37:14 +0000 | [diff] [blame] | 1036 | |
| 1037 | \subsection{DOM Support} |
| 1038 | |
| 1039 | The Document Object Model is a tree-based representation for an XML |
| 1040 | document. A top-level \class{Document} instance is the root of the |
| 1041 | tree, and has a single child which is the top-level \class{Element} |
| 1042 | instance. This \class{Element} has children nodes representing |
| 1043 | character data and any sub-elements, which may have further children |
| 1044 | of their own, and so forth. Using the DOM you can traverse the |
| 1045 | resulting tree any way you like, access element and attribute values, |
| 1046 | insert and delete nodes, and convert the tree back into XML. |
| 1047 | |
| 1048 | The DOM is useful for modifying XML documents, because you can create |
| 1049 | a DOM tree, modify it by adding new nodes or rearranging subtrees, and |
| 1050 | then produce a new XML document as output. You can also construct a |
| 1051 | DOM tree manually and convert it to XML, which can be a more flexible |
| 1052 | way of producing XML output than simply writing |
| 1053 | \code{<tag1>}...\code{</tag1>} to a file. |
| 1054 | |
| 1055 | The DOM implementation included with Python lives in the |
| 1056 | \module{xml.dom.minidom} module. It's a lightweight implementation of |
| 1057 | the Level 1 DOM with support for XML namespaces. The |
| 1058 | \function{parse()} and \function{parseString()} convenience |
| 1059 | functions are provided for generating a DOM tree: |
| 1060 | |
| 1061 | \begin{verbatim} |
| 1062 | from xml.dom import minidom |
| 1063 | doc = minidom.parse('hamlet.xml') |
| 1064 | \end{verbatim} |
| 1065 | |
| 1066 | \code{doc} is a \class{Document} instance. \class{Document}, like all |
| 1067 | the other DOM classes such as \class{Element} and \class{Text}, is a |
| 1068 | subclass of the \class{Node} base class. All the nodes in a DOM tree |
| 1069 | therefore support certain common methods, such as \method{toxml()} |
| 1070 | which returns a string containing the XML representation of the node |
| 1071 | and its children. Each class also has special methods of its own; for |
| 1072 | example, \class{Element} and \class{Document} instances have a method |
| 1073 | to find all child elements with a given tag name. Continuing from the |
| 1074 | previous 2-line example: |
| 1075 | |
| 1076 | \begin{verbatim} |
| 1077 | perslist = doc.getElementsByTagName( 'PERSONA' ) |
| 1078 | print perslist[0].toxml() |
| 1079 | print perslist[1].toxml() |
| 1080 | \end{verbatim} |
| 1081 | |
| 1082 | For the \textit{Hamlet} XML file, the above few lines output: |
| 1083 | |
| 1084 | \begin{verbatim} |
| 1085 | <PERSONA>CLAUDIUS, king of Denmark. </PERSONA> |
| 1086 | <PERSONA>HAMLET, son to the late, and nephew to the present king.</PERSONA> |
| 1087 | \end{verbatim} |
| 1088 | |
| 1089 | The root element of the document is available as |
| 1090 | \code{doc.documentElement}, and its children can be easily modified |
| 1091 | by deleting, adding, or removing nodes: |
| 1092 | |
| 1093 | \begin{verbatim} |
| 1094 | root = doc.documentElement |
| 1095 | |
| 1096 | # Remove the first child |
| 1097 | root.removeChild( root.childNodes[0] ) |
| 1098 | |
| 1099 | # Move the new first child to the end |
| 1100 | root.appendChild( root.childNodes[0] ) |
| 1101 | |
| 1102 | # Insert the new first child (originally, |
| 1103 | # the third child) before the 20th child. |
| 1104 | root.insertBefore( root.childNodes[0], root.childNodes[20] ) |
| 1105 | \end{verbatim} |
| 1106 | |
| 1107 | Again, I will refer you to the Python documentation for a complete |
| 1108 | listing of the different \class{Node} classes and their various methods. |
| 1109 | |
| 1110 | \subsection{Relationship to PyXML} |
| 1111 | |
| 1112 | The XML Special Interest Group has been working on XML-related Python |
| 1113 | code for a while. Its code distribution, called PyXML, is available |
| 1114 | from the SIG's Web pages at \url{http://www.python.org/sigs/xml-sig/}. |
| 1115 | The PyXML distribution also used the package name \samp{xml}. If |
| 1116 | you've written programs that used PyXML, you're probably wondering |
| 1117 | about its compatibility with the 2.0 \module{xml} package. |
| 1118 | |
| 1119 | The answer is that Python 2.0's \module{xml} package isn't compatible |
| 1120 | with PyXML, but can be made compatible by installing a recent version |
| 1121 | PyXML. Many applications can get by with the XML support that is |
| 1122 | included with Python 2.0, but more complicated applications will |
| 1123 | require that the full PyXML package will be installed. When |
| 1124 | installed, PyXML versions 0.6.0 or greater will replace the |
| 1125 | \module{xml} package shipped with Python, and will be a strict |
| 1126 | superset of the standard package, adding a bunch of additional |
| 1127 | features. Some of the additional features in PyXML include: |
| 1128 | |
| 1129 | \begin{itemize} |
| 1130 | \item 4DOM, a full DOM implementation |
Andrew M. Kuchling | f155170 | 2000-10-16 14:19:21 +0000 | [diff] [blame] | 1131 | from FourThought, Inc. |
Andrew M. Kuchling | 6032c48 | 2000-10-12 02:37:14 +0000 | [diff] [blame] | 1132 | \item The xmlproc validating parser, written by Lars Marius Garshol. |
| 1133 | \item The \module{sgmlop} parser accelerator module, written by Fredrik Lundh. |
| 1134 | \end{itemize} |
Andrew M. Kuchling | 4373764 | 2000-08-30 00:51:02 +0000 | [diff] [blame] | 1135 | |
| 1136 | % ====================================================================== |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 1137 | \section{Module changes} |
| 1138 | |
Andrew M. Kuchling | fa33a4e | 2000-06-03 02:52:40 +0000 | [diff] [blame] | 1139 | Lots of improvements and bugfixes were made to Python's extensive |
| 1140 | standard library; some of the affected modules include |
| 1141 | \module{readline}, \module{ConfigParser}, \module{cgi}, |
| 1142 | \module{calendar}, \module{posix}, \module{readline}, \module{xmllib}, |
| 1143 | \module{aifc}, \module{chunk, wave}, \module{random}, \module{shelve}, |
| 1144 | and \module{nntplib}. Consult the CVS logs for the exact |
| 1145 | patch-by-patch details. |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 1146 | |
Andrew M. Kuchling | fa33a4e | 2000-06-03 02:52:40 +0000 | [diff] [blame] | 1147 | Brian Gallew contributed OpenSSL support for the \module{socket} |
Andrew M. Kuchling | 6c3cd8d | 2000-06-10 02:24:31 +0000 | [diff] [blame] | 1148 | module. OpenSSL is an implementation of the Secure Socket Layer, |
| 1149 | which encrypts the data being sent over a socket. When compiling |
| 1150 | Python, you can edit \file{Modules/Setup} to include SSL support, |
| 1151 | which adds an additional function to the \module{socket} module: |
Andrew M. Kuchling | fa33a4e | 2000-06-03 02:52:40 +0000 | [diff] [blame] | 1152 | \function{socket.ssl(\var{socket}, \var{keyfile}, \var{certfile})}, |
Andrew M. Kuchling | 6c3cd8d | 2000-06-10 02:24:31 +0000 | [diff] [blame] | 1153 | which takes a socket object and returns an SSL socket. The |
| 1154 | \module{httplib} and \module{urllib} modules were also changed to |
| 1155 | support ``https://'' URLs, though no one has implemented FTP or SMTP |
| 1156 | over SSL. |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 1157 | |
Andrew M. Kuchling | 69db0e4 | 2000-06-28 02:16:00 +0000 | [diff] [blame] | 1158 | The \module{httplib} module has been rewritten by Greg Stein to |
| 1159 | support HTTP/1.1. Backward compatibility with the 1.5 version of |
| 1160 | \module{httplib} is provided, though using HTTP/1.1 features such as |
| 1161 | pipelining will require rewriting code to use a different set of |
| 1162 | interfaces. |
| 1163 | |
Andrew M. Kuchling | fa33a4e | 2000-06-03 02:52:40 +0000 | [diff] [blame] | 1164 | The \module{Tkinter} module now supports Tcl/Tk version 8.1, 8.2, or |
| 1165 | 8.3, and support for the older 7.x versions has been dropped. The |
Andrew M. Kuchling | 791b366 | 2000-07-01 15:04:18 +0000 | [diff] [blame] | 1166 | Tkinter module now supports displaying Unicode strings in Tk widgets. |
Andrew M. Kuchling | 5e08a01 | 2000-09-04 17:59:27 +0000 | [diff] [blame] | 1167 | Also, Fredrik Lundh contributed an optimization which makes operations |
| 1168 | like \code{create_line} and \code{create_polygon} much faster, |
Andrew M. Kuchling | 791b366 | 2000-07-01 15:04:18 +0000 | [diff] [blame] | 1169 | especially when using lots of coordinates. |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 1170 | |
Andrew M. Kuchling | fa33a4e | 2000-06-03 02:52:40 +0000 | [diff] [blame] | 1171 | The \module{curses} module has been greatly extended, starting from |
| 1172 | Oliver Andrich's enhanced version, to provide many additional |
| 1173 | functions from ncurses and SYSV curses, such as colour, alternative |
Andrew M. Kuchling | 69db0e4 | 2000-06-28 02:16:00 +0000 | [diff] [blame] | 1174 | character set support, pads, and mouse support. This means the module |
| 1175 | is no longer compatible with operating systems that only have BSD |
| 1176 | curses, but there don't seem to be any currently maintained OSes that |
| 1177 | fall into this category. |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 1178 | |
Andrew M. Kuchling | 730067e | 2000-06-30 01:44:05 +0000 | [diff] [blame] | 1179 | As mentioned in the earlier discussion of 2.0's Unicode support, the |
Andrew M. Kuchling | 6c3cd8d | 2000-06-10 02:24:31 +0000 | [diff] [blame] | 1180 | underlying implementation of the regular expressions provided by the |
| 1181 | \module{re} module has been changed. SRE, a new regular expression |
| 1182 | engine written by Fredrik Lundh and partially funded by Hewlett |
| 1183 | Packard, supports matching against both 8-bit strings and Unicode |
| 1184 | strings. |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 1185 | |
| 1186 | % ====================================================================== |
| 1187 | \section{New modules} |
| 1188 | |
Andrew M. Kuchling | 6c3cd8d | 2000-06-10 02:24:31 +0000 | [diff] [blame] | 1189 | A number of new modules were added. We'll simply list them with brief |
Andrew M. Kuchling | 730067e | 2000-06-30 01:44:05 +0000 | [diff] [blame] | 1190 | descriptions; consult the 2.0 documentation for the details of a |
Andrew M. Kuchling | 6c3cd8d | 2000-06-10 02:24:31 +0000 | [diff] [blame] | 1191 | particular module. |
| 1192 | |
| 1193 | \begin{itemize} |
| 1194 | |
Andrew M. Kuchling | 62cdd96 | 2000-06-30 12:46:41 +0000 | [diff] [blame] | 1195 | \item{\module{atexit}}: |
| 1196 | For registering functions to be called before the Python interpreter exits. |
| 1197 | Code that currently sets |
| 1198 | \code{sys.exitfunc} directly should be changed to |
| 1199 | use the \module{atexit} module instead, importing \module{atexit} |
| 1200 | and calling \function{atexit.register()} with |
| 1201 | the function to be called on exit. |
| 1202 | (Contributed by Skip Montanaro.) |
| 1203 | |
Andrew M. Kuchling | 6c3cd8d | 2000-06-10 02:24:31 +0000 | [diff] [blame] | 1204 | \item{\module{codecs}, \module{encodings}, \module{unicodedata}:} Added as part of the new Unicode support. |
| 1205 | |
Andrew M. Kuchling | fed4f1e | 2000-07-01 12:33:43 +0000 | [diff] [blame] | 1206 | \item{\module{filecmp}:} Supersedes the old \module{cmp}, \module{cmpcache} and |
Andrew M. Kuchling | 6c3cd8d | 2000-06-10 02:24:31 +0000 | [diff] [blame] | 1207 | \module{dircmp} modules, which have now become deprecated. |
Andrew M. Kuchling | c0328f0 | 2000-06-10 15:11:20 +0000 | [diff] [blame] | 1208 | (Contributed by Gordon MacMillan and Moshe Zadka.) |
Andrew M. Kuchling | 6c3cd8d | 2000-06-10 02:24:31 +0000 | [diff] [blame] | 1209 | |
Andrew M. Kuchling | ec1722e | 2000-10-12 03:04:22 +0000 | [diff] [blame] | 1210 | \item{\module{gettext}:} This module provides internationalization |
| 1211 | (I18N) and localization (L10N) support for Python programs by |
| 1212 | providing an interface to the GNU gettext message catalog library. |
| 1213 | (Integrated by Barry Warsaw, from separate contributions by Martin von |
| 1214 | Loewis, Peter Funk, and James Henstridge.) |
| 1215 | |
Andrew M. Kuchling | 35e8afb | 2000-07-08 12:06:31 +0000 | [diff] [blame] | 1216 | \item{\module{linuxaudiodev}:} Support for the \file{/dev/audio} |
| 1217 | device on Linux, a twin to the existing \module{sunaudiodev} module. |
Andrew M. Kuchling | ec1722e | 2000-10-12 03:04:22 +0000 | [diff] [blame] | 1218 | (Contributed by Peter Bosch, with fixes by Jeremy Hylton.) |
Andrew M. Kuchling | 6c3cd8d | 2000-06-10 02:24:31 +0000 | [diff] [blame] | 1219 | |
| 1220 | \item{\module{mmap}:} An interface to memory-mapped files on both |
| 1221 | Windows and Unix. A file's contents can be mapped directly into |
| 1222 | memory, at which point it behaves like a mutable string, so its |
| 1223 | contents can be read and modified. They can even be passed to |
| 1224 | functions that expect ordinary strings, such as the \module{re} |
| 1225 | module. (Contributed by Sam Rushing, with some extensions by |
| 1226 | A.M. Kuchling.) |
| 1227 | |
Andrew M. Kuchling | 35e8afb | 2000-07-08 12:06:31 +0000 | [diff] [blame] | 1228 | \item{\module{pyexpat}:} An interface to the Expat XML parser. |
Andrew M. Kuchling | 6c3cd8d | 2000-06-10 02:24:31 +0000 | [diff] [blame] | 1229 | (Contributed by Paul Prescod.) |
| 1230 | |
| 1231 | \item{\module{robotparser}:} Parse a \file{robots.txt} file, which is |
| 1232 | used for writing Web spiders that politely avoid certain areas of a |
Andrew M. Kuchling | 5e08a01 | 2000-09-04 17:59:27 +0000 | [diff] [blame] | 1233 | Web site. The parser accepts the contents of a \file{robots.txt} file, |
Andrew M. Kuchling | 6c3cd8d | 2000-06-10 02:24:31 +0000 | [diff] [blame] | 1234 | builds a set of rules from it, and can then answer questions about |
| 1235 | the fetchability of a given URL. (Contributed by Skip Montanaro.) |
| 1236 | |
| 1237 | \item{\module{tabnanny}:} A module/script to |
Andrew M. Kuchling | 5e08a01 | 2000-09-04 17:59:27 +0000 | [diff] [blame] | 1238 | check Python source code for ambiguous indentation. |
Andrew M. Kuchling | 6c3cd8d | 2000-06-10 02:24:31 +0000 | [diff] [blame] | 1239 | (Contributed by Tim Peters.) |
| 1240 | |
Andrew M. Kuchling | a5bbb00 | 2000-06-10 02:41:46 +0000 | [diff] [blame] | 1241 | \item{\module{UserString}:} A base class useful for deriving objects that behave like strings. |
| 1242 | |
Andrew M. Kuchling | 08d87c6 | 2000-07-09 15:05:15 +0000 | [diff] [blame] | 1243 | \item{\module{webbrowser}:} A module that provides a platform independent |
| 1244 | way to launch a web browser on a specific URL. For each platform, various |
| 1245 | browsers are tried in a specific order. The user can alter which browser |
| 1246 | is launched by setting the \var{BROWSER} environment variable. |
| 1247 | (Originally inspired by Eric S. Raymond's patch to \module{urllib} |
| 1248 | which added similar functionality, but |
| 1249 | the final module comes from code originally |
| 1250 | implemented by Fred Drake as \file{Tools/idle/BrowserControl.py}, |
| 1251 | and adapted for the standard library by Fred.) |
| 1252 | |
Andrew M. Kuchling | d500e44 | 2000-09-06 12:30:25 +0000 | [diff] [blame] | 1253 | \item{\module{_winreg}:} An interface to the |
Andrew M. Kuchling | fed4f1e | 2000-07-01 12:33:43 +0000 | [diff] [blame] | 1254 | Windows registry. \module{_winreg} is an adaptation of functions that |
| 1255 | have been part of PythonWin since 1995, but has now been added to the core |
Andrew M. Kuchling | d500e44 | 2000-09-06 12:30:25 +0000 | [diff] [blame] | 1256 | distribution, and enhanced to support Unicode. |
| 1257 | \module{_winreg} was written by Bill Tutt and Mark Hammond. |
Andrew M. Kuchling | 6c3cd8d | 2000-06-10 02:24:31 +0000 | [diff] [blame] | 1258 | |
| 1259 | \item{\module{zipfile}:} A module for reading and writing ZIP-format |
| 1260 | archives. These are archives produced by \program{PKZIP} on |
| 1261 | DOS/Windows or \program{zip} on Unix, not to be confused with |
| 1262 | \program{gzip}-format files (which are supported by the \module{gzip} |
| 1263 | module) |
Andrew M. Kuchling | 6c3cd8d | 2000-06-10 02:24:31 +0000 | [diff] [blame] | 1264 | (Contributed by James C. Ahlstrom.) |
| 1265 | |
Andrew M. Kuchling | 69db0e4 | 2000-06-28 02:16:00 +0000 | [diff] [blame] | 1266 | \item{\module{imputil}:} A module that provides a simpler way for |
| 1267 | writing customised import hooks, in comparison to the existing |
| 1268 | \module{ihooks} module. (Implemented by Greg Stein, with much |
| 1269 | discussion on python-dev along the way.) |
| 1270 | |
Andrew M. Kuchling | 6c3cd8d | 2000-06-10 02:24:31 +0000 | [diff] [blame] | 1271 | \end{itemize} |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 1272 | |
| 1273 | % ====================================================================== |
| 1274 | \section{IDLE Improvements} |
| 1275 | |
Andrew M. Kuchling | c0328f0 | 2000-06-10 15:11:20 +0000 | [diff] [blame] | 1276 | IDLE is the official Python cross-platform IDE, written using Tkinter. |
Andrew M. Kuchling | 730067e | 2000-06-30 01:44:05 +0000 | [diff] [blame] | 1277 | Python 2.0 includes IDLE 0.6, which adds a number of new features and |
Andrew M. Kuchling | c0328f0 | 2000-06-10 15:11:20 +0000 | [diff] [blame] | 1278 | improvements. A partial list: |
| 1279 | |
| 1280 | \begin{itemize} |
| 1281 | \item UI improvements and optimizations, |
| 1282 | especially in the area of syntax highlighting and auto-indentation. |
| 1283 | |
| 1284 | \item The class browser now shows more information, such as the top |
Andrew M. Kuchling | 730067e | 2000-06-30 01:44:05 +0000 | [diff] [blame] | 1285 | level functions in a module. |
Andrew M. Kuchling | c0328f0 | 2000-06-10 15:11:20 +0000 | [diff] [blame] | 1286 | |
| 1287 | \item Tab width is now a user settable option. When opening an existing Python |
| 1288 | file, IDLE automatically detects the indentation conventions, and adapts. |
| 1289 | |
| 1290 | \item There is now support for calling browsers on various platforms, |
| 1291 | used to open the Python documentation in a browser. |
| 1292 | |
| 1293 | \item IDLE now has a command line, which is largely similar to |
| 1294 | the vanilla Python interpreter. |
| 1295 | |
| 1296 | \item Call tips were added in many places. |
| 1297 | |
| 1298 | \item IDLE can now be installed as a package. |
| 1299 | |
| 1300 | \item In the editor window, there is now a line/column bar at the bottom. |
| 1301 | |
| 1302 | \item Three new keystroke commands: Check module (Alt-F5), Import |
Andrew M. Kuchling | 730067e | 2000-06-30 01:44:05 +0000 | [diff] [blame] | 1303 | module (F5) and Run script (Ctrl-F5). |
Andrew M. Kuchling | c0328f0 | 2000-06-10 15:11:20 +0000 | [diff] [blame] | 1304 | |
| 1305 | \end{itemize} |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 1306 | |
| 1307 | % ====================================================================== |
| 1308 | \section{Deleted and Deprecated Modules} |
| 1309 | |
Andrew M. Kuchling | 6c3cd8d | 2000-06-10 02:24:31 +0000 | [diff] [blame] | 1310 | A few modules have been dropped because they're obsolete, or because |
| 1311 | there are now better ways to do the same thing. The \module{stdwin} |
| 1312 | module is gone; it was for a platform-independent windowing toolkit |
| 1313 | that's no longer developed. |
Andrew M. Kuchling | 6c3cd8d | 2000-06-10 02:24:31 +0000 | [diff] [blame] | 1314 | |
Andrew M. Kuchling | a5bbb00 | 2000-06-10 02:41:46 +0000 | [diff] [blame] | 1315 | A number of modules have been moved to the |
| 1316 | \file{lib-old} subdirectory: |
| 1317 | \module{cmp}, \module{cmpcache}, \module{dircmp}, \module{dump}, |
| 1318 | \module{find}, \module{grep}, \module{packmail}, |
| 1319 | \module{poly}, \module{util}, \module{whatsound}, \module{zmod}. |
| 1320 | If you have code which relies on a module that's been moved to |
Andrew M. Kuchling | 6c3cd8d | 2000-06-10 02:24:31 +0000 | [diff] [blame] | 1321 | \file{lib-old}, you can simply add that directory to \code{sys.path} |
Andrew M. Kuchling | a5bbb00 | 2000-06-10 02:41:46 +0000 | [diff] [blame] | 1322 | to get them back, but you're encouraged to update any code that uses |
| 1323 | these modules. |
Andrew M. Kuchling | 6c3cd8d | 2000-06-10 02:24:31 +0000 | [diff] [blame] | 1324 | |
Andrew M. Kuchling | 730067e | 2000-06-30 01:44:05 +0000 | [diff] [blame] | 1325 | \section{Acknowledgements} |
Andrew M. Kuchling | 6c3cd8d | 2000-06-10 02:24:31 +0000 | [diff] [blame] | 1326 | |
Andrew M. Kuchling | a6161ed | 2000-07-01 00:23:02 +0000 | [diff] [blame] | 1327 | The authors would like to thank the following people for offering |
Andrew M. Kuchling | 2a15980 | 2002-05-02 14:37:14 +0000 | [diff] [blame] | 1328 | suggestions on various drafts of this article: David Bolen, Mark |
| 1329 | Hammond, Gregg Hauser, Jeremy Hylton, Fredrik Lundh, Detlef Lannert, |
| 1330 | Aahz Maruch, Skip Montanaro, Vladimir Marangozov, Tobias Polzin, Guido |
| 1331 | van Rossum, Neil Schemenauer, and Russ Schmidt. |
Andrew M. Kuchling | 25bfd0e | 2000-05-27 11:28:26 +0000 | [diff] [blame] | 1332 | |
| 1333 | \end{document} |