Fred Drake | ed0fa3d | 2003-07-30 19:14:09 +0000 | [diff] [blame] | 1 | \documentclass{howto} |
| 2 | \usepackage{distutils} |
| 3 | % $Id$ |
| 4 | |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 5 | % Don't write extensive text for new sections; I'll do that. |
| 6 | % Feel free to add commented-out reminders of things that need |
| 7 | % to be covered. --amk |
| 8 | |
Andrew M. Kuchling | d0b6d9d | 2004-07-04 15:35:00 +0000 | [diff] [blame] | 9 | % XXX pydoc can display links to module docs -- but when? |
| 10 | % |
| 11 | |
Fred Drake | ed0fa3d | 2003-07-30 19:14:09 +0000 | [diff] [blame] | 12 | \title{What's New in Python 2.4} |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 13 | \release{0.9} |
Fred Drake | ed0fa3d | 2003-07-30 19:14:09 +0000 | [diff] [blame] | 14 | \author{A.M.\ Kuchling} |
Fred Drake | b914ef0 | 2004-01-02 06:57:50 +0000 | [diff] [blame] | 15 | \authoraddress{ |
| 16 | \strong{Python Software Foundation}\\ |
| 17 | Email: \email{amk@amk.ca} |
| 18 | } |
Fred Drake | ed0fa3d | 2003-07-30 19:14:09 +0000 | [diff] [blame] | 19 | |
| 20 | \begin{document} |
| 21 | \maketitle |
| 22 | \tableofcontents |
| 23 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 24 | This article explains the new features in Python 2.4, scheduled for |
| 25 | release in December 2004. |
Fred Drake | ed0fa3d | 2003-07-30 19:14:09 +0000 | [diff] [blame] | 26 | |
Andrew M. Kuchling | 3bf85f1 | 2004-07-05 01:37:07 +0000 | [diff] [blame] | 27 | Python 2.4 is a medium-sized release. It doesn't introduce as many |
Andrew M. Kuchling | 3b79091 | 2004-07-04 16:39:40 +0000 | [diff] [blame] | 28 | changes as the radical Python 2.2, but introduces more features than |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 29 | the conservative 2.3 release. The most significant new language |
| 30 | features are function decorators and generator expressions; most other |
| 31 | changes are to the standard library. |
| 32 | |
Andrew M. Kuchling | a331e86 | 2004-09-10 13:05:22 +0000 | [diff] [blame] | 33 | % XXX update these figures as we go |
| 34 | According to the CVS change logs, there were 421 patches applied and |
| 35 | 413 bugs fixed between Python 2.3 and 2.4. Both figures are likely to |
| 36 | be underestimates. |
| 37 | |
Fred Drake | ed0fa3d | 2003-07-30 19:14:09 +0000 | [diff] [blame] | 38 | This article doesn't attempt to provide a complete specification of |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 39 | every single new feature, but instead provides a brief introduction to |
| 40 | each feature. For full details, you should refer to the documentation |
| 41 | for Python 2.4, such as the \citetitle[../lib/lib.html]{Python Library |
| 42 | Reference} and the \citetitle[../ref/ref.html]{Python Reference |
| 43 | Manual}. Often you will be referred to the PEP for a particular new |
| 44 | feature for explanations of the implementation and design rationale. |
Fred Drake | ed0fa3d | 2003-07-30 19:14:09 +0000 | [diff] [blame] | 45 | |
Andrew M. Kuchling | 35f2b05 | 2003-12-18 13:28:13 +0000 | [diff] [blame] | 46 | |
Raymond Hettinger | 7e0282f | 2003-11-24 07:14:54 +0000 | [diff] [blame] | 47 | %====================================================================== |
| 48 | \section{PEP 218: Built-In Set Objects} |
| 49 | |
Andrew M. Kuchling | 3bf85f1 | 2004-07-05 01:37:07 +0000 | [diff] [blame] | 50 | Python 2.3 introduced the \module{sets} module. C implementations of |
| 51 | set data types have now been added to the Python core as two new |
| 52 | built-in types, \function{set(\var{iterable})} and |
| 53 | \function{frozenset(\var{iterable})}. They provide high speed |
| 54 | operations for membership testing, for eliminating duplicates from |
| 55 | sequences, and for mathematical operations like unions, intersections, |
| 56 | differences, and symmetric differences. |
Raymond Hettinger | 7e0282f | 2003-11-24 07:14:54 +0000 | [diff] [blame] | 57 | |
| 58 | \begin{verbatim} |
| 59 | >>> a = set('abracadabra') # form a set from a string |
| 60 | >>> 'z' in a # fast membership testing |
| 61 | False |
| 62 | >>> a # unique letters in a |
| 63 | set(['a', 'r', 'b', 'c', 'd']) |
| 64 | >>> ''.join(a) # convert back into a string |
| 65 | 'arbcd' |
Raymond Hettinger | d446230 | 2003-11-26 17:52:45 +0000 | [diff] [blame] | 66 | |
Raymond Hettinger | 7e0282f | 2003-11-24 07:14:54 +0000 | [diff] [blame] | 67 | >>> b = set('alacazam') # form a second set |
| 68 | >>> a - b # letters in a but not in b |
| 69 | set(['r', 'd', 'b']) |
| 70 | >>> a | b # letters in either a or b |
| 71 | set(['a', 'c', 'r', 'd', 'b', 'm', 'z', 'l']) |
| 72 | >>> a & b # letters in both a and b |
| 73 | set(['a', 'c']) |
| 74 | >>> a ^ b # letters in a or b but not both |
| 75 | set(['r', 'd', 'b', 'm', 'z', 'l']) |
Raymond Hettinger | d446230 | 2003-11-26 17:52:45 +0000 | [diff] [blame] | 76 | |
Raymond Hettinger | 7e0282f | 2003-11-24 07:14:54 +0000 | [diff] [blame] | 77 | >>> a.add('z') # add a new element |
| 78 | >>> a.update('wxy') # add multiple new elements |
| 79 | >>> a |
| 80 | set(['a', 'c', 'b', 'd', 'r', 'w', 'y', 'x', 'z']) |
| 81 | >>> a.remove('x') # take one element out |
| 82 | >>> a |
| 83 | set(['a', 'c', 'b', 'd', 'r', 'w', 'y', 'z']) |
| 84 | \end{verbatim} |
| 85 | |
Andrew M. Kuchling | 3bf85f1 | 2004-07-05 01:37:07 +0000 | [diff] [blame] | 86 | The \function{frozenset} type is an immutable version of \function{set}. |
Raymond Hettinger | 7e0282f | 2003-11-24 07:14:54 +0000 | [diff] [blame] | 87 | Since it is immutable and hashable, it may be used as a dictionary key or |
Andrew M. Kuchling | 3bf85f1 | 2004-07-05 01:37:07 +0000 | [diff] [blame] | 88 | as a member of another set. |
Raymond Hettinger | 7e0282f | 2003-11-24 07:14:54 +0000 | [diff] [blame] | 89 | |
Andrew M. Kuchling | 3bf85f1 | 2004-07-05 01:37:07 +0000 | [diff] [blame] | 90 | The \module{sets} module remains in the standard library, and may be |
| 91 | useful if you wish to subclass the \class{Set} or \class{ImmutableSet} |
| 92 | classes. There are currently no plans to deprecate the module. |
Andrew M. Kuchling | 35f2b05 | 2003-12-18 13:28:13 +0000 | [diff] [blame] | 93 | |
Raymond Hettinger | 7e0282f | 2003-11-24 07:14:54 +0000 | [diff] [blame] | 94 | \begin{seealso} |
| 95 | \seepep{218}{Adding a Built-In Set Object Type}{Originally proposed by |
| 96 | Greg Wilson and ultimately implemented by Raymond Hettinger.} |
| 97 | \end{seealso} |
Fred Drake | ed0fa3d | 2003-07-30 19:14:09 +0000 | [diff] [blame] | 98 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 99 | |
Fred Drake | ed0fa3d | 2003-07-30 19:14:09 +0000 | [diff] [blame] | 100 | %====================================================================== |
Andrew M. Kuchling | 35f2b05 | 2003-12-18 13:28:13 +0000 | [diff] [blame] | 101 | \section{PEP 237: Unifying Long Integers and Integers} |
| 102 | |
Andrew M. Kuchling | 3bf85f1 | 2004-07-05 01:37:07 +0000 | [diff] [blame] | 103 | The lengthy transition process for this PEP, begun in Python 2.2, |
Andrew M. Kuchling | d4be86c | 2004-07-04 01:44:04 +0000 | [diff] [blame] | 104 | takes another step forward in Python 2.4. In 2.3, certain integer |
| 105 | operations that would behave differently after int/long unification |
| 106 | triggered \exception{FutureWarning} warnings and returned values |
Andrew M. Kuchling | 3bf85f1 | 2004-07-05 01:37:07 +0000 | [diff] [blame] | 107 | limited to 32 or 64 bits (depending on your platform). In 2.4, these |
| 108 | expressions no longer produce a warning and instead produce a |
| 109 | different result that's usually a long integer. |
Andrew M. Kuchling | d4be86c | 2004-07-04 01:44:04 +0000 | [diff] [blame] | 110 | |
| 111 | The problematic expressions are primarily left shifts and lengthy |
Raymond Hettinger | ca1a775 | 2004-07-12 13:00:45 +0000 | [diff] [blame] | 112 | hexadecimal and octal constants. For example, |
| 113 | \code{2 \textless{}\textless{} 32} results |
Andrew M. Kuchling | 3bf85f1 | 2004-07-05 01:37:07 +0000 | [diff] [blame] | 114 | in a warning in 2.3, evaluating to 0 on 32-bit platforms. In Python |
| 115 | 2.4, this expression now returns the correct answer, 8589934592. |
Andrew M. Kuchling | d4be86c | 2004-07-04 01:44:04 +0000 | [diff] [blame] | 116 | |
| 117 | \begin{seealso} |
| 118 | \seepep{237}{Unifying Long Integers and Integers}{Original PEP |
Andrew M. Kuchling | 3bf85f1 | 2004-07-05 01:37:07 +0000 | [diff] [blame] | 119 | written by Moshe Zadka and GvR. The changes for 2.4 were implemented by |
Andrew M. Kuchling | d4be86c | 2004-07-04 01:44:04 +0000 | [diff] [blame] | 120 | Kalle Svensson.} |
| 121 | \end{seealso} |
Andrew M. Kuchling | 35f2b05 | 2003-12-18 13:28:13 +0000 | [diff] [blame] | 122 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 123 | |
Andrew M. Kuchling | 35f2b05 | 2003-12-18 13:28:13 +0000 | [diff] [blame] | 124 | %====================================================================== |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 125 | \section{PEP 289: Generator Expressions} |
Raymond Hettinger | 354433a | 2004-05-19 08:20:33 +0000 | [diff] [blame] | 126 | |
Andrew M. Kuchling | 38dc2a6 | 2004-08-07 13:24:12 +0000 | [diff] [blame] | 127 | The iterator feature introduced in Python 2.2 and the |
| 128 | \module{itertools} module make it easier to write programs that loop |
| 129 | through large data sets without having the entire data set in memory |
| 130 | at one time. List comprehensions don't fit into this picture very |
| 131 | well because they produce a Python list object containing all of the |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 132 | items. This unavoidably pulls all of the objects into memory, which |
| 133 | can be a problem if your data set is very large. When trying to write |
Andrew M. Kuchling | 38dc2a6 | 2004-08-07 13:24:12 +0000 | [diff] [blame] | 134 | a functionally-styled program, it would be natural to write something |
Andrew M. Kuchling | 3bf85f1 | 2004-07-05 01:37:07 +0000 | [diff] [blame] | 135 | like: |
Raymond Hettinger | 354433a | 2004-05-19 08:20:33 +0000 | [diff] [blame] | 136 | |
| 137 | \begin{verbatim} |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 138 | links = [link for link in get_all_links() if not link.followed] |
| 139 | for link in links: |
| 140 | ... |
Raymond Hettinger | 354433a | 2004-05-19 08:20:33 +0000 | [diff] [blame] | 141 | \end{verbatim} |
| 142 | |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 143 | instead of |
Raymond Hettinger | 354433a | 2004-05-19 08:20:33 +0000 | [diff] [blame] | 144 | |
| 145 | \begin{verbatim} |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 146 | for link in get_all_links(): |
| 147 | if link.followed: |
| 148 | continue |
| 149 | ... |
| 150 | \end{verbatim} |
Raymond Hettinger | 354433a | 2004-05-19 08:20:33 +0000 | [diff] [blame] | 151 | |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 152 | The first form is more concise and perhaps more readable, but if |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 153 | you're dealing with a large number of link objects you'd have to write |
| 154 | the second form to avoid having all link objects in memory at the same |
| 155 | time. |
Raymond Hettinger | 354433a | 2004-05-19 08:20:33 +0000 | [diff] [blame] | 156 | |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 157 | Generator expressions work similarly to list comprehensions but don't |
| 158 | materialize the entire list; instead they create a generator that will |
| 159 | return elements one by one. The above example could be written as: |
Raymond Hettinger | 354433a | 2004-05-19 08:20:33 +0000 | [diff] [blame] | 160 | |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 161 | \begin{verbatim} |
| 162 | links = (link for link in get_all_links() if not link.followed) |
| 163 | for link in links: |
| 164 | ... |
| 165 | \end{verbatim} |
Raymond Hettinger | 170a622 | 2004-05-19 19:45:19 +0000 | [diff] [blame] | 166 | |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 167 | Generator expressions always have to be written inside parentheses, as |
| 168 | in the above example. The parentheses signalling a function call also |
| 169 | count, so if you want to create a iterator that will be immediately |
| 170 | passed to a function you could write: |
Raymond Hettinger | 170a622 | 2004-05-19 19:45:19 +0000 | [diff] [blame] | 171 | |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 172 | \begin{verbatim} |
| 173 | print sum(obj.count for obj in list_all_objects()) |
| 174 | \end{verbatim} |
Raymond Hettinger | 170a622 | 2004-05-19 19:45:19 +0000 | [diff] [blame] | 175 | |
Andrew M. Kuchling | 3bf85f1 | 2004-07-05 01:37:07 +0000 | [diff] [blame] | 176 | Generator expressions differ from list comprehensions in various small |
| 177 | ways. Most notably, the loop variable (\var{obj} in the above |
| 178 | example) is not accessible outside of the generator expression. List |
| 179 | comprehensions leave the variable assigned to its last value; future |
| 180 | versions of Python will change this, making list comprehensions match |
| 181 | generator expressions in this respect. |
Raymond Hettinger | 354433a | 2004-05-19 08:20:33 +0000 | [diff] [blame] | 182 | |
| 183 | \begin{seealso} |
| 184 | \seepep{289}{Generator Expressions}{Proposed by Raymond Hettinger and |
| 185 | implemented by Jiwon Seo with early efforts steered by Hye-Shik Chang.} |
| 186 | \end{seealso} |
| 187 | |
Andrew M. Kuchling | 87c98b2 | 2004-08-25 13:38:46 +0000 | [diff] [blame] | 188 | |
| 189 | %====================================================================== |
| 190 | \section{PEP 292: Simpler String Substitutions} |
| 191 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 192 | Some new classes in the standard library provide an alternative |
| 193 | mechanism for substituting variables into strings; this style of |
| 194 | substitution may be better for applications where untrained |
| 195 | users need to edit templates. |
Andrew M. Kuchling | 87c98b2 | 2004-08-25 13:38:46 +0000 | [diff] [blame] | 196 | |
| 197 | The usual way of substituting variables by name is the \code{\%} |
| 198 | operator: |
| 199 | |
| 200 | \begin{verbatim} |
| 201 | >>> '%(page)i: %(title)s' % {'page':2, 'title': 'The Best of Times'} |
| 202 | '2: The Best of Times' |
| 203 | \end{verbatim} |
| 204 | |
| 205 | When writing the template string, it can be easy to forget the |
| 206 | \samp{i} or \samp{s} after the closing parenthesis. This isn't a big |
| 207 | problem if the template is in a Python module, because you run the |
| 208 | code, get an ``Unsupported format character'' \exception{ValueError}, |
| 209 | and fix the problem. However, consider an application such as Mailman |
| 210 | where template strings or translations are being edited by users who |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 211 | aren't aware of the Python language. The format string's syntax is |
| 212 | complicated to explain to such users, and if they make a mistake, it's |
| 213 | difficult to provide helpful feedback to them. |
Andrew M. Kuchling | 87c98b2 | 2004-08-25 13:38:46 +0000 | [diff] [blame] | 214 | |
| 215 | PEP 292 adds a \class{Template} class to the \module{string} module |
| 216 | that uses \samp{\$} to indicate a substitution. \class{Template} is a |
| 217 | subclass of the built-in Unicode type, so the result is always a |
| 218 | Unicode string: |
| 219 | |
| 220 | \begin{verbatim} |
| 221 | >>> import string |
| 222 | >>> t = string.Template('$page: $title') |
Andrew M. Kuchling | a79ec22 | 2004-09-10 11:34:39 +0000 | [diff] [blame] | 223 | >>> t.substitute({'page':2, 'title': 'The Best of Times'}) |
Andrew M. Kuchling | 87c98b2 | 2004-08-25 13:38:46 +0000 | [diff] [blame] | 224 | u'2: The Best of Times' |
Andrew M. Kuchling | 87c98b2 | 2004-08-25 13:38:46 +0000 | [diff] [blame] | 225 | \end{verbatim} |
| 226 | |
| 227 | % $ Terminate $-mode for Emacs |
| 228 | |
Andrew M. Kuchling | a79ec22 | 2004-09-10 11:34:39 +0000 | [diff] [blame] | 229 | If a key is missing from the dictionary, the \method{substitute} method |
| 230 | will raise a \exception{KeyError}. There's also a \method{safe_substitute} |
| 231 | method that ignores missing keys: |
Andrew M. Kuchling | 87c98b2 | 2004-08-25 13:38:46 +0000 | [diff] [blame] | 232 | |
| 233 | \begin{verbatim} |
| 234 | >>> t = string.SafeTemplate('$page: $title') |
Andrew M. Kuchling | a79ec22 | 2004-09-10 11:34:39 +0000 | [diff] [blame] | 235 | >>> t.safe_substitute({'page':3}) |
Andrew M. Kuchling | 87c98b2 | 2004-08-25 13:38:46 +0000 | [diff] [blame] | 236 | u'3: $title' |
| 237 | \end{verbatim} |
| 238 | |
Andrew M. Kuchling | a331e86 | 2004-09-10 13:05:22 +0000 | [diff] [blame] | 239 | % $ Terminate math-mode for Emacs |
| 240 | |
| 241 | |
Andrew M. Kuchling | 87c98b2 | 2004-08-25 13:38:46 +0000 | [diff] [blame] | 242 | \begin{seealso} |
| 243 | \seepep{292}{Simpler String Substitutions}{Written and implemented |
| 244 | by Barry Warsaw.} |
| 245 | \end{seealso} |
| 246 | |
| 247 | |
Raymond Hettinger | 354433a | 2004-05-19 08:20:33 +0000 | [diff] [blame] | 248 | %====================================================================== |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 249 | \section{PEP 318: Decorators for Functions and Methods} |
Andrew M. Kuchling | d91fcbe | 2004-08-02 12:44:28 +0000 | [diff] [blame] | 250 | |
Andrew M. Kuchling | 77a602f | 2004-08-02 13:48:18 +0000 | [diff] [blame] | 251 | Python 2.2 extended Python's object model by adding static methods and |
| 252 | class methods, but it didn't extend Python's syntax to provide any new |
| 253 | way of defining static or class methods. Instead, you had to write a |
| 254 | \keyword{def} statement in the usual way, and pass the resulting |
| 255 | method to a \function{staticmethod()} or \function{classmethod()} |
| 256 | function that would wrap up the function as a method of the new type. |
| 257 | Your code would look like this: |
| 258 | |
| 259 | \begin{verbatim} |
| 260 | class C: |
| 261 | def meth (cls): |
| 262 | ... |
| 263 | |
| 264 | meth = classmethod(meth) # Rebind name to wrapped-up class method |
| 265 | \end{verbatim} |
| 266 | |
| 267 | If the method was very long, it would be easy to miss or forget the |
| 268 | \function{classmethod()} invocation after the function body. |
| 269 | |
| 270 | The intention was always to add some syntax to make such definitions |
| 271 | more readable, but at the time of 2.2's release a good syntax was not |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 272 | obvious. Today a good syntax \emph{still} isn't obvious but users are |
| 273 | asking for easier access to the feature; a new syntactic feature has |
| 274 | been added to meet this need. |
Andrew M. Kuchling | 77a602f | 2004-08-02 13:48:18 +0000 | [diff] [blame] | 275 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 276 | The new feature is called ``function decorators''. The name comes |
| 277 | from the idea that \function{classmethod}, \function{staticmethod}, |
| 278 | and friends are storing additional information on a function object; |
| 279 | they're \emph{decorating} functions with more details. |
Andrew M. Kuchling | 77a602f | 2004-08-02 13:48:18 +0000 | [diff] [blame] | 280 | |
Fred Drake | 3f5c654 | 2004-08-06 03:34:20 +0000 | [diff] [blame] | 281 | The notation borrows from Java and uses the \character{@} character as an |
Andrew M. Kuchling | 77a602f | 2004-08-02 13:48:18 +0000 | [diff] [blame] | 282 | indicator. Using the new syntax, the example above would be written: |
| 283 | |
| 284 | \begin{verbatim} |
| 285 | class C: |
| 286 | |
| 287 | @classmethod |
| 288 | def meth (cls): |
| 289 | ... |
| 290 | |
| 291 | \end{verbatim} |
| 292 | |
| 293 | The \code{@classmethod} is shorthand for the |
Fred Drake | 3f5c654 | 2004-08-06 03:34:20 +0000 | [diff] [blame] | 294 | \code{meth=classmethod(meth)} assignment. More generally, if you have |
Andrew M. Kuchling | 77a602f | 2004-08-02 13:48:18 +0000 | [diff] [blame] | 295 | the following: |
| 296 | |
| 297 | \begin{verbatim} |
| 298 | @A @B @C |
| 299 | def f (): |
| 300 | ... |
| 301 | \end{verbatim} |
| 302 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 303 | It's equivalent to the following pre-decorator code: |
Andrew M. Kuchling | 77a602f | 2004-08-02 13:48:18 +0000 | [diff] [blame] | 304 | |
| 305 | \begin{verbatim} |
| 306 | def f(): ... |
Andrew M. Kuchling | cebdd3c | 2004-10-08 18:29:29 +0000 | [diff] [blame] | 307 | f = A(B(C(f))) |
Andrew M. Kuchling | 77a602f | 2004-08-02 13:48:18 +0000 | [diff] [blame] | 308 | \end{verbatim} |
| 309 | |
| 310 | Decorators must come on the line before a function definition, and |
| 311 | can't be on the same line, meaning that \code{@A def f(): ...} is |
| 312 | illegal. You can only decorate function definitions, either at the |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 313 | module level or inside a class; you can't decorate class definitions. |
Andrew M. Kuchling | 77a602f | 2004-08-02 13:48:18 +0000 | [diff] [blame] | 314 | |
| 315 | A decorator is just a function that takes the function to be decorated |
| 316 | as an argument and returns either the same function or some new |
| 317 | callable thing. It's easy to write your own decorators. The |
| 318 | following simple example just sets an attribute on the function |
| 319 | object: |
| 320 | |
| 321 | \begin{verbatim} |
| 322 | >>> def deco(func): |
| 323 | ... func.attr = 'decorated' |
| 324 | ... return func |
| 325 | ... |
| 326 | >>> @deco |
| 327 | ... def f(): pass |
| 328 | ... |
| 329 | >>> f |
| 330 | <function f at 0x402ef0d4> |
| 331 | >>> f.attr |
| 332 | 'decorated' |
| 333 | >>> |
| 334 | \end{verbatim} |
| 335 | |
| 336 | As a slightly more realistic example, the following decorator checks |
| 337 | that the supplied argument is an integer: |
| 338 | |
| 339 | \begin{verbatim} |
| 340 | def require_int (func): |
| 341 | def wrapper (arg): |
| 342 | assert isinstance(arg, int) |
| 343 | return func(arg) |
| 344 | |
| 345 | return wrapper |
| 346 | |
| 347 | @require_int |
| 348 | def p1 (arg): |
| 349 | print arg |
| 350 | |
| 351 | @require_int |
| 352 | def p2(arg): |
| 353 | print arg*2 |
| 354 | \end{verbatim} |
| 355 | |
| 356 | An example in \pep{318} contains a fancier version of this idea that |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 357 | lets you both specify the required type and check the returned type. |
Andrew M. Kuchling | 77a602f | 2004-08-02 13:48:18 +0000 | [diff] [blame] | 358 | |
| 359 | Decorator functions can take arguments. If arguments are supplied, |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 360 | your decorator function is called with only those arguments and must |
| 361 | return a new decorator function; this function must take a single |
Andrew M. Kuchling | 77a602f | 2004-08-02 13:48:18 +0000 | [diff] [blame] | 362 | function and return a function, as previously described. In other |
| 363 | words, \code{@A @B @C(args)} becomes: |
| 364 | |
| 365 | \begin{verbatim} |
| 366 | def f(): ... |
| 367 | _deco = C(args) |
Andrew M. Kuchling | cebdd3c | 2004-10-08 18:29:29 +0000 | [diff] [blame] | 368 | f = A(B(_deco(f))) |
Andrew M. Kuchling | 77a602f | 2004-08-02 13:48:18 +0000 | [diff] [blame] | 369 | \end{verbatim} |
| 370 | |
| 371 | Getting this right can be slightly brain-bending, but it's not too |
| 372 | difficult. |
| 373 | |
Andrew M. Kuchling | 87c98b2 | 2004-08-25 13:38:46 +0000 | [diff] [blame] | 374 | A small related change makes the \member{func_name} attribute of |
| 375 | functions writable. This attribute is used to display function names |
| 376 | in tracebacks, so decorators should change the name of any new |
| 377 | function that's constructed and returned. |
| 378 | |
Andrew M. Kuchling | d91fcbe | 2004-08-02 12:44:28 +0000 | [diff] [blame] | 379 | \begin{seealso} |
| 380 | \seepep{318}{Decorators for Functions, Methods and Classes}{Written |
Andrew M. Kuchling | 77a602f | 2004-08-02 13:48:18 +0000 | [diff] [blame] | 381 | by Kevin D. Smith, Jim Jewett, and Skip Montanaro. Several people |
| 382 | wrote patches implementing function decorators, but the one that was |
Fred Drake | e72bd4d | 2004-08-02 21:50:26 +0000 | [diff] [blame] | 383 | actually checked in was patch \#979728, written by Mark Russell.} |
Andrew M. Kuchling | d91fcbe | 2004-08-02 12:44:28 +0000 | [diff] [blame] | 384 | \end{seealso} |
| 385 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 386 | % XXX add link to decorators module in Wiki |
| 387 | |
| 388 | |
Andrew M. Kuchling | d91fcbe | 2004-08-02 12:44:28 +0000 | [diff] [blame] | 389 | %====================================================================== |
Andrew M. Kuchling | 1a42025 | 2003-11-08 15:58:49 +0000 | [diff] [blame] | 390 | \section{PEP 322: Reverse Iteration} |
Fred Drake | ed0fa3d | 2003-07-30 19:14:09 +0000 | [diff] [blame] | 391 | |
Fred Drake | 56fcc23 | 2004-05-06 02:55:35 +0000 | [diff] [blame] | 392 | A new built-in function, \function{reversed(\var{seq})}, takes a sequence |
Andrew M. Kuchling | 3bf85f1 | 2004-07-05 01:37:07 +0000 | [diff] [blame] | 393 | and returns an iterator that loops over the elements of the sequence |
Andrew M. Kuchling | 1a42025 | 2003-11-08 15:58:49 +0000 | [diff] [blame] | 394 | in reverse order. |
| 395 | |
| 396 | \begin{verbatim} |
Raymond Hettinger | bc3cba2 | 2003-11-12 16:39:30 +0000 | [diff] [blame] | 397 | >>> for i in reversed(xrange(1,4)): |
Andrew M. Kuchling | 1a42025 | 2003-11-08 15:58:49 +0000 | [diff] [blame] | 398 | ... print i |
| 399 | ... |
| 400 | 3 |
| 401 | 2 |
| 402 | 1 |
| 403 | \end{verbatim} |
| 404 | |
Andrew M. Kuchling | 3bf85f1 | 2004-07-05 01:37:07 +0000 | [diff] [blame] | 405 | Compared to extended slicing, such as \code{range(1,4)[::-1]}, |
| 406 | \function{reversed()} is easier to read, runs faster, and uses |
| 407 | substantially less memory. |
Raymond Hettinger | bc3cba2 | 2003-11-12 16:39:30 +0000 | [diff] [blame] | 408 | |
Andrew M. Kuchling | 1a42025 | 2003-11-08 15:58:49 +0000 | [diff] [blame] | 409 | Note that \function{reversed()} only accepts sequences, not arbitrary |
Raymond Hettinger | bc3cba2 | 2003-11-12 16:39:30 +0000 | [diff] [blame] | 410 | iterators. If you want to reverse an iterator, first convert it to |
| 411 | a list with \function{list()}. |
Andrew M. Kuchling | 1a42025 | 2003-11-08 15:58:49 +0000 | [diff] [blame] | 412 | |
| 413 | \begin{verbatim} |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 414 | >>> input = open('/etc/passwd', 'r') |
Andrew M. Kuchling | 44a31e1 | 2004-01-01 18:33:34 +0000 | [diff] [blame] | 415 | >>> for line in reversed(list(input)): |
Andrew M. Kuchling | 1a42025 | 2003-11-08 15:58:49 +0000 | [diff] [blame] | 416 | ... print line |
| 417 | ... |
| 418 | root:*:0:0:System Administrator:/var/root:/bin/tcsh |
| 419 | ... |
| 420 | \end{verbatim} |
Fred Drake | ed0fa3d | 2003-07-30 19:14:09 +0000 | [diff] [blame] | 421 | |
Andrew M. Kuchling | f7a6b67 | 2003-11-08 16:05:37 +0000 | [diff] [blame] | 422 | \begin{seealso} |
| 423 | \seepep{322}{Reverse Iteration}{Written and implemented by Raymond Hettinger.} |
| 424 | |
| 425 | \end{seealso} |
| 426 | |
Fred Drake | ed0fa3d | 2003-07-30 19:14:09 +0000 | [diff] [blame] | 427 | |
| 428 | %====================================================================== |
Andrew M. Kuchling | c9e7d77 | 2004-10-12 15:58:02 +0000 | [diff] [blame] | 429 | \section{PEP 324: New subprocess Module} |
| 430 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 431 | The standard library provides a number of ways to execute a |
| 432 | subprocess, offering different features and different levels of |
| 433 | complexity. \function{os.system(\var{command})} is easy to use, but |
| 434 | slow (it runs a shell process which executes the command) and |
| 435 | dangerous (you have to be careful about escaping the shell's |
| 436 | metacharacters). The \module{popen2} module offers classes that can |
| 437 | capture standard output and standard error from the subprocess, but |
| 438 | the naming is confusing. The \module{subprocess} module cleans |
| 439 | this up, providing a unified interface that offers all the features |
| 440 | you might need. |
Andrew M. Kuchling | c9e7d77 | 2004-10-12 15:58:02 +0000 | [diff] [blame] | 441 | |
Andrew M. Kuchling | b6ffc27 | 2004-10-12 16:36:57 +0000 | [diff] [blame] | 442 | Instead of \module{popen2}'s collection of classes, |
| 443 | \module{subprocess} contains a single class called \class{Popen} |
| 444 | whose constructor supports a number of different keyword arguments. |
Andrew M. Kuchling | c9e7d77 | 2004-10-12 15:58:02 +0000 | [diff] [blame] | 445 | |
Andrew M. Kuchling | b6ffc27 | 2004-10-12 16:36:57 +0000 | [diff] [blame] | 446 | \begin{verbatim} |
| 447 | class Popen(args, bufsize=0, executable=None, |
| 448 | stdin=None, stdout=None, stderr=None, |
| 449 | preexec_fn=None, close_fds=False, shell=False, |
| 450 | cwd=None, env=None, universal_newlines=False, |
| 451 | startupinfo=None, creationflags=0): |
| 452 | \end{verbatim} |
Andrew M. Kuchling | c9e7d77 | 2004-10-12 15:58:02 +0000 | [diff] [blame] | 453 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 454 | \var{args} is commonly a sequence of strings that will be the |
| 455 | arguments to the program executed as the subprocess. (If the |
| 456 | \var{shell} argument is true, \var{args} can be a string which will |
| 457 | then be passed on to the shell for interpretation, just as |
| 458 | \function{os.system()} does.) |
Andrew M. Kuchling | b6ffc27 | 2004-10-12 16:36:57 +0000 | [diff] [blame] | 459 | |
| 460 | \var{stdin}, \var{stdout}, and \var{stderr} specify what the |
| 461 | subprocess's input, output, and error streams will be. You can |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 462 | provide a file object or a file descriptor, or you can use the |
| 463 | constant \code{subprocess.PIPE} to create a pipe between the |
| 464 | subprocess and the parent. |
Andrew M. Kuchling | b6ffc27 | 2004-10-12 16:36:57 +0000 | [diff] [blame] | 465 | |
| 466 | The constructor has a number of handy options: |
| 467 | |
| 468 | \begin{itemize} |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 469 | \item \var{close_fds} requests that all file descriptors be closed |
| 470 | before running the subprocess. |
| 471 | |
| 472 | \item \var{cwd} specifies the working directory in which the |
| 473 | subprocess will be executed (defaulting to whatever the parent's |
| 474 | working directory is). |
| 475 | |
Andrew M. Kuchling | b6ffc27 | 2004-10-12 16:36:57 +0000 | [diff] [blame] | 476 | \item \var{env} is a dictionary specifying environment variables. |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 477 | |
| 478 | \item \var{preexec_fn} is a function that gets called before the |
| 479 | child is started. |
| 480 | |
| 481 | \item \var{universal_newlines} opens the child's input and output |
| 482 | using Python's universal newline feature. |
| 483 | |
Andrew M. Kuchling | b6ffc27 | 2004-10-12 16:36:57 +0000 | [diff] [blame] | 484 | \end{itemize} |
| 485 | |
| 486 | Once you've created the \class{Popen} instance, |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 487 | you can call its \method{wait()} method to pause until the subprocess |
| 488 | has exited, \method{poll()} to check if it's exited without pausing, |
| 489 | or \method{communicate(\var{data})} to send the string \var{data} to |
Andrew M. Kuchling | b6ffc27 | 2004-10-12 16:36:57 +0000 | [diff] [blame] | 490 | the subprocess's standard input. \method{communicate(\var{data})} |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 491 | then reads any data that the subprocess has sent to its standard output |
| 492 | or standard error, returning a tuple \code{(\var{stdout_data}, |
| 493 | \var{stderr_data})}. |
Andrew M. Kuchling | b6ffc27 | 2004-10-12 16:36:57 +0000 | [diff] [blame] | 494 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 495 | \function{call()} is a shortcut that passes its arguments along to the |
| 496 | \class{Popen} constructor, waits for the command to complete, and |
| 497 | returns the status code of the subprocess. It can serve as a safer |
| 498 | analog to \function{os.system()}: |
Andrew M. Kuchling | b6ffc27 | 2004-10-12 16:36:57 +0000 | [diff] [blame] | 499 | |
| 500 | \begin{verbatim} |
| 501 | sts = subprocess.call(['dpkg', '-i', '/tmp/new-package.deb']) |
| 502 | if sts == 0: |
| 503 | # Success |
| 504 | ... |
| 505 | else: |
| 506 | # dpkg returned an error |
| 507 | ... |
| 508 | \end{verbatim} |
| 509 | |
| 510 | The command is invoked without use of the shell. If you really do want to |
| 511 | use the shell, you can add \code{shell=True} as a keyword argument and provide |
| 512 | a string instead of a sequence: |
| 513 | |
| 514 | \begin{verbatim} |
| 515 | sts = subprocess.call('dpkg -i /tmp/new-package.deb', shell=True) |
| 516 | \end{verbatim} |
| 517 | |
| 518 | The PEP takes various examples of shell and Python code and shows how |
| 519 | they'd be translated into Python code that uses \module{subprocess}. |
| 520 | Reading this section of the PEP is highly recommended. |
Andrew M. Kuchling | c9e7d77 | 2004-10-12 15:58:02 +0000 | [diff] [blame] | 521 | |
| 522 | \begin{seealso} |
Andrew M. Kuchling | b6ffc27 | 2004-10-12 16:36:57 +0000 | [diff] [blame] | 523 | \seepep{324}{subprocess - New process module}{Written and implemented by Peter {\AA}strand, with assistance from Fredrik Lundh and others.} |
Andrew M. Kuchling | c9e7d77 | 2004-10-12 15:58:02 +0000 | [diff] [blame] | 524 | \end{seealso} |
| 525 | |
Andrew M. Kuchling | b6ffc27 | 2004-10-12 16:36:57 +0000 | [diff] [blame] | 526 | |
Andrew M. Kuchling | c9e7d77 | 2004-10-12 15:58:02 +0000 | [diff] [blame] | 527 | %====================================================================== |
Raymond Hettinger | 0fff62f | 2004-07-01 11:52:15 +0000 | [diff] [blame] | 528 | \section{PEP 327: Decimal Data Type} |
| 529 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 530 | Python has always supported floating-point (FP) numbers, based on the |
| 531 | underlying C \ctype{double} type, as a data type. However, while most |
| 532 | programming languages provide a floating-point type, most people (even |
| 533 | programmers) are unaware that computing with floating-point numbers |
| 534 | entails certain unavoidable inaccuracies. The new decimal type |
| 535 | provides a way to avoid these inaccuracies. |
Raymond Hettinger | 0fff62f | 2004-07-01 11:52:15 +0000 | [diff] [blame] | 536 | |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 537 | \subsection{Why is Decimal needed?} |
Raymond Hettinger | 0fff62f | 2004-07-01 11:52:15 +0000 | [diff] [blame] | 538 | |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 539 | The limitations arise from the representation used for floating-point numbers. |
| 540 | FP numbers are made up of three components: |
| 541 | |
| 542 | \begin{itemize} |
Andrew M. Kuchling | e34c3bd | 2004-08-31 12:21:44 +0000 | [diff] [blame] | 543 | \item The sign, which is positive or negative. |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 544 | \item The mantissa, which is a single-digit binary number |
| 545 | followed by a fractional part. For example, \code{1.01} in base-2 notation |
| 546 | is \code{1 + 0/2 + 1/4}, or 1.25 in decimal notation. |
| 547 | \item The exponent, which tells where the decimal point is located in the number represented. |
| 548 | \end{itemize} |
| 549 | |
Andrew M. Kuchling | e34c3bd | 2004-08-31 12:21:44 +0000 | [diff] [blame] | 550 | For example, the number 1.25 has positive sign, a mantissa value of |
| 551 | 1.01 (in binary), and an exponent of 0 (the decimal point doesn't need |
| 552 | to be shifted). The number 5 has the same sign and mantissa, but the |
| 553 | exponent is 2 because the mantissa is multiplied by 4 (2 to the power |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 554 | of the exponent 2); 1.25 * 4 equals 5. |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 555 | |
| 556 | Modern systems usually provide floating-point support that conforms to |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 557 | a standard called IEEE 754. C's \ctype{double} type is usually |
| 558 | implemented as a 64-bit IEEE 754 number, which uses 52 bits of space |
| 559 | for the mantissa. This means that numbers can only be specified to 52 |
| 560 | bits of precision. If you're trying to represent numbers whose |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 561 | expansion repeats endlessly, the expansion is cut off after 52 bits. |
| 562 | Unfortunately, most software needs to produce output in base 10, and |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 563 | common fractions in base 10 are often repeating decimals in binary. |
| 564 | For example, 1.1 decimal is binary \code{1.0001100110011 ...}; .1 = |
| 565 | 1/16 + 1/32 + 1/256 plus an infinite number of additional terms. IEEE |
| 566 | 754 has to chop off that infinitely repeated decimal after 52 digits, |
| 567 | so the representation is slightly inaccurate. |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 568 | |
| 569 | Sometimes you can see this inaccuracy when the number is printed: |
Raymond Hettinger | 0fff62f | 2004-07-01 11:52:15 +0000 | [diff] [blame] | 570 | \begin{verbatim} |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 571 | >>> 1.1 |
| 572 | 1.1000000000000001 |
Raymond Hettinger | 0fff62f | 2004-07-01 11:52:15 +0000 | [diff] [blame] | 573 | \end{verbatim} |
| 574 | |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 575 | The inaccuracy isn't always visible when you print the number because |
Andrew M. Kuchling | e34c3bd | 2004-08-31 12:21:44 +0000 | [diff] [blame] | 576 | the FP-to-decimal-string conversion is provided by the C library, and |
| 577 | most C libraries try to produce sensible output. Even if it's not |
| 578 | displayed, however, the inaccuracy is still there and subsequent |
| 579 | operations can magnify the error. |
Raymond Hettinger | 0fff62f | 2004-07-01 11:52:15 +0000 | [diff] [blame] | 580 | |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 581 | For many applications this doesn't matter. If I'm plotting points and |
| 582 | displaying them on my monitor, the difference between 1.1 and |
| 583 | 1.1000000000000001 is too small to be visible. Reports often limit |
| 584 | output to a certain number of decimal places, and if you round the |
| 585 | number to two or three or even eight decimal places, the error is |
| 586 | never apparent. However, for applications where it does matter, |
| 587 | it's a lot of work to implement your own custom arithmetic routines. |
| 588 | |
Andrew M. Kuchling | e34c3bd | 2004-08-31 12:21:44 +0000 | [diff] [blame] | 589 | Hence, the \class{Decimal} type was created. |
| 590 | |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 591 | \subsection{The \class{Decimal} type} |
| 592 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 593 | A new module, \module{decimal}, was added to Python's standard |
| 594 | library. It contains two classes, \class{Decimal} and |
| 595 | \class{Context}. \class{Decimal} instances represent numbers, and |
| 596 | \class{Context} instances are used to wrap up various settings such as |
| 597 | the precision and default rounding mode. |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 598 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 599 | \class{Decimal} instances are immutable, like regular Python integers |
| 600 | and FP numbers; once it's been created, you can't change the value an |
| 601 | instance represents. \class{Decimal} instances can be created from |
| 602 | integers or strings: |
Raymond Hettinger | 0fff62f | 2004-07-01 11:52:15 +0000 | [diff] [blame] | 603 | |
| 604 | \begin{verbatim} |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 605 | >>> import decimal |
| 606 | >>> decimal.Decimal(1972) |
| 607 | Decimal("1972") |
| 608 | >>> decimal.Decimal("1.1") |
| 609 | Decimal("1.1") |
Raymond Hettinger | 0fff62f | 2004-07-01 11:52:15 +0000 | [diff] [blame] | 610 | \end{verbatim} |
| 611 | |
Andrew M. Kuchling | e34c3bd | 2004-08-31 12:21:44 +0000 | [diff] [blame] | 612 | You can also provide tuples containing the sign, the mantissa represented |
| 613 | as a tuple of decimal digits, and the exponent: |
Raymond Hettinger | 0fff62f | 2004-07-01 11:52:15 +0000 | [diff] [blame] | 614 | |
| 615 | \begin{verbatim} |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 616 | >>> decimal.Decimal((1, (1, 4, 7, 5), -2)) |
| 617 | Decimal("-14.75") |
Raymond Hettinger | 0fff62f | 2004-07-01 11:52:15 +0000 | [diff] [blame] | 618 | \end{verbatim} |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 619 | |
Andrew M. Kuchling | e34c3bd | 2004-08-31 12:21:44 +0000 | [diff] [blame] | 620 | Cautionary note: the sign bit is a Boolean value, so 0 is positive and |
| 621 | 1 is negative. |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 622 | |
Andrew M. Kuchling | e34c3bd | 2004-08-31 12:21:44 +0000 | [diff] [blame] | 623 | Converting from floating-point numbers poses a bit of a problem: |
| 624 | should the FP number representing 1.1 turn into the decimal number for |
| 625 | exactly 1.1, or for 1.1 plus whatever inaccuracies are introduced? |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 626 | The decision was to dodge the issue and leave such a conversion out of |
| 627 | the API. Instead, you should convert the floating-point number into a |
| 628 | string using the desired precision and pass the string to the |
| 629 | \class{Decimal} constructor: |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 630 | |
| 631 | \begin{verbatim} |
| 632 | >>> f = 1.1 |
| 633 | >>> decimal.Decimal(str(f)) |
| 634 | Decimal("1.1") |
Andrew M. Kuchling | 0ad20f1 | 2004-07-21 13:00:06 +0000 | [diff] [blame] | 635 | >>> decimal.Decimal('%.12f' % f) |
| 636 | Decimal("1.100000000000") |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 637 | \end{verbatim} |
| 638 | |
| 639 | Once you have \class{Decimal} instances, you can perform the usual |
| 640 | mathematical operations on them. One limitation: exponentiation |
| 641 | requires an integer exponent: |
| 642 | |
| 643 | \begin{verbatim} |
| 644 | >>> a = decimal.Decimal('35.72') |
| 645 | >>> b = decimal.Decimal('1.73') |
| 646 | >>> a+b |
| 647 | Decimal("37.45") |
| 648 | >>> a-b |
| 649 | Decimal("33.99") |
| 650 | >>> a*b |
| 651 | Decimal("61.7956") |
| 652 | >>> a/b |
Andrew M. Kuchling | 0ad20f1 | 2004-07-21 13:00:06 +0000 | [diff] [blame] | 653 | Decimal("20.64739884393063583815028902") |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 654 | >>> a ** 2 |
| 655 | Decimal("1275.9184") |
Andrew M. Kuchling | 0ad20f1 | 2004-07-21 13:00:06 +0000 | [diff] [blame] | 656 | >>> a**b |
| 657 | Traceback (most recent call last): |
| 658 | ... |
| 659 | decimal.InvalidOperation: x ** (non-integer) |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 660 | \end{verbatim} |
| 661 | |
| 662 | You can combine \class{Decimal} instances with integers, but not with |
| 663 | floating-point numbers: |
| 664 | |
| 665 | \begin{verbatim} |
| 666 | >>> a + 4 |
| 667 | Decimal("39.72") |
| 668 | >>> a + 4.5 |
| 669 | Traceback (most recent call last): |
| 670 | ... |
| 671 | TypeError: You can interact Decimal only with int, long or Decimal data types. |
| 672 | >>> |
| 673 | \end{verbatim} |
| 674 | |
| 675 | \class{Decimal} numbers can be used with the \module{math} and |
Andrew M. Kuchling | 0ad20f1 | 2004-07-21 13:00:06 +0000 | [diff] [blame] | 676 | \module{cmath} modules, but note that they'll be immediately converted to |
| 677 | floating-point numbers before the operation is performed, resulting in |
| 678 | a possible loss of precision and accuracy. You'll also get back a |
| 679 | regular floating-point number and not a \class{Decimal}. |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 680 | |
| 681 | \begin{verbatim} |
| 682 | >>> import math, cmath |
| 683 | >>> d = decimal.Decimal('123456789012.345') |
| 684 | >>> math.sqrt(d) |
| 685 | 351364.18288201344 |
| 686 | >>> cmath.sqrt(-d) |
| 687 | 351364.18288201344j |
Andrew M. Kuchling | 0ad20f1 | 2004-07-21 13:00:06 +0000 | [diff] [blame] | 688 | \end{verbatim} |
| 689 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 690 | \class{Decimal} instances have a \method{sqrt()} method that |
| 691 | returns a \class{Decimal}, but if you need other things such as |
| 692 | trigonometric functions you'll have to implement them. |
Andrew M. Kuchling | 0ad20f1 | 2004-07-21 13:00:06 +0000 | [diff] [blame] | 693 | |
| 694 | \begin{verbatim} |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 695 | >>> d.sqrt() |
Raymond Hettinger | ca1a775 | 2004-07-12 13:00:45 +0000 | [diff] [blame] | 696 | Decimal("351364.1828820134592177245001") |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 697 | \end{verbatim} |
| 698 | |
| 699 | |
| 700 | \subsection{The \class{Context} type} |
| 701 | |
| 702 | Instances of the \class{Context} class encapsulate several settings for |
| 703 | decimal operations: |
| 704 | |
| 705 | \begin{itemize} |
| 706 | \item \member{prec} is the precision, the number of decimal places. |
| 707 | \item \member{rounding} specifies the rounding mode. The \module{decimal} |
| 708 | module has constants for the various possibilities: |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 709 | \constant{ROUND_DOWN}, \constant{ROUND_CEILING}, |
| 710 | \constant{ROUND_HALF_EVEN}, and various others. |
Andrew M. Kuchling | 0ad20f1 | 2004-07-21 13:00:06 +0000 | [diff] [blame] | 711 | \item \member{traps} is a dictionary specifying what happens on |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 712 | encountering certain error conditions: either an exception is raised or |
| 713 | a value is returned. Some examples of error conditions are |
| 714 | division by zero, loss of precision, and overflow. |
| 715 | \end{itemize} |
| 716 | |
| 717 | There's a thread-local default context available by calling |
| 718 | \function{getcontext()}; you can change the properties of this context |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 719 | to alter the default precision, rounding, or trap handling. The |
| 720 | following example shows the effect of changing the precision of the default |
| 721 | context: |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 722 | |
| 723 | \begin{verbatim} |
| 724 | >>> decimal.getcontext().prec |
| 725 | 28 |
| 726 | >>> decimal.Decimal(1) / decimal.Decimal(7) |
Raymond Hettinger | ca1a775 | 2004-07-12 13:00:45 +0000 | [diff] [blame] | 727 | Decimal("0.1428571428571428571428571429") |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 728 | >>> decimal.getcontext().prec = 9 |
| 729 | >>> decimal.Decimal(1) / decimal.Decimal(7) |
Raymond Hettinger | ca1a775 | 2004-07-12 13:00:45 +0000 | [diff] [blame] | 730 | Decimal("0.142857143") |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 731 | \end{verbatim} |
| 732 | |
Andrew M. Kuchling | 0ad20f1 | 2004-07-21 13:00:06 +0000 | [diff] [blame] | 733 | The default action for error conditions is selectable; the module can |
| 734 | either return a special value such as infinity or not-a-number, or |
| 735 | exceptions can be raised: |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 736 | |
| 737 | \begin{verbatim} |
| 738 | >>> decimal.Decimal(1) / decimal.Decimal(0) |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 739 | Traceback (most recent call last): |
| 740 | ... |
| 741 | decimal.DivisionByZero: x / 0 |
Andrew M. Kuchling | 0ad20f1 | 2004-07-21 13:00:06 +0000 | [diff] [blame] | 742 | >>> decimal.getcontext().traps[decimal.DivisionByZero] = False |
| 743 | >>> decimal.Decimal(1) / decimal.Decimal(0) |
| 744 | Decimal("Infinity") |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 745 | >>> |
| 746 | \end{verbatim} |
| 747 | |
| 748 | The \class{Context} instance also has various methods for formatting |
| 749 | numbers such as \method{to_eng_string()} and \method{to_sci_string()}. |
| 750 | |
Andrew M. Kuchling | 0ad20f1 | 2004-07-21 13:00:06 +0000 | [diff] [blame] | 751 | For more information, see the documentation for the \module{decimal} |
| 752 | module, which includes a quick-start tutorial and a reference. |
| 753 | |
Raymond Hettinger | 0fff62f | 2004-07-01 11:52:15 +0000 | [diff] [blame] | 754 | \begin{seealso} |
| 755 | \seepep{327}{Decimal Data Type}{Written by Facundo Batista and implemented |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 756 | by Facundo Batista, Eric Price, Raymond Hettinger, Aahz, and Tim Peters.} |
| 757 | |
Raymond Hettinger | ca1a775 | 2004-07-12 13:00:45 +0000 | [diff] [blame] | 758 | \seeurl{http://research.microsoft.com/\textasciitilde hollasch/cgindex/coding/ieeefloat.html} |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 759 | {A more detailed overview of the IEEE-754 representation.} |
| 760 | |
| 761 | \seeurl{http://www.lahey.com/float.htm} |
| 762 | {The article uses Fortran code to illustrate many of the problems |
| 763 | that floating-point inaccuracy can cause.} |
| 764 | |
| 765 | \seeurl{http://www2.hursley.ibm.com/decimal/} |
| 766 | {A description of a decimal-based representation. This representation |
| 767 | is being proposed as a standard, and underlies the new Python decimal |
| 768 | type. Much of this material was written by Mike Cowlishaw, designer of the |
Raymond Hettinger | ca1a775 | 2004-07-12 13:00:45 +0000 | [diff] [blame] | 769 | Rexx language.} |
Andrew M. Kuchling | c8f8a81 | 2004-07-04 01:26:42 +0000 | [diff] [blame] | 770 | |
Raymond Hettinger | 0fff62f | 2004-07-01 11:52:15 +0000 | [diff] [blame] | 771 | \end{seealso} |
| 772 | |
| 773 | |
| 774 | %====================================================================== |
Andrew M. Kuchling | 3294e9d | 2004-08-31 11:26:23 +0000 | [diff] [blame] | 775 | \section{PEP 328: Multi-line Imports} |
| 776 | |
| 777 | One language change is a small syntactic tweak aimed at making it |
| 778 | easier to import many names from a module. In a |
| 779 | \code{from \var{module} import \var{names}} statement, |
| 780 | \var{names} is a sequence of names separated by commas. If the sequence is |
| 781 | very long, you can either write multiple imports from the same module, |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 782 | or you can use backslashes to escape the line endings like this: |
Andrew M. Kuchling | 3294e9d | 2004-08-31 11:26:23 +0000 | [diff] [blame] | 783 | |
| 784 | \begin{verbatim} |
| 785 | from SimpleXMLRPCServer import SimpleXMLRPCServer,\ |
| 786 | SimpleXMLRPCRequestHandler,\ |
| 787 | CGIXMLRPCRequestHandler,\ |
| 788 | resolve_dotted_attribute |
| 789 | \end{verbatim} |
| 790 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 791 | The syntactic change in Python 2.4 simply allows putting the names |
| 792 | within parentheses. Python ignores newlines within a parenthesized |
Andrew M. Kuchling | 3294e9d | 2004-08-31 11:26:23 +0000 | [diff] [blame] | 793 | expression, so the backslashes are no longer needed: |
| 794 | |
| 795 | \begin{verbatim} |
| 796 | from SimpleXMLRPCServer import (SimpleXMLRPCServer, |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 797 | SimpleXMLRPCRequestHandler, |
| 798 | CGIXMLRPCRequestHandler, |
| 799 | resolve_dotted_attribute) |
Andrew M. Kuchling | 3294e9d | 2004-08-31 11:26:23 +0000 | [diff] [blame] | 800 | \end{verbatim} |
| 801 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 802 | The PEP also proposes that all \keyword{import} statements be absolute |
| 803 | imports, with a leading \samp{.} character to indicate a relative |
| 804 | import. This part of the PEP is not yet implemented, and will have to |
| 805 | wait for Python 2.5 or some other future version. |
Andrew M. Kuchling | 3294e9d | 2004-08-31 11:26:23 +0000 | [diff] [blame] | 806 | |
| 807 | \begin{seealso} |
Fred Drake | 410eb84 | 2004-09-01 04:05:08 +0000 | [diff] [blame] | 808 | \seepep{328}{Imports: Multi-Line and Absolute/Relative} |
| 809 | {Written by Aahz. Multi-line imports were implemented by |
| 810 | Dima Dorfman.} |
| 811 | \end{seealso} |
Andrew M. Kuchling | 3294e9d | 2004-08-31 11:26:23 +0000 | [diff] [blame] | 812 | |
| 813 | |
| 814 | %====================================================================== |
Andrew M. Kuchling | 65a3332 | 2004-07-21 12:41:38 +0000 | [diff] [blame] | 815 | \section{PEP 331: Locale-Independent Float/String Conversions} |
| 816 | |
| 817 | The \module{locale} modules lets Python software select various |
| 818 | conversions and display conventions that are localized to a particular |
| 819 | country or language. However, the module was careful to not change |
| 820 | the numeric locale because various functions in Python's |
| 821 | implementation required that the numeric locale remain set to the |
| 822 | \code{'C'} locale. Often this was because the code was using the C library's |
| 823 | \cfunction{atof()} function. |
| 824 | |
| 825 | Not setting the numeric locale caused trouble for extensions that used |
| 826 | third-party C libraries, however, because they wouldn't have the |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 827 | % XXX is it GTK or GTk? |
Andrew M. Kuchling | 65a3332 | 2004-07-21 12:41:38 +0000 | [diff] [blame] | 828 | correct locale set. The motivating example was GTK+, whose user |
| 829 | interface widgets weren't displaying numbers in the current locale. |
| 830 | |
| 831 | The solution described in the PEP is to add three new functions to the |
| 832 | Python API that perform ASCII-only conversions, ignoring the locale |
| 833 | setting: |
| 834 | |
| 835 | \begin{itemize} |
| 836 | \item \cfunction{PyOS_ascii_strtod(\var{str}, \var{ptr})} |
| 837 | and \cfunction{PyOS_ascii_atof(\var{str}, \var{ptr})} |
| 838 | both convert a string to a C \ctype{double}. |
| 839 | \item \cfunction{PyOS_ascii_formatd(\var{buffer}, \var{buf_len}, \var{format}, \var{d})} converts a \ctype{double} to an ASCII string. |
| 840 | \end{itemize} |
| 841 | |
| 842 | The code for these functions came from the GLib library |
| 843 | (\url{http://developer.gnome.org/arch/gtk/glib.html}), whose |
| 844 | developers kindly relicensed the relevant functions and donated them |
| 845 | to the Python Software Foundation. The \module{locale} module |
| 846 | can now change the numeric locale, letting extensions such as GTK+ |
| 847 | produce the correct results. |
| 848 | |
| 849 | \begin{seealso} |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 850 | \seepep{331}{Locale-Independent Float/String Conversions} |
| 851 | {Written by Christian R. Reis, and implemented by Gustavo Carneiro.} |
Andrew M. Kuchling | 65a3332 | 2004-07-21 12:41:38 +0000 | [diff] [blame] | 852 | \end{seealso} |
| 853 | |
| 854 | %====================================================================== |
Fred Drake | ed0fa3d | 2003-07-30 19:14:09 +0000 | [diff] [blame] | 855 | \section{Other Language Changes} |
| 856 | |
| 857 | Here are all of the changes that Python 2.4 makes to the core Python |
| 858 | language. |
| 859 | |
| 860 | \begin{itemize} |
Raymond Hettinger | d446230 | 2003-11-26 17:52:45 +0000 | [diff] [blame] | 861 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 862 | \item Decorators for functions and methods were added (\pep{318}). |
| 863 | |
| 864 | \item Built-in \function{set} and \function{frozenset} types were |
| 865 | added (\pep{218}). Other new built-ins include the \function{reversed(\var{seq})} function (\pep{322}). |
| 866 | |
| 867 | \item Generator expressions were added (\pep{289}). |
| 868 | |
| 869 | \item Certain numeric expressions no longer return values restricted to 32 or 64 bits (\pep{237}). |
| 870 | |
| 871 | \item You can now put parentheses around the list of names in a |
| 872 | \code{from \var{module} import \var{names}} statement (\pep{328}). |
| 873 | |
Raymond Hettinger | 31017ae | 2004-03-04 08:25:44 +0000 | [diff] [blame] | 874 | \item The \method{dict.update()} method now accepts the same |
| 875 | argument forms as the \class{dict} constructor. This includes any |
Andrew M. Kuchling | 7642f7a | 2004-09-13 15:06:50 +0000 | [diff] [blame] | 876 | mapping, any iterable of key/value pairs, and keyword arguments. |
| 877 | (Contributed by Raymond Hettinger.) |
Raymond Hettinger | 31017ae | 2004-03-04 08:25:44 +0000 | [diff] [blame] | 878 | |
Andrew M. Kuchling | 3bf85f1 | 2004-07-05 01:37:07 +0000 | [diff] [blame] | 879 | \item The string methods \method{ljust()}, \method{rjust()}, and |
Andrew M. Kuchling | 6708756 | 2003-11-26 18:03:48 +0000 | [diff] [blame] | 880 | \method{center()} now take an optional argument for specifying a |
Raymond Hettinger | d446230 | 2003-11-26 17:52:45 +0000 | [diff] [blame] | 881 | fill character other than a space. |
Andrew M. Kuchling | 7642f7a | 2004-09-13 15:06:50 +0000 | [diff] [blame] | 882 | (Contributed by Raymond Hettinger.) |
Raymond Hettinger | d446230 | 2003-11-26 17:52:45 +0000 | [diff] [blame] | 883 | |
Andrew M. Kuchling | 35f2b05 | 2003-12-18 13:28:13 +0000 | [diff] [blame] | 884 | \item Strings also gained an \method{rsplit()} method that |
Raymond Hettinger | ed54d91 | 2003-12-31 01:59:18 +0000 | [diff] [blame] | 885 | works like the \method{split()} method but splits from the end of |
Andrew M. Kuchling | 44a31e1 | 2004-01-01 18:33:34 +0000 | [diff] [blame] | 886 | the string. |
Andrew M. Kuchling | 35f2b05 | 2003-12-18 13:28:13 +0000 | [diff] [blame] | 887 | |
| 888 | \begin{verbatim} |
Raymond Hettinger | 7a6d297 | 2004-02-13 19:00:07 +0000 | [diff] [blame] | 889 | >>> 'www.python.org'.split('.', 1) |
| 890 | ['www', 'python.org'] |
| 891 | 'www.python.org'.rsplit('.', 1) |
| 892 | ['www.python', 'org'] |
| 893 | \end{verbatim} |
Raymond Hettinger | 97ef8de | 2004-01-05 00:29:57 +0000 | [diff] [blame] | 894 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 895 | \item Three keyword parameters, \var{cmp}, \var{key}, and |
| 896 | \var{reverse}, were added to the \method{sort()} method of lists. |
| 897 | These parameters make some common usages of \method{sort()} simpler. |
| 898 | All of these parameters are optional. |
Andrew M. Kuchling | 2fb4d51 | 2003-10-21 12:31:16 +0000 | [diff] [blame] | 899 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 900 | For the \var{cmp} parameter, the value should be a comparison function |
| 901 | that takes two parameters and returns -1, 0, or +1 depending on how |
| 902 | the parameters compare. This function will then be used to sort the |
| 903 | list. Previously this was the only parameter that could be provided |
| 904 | to \method{sort()}. |
Andrew M. Kuchling | 2fb4d51 | 2003-10-21 12:31:16 +0000 | [diff] [blame] | 905 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 906 | \var{key} should be a single-parameter function that takes a list |
Andrew M. Kuchling | 2fb4d51 | 2003-10-21 12:31:16 +0000 | [diff] [blame] | 907 | element and returns a comparison key for the element. The list is |
Raymond Hettinger | 607c00f | 2003-11-12 16:27:50 +0000 | [diff] [blame] | 908 | then sorted using the comparison keys. The following example sorts a |
| 909 | list case-insensitively: |
Andrew M. Kuchling | 2fb4d51 | 2003-10-21 12:31:16 +0000 | [diff] [blame] | 910 | |
| 911 | \begin{verbatim} |
| 912 | >>> L = ['A', 'b', 'c', 'D'] |
| 913 | >>> L.sort() # Case-sensitive sort |
| 914 | >>> L |
| 915 | ['A', 'D', 'b', 'c'] |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 916 | >>> # Using 'key' parameter to sort list |
Andrew M. Kuchling | 2fb4d51 | 2003-10-21 12:31:16 +0000 | [diff] [blame] | 917 | >>> L.sort(key=lambda x: x.lower()) |
| 918 | >>> L |
| 919 | ['A', 'b', 'c', 'D'] |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 920 | >>> # Old-fashioned way |
Andrew M. Kuchling | 2fb4d51 | 2003-10-21 12:31:16 +0000 | [diff] [blame] | 921 | >>> L.sort(cmp=lambda x,y: cmp(x.lower(), y.lower())) |
| 922 | >>> L |
| 923 | ['A', 'b', 'c', 'D'] |
| 924 | \end{verbatim} |
| 925 | |
| 926 | The last example, which uses the \var{cmp} parameter, is the old way |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 927 | to perform a case-insensitive sort. It works but is slower than using |
| 928 | a \var{key} parameter. Using \var{key} calls \method{lower()} method |
| 929 | once for each element in the list while using \var{cmp} will call it |
| 930 | twice for each comparison, so using \var{key} saves on invocations of |
| 931 | the \method{lower()} method. |
Andrew M. Kuchling | 2fb4d51 | 2003-10-21 12:31:16 +0000 | [diff] [blame] | 932 | |
Andrew M. Kuchling | 981a918 | 2003-11-13 21:33:26 +0000 | [diff] [blame] | 933 | For simple key functions and comparison functions, it is often |
| 934 | possible to avoid a \keyword{lambda} expression by using an unbound |
Raymond Hettinger | 607c00f | 2003-11-12 16:27:50 +0000 | [diff] [blame] | 935 | method instead. For example, the above case-insensitive sort is best |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 936 | written as: |
Raymond Hettinger | 607c00f | 2003-11-12 16:27:50 +0000 | [diff] [blame] | 937 | |
| 938 | \begin{verbatim} |
| 939 | >>> L.sort(key=str.lower) |
| 940 | >>> L |
| 941 | ['A', 'b', 'c', 'D'] |
| 942 | \end{verbatim} |
| 943 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 944 | Finally, the \var{reverse} parameter takes a Boolean value. If the |
| 945 | value is true, the list will be sorted into reverse order. |
| 946 | Instead of \code{L.sort() ; L.reverse()}, you can now write |
| 947 | \code{L.sort(reverse=True)}. |
Fred Drake | ed0fa3d | 2003-07-30 19:14:09 +0000 | [diff] [blame] | 948 | |
Andrew M. Kuchling | 981a918 | 2003-11-13 21:33:26 +0000 | [diff] [blame] | 949 | The results of sorting are now guaranteed to be stable. This means |
| 950 | that two entries with equal keys will be returned in the same order as |
| 951 | they were input. For example, you can sort a list of people by name, |
| 952 | and then sort the list by age, resulting in a list sorted by age where |
| 953 | people with the same age are in name-sorted order. |
Raymond Hettinger | 607c00f | 2003-11-12 16:27:50 +0000 | [diff] [blame] | 954 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 955 | (All changes to \method{sort()} contributed by Raymond Hettinger.) |
| 956 | |
Fred Drake | 56fcc23 | 2004-05-06 02:55:35 +0000 | [diff] [blame] | 957 | \item There is a new built-in function |
| 958 | \function{sorted(\var{iterable})} that works like the in-place |
Andrew M. Kuchling | 3bf85f1 | 2004-07-05 01:37:07 +0000 | [diff] [blame] | 959 | \method{list.sort()} method but can be used in |
Fred Drake | 56fcc23 | 2004-05-06 02:55:35 +0000 | [diff] [blame] | 960 | expressions. The differences are: |
Raymond Hettinger | 607c00f | 2003-11-12 16:27:50 +0000 | [diff] [blame] | 961 | \begin{itemize} |
Raymond Hettinger | 7d1dd04 | 2003-11-12 16:42:10 +0000 | [diff] [blame] | 962 | \item the input may be any iterable; |
| 963 | \item a newly formed copy is sorted, leaving the original intact; and |
Raymond Hettinger | 607c00f | 2003-11-12 16:27:50 +0000 | [diff] [blame] | 964 | \item the expression returns the new sorted copy |
| 965 | \end{itemize} |
Andrew M. Kuchling | 1a42025 | 2003-11-08 15:58:49 +0000 | [diff] [blame] | 966 | |
| 967 | \begin{verbatim} |
| 968 | >>> L = [9,7,8,3,2,4,1,6,5] |
Raymond Hettinger | 64958a1 | 2003-12-17 20:43:33 +0000 | [diff] [blame] | 969 | >>> [10+i for i in sorted(L)] # usable in a list comprehension |
Raymond Hettinger | 607c00f | 2003-11-12 16:27:50 +0000 | [diff] [blame] | 970 | [11, 12, 13, 14, 15, 16, 17, 18, 19] |
Hye-Shik Chang | 2b05248 | 2004-07-17 13:53:48 +0000 | [diff] [blame] | 971 | >>> L # original is left unchanged |
Andrew M. Kuchling | e3e1eca | 2004-07-26 18:52:48 +0000 | [diff] [blame] | 972 | [9,7,8,3,2,4,1,6,5] |
| 973 | >>> sorted('Monty Python') # any iterable may be an input |
| 974 | [' ', 'M', 'P', 'h', 'n', 'n', 'o', 'o', 't', 't', 'y', 'y'] |
Raymond Hettinger | d446230 | 2003-11-26 17:52:45 +0000 | [diff] [blame] | 975 | |
| 976 | >>> # List the contents of a dict sorted by key values |
Raymond Hettinger | 607c00f | 2003-11-12 16:27:50 +0000 | [diff] [blame] | 977 | >>> colormap = dict(red=1, blue=2, green=3, black=4, yellow=5) |
Raymond Hettinger | 64958a1 | 2003-12-17 20:43:33 +0000 | [diff] [blame] | 978 | >>> for k, v in sorted(colormap.iteritems()): |
Raymond Hettinger | 607c00f | 2003-11-12 16:27:50 +0000 | [diff] [blame] | 979 | ... print k, v |
| 980 | ... |
| 981 | black 4 |
| 982 | blue 2 |
| 983 | green 3 |
| 984 | red 1 |
| 985 | yellow 5 |
Andrew M. Kuchling | 1a42025 | 2003-11-08 15:58:49 +0000 | [diff] [blame] | 986 | \end{verbatim} |
| 987 | |
Andrew M. Kuchling | 7642f7a | 2004-09-13 15:06:50 +0000 | [diff] [blame] | 988 | (Contributed by Raymond Hettinger.) |
| 989 | |
Andrew M. Kuchling | 87c98b2 | 2004-08-25 13:38:46 +0000 | [diff] [blame] | 990 | \item Integer operations will no longer trigger an \exception{OverflowWarning}. |
| 991 | The \exception{OverflowWarning} warning will disappear in Python 2.5. |
| 992 | |
Andrew M. Kuchling | 5e3f923 | 2004-10-07 12:00:33 +0000 | [diff] [blame] | 993 | \item The interpreter gained a new switch, \programopt{-m}, that |
| 994 | takes a name, searches for the corresponding module on \code{sys.path}, |
| 995 | and runs the module as a script. For example, |
| 996 | you can now run the Python profiler with \code{python -m profile}. |
| 997 | (Contributed by Nick Coghlan.) |
| 998 | |
Andrew M. Kuchling | d0b6d9d | 2004-07-04 15:35:00 +0000 | [diff] [blame] | 999 | \item The \function{eval(\var{expr}, \var{globals}, \var{locals})} |
Andrew M. Kuchling | 1455f79 | 2004-08-02 12:09:58 +0000 | [diff] [blame] | 1000 | and \function{execfile(\var{filename}, \var{globals}, \var{locals})} |
| 1001 | functions and the \keyword{exec} statement now accept any mapping type |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 1002 | for the \var{locals} parameter. Previously this had to be a regular |
Andrew M. Kuchling | 1455f79 | 2004-08-02 12:09:58 +0000 | [diff] [blame] | 1003 | Python dictionary. (Contributed by Raymond Hettinger.) |
Andrew M. Kuchling | d0b6d9d | 2004-07-04 15:35:00 +0000 | [diff] [blame] | 1004 | |
Raymond Hettinger | 607c00f | 2003-11-12 16:27:50 +0000 | [diff] [blame] | 1005 | \item The \function{zip()} built-in function and \function{itertools.izip()} |
Andrew M. Kuchling | 3bf85f1 | 2004-07-05 01:37:07 +0000 | [diff] [blame] | 1006 | now return an empty list if called with no arguments. |
| 1007 | Previously they raised a \exception{TypeError} |
| 1008 | exception. This makes them more |
Raymond Hettinger | 607c00f | 2003-11-12 16:27:50 +0000 | [diff] [blame] | 1009 | suitable for use with variable length argument lists: |
| 1010 | |
| 1011 | \begin{verbatim} |
| 1012 | >>> def transpose(array): |
| 1013 | ... return zip(*array) |
| 1014 | ... |
| 1015 | >>> transpose([(1,2,3), (4,5,6)]) |
| 1016 | [(1, 4), (2, 5), (3, 6)] |
| 1017 | >>> transpose([]) |
| 1018 | [] |
| 1019 | \end{verbatim} |
Andrew M. Kuchling | 7642f7a | 2004-09-13 15:06:50 +0000 | [diff] [blame] | 1020 | (Contributed by Raymond Hettinger.) |
| 1021 | |
Andrew M. Kuchling | d91fcbe | 2004-08-02 12:44:28 +0000 | [diff] [blame] | 1022 | \item Encountering a failure while importing a module no longer leaves |
| 1023 | a partially-initialized module object in \code{sys.modules}. The |
| 1024 | incomplete module object left behind would fool further imports of the |
| 1025 | same module into succeeding, leading to confusing errors. |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 1026 | % (XXX contributed by Tim?) |
Andrew M. Kuchling | d91fcbe | 2004-08-02 12:44:28 +0000 | [diff] [blame] | 1027 | |
Andrew M. Kuchling | 65a3332 | 2004-07-21 12:41:38 +0000 | [diff] [blame] | 1028 | \item \constant{None} is now a constant; code that binds a new value to |
| 1029 | the name \samp{None} is now a syntax error. |
Andrew M. Kuchling | 7642f7a | 2004-09-13 15:06:50 +0000 | [diff] [blame] | 1030 | (Contributed by Raymond Hettinger.) |
Andrew M. Kuchling | 65a3332 | 2004-07-21 12:41:38 +0000 | [diff] [blame] | 1031 | |
Fred Drake | ed0fa3d | 2003-07-30 19:14:09 +0000 | [diff] [blame] | 1032 | \end{itemize} |
| 1033 | |
| 1034 | |
| 1035 | %====================================================================== |
| 1036 | \subsection{Optimizations} |
| 1037 | |
| 1038 | \begin{itemize} |
| 1039 | |
Raymond Hettinger | ca1a775 | 2004-07-12 13:00:45 +0000 | [diff] [blame] | 1040 | \item The inner loops for list and tuple slicing |
Andrew M. Kuchling | 65a3332 | 2004-07-21 12:41:38 +0000 | [diff] [blame] | 1041 | were optimized and now run about one-third faster. The inner loops |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 1042 | for dictionaries were also optimized , resulting in performance boosts for |
Andrew M. Kuchling | 65a3332 | 2004-07-21 12:41:38 +0000 | [diff] [blame] | 1043 | \method{keys()}, \method{values()}, \method{items()}, |
| 1044 | \method{iterkeys()}, \method{itervalues()}, and \method{iteritems()}. |
Andrew M. Kuchling | 7642f7a | 2004-09-13 15:06:50 +0000 | [diff] [blame] | 1045 | (Contributed by Raymond Hettinger.) |
Raymond Hettinger | b7d05db | 2004-03-08 07:25:05 +0000 | [diff] [blame] | 1046 | |
Andrew M. Kuchling | 3bf85f1 | 2004-07-05 01:37:07 +0000 | [diff] [blame] | 1047 | \item The machinery for growing and shrinking lists was optimized for |
| 1048 | speed and for space efficiency. Appending and popping from lists now |
| 1049 | runs faster due to more efficient code paths and less frequent use of |
| 1050 | the underlying system \cfunction{realloc()}. List comprehensions |
| 1051 | also benefit. \method{list.extend()} was also optimized and no |
| 1052 | longer converts its argument into a temporary list before extending |
Andrew M. Kuchling | 7642f7a | 2004-09-13 15:06:50 +0000 | [diff] [blame] | 1053 | the base list. (Contributed by Raymond Hettinger.) |
Raymond Hettinger | 7a6d297 | 2004-02-13 19:00:07 +0000 | [diff] [blame] | 1054 | |
Raymond Hettinger | 97ef8de | 2004-01-05 00:29:57 +0000 | [diff] [blame] | 1055 | \item \function{list()}, \function{tuple()}, \function{map()}, |
| 1056 | \function{filter()}, and \function{zip()} now run several times |
| 1057 | faster with non-sequence arguments that supply a \method{__len__()} |
Andrew M. Kuchling | 7642f7a | 2004-09-13 15:06:50 +0000 | [diff] [blame] | 1058 | method. (Contributed by Raymond Hettinger.) |
Raymond Hettinger | 97ef8de | 2004-01-05 00:29:57 +0000 | [diff] [blame] | 1059 | |
Raymond Hettinger | 23a0f4e | 2004-01-05 08:15:20 +0000 | [diff] [blame] | 1060 | \item The methods \method{list.__getitem__()}, |
Raymond Hettinger | 97ef8de | 2004-01-05 00:29:57 +0000 | [diff] [blame] | 1061 | \method{dict.__getitem__()}, and \method{dict.__contains__()} are |
| 1062 | are now implemented as \class{method_descriptor} objects rather |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 1063 | than \class{wrapper_descriptor} objects. This form of |
Raymond Hettinger | 97ef8de | 2004-01-05 00:29:57 +0000 | [diff] [blame] | 1064 | access doubles their performance and makes them more suitable for |
Raymond Hettinger | 23a0f4e | 2004-01-05 08:15:20 +0000 | [diff] [blame] | 1065 | use as arguments to functionals: |
| 1066 | \samp{map(mydict.__getitem__, keylist)}. |
Andrew M. Kuchling | 7642f7a | 2004-09-13 15:06:50 +0000 | [diff] [blame] | 1067 | (Contributed by Raymond Hettinger.) |
Fred Drake | ed0fa3d | 2003-07-30 19:14:09 +0000 | [diff] [blame] | 1068 | |
Fred Drake | d6d35d9 | 2004-06-03 13:31:22 +0000 | [diff] [blame] | 1069 | \item Added a new opcode, \code{LIST_APPEND}, that simplifies |
Raymond Hettinger | dd80f76 | 2004-03-07 07:31:06 +0000 | [diff] [blame] | 1070 | the generated bytecode for list comprehensions and speeds them up |
Andrew M. Kuchling | 7642f7a | 2004-09-13 15:06:50 +0000 | [diff] [blame] | 1071 | by about a third. (Contributed by Raymond Hettinger.) |
Raymond Hettinger | dd80f76 | 2004-03-07 07:31:06 +0000 | [diff] [blame] | 1072 | |
Andrew M. Kuchling | 0c78956 | 2004-09-23 20:15:41 +0000 | [diff] [blame] | 1073 | \item The peephole bytecode optimizer has been improved to |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 1074 | produce shorter, faster bytecode; remarkably, the resulting bytecode is |
Andrew M. Kuchling | 0c78956 | 2004-09-23 20:15:41 +0000 | [diff] [blame] | 1075 | more readable. (Enhanced by Raymond Hettinger.) |
| 1076 | |
Andrew M. Kuchling | ac64287 | 2004-08-07 13:13:31 +0000 | [diff] [blame] | 1077 | \item String concatenations in statements of the form \code{s = s + |
| 1078 | "abc"} and \code{s += "abc"} are now performed more efficiently in |
| 1079 | certain circumstances. This optimization won't be present in other |
| 1080 | Python implementations such as Jython, so you shouldn't rely on it; |
| 1081 | using the \method{join()} method of strings is still recommended when |
| 1082 | you want to efficiently glue a large number of strings together. |
Andrew M. Kuchling | 7642f7a | 2004-09-13 15:06:50 +0000 | [diff] [blame] | 1083 | (Contributed by Armin Rigo.) |
Andrew M. Kuchling | ac64287 | 2004-08-07 13:13:31 +0000 | [diff] [blame] | 1084 | |
Fred Drake | ed0fa3d | 2003-07-30 19:14:09 +0000 | [diff] [blame] | 1085 | \end{itemize} |
| 1086 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 1087 | % XXX fill in these figures |
Raymond Hettinger | b2d5a8e | 2004-11-18 05:51:53 +0000 | [diff] [blame^] | 1088 | % pystone is almost useless for comparing different versions of Python; |
| 1089 | % instead, it excels at predicting relative Python performance on |
| 1090 | % different machines. |
| 1091 | % So, this section would be more informative if it used other tools |
| 1092 | % such as pybench and parrotbench. For a more application oriented |
| 1093 | % benchmark, try comparing the timings of test_decimal.py under 2.3 |
| 1094 | % and 2.4. |
| 1095 | |
Fred Drake | ed0fa3d | 2003-07-30 19:14:09 +0000 | [diff] [blame] | 1096 | The net result of the 2.4 optimizations is that Python 2.4 runs the |
| 1097 | pystone benchmark around XX\% faster than Python 2.3 and YY\% faster |
| 1098 | than Python 2.2. |
| 1099 | |
| 1100 | |
| 1101 | %====================================================================== |
| 1102 | \section{New, Improved, and Deprecated Modules} |
| 1103 | |
| 1104 | As usual, Python's standard library received a number of enhancements and |
| 1105 | bug fixes. Here's a partial list of the most notable changes, sorted |
| 1106 | alphabetically by module name. Consult the |
| 1107 | \file{Misc/NEWS} file in the source tree for a more |
| 1108 | complete list of changes, or look through the CVS logs for all the |
| 1109 | details. |
| 1110 | |
| 1111 | \begin{itemize} |
| 1112 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 1113 | \item The \module{asyncore} module's \function{loop()} function now |
| 1114 | has a \var{count} parameter that lets you perform a limited number |
Andrew M. Kuchling | d0b6d9d | 2004-07-04 15:35:00 +0000 | [diff] [blame] | 1115 | of passes through the polling loop. The default is still to loop |
| 1116 | forever. |
| 1117 | |
Andrew M. Kuchling | a331e86 | 2004-09-10 13:05:22 +0000 | [diff] [blame] | 1118 | \item The \module{base64} module now has more complete RFC 3548 support |
| 1119 | for Base64, Base32, and Base16 encoding and decoding, including |
| 1120 | optional case folding and optional alternative alphabets. |
| 1121 | (Contributed by Barry Warsaw.) |
Andrew M. Kuchling | 6aedcfc | 2003-10-21 12:48:23 +0000 | [diff] [blame] | 1122 | |
Raymond Hettinger | 0c41027 | 2004-01-05 10:13:35 +0000 | [diff] [blame] | 1123 | \item The \module{bisect} module now has an underlying C implementation |
| 1124 | for improved performance. |
| 1125 | (Contributed by Dmitry Vasiliev.) |
| 1126 | |
Andrew M. Kuchling | 5303a96 | 2004-01-18 15:55:51 +0000 | [diff] [blame] | 1127 | \item The CJKCodecs collections of East Asian codecs, maintained |
| 1128 | by Hye-Shik Chang, was integrated into 2.4. |
| 1129 | The new encodings are: |
| 1130 | |
| 1131 | \begin{itemize} |
Andrew M. Kuchling | 671c506 | 2004-07-28 15:29:39 +0000 | [diff] [blame] | 1132 | \item Chinese (PRC): gb2312, gbk, gb18030, big5hkscs, hz |
Andrew M. Kuchling | 5303a96 | 2004-01-18 15:55:51 +0000 | [diff] [blame] | 1133 | \item Chinese (ROC): big5, cp950 |
Andrew M. Kuchling | 671c506 | 2004-07-28 15:29:39 +0000 | [diff] [blame] | 1134 | \item Japanese: cp932, euc-jis-2004, euc-jp, |
Andrew M. Kuchling | 5303a96 | 2004-01-18 15:55:51 +0000 | [diff] [blame] | 1135 | euc-jisx0213, iso-2022-jp, iso-2022-jp-1, iso-2022-jp-2, |
Andrew M. Kuchling | 671c506 | 2004-07-28 15:29:39 +0000 | [diff] [blame] | 1136 | iso-2022-jp-3, iso-2022-jp-ext, iso-2022-jp-2004, |
| 1137 | shift-jis, shift-jisx0213, shift-jis-2004 |
Andrew M. Kuchling | 5303a96 | 2004-01-18 15:55:51 +0000 | [diff] [blame] | 1138 | \item Korean: cp949, euc-kr, johab, iso-2022-kr |
| 1139 | \end{itemize} |
| 1140 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 1141 | \item Some other new encodings were added: HP Roman8, |
| 1142 | ISO_8859-11, ISO_8859-16, PCTP-154, and TIS-620. |
| 1143 | |
Andrew M. Kuchling | 579b3e2 | 2004-10-05 20:23:34 +0000 | [diff] [blame] | 1144 | \item The UTF-8 and UTF-16 codecs now cope better with receiving partial input. |
| 1145 | Previously the \class{StreamReader} class would try to read more data, |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 1146 | making it impossible to resume decoding from the stream. The |
Andrew M. Kuchling | 579b3e2 | 2004-10-05 20:23:34 +0000 | [diff] [blame] | 1147 | \method{read()} method will now return as much data as it can and future |
| 1148 | calls will resume decoding where previous ones left off. |
| 1149 | (Implemented by Walter D\"orwald.) |
| 1150 | |
Andrew M. Kuchling | fd0e494 | 2004-02-09 13:23:34 +0000 | [diff] [blame] | 1151 | \item There is a new \module{collections} module for |
| 1152 | various specialized collection datatypes. |
| 1153 | Currently it contains just one type, \class{deque}, |
| 1154 | a double-ended queue that supports efficiently adding and removing |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 1155 | elements from either end: |
Raymond Hettinger | 756b3f3 | 2004-01-29 06:37:52 +0000 | [diff] [blame] | 1156 | |
| 1157 | \begin{verbatim} |
| 1158 | >>> from collections import deque |
| 1159 | >>> d = deque('ghi') # make a new deque with three items |
| 1160 | >>> d.append('j') # add a new entry to the right side |
| 1161 | >>> d.appendleft('f') # add a new entry to the left side |
| 1162 | >>> d # show the representation of the deque |
| 1163 | deque(['f', 'g', 'h', 'i', 'j']) |
| 1164 | >>> d.pop() # return and remove the rightmost item |
| 1165 | 'j' |
| 1166 | >>> d.popleft() # return and remove the leftmost item |
| 1167 | 'f' |
| 1168 | >>> list(d) # list the contents of the deque |
| 1169 | ['g', 'h', 'i'] |
| 1170 | >>> 'h' in d # search the deque |
| 1171 | True |
| 1172 | \end{verbatim} |
| 1173 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 1174 | Several modules, such as the \module{Queue} and \module{threading} |
| 1175 | modules, now take advantage of \class{collections.deque} for improved |
| 1176 | performance. (Contributed by Raymond Hettinger.) |
Andrew M. Kuchling | 5303a96 | 2004-01-18 15:55:51 +0000 | [diff] [blame] | 1177 | |
Fred Drake | 9f15b5c | 2004-05-18 04:30:00 +0000 | [diff] [blame] | 1178 | \item The \module{ConfigParser} classes have been enhanced slightly. |
| 1179 | The \method{read()} method now returns a list of the files that |
| 1180 | were successfully parsed, and the \method{set()} method raises |
| 1181 | \exception{TypeError} if passed a \var{value} argument that isn't a |
| 1182 | string. |
| 1183 | |
Andrew M. Kuchling | a331e86 | 2004-09-10 13:05:22 +0000 | [diff] [blame] | 1184 | \item The \module{curses} module now supports the ncurses extension |
| 1185 | \function{use_default_colors()}. On platforms where the terminal |
| 1186 | supports transparency, this makes it possible to use a transparent |
| 1187 | background. (Contributed by J\"org Lehmann.) |
| 1188 | |
| 1189 | \item The \module{difflib} module now includes an \class{HtmlDiff} class |
| 1190 | that creates an HTML table showing a side by side comparison |
| 1191 | of two versions of a text. (Contributed by Dan Gass.) |
| 1192 | |
Andrew M. Kuchling | 579b3e2 | 2004-10-05 20:23:34 +0000 | [diff] [blame] | 1193 | \item The \module{email} package was updated to version 3.0, |
| 1194 | which dropped various deprecated APIs and removes support for Python |
| 1195 | versions earlier than 2.3. The 3.0 version of the package uses a new |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 1196 | incremental parser for MIME messages, available in the |
Andrew M. Kuchling | 579b3e2 | 2004-10-05 20:23:34 +0000 | [diff] [blame] | 1197 | \module{email.FeedParser} module. The new parser doesn't require |
| 1198 | reading the entire message into memory, and doesn't throw exceptions |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 1199 | if a message is malformed; instead it records any problems in the |
Andrew M. Kuchling | 579b3e2 | 2004-10-05 20:23:34 +0000 | [diff] [blame] | 1200 | \member{defect} attribute of the message. (Developed by Anthony |
| 1201 | Baxter, Barry Warsaw, Thomas Wouters, and others.) |
Andrew M. Kuchling | a331e86 | 2004-09-10 13:05:22 +0000 | [diff] [blame] | 1202 | |
Raymond Hettinger | 607c00f | 2003-11-12 16:27:50 +0000 | [diff] [blame] | 1203 | \item The \module{heapq} module has been converted to C. The resulting |
Andrew M. Kuchling | fd0e494 | 2004-02-09 13:23:34 +0000 | [diff] [blame] | 1204 | tenfold improvement in speed makes the module suitable for handling |
Raymond Hettinger | 33ecffb | 2004-06-10 05:03:17 +0000 | [diff] [blame] | 1205 | high volumes of data. In addition, the module has two new functions |
| 1206 | \function{nlargest()} and \function{nsmallest()} that use heaps to |
Andrew M. Kuchling | d0b6d9d | 2004-07-04 15:35:00 +0000 | [diff] [blame] | 1207 | find the N largest or smallest values in a dataset without the |
Andrew M. Kuchling | 7642f7a | 2004-09-13 15:06:50 +0000 | [diff] [blame] | 1208 | expense of a full sort. (Contributed by Raymond Hettinger.) |
Andrew M. Kuchling | 1a42025 | 2003-11-08 15:58:49 +0000 | [diff] [blame] | 1209 | |
Andrew M. Kuchling | 0c78956 | 2004-09-23 20:15:41 +0000 | [diff] [blame] | 1210 | \item The \module{httplib} module now contains constants for HTTP |
| 1211 | status codes defined in various HTTP-related RFC documents. Constants |
| 1212 | have names such as \constant{OK}, \constant{CREATED}, |
| 1213 | \constant{CONTINUE}, and \constant{MOVED_PERMANENTLY}; use pydoc to |
| 1214 | get a full list. (Contributed by Andrew Eland.) |
| 1215 | |
Andrew M. Kuchling | ce4bae6 | 2004-07-27 12:13:25 +0000 | [diff] [blame] | 1216 | \item The \module{imaplib} module now supports IMAP's THREAD command |
| 1217 | (contributed by Yves Dionne) and new \method{deleteacl()} and |
| 1218 | \method{myrights()} methods (contributed by Arnaud Mazin). |
Andrew M. Kuchling | dff9dbd | 2003-11-20 22:22:19 +0000 | [diff] [blame] | 1219 | |
Andrew M. Kuchling | ad80955 | 2003-12-06 23:19:23 +0000 | [diff] [blame] | 1220 | \item The \module{itertools} module gained a |
Andrew M. Kuchling | 3bf85f1 | 2004-07-05 01:37:07 +0000 | [diff] [blame] | 1221 | \function{groupby(\var{iterable}\optional{, \var{func}})} function. |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 1222 | \var{iterable} is something that can be iterated over to return a |
| 1223 | stream of elements, and the optional \var{func} parameter is a |
| 1224 | function that takes an element and returns a key value; if omitted, |
| 1225 | the key is simply the element itself. \function{groupby()} then |
| 1226 | groups the elements into subsequences which have matching values of |
| 1227 | the key, and returns a series of 2-tuples containing the key value |
| 1228 | and an iterator over the subsequence. |
Andrew M. Kuchling | ad80955 | 2003-12-06 23:19:23 +0000 | [diff] [blame] | 1229 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 1230 | Here's an example to make this clearer. The \var{key} function simply |
| 1231 | returns whether a number is even or odd, so the result of |
| 1232 | \function{groupby()} is to return consecutive runs of odd or even |
| 1233 | numbers. |
Andrew M. Kuchling | ad80955 | 2003-12-06 23:19:23 +0000 | [diff] [blame] | 1234 | |
| 1235 | \begin{verbatim} |
| 1236 | >>> import itertools |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 1237 | >>> L = [2, 4, 6, 7, 8, 9, 11, 12, 14] |
Andrew M. Kuchling | ad80955 | 2003-12-06 23:19:23 +0000 | [diff] [blame] | 1238 | >>> for key_val, it in itertools.groupby(L, lambda x: x % 2): |
| 1239 | ... print key_val, list(it) |
| 1240 | ... |
| 1241 | 0 [2, 4, 6] |
| 1242 | 1 [7] |
| 1243 | 0 [8] |
| 1244 | 1 [9, 11] |
| 1245 | 0 [12, 14] |
| 1246 | >>> |
| 1247 | \end{verbatim} |
| 1248 | |
Andrew M. Kuchling | 3bf85f1 | 2004-07-05 01:37:07 +0000 | [diff] [blame] | 1249 | \function{groupby()} is typically used with sorted input. The logic |
| 1250 | for \function{groupby()} is similar to the \UNIX{} \code{uniq} filter |
| 1251 | which makes it handy for eliminating, counting, or identifying |
| 1252 | duplicate elements: |
Raymond Hettinger | feb78c9 | 2003-12-12 13:13:47 +0000 | [diff] [blame] | 1253 | |
| 1254 | \begin{verbatim} |
| 1255 | >>> word = 'abracadabra' |
Raymond Hettinger | ed54d91 | 2003-12-31 01:59:18 +0000 | [diff] [blame] | 1256 | >>> letters = sorted(word) # Turn string into a sorted list of letters |
Raymond Hettinger | 64958a1 | 2003-12-17 20:43:33 +0000 | [diff] [blame] | 1257 | >>> letters |
Andrew M. Kuchling | 4612bc5 | 2003-12-16 20:59:37 +0000 | [diff] [blame] | 1258 | ['a', 'a', 'a', 'a', 'a', 'b', 'b', 'c', 'd', 'r', 'r'] |
Andrew M. Kuchling | 3bf85f1 | 2004-07-05 01:37:07 +0000 | [diff] [blame] | 1259 | >>> for k, g in itertools.groupby(letters): |
| 1260 | ... print k, list(g) |
| 1261 | ... |
| 1262 | a ['a', 'a', 'a', 'a', 'a'] |
| 1263 | b ['b', 'b'] |
| 1264 | c ['c'] |
| 1265 | d ['d'] |
| 1266 | r ['r', 'r'] |
| 1267 | >>> # List unique letters |
| 1268 | >>> [k for k, g in groupby(letters)] |
Raymond Hettinger | feb78c9 | 2003-12-12 13:13:47 +0000 | [diff] [blame] | 1269 | ['a', 'b', 'c', 'd', 'r'] |
Johannes Gijsbers | d345225 | 2004-09-11 16:50:06 +0000 | [diff] [blame] | 1270 | >>> # Count letter occurrences |
Andrew M. Kuchling | 3bf85f1 | 2004-07-05 01:37:07 +0000 | [diff] [blame] | 1271 | >>> [(k, len(list(g))) for k, g in groupby(letters)] |
Raymond Hettinger | feb78c9 | 2003-12-12 13:13:47 +0000 | [diff] [blame] | 1272 | [('a', 5), ('b', 2), ('c', 1), ('d', 1), ('r', 2)] |
Raymond Hettinger | feb78c9 | 2003-12-12 13:13:47 +0000 | [diff] [blame] | 1273 | \end{verbatim} |
| 1274 | |
Andrew M. Kuchling | 7642f7a | 2004-09-13 15:06:50 +0000 | [diff] [blame] | 1275 | (Contributed by Hye-Shik Chang.) |
| 1276 | |
Raymond Hettinger | ed54d91 | 2003-12-31 01:59:18 +0000 | [diff] [blame] | 1277 | \item \module{itertools} also gained a function named |
| 1278 | \function{tee(\var{iterator}, \var{N})} that returns \var{N} independent |
| 1279 | iterators that replicate \var{iterator}. If \var{N} is omitted, the |
| 1280 | default is 2. |
Andrew M. Kuchling | 35f2b05 | 2003-12-18 13:28:13 +0000 | [diff] [blame] | 1281 | |
| 1282 | \begin{verbatim} |
| 1283 | >>> L = [1,2,3] |
| 1284 | >>> i1, i2 = itertools.tee(L) |
| 1285 | >>> i1,i2 |
| 1286 | (<itertools.tee object at 0x402c2080>, <itertools.tee object at 0x402c2090>) |
Raymond Hettinger | ed54d91 | 2003-12-31 01:59:18 +0000 | [diff] [blame] | 1287 | >>> list(i1) # Run the first iterator to exhaustion |
Andrew M. Kuchling | 35f2b05 | 2003-12-18 13:28:13 +0000 | [diff] [blame] | 1288 | [1, 2, 3] |
Raymond Hettinger | ed54d91 | 2003-12-31 01:59:18 +0000 | [diff] [blame] | 1289 | >>> list(i2) # Run the second iterator to exhaustion |
Andrew M. Kuchling | 35f2b05 | 2003-12-18 13:28:13 +0000 | [diff] [blame] | 1290 | [1, 2, 3] |
| 1291 | >\end{verbatim} |
| 1292 | |
| 1293 | Note that \function{tee()} has to keep copies of the values returned |
Raymond Hettinger | ed54d91 | 2003-12-31 01:59:18 +0000 | [diff] [blame] | 1294 | by the iterator; in the worst case, it may need to keep all of them. |
Andrew M. Kuchling | 44a31e1 | 2004-01-01 18:33:34 +0000 | [diff] [blame] | 1295 | This should therefore be used carefully if the leading iterator |
Raymond Hettinger | ed54d91 | 2003-12-31 01:59:18 +0000 | [diff] [blame] | 1296 | can run far ahead of the trailing iterator in a long stream of inputs. |
Andrew M. Kuchling | 3bf85f1 | 2004-07-05 01:37:07 +0000 | [diff] [blame] | 1297 | If the separation is large, then you might as well use |
Raymond Hettinger | ed54d91 | 2003-12-31 01:59:18 +0000 | [diff] [blame] | 1298 | \function{list()} instead. When the iterators track closely with one |
| 1299 | another, \function{tee()} is ideal. Possible applications include |
| 1300 | bookmarking, windowing, or lookahead iterators. |
Andrew M. Kuchling | 7642f7a | 2004-09-13 15:06:50 +0000 | [diff] [blame] | 1301 | (Contributed by Raymond Hettinger.) |
Andrew M. Kuchling | 35f2b05 | 2003-12-18 13:28:13 +0000 | [diff] [blame] | 1302 | |
Andrew M. Kuchling | 5785a13 | 2004-07-26 19:28:46 +0000 | [diff] [blame] | 1303 | \item A number of functions were added to the \module{locale} |
| 1304 | module, such as \function{bind_textdomain_codeset()} to specify a |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 1305 | particular encoding and a family of \function{l*gettext()} functions |
Andrew M. Kuchling | 5785a13 | 2004-07-26 19:28:46 +0000 | [diff] [blame] | 1306 | that return messages in the chosen encoding. |
| 1307 | (Contributed by Gustavo Niemeyer.) |
| 1308 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 1309 | \item Some keyword arguments were added to the \module{logging} |
| 1310 | package's \function{basicConfig} function to simplify log |
| 1311 | configuration. The default behavior is to log messages to standard |
| 1312 | error, but various keyword arguments can be specified to log to a |
| 1313 | particular file, change the logging format, or set the logging level. |
| 1314 | For example: |
Andrew M. Kuchling | bcefe69 | 2004-07-07 13:01:53 +0000 | [diff] [blame] | 1315 | |
| 1316 | \begin{verbatim} |
| 1317 | import logging |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 1318 | logging.basicConfig(filename='/var/log/application.log', |
| 1319 | level=0, # Log all messages |
Andrew M. Kuchling | bcefe69 | 2004-07-07 13:01:53 +0000 | [diff] [blame] | 1320 | format='%(levelname):%(process):%(thread):%(message)') |
| 1321 | \end{verbatim} |
| 1322 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 1323 | Other additions to the \module{logging} package include a |
| 1324 | \method{log(\var{level}, \var{msg})} convenience method, as well as a |
| 1325 | \class{TimedRotatingFileHandler} class that rotates its log files at a |
| 1326 | timed interval. The module already had \class{RotatingFileHandler}, |
Andrew M. Kuchling | bcefe69 | 2004-07-07 13:01:53 +0000 | [diff] [blame] | 1327 | which rotated logs once the file exceeded a certain size. Both |
| 1328 | classes derive from a new \class{BaseRotatingHandler} class that can |
| 1329 | be used to implement other rotating handlers. |
| 1330 | |
Andrew M. Kuchling | 579b3e2 | 2004-10-05 20:23:34 +0000 | [diff] [blame] | 1331 | (Changes implemented by Vinay Sajip.) |
| 1332 | |
Andrew M. Kuchling | 0c78956 | 2004-09-23 20:15:41 +0000 | [diff] [blame] | 1333 | \item The \module{marshal} module now shares interned strings on unpacking a |
| 1334 | data structure. This may shrink the size of certain pickle strings, |
| 1335 | but the primary effect is to make \file{.pyc} files significantly smaller. |
| 1336 | (Contributed by Martin von Loewis.) |
| 1337 | |
Andrew M. Kuchling | 5785a13 | 2004-07-26 19:28:46 +0000 | [diff] [blame] | 1338 | \item The \module{nntplib} module's \class{NNTP} class gained |
| 1339 | \method{description()} and \method{descriptions()} methods to retrieve |
| 1340 | newsgroup descriptions for a single group or for a range of groups. |
| 1341 | (Contributed by J\"urgen A. Erhard.) |
| 1342 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 1343 | \item Two new functions were added to the \module{operator} module, |
Andrew M. Kuchling | 35f2b05 | 2003-12-18 13:28:13 +0000 | [diff] [blame] | 1344 | \function{attrgetter(\var{attr})} and \function{itemgetter(\var{index})}. |
| 1345 | Both functions return callables that take a single argument and return |
Raymond Hettinger | ed54d91 | 2003-12-31 01:59:18 +0000 | [diff] [blame] | 1346 | the corresponding attribute or item; these callables make excellent |
Andrew M. Kuchling | bcefe69 | 2004-07-07 13:01:53 +0000 | [diff] [blame] | 1347 | data extractors when used with \function{map()} or |
| 1348 | \function{sorted()}. For example: |
Andrew M. Kuchling | 35f2b05 | 2003-12-18 13:28:13 +0000 | [diff] [blame] | 1349 | |
| 1350 | \begin{verbatim} |
Raymond Hettinger | ed54d91 | 2003-12-31 01:59:18 +0000 | [diff] [blame] | 1351 | >>> L = [('c', 2), ('d', 1), ('a', 4), ('b', 3)] |
Andrew M. Kuchling | 35f2b05 | 2003-12-18 13:28:13 +0000 | [diff] [blame] | 1352 | >>> map(operator.itemgetter(0), L) |
| 1353 | ['c', 'd', 'a', 'b'] |
| 1354 | >>> map(operator.itemgetter(1), L) |
Raymond Hettinger | ed54d91 | 2003-12-31 01:59:18 +0000 | [diff] [blame] | 1355 | [2, 1, 4, 3] |
| 1356 | >>> sorted(L, key=operator.itemgetter(1)) # Sort list by second tuple item |
| 1357 | [('d', 1), ('c', 2), ('b', 3), ('a', 4)] |
Andrew M. Kuchling | 35f2b05 | 2003-12-18 13:28:13 +0000 | [diff] [blame] | 1358 | \end{verbatim} |
| 1359 | |
Andrew M. Kuchling | 7642f7a | 2004-09-13 15:06:50 +0000 | [diff] [blame] | 1360 | (Contributed by Raymond Hettinger.) |
| 1361 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 1362 | \item The \module{optparse} module was updated in various ways. The |
| 1363 | module now passes its messages through \function{gettext.gettext()}, |
| 1364 | making it possible to internationalize Optik's help and error |
| 1365 | messages. Help messages for options can now include the string |
| 1366 | \code{'\%default'}, which will be replaced by the option's default |
| 1367 | value. (Contributed by Greg Ward.) |
Andrew M. Kuchling | e30c4d4 | 2004-08-07 13:58:02 +0000 | [diff] [blame] | 1368 | |
Andrew M. Kuchling | f3958f1 | 2004-10-11 19:20:06 +0000 | [diff] [blame] | 1369 | \item The long-term plan is to deprecate the \module{rfc822} module |
| 1370 | in some future Python release in favor of the \module{email} package. |
| 1371 | To this end, the \function{email.Utils.formatdate()} function has been |
| 1372 | changed to make it usable as a replacement for |
| 1373 | \function{rfc822.formatdate()}. You may want to write new e-mail |
| 1374 | processing code with this in mind. (Change implemented by Anthony |
| 1375 | Baxter.) |
| 1376 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 1377 | \item A new \function{urandom(\var{n})} function was added to the |
| 1378 | \module{os} module, returning a string containing \var{n} bytes of |
| 1379 | random data. This function provides access to platform-specific |
| 1380 | sources of randomness such as \file{/dev/urandom} on Linux or the |
| 1381 | Windows CryptoAPI. (Contributed by Trevor Perrin.) |
Andrew M. Kuchling | cb7b3f3 | 2004-08-30 11:58:04 +0000 | [diff] [blame] | 1382 | |
| 1383 | \item Another new function: \function{os.path.lexists(\var{path})} |
| 1384 | returns true if the file specified by \var{path} exists, whether or |
| 1385 | not it's a symbolic link. This differs from the existing |
| 1386 | \function{os.path.exists(\var{path})} function, which returns false if |
| 1387 | \var{path} is a symlink that points to a destination that doesn't exist. |
| 1388 | (Contributed by Beni Cherniavsky.) |
| 1389 | |
Andrew M. Kuchling | d0b6d9d | 2004-07-04 15:35:00 +0000 | [diff] [blame] | 1390 | \item A new \function{getsid()} function was added to the |
| 1391 | \module{posix} module that underlies the \module{os} module. |
| 1392 | (Contributed by J. Raynor.) |
| 1393 | |
| 1394 | \item The \module{poplib} module now supports POP over SSL. |
| 1395 | |
| 1396 | \item The \module{profile} module can now profile C extension functions. |
Andrew M. Kuchling | 7642f7a | 2004-09-13 15:06:50 +0000 | [diff] [blame] | 1397 | % XXX more to say about this? |
| 1398 | (Contributed by Nick Bastin.) |
Andrew M. Kuchling | d0b6d9d | 2004-07-04 15:35:00 +0000 | [diff] [blame] | 1399 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 1400 | \item The \module{random} module has a new method called |
| 1401 | \method{getrandbits(\var{N})} that returns a long integer \var{N} |
| 1402 | bits in length. The existing \method{randrange()} method now uses |
| 1403 | \method{getrandbits()} where appropriate, making generation of |
| 1404 | arbitrarily large random numbers more efficient. (Contributed by |
| 1405 | Raymond Hettinger.) |
Andrew M. Kuchling | 6aedcfc | 2003-10-21 12:48:23 +0000 | [diff] [blame] | 1406 | |
| 1407 | \item The regular expression language accepted by the \module{re} module |
| 1408 | was extended with simple conditional expressions, written as |
Andrew M. Kuchling | ab77822 | 2004-08-31 12:07:43 +0000 | [diff] [blame] | 1409 | \regexp{(?(\var{group})\var{A}|\var{B})}. \var{group} is either a |
| 1410 | numeric group ID or a group name defined with \regexp{(?P<group>...)} |
Andrew M. Kuchling | 6aedcfc | 2003-10-21 12:48:23 +0000 | [diff] [blame] | 1411 | earlier in the expression. If the specified group matched, the |
| 1412 | regular expression pattern \var{A} will be tested against the string; if |
| 1413 | the group didn't match, the pattern \var{B} will be used instead. |
Raymond Hettinger | 874ebd5 | 2004-05-31 03:15:02 +0000 | [diff] [blame] | 1414 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 1415 | \item The \module{re} module is also no longer recursive, thanks to a |
| 1416 | massive amount of work by Gustavo Niemeyer. In a recursive regular |
| 1417 | expression engine, certain patterns result in a large amount of C |
| 1418 | stack space being consumed, and it was possible to overflow the stack. |
| 1419 | For example, if you matched a 30000-byte string of \samp{a} characters |
| 1420 | against the expression \regexp{(a|b)+}, one stack frame was consumed |
| 1421 | per character. Python 2.3 tried to check for stack overflow and raise |
| 1422 | a \exception{RuntimeError} exception, but certain patterns could |
| 1423 | sidestep the checking and if you were unlucky Python could segfault. |
| 1424 | Python 2.4's regular expression engine can match this pattern without |
| 1425 | problems. |
Andrew M. Kuchling | ab77822 | 2004-08-31 12:07:43 +0000 | [diff] [blame] | 1426 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 1427 | \item A new \function{socketpair()} function, returning a pair of |
| 1428 | connected sockets, was added to the \module{socket} module. |
Andrew M. Kuchling | 87c98b2 | 2004-08-25 13:38:46 +0000 | [diff] [blame] | 1429 | (Contributed by Dave Cole.) |
Andrew M. Kuchling | 7f203b8 | 2004-08-09 14:48:28 +0000 | [diff] [blame] | 1430 | |
Andrew M. Kuchling | 87c98b2 | 2004-08-25 13:38:46 +0000 | [diff] [blame] | 1431 | \item The \function{sys.exitfunc()} function has been deprecated. Code |
| 1432 | should be using the existing \module{atexit} module, which correctly |
| 1433 | handles calling multiple exit functions. Eventually |
| 1434 | \function{sys.exitfunc()} will become a purely internal interface, |
| 1435 | accessed only by \module{atexit}. |
| 1436 | |
| 1437 | \item The \module{tarfile} module now generates GNU-format tar files |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 1438 | by default. (Contributed by Lars Gustaebel.) |
Andrew M. Kuchling | 87c98b2 | 2004-08-25 13:38:46 +0000 | [diff] [blame] | 1439 | |
Andrew M. Kuchling | 0045717 | 2004-07-15 11:52:40 +0000 | [diff] [blame] | 1440 | \item The \module{threading} module now has an elegantly simple way to support |
| 1441 | thread-local data. The module contains a \class{local} class whose |
| 1442 | attribute values are local to different threads. |
| 1443 | |
| 1444 | \begin{verbatim} |
| 1445 | import threading |
| 1446 | |
| 1447 | data = threading.local() |
| 1448 | data.number = 42 |
| 1449 | data.url = ('www.python.org', 80) |
| 1450 | \end{verbatim} |
| 1451 | |
| 1452 | Other threads can assign and retrieve their own values for the |
| 1453 | \member{number} and \member{url} attributes. You can subclass |
| 1454 | \class{local} to initialize attributes or to add methods. |
| 1455 | (Contributed by Jim Fulton.) |
| 1456 | |
Andrew M. Kuchling | a331e86 | 2004-09-10 13:05:22 +0000 | [diff] [blame] | 1457 | \item The \module{timeit} module now automatically disables periodic |
| 1458 | garbarge collection during the timing loop. This change makes |
Andrew M. Kuchling | 7642f7a | 2004-09-13 15:06:50 +0000 | [diff] [blame] | 1459 | consecutive timings more comparable. (Contributed by Raymond Hettinger.) |
Andrew M. Kuchling | a331e86 | 2004-09-10 13:05:22 +0000 | [diff] [blame] | 1460 | |
Raymond Hettinger | 874ebd5 | 2004-05-31 03:15:02 +0000 | [diff] [blame] | 1461 | \item The \module{weakref} module now supports a wider variety of objects |
| 1462 | including Python functions, class instances, sets, frozensets, deques, |
| 1463 | arrays, files, sockets, and regular expression pattern objects. |
Andrew M. Kuchling | 7642f7a | 2004-09-13 15:06:50 +0000 | [diff] [blame] | 1464 | (Contributed by Raymond Hettinger.) |
Andrew M. Kuchling | d0b6d9d | 2004-07-04 15:35:00 +0000 | [diff] [blame] | 1465 | |
| 1466 | \item The \module{xmlrpclib} module now supports a multi-call extension for |
Andrew M. Kuchling | 0045717 | 2004-07-15 11:52:40 +0000 | [diff] [blame] | 1467 | transmitting multiple XML-RPC calls in a single HTTP operation. |
Andrew M. Kuchling | 3d3db96 | 2004-08-31 13:57:02 +0000 | [diff] [blame] | 1468 | |
| 1469 | \item The \module{mpz}, \module{rotor}, and \module{xreadlines} modules have |
| 1470 | been removed. |
Andrew M. Kuchling | 69f31eb | 2003-08-13 23:11:04 +0000 | [diff] [blame] | 1471 | |
Fred Drake | ed0fa3d | 2003-07-30 19:14:09 +0000 | [diff] [blame] | 1472 | \end{itemize} |
| 1473 | |
| 1474 | |
| 1475 | %====================================================================== |
Raymond Hettinger | ca1a775 | 2004-07-12 13:00:45 +0000 | [diff] [blame] | 1476 | % whole new modules get described in subsections here |
Fred Drake | ed0fa3d | 2003-07-30 19:14:09 +0000 | [diff] [blame] | 1477 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 1478 | %===================== |
Martin v. Löwis | 2a6ba90 | 2004-05-31 18:22:40 +0000 | [diff] [blame] | 1479 | \subsection{cookielib} |
| 1480 | |
| 1481 | The \module{cookielib} library supports client-side handling for HTTP |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 1482 | cookies, mirroring the \module{Cookie} module's server-side cookie |
| 1483 | support. Cookies are stored in cookie jars; the library transparently |
| 1484 | stores cookies offered by the web server in the cookie jar, and |
| 1485 | fetches the cookie from the jar when connecting to the server. As in |
| 1486 | web browsers, policy objects control whether cookies are accepted or |
| 1487 | not. |
Martin v. Löwis | 2a6ba90 | 2004-05-31 18:22:40 +0000 | [diff] [blame] | 1488 | |
| 1489 | In order to store cookies across sessions, two implementations of |
| 1490 | cookie jars are provided: one that stores cookies in the Netscape |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 1491 | format so applications can use the Mozilla or Lynx cookie files, and |
Martin v. Löwis | 2a6ba90 | 2004-05-31 18:22:40 +0000 | [diff] [blame] | 1492 | one that stores cookies in the same format as the Perl libwww libary. |
| 1493 | |
| 1494 | \module{urllib2} has been changed to interact with \module{cookielib}: |
| 1495 | \class{HTTPCookieProcessor} manages a cookie jar that is used when |
| 1496 | accessing URLs. |
Fred Drake | ed0fa3d | 2003-07-30 19:14:09 +0000 | [diff] [blame] | 1497 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 1498 | % ================== |
Andrew M. Kuchling | 87c98b2 | 2004-08-25 13:38:46 +0000 | [diff] [blame] | 1499 | \subsection{doctest} |
| 1500 | |
| 1501 | The \module{doctest} module underwent considerable refactoring thanks |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 1502 | to Edward Loper and Tim Peters. Testing can still be as simple as |
| 1503 | running \function{doctest.testmod()}, but the refactorings allow |
| 1504 | customizing the module's operation in various ways |
Andrew M. Kuchling | 87c98b2 | 2004-08-25 13:38:46 +0000 | [diff] [blame] | 1505 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 1506 | The new \class{DocTestFinder} class extracts the tests from a given |
| 1507 | object's docstrings: |
| 1508 | |
| 1509 | \begin{verbatim} |
| 1510 | def f (x, y): |
| 1511 | """>>> f(2,2) |
| 1512 | 4 |
| 1513 | >>> f(3,2) |
| 1514 | 6 |
| 1515 | """ |
| 1516 | return x*y |
| 1517 | |
| 1518 | finder = doctest.DocTestFinder() |
| 1519 | |
| 1520 | # Get list of DocTest instances |
| 1521 | tests = finder.find(f) |
| 1522 | \end{verbatim} |
| 1523 | |
| 1524 | The new \class{DocTestRunner} class then runs individual tests and can |
| 1525 | produce a summary of the results: |
| 1526 | |
| 1527 | \begin{verbatim} |
| 1528 | runner = doctest.DocTestRunner() |
| 1529 | for t in tests: |
| 1530 | tried, failed = runner.run(t) |
| 1531 | |
| 1532 | runner.summarize(verbose=1) |
| 1533 | \end{verbatim} |
| 1534 | |
| 1535 | The above example produces the following output: |
| 1536 | |
| 1537 | \begin{verbatim} |
| 1538 | 1 items passed all tests: |
| 1539 | 2 tests in f |
| 1540 | 2 tests in 1 items. |
| 1541 | 2 passed and 0 failed. |
| 1542 | Test passed. |
| 1543 | \end{verbatim} |
| 1544 | |
| 1545 | \class{DocTestRunner} uses an instance of the \class{OutputChecker} |
| 1546 | class to compare the expected output with the actual output. This |
| 1547 | class takes a number of different flags that customize its behaviour; |
| 1548 | ambitious users can also write a completely new subclass of |
| 1549 | \class{OutputChecker}. |
| 1550 | |
| 1551 | The default output checker provides a number of handy features. |
| 1552 | For example, with the \constant{doctest.ELLIPSIS} option flag, |
| 1553 | an ellipsis (\samp{...}) in the expected output matches any substring, |
| 1554 | making it easier to accommodate outputs that vary in minor ways: |
| 1555 | |
| 1556 | \begin{verbatim} |
| 1557 | def o (n): |
| 1558 | """>>> o(1) |
| 1559 | <__main__.C instance at 0x...> |
| 1560 | >>> |
| 1561 | """ |
| 1562 | \end{verbatim} |
| 1563 | |
| 1564 | Another special string, \samp{<BLANKLINE>}, matches a blank line: |
| 1565 | |
| 1566 | \begin{verbatim} |
| 1567 | def p (n): |
| 1568 | """>>> p(1) |
| 1569 | <BLANKLINE> |
| 1570 | >>> |
| 1571 | """ |
| 1572 | \end{verbatim} |
| 1573 | |
| 1574 | Another new capability is producing a diff-style display of the output |
| 1575 | by specifying the \constant{doctest.REPORT_UDIFF} (unified diffs), |
| 1576 | \constant{doctest.REPORT_CDIFF} (context diffs), or |
| 1577 | \constant{doctest.REPORT_NDIFF} (delta-style) option flags. For example: |
| 1578 | |
| 1579 | \begin{verbatim} |
| 1580 | def g (n): |
| 1581 | """>>> g(4) |
| 1582 | here |
| 1583 | is |
| 1584 | a |
| 1585 | lengthy |
| 1586 | >>>""" |
| 1587 | L = 'here is a rather lengthy list of words'.split() |
| 1588 | for word in L[:n]: |
| 1589 | print word |
| 1590 | \end{verbatim} |
| 1591 | |
| 1592 | Running the above function's tests with |
| 1593 | \constant{doctest.REPORT_UDIFF} specified, you get the following output: |
| 1594 | |
| 1595 | \begin{verbatim} |
| 1596 | ********************************************************************** |
| 1597 | File ``t.py'', line 15, in g |
| 1598 | Failed example: |
| 1599 | g(4) |
| 1600 | Differences (unified diff with -expected +actual): |
| 1601 | @@ -2,3 +2,3 @@ |
| 1602 | is |
| 1603 | a |
| 1604 | -lengthy |
| 1605 | +rather |
| 1606 | ********************************************************************** |
| 1607 | \end{verbatim} |
| 1608 | |
Andrew M. Kuchling | 87c98b2 | 2004-08-25 13:38:46 +0000 | [diff] [blame] | 1609 | |
Fred Drake | ed0fa3d | 2003-07-30 19:14:09 +0000 | [diff] [blame] | 1610 | % ====================================================================== |
| 1611 | \section{Build and C API Changes} |
| 1612 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 1613 | Some of the changes to Python's build process and to the C API are: |
Fred Drake | ed0fa3d | 2003-07-30 19:14:09 +0000 | [diff] [blame] | 1614 | |
| 1615 | \begin{itemize} |
| 1616 | |
Andrew M. Kuchling | 6aedcfc | 2003-10-21 12:48:23 +0000 | [diff] [blame] | 1617 | \item Three new convenience macros were added for common return |
| 1618 | values from extension functions: \csimplemacro{Py_RETURN_NONE}, |
| 1619 | \csimplemacro{Py_RETURN_TRUE}, and \csimplemacro{Py_RETURN_FALSE}. |
Andrew M. Kuchling | 7642f7a | 2004-09-13 15:06:50 +0000 | [diff] [blame] | 1620 | (Contributed by Brett Cannon.) |
Andrew M. Kuchling | 6aedcfc | 2003-10-21 12:48:23 +0000 | [diff] [blame] | 1621 | |
Andrew M. Kuchling | 5785a13 | 2004-07-26 19:28:46 +0000 | [diff] [blame] | 1622 | \item Another new macro, \csimplemacro{Py_CLEAR(\var{obj})}, |
| 1623 | decreases the reference count of \var{obj} and sets \var{obj} to the |
Andrew M. Kuchling | 7642f7a | 2004-09-13 15:06:50 +0000 | [diff] [blame] | 1624 | null pointer. (Contributed by Jim Fulton.) |
Andrew M. Kuchling | 5785a13 | 2004-07-26 19:28:46 +0000 | [diff] [blame] | 1625 | |
Fred Drake | ce3caf2 | 2004-02-12 18:13:12 +0000 | [diff] [blame] | 1626 | \item A new function, \cfunction{PyTuple_Pack(\var{N}, \var{obj1}, |
| 1627 | \var{obj2}, ..., \var{objN})}, constructs tuples from a variable |
Andrew M. Kuchling | 7642f7a | 2004-09-13 15:06:50 +0000 | [diff] [blame] | 1628 | length argument list of Python objects. (Contributed by Raymond Hettinger.) |
Fred Drake | ed0fa3d | 2003-07-30 19:14:09 +0000 | [diff] [blame] | 1629 | |
Fred Drake | ce3caf2 | 2004-02-12 18:13:12 +0000 | [diff] [blame] | 1630 | \item A new function, \cfunction{PyDict_Contains(\var{d}, \var{k})}, |
| 1631 | implements fast dictionary lookups without masking exceptions raised |
Andrew M. Kuchling | 7642f7a | 2004-09-13 15:06:50 +0000 | [diff] [blame] | 1632 | during the look-up process. (Contributed by Raymond Hettinger.) |
Raymond Hettinger | d446230 | 2003-11-26 17:52:45 +0000 | [diff] [blame] | 1633 | |
Andrew M. Kuchling | 0c78956 | 2004-09-23 20:15:41 +0000 | [diff] [blame] | 1634 | \item The \csimplemacro{Py_IS_NAN(\var{X})} macro returns 1 if |
| 1635 | its float or double argument \var{X} is a NaN. |
| 1636 | (Contributed by Tim Peters.) |
| 1637 | |
Andrew M. Kuchling | f3958f1 | 2004-10-11 19:20:06 +0000 | [diff] [blame] | 1638 | \item C code can avoid unnecessary locking by using the new |
| 1639 | \cfunction{PyEval_ThreadsInitialized()} function to tell |
| 1640 | if any thread operations have been performed. If this function |
| 1641 | returns false, no lock operations are needed. |
| 1642 | (Contributed by Nick Coghlan.) |
| 1643 | |
Andrew M. Kuchling | e30c4d4 | 2004-08-07 13:58:02 +0000 | [diff] [blame] | 1644 | \item A new function, \cfunction{PyArg_VaParseTupleAndKeywords()}, |
| 1645 | is the same as \cfunction{PyArg_ParseTupleAndKeywords()} but takes a |
| 1646 | \ctype{va_list} instead of a number of arguments. |
| 1647 | (Contributed by Greg Chapman.) |
| 1648 | |
Fred Drake | ce3caf2 | 2004-02-12 18:13:12 +0000 | [diff] [blame] | 1649 | \item A new method flag, \constant{METH_COEXISTS}, allows a function |
Andrew M. Kuchling | 71432f1 | 2004-07-05 01:40:07 +0000 | [diff] [blame] | 1650 | defined in slots to co-exist with a \ctype{PyCFunction} having the |
| 1651 | same name. This can halve the access time for a method such as |
Andrew M. Kuchling | 7642f7a | 2004-09-13 15:06:50 +0000 | [diff] [blame] | 1652 | \method{set.__contains__()}. (Contributed by Raymond Hettinger.) |
Andrew M. Kuchling | d0b6d9d | 2004-07-04 15:35:00 +0000 | [diff] [blame] | 1653 | |
Andrew M. Kuchling | 87c98b2 | 2004-08-25 13:38:46 +0000 | [diff] [blame] | 1654 | \item Python can now be built with additional profiling for the |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 1655 | interpreter itself, intended as an aid to people developing the |
Andrew M. Kuchling | 87c98b2 | 2004-08-25 13:38:46 +0000 | [diff] [blame] | 1656 | Python core. Providing \longprogramopt{--enable-profiling} to the |
| 1657 | \program{configure} script will let you profile the interpreter with |
| 1658 | \program{gprof}, and providing the \longprogramopt{--with-tsc} |
| 1659 | switch enables profiling using the Pentium's Time-Stamp-Counter |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 1660 | register. (The \longprogramopt{--with-tsc} switch is slightly |
| 1661 | misnamed, because the profiling feature also works on the PowerPC |
| 1662 | platform, though that processor architecture doesn't call that |
| 1663 | register ``the TSC register''.) |
Andrew M. Kuchling | 87c98b2 | 2004-08-25 13:38:46 +0000 | [diff] [blame] | 1664 | |
Andrew M. Kuchling | d0b6d9d | 2004-07-04 15:35:00 +0000 | [diff] [blame] | 1665 | \item The \ctype{tracebackobject} type has been renamed to \ctype{PyTracebackObject}. |
Raymond Hettinger | 97ef8de | 2004-01-05 00:29:57 +0000 | [diff] [blame] | 1666 | |
Fred Drake | ed0fa3d | 2003-07-30 19:14:09 +0000 | [diff] [blame] | 1667 | \end{itemize} |
| 1668 | |
| 1669 | |
| 1670 | %====================================================================== |
| 1671 | \subsection{Port-Specific Changes} |
| 1672 | |
Raymond Hettinger | 97ef8de | 2004-01-05 00:29:57 +0000 | [diff] [blame] | 1673 | \begin{itemize} |
| 1674 | |
| 1675 | \item The Windows port now builds under MSVC++ 7.1 as well as version 6. |
Andrew M. Kuchling | 7642f7a | 2004-09-13 15:06:50 +0000 | [diff] [blame] | 1676 | (Contributed by Martin von Loewis.) |
Raymond Hettinger | 97ef8de | 2004-01-05 00:29:57 +0000 | [diff] [blame] | 1677 | |
| 1678 | \end{itemize} |
Fred Drake | ed0fa3d | 2003-07-30 19:14:09 +0000 | [diff] [blame] | 1679 | |
| 1680 | |
Fred Drake | ed0fa3d | 2003-07-30 19:14:09 +0000 | [diff] [blame] | 1681 | |
| 1682 | %====================================================================== |
| 1683 | \section{Porting to Python 2.4} |
| 1684 | |
| 1685 | This section lists previously described changes that may require |
| 1686 | changes to your code: |
| 1687 | |
| 1688 | \begin{itemize} |
| 1689 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 1690 | \item Left shifts and hexadecimal/octal constants that are too |
| 1691 | large no longer trigger a \exception{FutureWarning} and return |
| 1692 | a value limited to 32 or 64 bits; instead they return a long integer. |
| 1693 | |
| 1694 | \item Integer operations will no longer trigger an \exception{OverflowWarning}. |
| 1695 | The \exception{OverflowWarning} warning will disappear in Python 2.5. |
| 1696 | |
Raymond Hettinger | 607c00f | 2003-11-12 16:27:50 +0000 | [diff] [blame] | 1697 | \item The \function{zip()} built-in function and \function{itertools.izip()} |
| 1698 | now return an empty list instead of raising a \exception{TypeError} |
| 1699 | exception if called with no arguments. |
Andrew M. Kuchling | 7642f7a | 2004-09-13 15:06:50 +0000 | [diff] [blame] | 1700 | (Contributed by Raymond Hettinger.) |
Andrew M. Kuchling | 6aedcfc | 2003-10-21 12:48:23 +0000 | [diff] [blame] | 1701 | |
| 1702 | \item \function{dircache.listdir()} now passes exceptions to the caller |
| 1703 | instead of returning empty lists. |
Fred Drake | ed0fa3d | 2003-07-30 19:14:09 +0000 | [diff] [blame] | 1704 | |
Andrew M. Kuchling | 71432f1 | 2004-07-05 01:40:07 +0000 | [diff] [blame] | 1705 | \item \function{LexicalHandler.startDTD()} used to receive the public and |
| 1706 | system IDs in the wrong order. This has been corrected; applications |
Fred Drake | 56fcc23 | 2004-05-06 02:55:35 +0000 | [diff] [blame] | 1707 | relying on the wrong order need to be fixed. |
Martin v. Löwis | 456ab1d | 2004-05-06 01:54:36 +0000 | [diff] [blame] | 1708 | |
Andrew M. Kuchling | 71432f1 | 2004-07-05 01:40:07 +0000 | [diff] [blame] | 1709 | \item \function{fcntl.ioctl} now warns if the \var{mutate} |
| 1710 | argument is omitted and relevant. |
Martin v. Löwis | 77ca6c4 | 2004-06-03 12:47:26 +0000 | [diff] [blame] | 1711 | |
Andrew M. Kuchling | 87c98b2 | 2004-08-25 13:38:46 +0000 | [diff] [blame] | 1712 | \item The \module{tarfile} module now generates GNU-format tar files |
| 1713 | by default. |
| 1714 | |
Andrew M. Kuchling | f8c075c | 2004-11-09 02:58:02 +0000 | [diff] [blame] | 1715 | \item Encountering a failure while importing a module no longer leaves |
| 1716 | a partially-initialized module object in \code{sys.modules}. |
| 1717 | |
| 1718 | \item \constant{None} is now a constant; code that binds a new value to |
| 1719 | the name \samp{None} is now a syntax error. |
| 1720 | |
Fred Drake | ed0fa3d | 2003-07-30 19:14:09 +0000 | [diff] [blame] | 1721 | \end{itemize} |
| 1722 | |
| 1723 | |
| 1724 | %====================================================================== |
| 1725 | \section{Acknowledgements \label{acks}} |
| 1726 | |
| 1727 | The author would like to thank the following people for offering |
| 1728 | suggestions, corrections and assistance with various drafts of this |
Andrew M. Kuchling | b6ffc27 | 2004-10-12 16:36:57 +0000 | [diff] [blame] | 1729 | article: Hye-Shik Chang, Michael Dyck, Raymond Hettinger, Hamish Lawson, |
| 1730 | Fredrik Lundh. |
Fred Drake | ed0fa3d | 2003-07-30 19:14:09 +0000 | [diff] [blame] | 1731 | |
| 1732 | \end{document} |