| \documentclass{howto} |
| \usepackage{distutils} |
| % $Id$ |
| |
| % Don't write extensive text for new sections; I'll do that. |
| % Feel free to add commented-out reminders of things that need |
| % to be covered. --amk |
| |
| \title{What's New in Python 2.4} |
| \release{1.02} |
| \author{A.M.\ Kuchling} |
| \authoraddress{ |
| \strong{Python Software Foundation}\\ |
| Email: \email{amk@amk.ca} |
| } |
| |
| \begin{document} |
| \maketitle |
| \tableofcontents |
| |
| This article explains the new features in Python 2.4.1, released on |
| March~30, 2005. |
| |
| Python 2.4 is a medium-sized release. It doesn't introduce as many |
| changes as the radical Python 2.2, but introduces more features than |
| the conservative 2.3 release. The most significant new language |
| features are function decorators and generator expressions; most other |
| changes are to the standard library. |
| |
| According to the CVS change logs, there were 481 patches applied and |
| 502 bugs fixed between Python 2.3 and 2.4. Both figures are likely to |
| be underestimates. |
| |
| This article doesn't attempt to provide a complete specification of |
| every single new feature, but instead provides a brief introduction to |
| each feature. For full details, you should refer to the documentation |
| for Python 2.4, such as the \citetitle[../lib/lib.html]{Python Library |
| Reference} and the \citetitle[../ref/ref.html]{Python Reference |
| Manual}. Often you will be referred to the PEP for a particular new |
| feature for explanations of the implementation and design rationale. |
| |
| |
| %====================================================================== |
| \section{PEP 218: Built-In Set Objects} |
| |
| Python 2.3 introduced the \module{sets} module. C implementations of |
| set data types have now been added to the Python core as two new |
| built-in types, \function{set(\var{iterable})} and |
| \function{frozenset(\var{iterable})}. They provide high speed |
| operations for membership testing, for eliminating duplicates from |
| sequences, and for mathematical operations like unions, intersections, |
| differences, and symmetric differences. |
| |
| \begin{verbatim} |
| >>> a = set('abracadabra') # form a set from a string |
| >>> 'z' in a # fast membership testing |
| False |
| >>> a # unique letters in a |
| set(['a', 'r', 'b', 'c', 'd']) |
| >>> ''.join(a) # convert back into a string |
| 'arbcd' |
| |
| >>> b = set('alacazam') # form a second set |
| >>> a - b # letters in a but not in b |
| set(['r', 'd', 'b']) |
| >>> a | b # letters in either a or b |
| set(['a', 'c', 'r', 'd', 'b', 'm', 'z', 'l']) |
| >>> a & b # letters in both a and b |
| set(['a', 'c']) |
| >>> a ^ b # letters in a or b but not both |
| set(['r', 'd', 'b', 'm', 'z', 'l']) |
| |
| >>> a.add('z') # add a new element |
| >>> a.update('wxy') # add multiple new elements |
| >>> a |
| set(['a', 'c', 'b', 'd', 'r', 'w', 'y', 'x', 'z']) |
| >>> a.remove('x') # take one element out |
| >>> a |
| set(['a', 'c', 'b', 'd', 'r', 'w', 'y', 'z']) |
| \end{verbatim} |
| |
| The \function{frozenset} type is an immutable version of \function{set}. |
| Since it is immutable and hashable, it may be used as a dictionary key or |
| as a member of another set. |
| |
| The \module{sets} module remains in the standard library, and may be |
| useful if you wish to subclass the \class{Set} or \class{ImmutableSet} |
| classes. There are currently no plans to deprecate the module. |
| |
| \begin{seealso} |
| \seepep{218}{Adding a Built-In Set Object Type}{Originally proposed by |
| Greg Wilson and ultimately implemented by Raymond Hettinger.} |
| \end{seealso} |
| |
| |
| %====================================================================== |
| \section{PEP 237: Unifying Long Integers and Integers} |
| |
| The lengthy transition process for this PEP, begun in Python 2.2, |
| takes another step forward in Python 2.4. In 2.3, certain integer |
| operations that would behave differently after int/long unification |
| triggered \exception{FutureWarning} warnings and returned values |
| limited to 32 or 64 bits (depending on your platform). In 2.4, these |
| expressions no longer produce a warning and instead produce a |
| different result that's usually a long integer. |
| |
| The problematic expressions are primarily left shifts and lengthy |
| hexadecimal and octal constants. For example, |
| \code{2 \textless{}\textless{} 32} results |
| in a warning in 2.3, evaluating to 0 on 32-bit platforms. In Python |
| 2.4, this expression now returns the correct answer, 8589934592. |
| |
| \begin{seealso} |
| \seepep{237}{Unifying Long Integers and Integers}{Original PEP |
| written by Moshe Zadka and GvR. The changes for 2.4 were implemented by |
| Kalle Svensson.} |
| \end{seealso} |
| |
| |
| %====================================================================== |
| \section{PEP 289: Generator Expressions} |
| |
| The iterator feature introduced in Python 2.2 and the |
| \module{itertools} module make it easier to write programs that loop |
| through large data sets without having the entire data set in memory |
| at one time. List comprehensions don't fit into this picture very |
| well because they produce a Python list object containing all of the |
| items. This unavoidably pulls all of the objects into memory, which |
| can be a problem if your data set is very large. When trying to write |
| a functionally-styled program, it would be natural to write something |
| like: |
| |
| \begin{verbatim} |
| links = [link for link in get_all_links() if not link.followed] |
| for link in links: |
| ... |
| \end{verbatim} |
| |
| instead of |
| |
| \begin{verbatim} |
| for link in get_all_links(): |
| if link.followed: |
| continue |
| ... |
| \end{verbatim} |
| |
| The first form is more concise and perhaps more readable, but if |
| you're dealing with a large number of link objects you'd have to write |
| the second form to avoid having all link objects in memory at the same |
| time. |
| |
| Generator expressions work similarly to list comprehensions but don't |
| materialize the entire list; instead they create a generator that will |
| return elements one by one. The above example could be written as: |
| |
| \begin{verbatim} |
| links = (link for link in get_all_links() if not link.followed) |
| for link in links: |
| ... |
| \end{verbatim} |
| |
| Generator expressions always have to be written inside parentheses, as |
| in the above example. The parentheses signalling a function call also |
| count, so if you want to create an iterator that will be immediately |
| passed to a function you could write: |
| |
| \begin{verbatim} |
| print sum(obj.count for obj in list_all_objects()) |
| \end{verbatim} |
| |
| Generator expressions differ from list comprehensions in various small |
| ways. Most notably, the loop variable (\var{obj} in the above |
| example) is not accessible outside of the generator expression. List |
| comprehensions leave the variable assigned to its last value; future |
| versions of Python will change this, making list comprehensions match |
| generator expressions in this respect. |
| |
| \begin{seealso} |
| \seepep{289}{Generator Expressions}{Proposed by Raymond Hettinger and |
| implemented by Jiwon Seo with early efforts steered by Hye-Shik Chang.} |
| \end{seealso} |
| |
| |
| %====================================================================== |
| \section{PEP 292: Simpler String Substitutions} |
| |
| Some new classes in the standard library provide an alternative |
| mechanism for substituting variables into strings; this style of |
| substitution may be better for applications where untrained |
| users need to edit templates. |
| |
| The usual way of substituting variables by name is the \code{\%} |
| operator: |
| |
| \begin{verbatim} |
| >>> '%(page)i: %(title)s' % {'page':2, 'title': 'The Best of Times'} |
| '2: The Best of Times' |
| \end{verbatim} |
| |
| When writing the template string, it can be easy to forget the |
| \samp{i} or \samp{s} after the closing parenthesis. This isn't a big |
| problem if the template is in a Python module, because you run the |
| code, get an ``Unsupported format character'' \exception{ValueError}, |
| and fix the problem. However, consider an application such as Mailman |
| where template strings or translations are being edited by users who |
| aren't aware of the Python language. The format string's syntax is |
| complicated to explain to such users, and if they make a mistake, it's |
| difficult to provide helpful feedback to them. |
| |
| PEP 292 adds a \class{Template} class to the \module{string} module |
| that uses \samp{\$} to indicate a substitution: |
| |
| \begin{verbatim} |
| >>> import string |
| >>> t = string.Template('$page: $title') |
| >>> t.substitute({'page':2, 'title': 'The Best of Times'}) |
| '2: The Best of Times' |
| \end{verbatim} |
| |
| % $ Terminate $-mode for Emacs |
| |
| If a key is missing from the dictionary, the \method{substitute} method |
| will raise a \exception{KeyError}. There's also a \method{safe_substitute} |
| method that ignores missing keys: |
| |
| \begin{verbatim} |
| >>> t = string.Template('$page: $title') |
| >>> t.safe_substitute({'page':3}) |
| '3: $title' |
| \end{verbatim} |
| |
| % $ Terminate math-mode for Emacs |
| |
| |
| \begin{seealso} |
| \seepep{292}{Simpler String Substitutions}{Written and implemented |
| by Barry Warsaw.} |
| \end{seealso} |
| |
| |
| %====================================================================== |
| \section{PEP 318: Decorators for Functions and Methods} |
| |
| Python 2.2 extended Python's object model by adding static methods and |
| class methods, but it didn't extend Python's syntax to provide any new |
| way of defining static or class methods. Instead, you had to write a |
| \keyword{def} statement in the usual way, and pass the resulting |
| method to a \function{staticmethod()} or \function{classmethod()} |
| function that would wrap up the function as a method of the new type. |
| Your code would look like this: |
| |
| \begin{verbatim} |
| class C: |
| def meth (cls): |
| ... |
| |
| meth = classmethod(meth) # Rebind name to wrapped-up class method |
| \end{verbatim} |
| |
| If the method was very long, it would be easy to miss or forget the |
| \function{classmethod()} invocation after the function body. |
| |
| The intention was always to add some syntax to make such definitions |
| more readable, but at the time of 2.2's release a good syntax was not |
| obvious. Today a good syntax \emph{still} isn't obvious but users are |
| asking for easier access to the feature; a new syntactic feature has |
| been added to meet this need. |
| |
| The new feature is called ``function decorators''. The name comes |
| from the idea that \function{classmethod}, \function{staticmethod}, |
| and friends are storing additional information on a function object; |
| they're \emph{decorating} functions with more details. |
| |
| The notation borrows from Java and uses the \character{@} character as an |
| indicator. Using the new syntax, the example above would be written: |
| |
| \begin{verbatim} |
| class C: |
| |
| @classmethod |
| def meth (cls): |
| ... |
| |
| \end{verbatim} |
| |
| The \code{@classmethod} is shorthand for the |
| \code{meth=classmethod(meth)} assignment. More generally, if you have |
| the following: |
| |
| \begin{verbatim} |
| @A |
| @B |
| @C |
| def f (): |
| ... |
| \end{verbatim} |
| |
| It's equivalent to the following pre-decorator code: |
| |
| \begin{verbatim} |
| def f(): ... |
| f = A(B(C(f))) |
| \end{verbatim} |
| |
| Decorators must come on the line before a function definition, one decorator |
| per line, and can't be on the same line as the def statement, meaning that |
| \code{@A def f(): ...} is illegal. You can only decorate function |
| definitions, either at the module level or inside a class; you can't |
| decorate class definitions. |
| |
| A decorator is just a function that takes the function to be decorated as an |
| argument and returns either the same function or some new object. The |
| return value of the decorator need not be callable (though it typically is), |
| unless further decorators will be applied to the result. It's easy to write |
| your own decorators. The following simple example just sets an attribute on |
| the function object: |
| |
| \begin{verbatim} |
| >>> def deco(func): |
| ... func.attr = 'decorated' |
| ... return func |
| ... |
| >>> @deco |
| ... def f(): pass |
| ... |
| >>> f |
| <function f at 0x402ef0d4> |
| >>> f.attr |
| 'decorated' |
| >>> |
| \end{verbatim} |
| |
| As a slightly more realistic example, the following decorator checks |
| that the supplied argument is an integer: |
| |
| \begin{verbatim} |
| def require_int (func): |
| def wrapper (arg): |
| assert isinstance(arg, int) |
| return func(arg) |
| |
| return wrapper |
| |
| @require_int |
| def p1 (arg): |
| print arg |
| |
| @require_int |
| def p2(arg): |
| print arg*2 |
| \end{verbatim} |
| |
| An example in \pep{318} contains a fancier version of this idea that |
| lets you both specify the required type and check the returned type. |
| |
| Decorator functions can take arguments. If arguments are supplied, |
| your decorator function is called with only those arguments and must |
| return a new decorator function; this function must take a single |
| function and return a function, as previously described. In other |
| words, \code{@A @B @C(args)} becomes: |
| |
| \begin{verbatim} |
| def f(): ... |
| _deco = C(args) |
| f = A(B(_deco(f))) |
| \end{verbatim} |
| |
| Getting this right can be slightly brain-bending, but it's not too |
| difficult. |
| |
| A small related change makes the \member{func_name} attribute of |
| functions writable. This attribute is used to display function names |
| in tracebacks, so decorators should change the name of any new |
| function that's constructed and returned. |
| |
| \begin{seealso} |
| \seepep{318}{Decorators for Functions, Methods and Classes}{Written |
| by Kevin D. Smith, Jim Jewett, and Skip Montanaro. Several people |
| wrote patches implementing function decorators, but the one that was |
| actually checked in was patch \#979728, written by Mark Russell.} |
| |
| \seeurl{http://www.python.org/moin/PythonDecoratorLibrary} |
| {This Wiki page contains several examples of decorators.} |
| |
| \end{seealso} |
| |
| |
| %====================================================================== |
| \section{PEP 322: Reverse Iteration} |
| |
| A new built-in function, \function{reversed(\var{seq})}, takes a sequence |
| and returns an iterator that loops over the elements of the sequence |
| in reverse order. |
| |
| \begin{verbatim} |
| >>> for i in reversed(xrange(1,4)): |
| ... print i |
| ... |
| 3 |
| 2 |
| 1 |
| \end{verbatim} |
| |
| Compared to extended slicing, such as \code{range(1,4)[::-1]}, |
| \function{reversed()} is easier to read, runs faster, and uses |
| substantially less memory. |
| |
| Note that \function{reversed()} only accepts sequences, not arbitrary |
| iterators. If you want to reverse an iterator, first convert it to |
| a list with \function{list()}. |
| |
| \begin{verbatim} |
| >>> input = open('/etc/passwd', 'r') |
| >>> for line in reversed(list(input)): |
| ... print line |
| ... |
| root:*:0:0:System Administrator:/var/root:/bin/tcsh |
| ... |
| \end{verbatim} |
| |
| \begin{seealso} |
| \seepep{322}{Reverse Iteration}{Written and implemented by Raymond Hettinger.} |
| |
| \end{seealso} |
| |
| |
| %====================================================================== |
| \section{PEP 324: New subprocess Module} |
| |
| The standard library provides a number of ways to execute a |
| subprocess, offering different features and different levels of |
| complexity. \function{os.system(\var{command})} is easy to use, but |
| slow (it runs a shell process which executes the command) and |
| dangerous (you have to be careful about escaping the shell's |
| metacharacters). The \module{popen2} module offers classes that can |
| capture standard output and standard error from the subprocess, but |
| the naming is confusing. The \module{subprocess} module cleans |
| this up, providing a unified interface that offers all the features |
| you might need. |
| |
| Instead of \module{popen2}'s collection of classes, |
| \module{subprocess} contains a single class called \class{Popen} |
| whose constructor supports a number of different keyword arguments. |
| |
| \begin{verbatim} |
| class Popen(args, bufsize=0, executable=None, |
| stdin=None, stdout=None, stderr=None, |
| preexec_fn=None, close_fds=False, shell=False, |
| cwd=None, env=None, universal_newlines=False, |
| startupinfo=None, creationflags=0): |
| \end{verbatim} |
| |
| \var{args} is commonly a sequence of strings that will be the |
| arguments to the program executed as the subprocess. (If the |
| \var{shell} argument is true, \var{args} can be a string which will |
| then be passed on to the shell for interpretation, just as |
| \function{os.system()} does.) |
| |
| \var{stdin}, \var{stdout}, and \var{stderr} specify what the |
| subprocess's input, output, and error streams will be. You can |
| provide a file object or a file descriptor, or you can use the |
| constant \code{subprocess.PIPE} to create a pipe between the |
| subprocess and the parent. |
| |
| The constructor has a number of handy options: |
| |
| \begin{itemize} |
| \item \var{close_fds} requests that all file descriptors be closed |
| before running the subprocess. |
| |
| \item \var{cwd} specifies the working directory in which the |
| subprocess will be executed (defaulting to whatever the parent's |
| working directory is). |
| |
| \item \var{env} is a dictionary specifying environment variables. |
| |
| \item \var{preexec_fn} is a function that gets called before the |
| child is started. |
| |
| \item \var{universal_newlines} opens the child's input and output |
| using Python's universal newline feature. |
| |
| \end{itemize} |
| |
| Once you've created the \class{Popen} instance, |
| you can call its \method{wait()} method to pause until the subprocess |
| has exited, \method{poll()} to check if it's exited without pausing, |
| or \method{communicate(\var{data})} to send the string \var{data} to |
| the subprocess's standard input. \method{communicate(\var{data})} |
| then reads any data that the subprocess has sent to its standard output |
| or standard error, returning a tuple \code{(\var{stdout_data}, |
| \var{stderr_data})}. |
| |
| \function{call()} is a shortcut that passes its arguments along to the |
| \class{Popen} constructor, waits for the command to complete, and |
| returns the status code of the subprocess. It can serve as a safer |
| analog to \function{os.system()}: |
| |
| \begin{verbatim} |
| sts = subprocess.call(['dpkg', '-i', '/tmp/new-package.deb']) |
| if sts == 0: |
| # Success |
| ... |
| else: |
| # dpkg returned an error |
| ... |
| \end{verbatim} |
| |
| The command is invoked without use of the shell. If you really do want to |
| use the shell, you can add \code{shell=True} as a keyword argument and provide |
| a string instead of a sequence: |
| |
| \begin{verbatim} |
| sts = subprocess.call('dpkg -i /tmp/new-package.deb', shell=True) |
| \end{verbatim} |
| |
| The PEP takes various examples of shell and Python code and shows how |
| they'd be translated into Python code that uses \module{subprocess}. |
| Reading this section of the PEP is highly recommended. |
| |
| \begin{seealso} |
| \seepep{324}{subprocess - New process module}{Written and implemented by Peter {\AA}strand, with assistance from Fredrik Lundh and others.} |
| \end{seealso} |
| |
| |
| %====================================================================== |
| \section{PEP 327: Decimal Data Type} |
| |
| Python has always supported floating-point (FP) numbers, based on the |
| underlying C \ctype{double} type, as a data type. However, while most |
| programming languages provide a floating-point type, many people (even |
| programmers) are unaware that floating-point numbers don't represent |
| certain decimal fractions accurately. The new \class{Decimal} type |
| can represent these fractions accurately, up to a user-specified |
| precision limit. |
| |
| |
| \subsection{Why is Decimal needed?} |
| |
| The limitations arise from the representation used for floating-point numbers. |
| FP numbers are made up of three components: |
| |
| \begin{itemize} |
| \item The sign, which is positive or negative. |
| \item The mantissa, which is a single-digit binary number |
| followed by a fractional part. For example, \code{1.01} in base-2 notation |
| is \code{1 + 0/2 + 1/4}, or 1.25 in decimal notation. |
| \item The exponent, which tells where the decimal point is located in the number represented. |
| \end{itemize} |
| |
| For example, the number 1.25 has positive sign, a mantissa value of |
| 1.01 (in binary), and an exponent of 0 (the decimal point doesn't need |
| to be shifted). The number 5 has the same sign and mantissa, but the |
| exponent is 2 because the mantissa is multiplied by 4 (2 to the power |
| of the exponent 2); 1.25 * 4 equals 5. |
| |
| Modern systems usually provide floating-point support that conforms to |
| a standard called IEEE 754. C's \ctype{double} type is usually |
| implemented as a 64-bit IEEE 754 number, which uses 52 bits of space |
| for the mantissa. This means that numbers can only be specified to 52 |
| bits of precision. If you're trying to represent numbers whose |
| expansion repeats endlessly, the expansion is cut off after 52 bits. |
| Unfortunately, most software needs to produce output in base 10, and |
| common fractions in base 10 are often repeating decimals in binary. |
| For example, 1.1 decimal is binary \code{1.0001100110011 ...}; .1 = |
| 1/16 + 1/32 + 1/256 plus an infinite number of additional terms. IEEE |
| 754 has to chop off that infinitely repeated decimal after 52 digits, |
| so the representation is slightly inaccurate. |
| |
| Sometimes you can see this inaccuracy when the number is printed: |
| \begin{verbatim} |
| >>> 1.1 |
| 1.1000000000000001 |
| \end{verbatim} |
| |
| The inaccuracy isn't always visible when you print the number because |
| the FP-to-decimal-string conversion is provided by the C library, and |
| most C libraries try to produce sensible output. Even if it's not |
| displayed, however, the inaccuracy is still there and subsequent |
| operations can magnify the error. |
| |
| For many applications this doesn't matter. If I'm plotting points and |
| displaying them on my monitor, the difference between 1.1 and |
| 1.1000000000000001 is too small to be visible. Reports often limit |
| output to a certain number of decimal places, and if you round the |
| number to two or three or even eight decimal places, the error is |
| never apparent. However, for applications where it does matter, |
| it's a lot of work to implement your own custom arithmetic routines. |
| |
| Hence, the \class{Decimal} type was created. |
| |
| \subsection{The \class{Decimal} type} |
| |
| A new module, \module{decimal}, was added to Python's standard |
| library. It contains two classes, \class{Decimal} and |
| \class{Context}. \class{Decimal} instances represent numbers, and |
| \class{Context} instances are used to wrap up various settings such as |
| the precision and default rounding mode. |
| |
| \class{Decimal} instances are immutable, like regular Python integers |
| and FP numbers; once it's been created, you can't change the value an |
| instance represents. \class{Decimal} instances can be created from |
| integers or strings: |
| |
| \begin{verbatim} |
| >>> import decimal |
| >>> decimal.Decimal(1972) |
| Decimal("1972") |
| >>> decimal.Decimal("1.1") |
| Decimal("1.1") |
| \end{verbatim} |
| |
| You can also provide tuples containing the sign, the mantissa represented |
| as a tuple of decimal digits, and the exponent: |
| |
| \begin{verbatim} |
| >>> decimal.Decimal((1, (1, 4, 7, 5), -2)) |
| Decimal("-14.75") |
| \end{verbatim} |
| |
| Cautionary note: the sign bit is a Boolean value, so 0 is positive and |
| 1 is negative. |
| |
| Converting from floating-point numbers poses a bit of a problem: |
| should the FP number representing 1.1 turn into the decimal number for |
| exactly 1.1, or for 1.1 plus whatever inaccuracies are introduced? |
| The decision was to dodge the issue and leave such a conversion out of |
| the API. Instead, you should convert the floating-point number into a |
| string using the desired precision and pass the string to the |
| \class{Decimal} constructor: |
| |
| \begin{verbatim} |
| >>> f = 1.1 |
| >>> decimal.Decimal(str(f)) |
| Decimal("1.1") |
| >>> decimal.Decimal('%.12f' % f) |
| Decimal("1.100000000000") |
| \end{verbatim} |
| |
| Once you have \class{Decimal} instances, you can perform the usual |
| mathematical operations on them. One limitation: exponentiation |
| requires an integer exponent: |
| |
| \begin{verbatim} |
| >>> a = decimal.Decimal('35.72') |
| >>> b = decimal.Decimal('1.73') |
| >>> a+b |
| Decimal("37.45") |
| >>> a-b |
| Decimal("33.99") |
| >>> a*b |
| Decimal("61.7956") |
| >>> a/b |
| Decimal("20.64739884393063583815028902") |
| >>> a ** 2 |
| Decimal("1275.9184") |
| >>> a**b |
| Traceback (most recent call last): |
| ... |
| decimal.InvalidOperation: x ** (non-integer) |
| \end{verbatim} |
| |
| You can combine \class{Decimal} instances with integers, but not with |
| floating-point numbers: |
| |
| \begin{verbatim} |
| >>> a + 4 |
| Decimal("39.72") |
| >>> a + 4.5 |
| Traceback (most recent call last): |
| ... |
| TypeError: You can interact Decimal only with int, long or Decimal data types. |
| >>> |
| \end{verbatim} |
| |
| \class{Decimal} numbers can be used with the \module{math} and |
| \module{cmath} modules, but note that they'll be immediately converted to |
| floating-point numbers before the operation is performed, resulting in |
| a possible loss of precision and accuracy. You'll also get back a |
| regular floating-point number and not a \class{Decimal}. |
| |
| \begin{verbatim} |
| >>> import math, cmath |
| >>> d = decimal.Decimal('123456789012.345') |
| >>> math.sqrt(d) |
| 351364.18288201344 |
| >>> cmath.sqrt(-d) |
| 351364.18288201344j |
| \end{verbatim} |
| |
| \class{Decimal} instances have a \method{sqrt()} method that |
| returns a \class{Decimal}, but if you need other things such as |
| trigonometric functions you'll have to implement them. |
| |
| \begin{verbatim} |
| >>> d.sqrt() |
| Decimal("351364.1828820134592177245001") |
| \end{verbatim} |
| |
| |
| \subsection{The \class{Context} type} |
| |
| Instances of the \class{Context} class encapsulate several settings for |
| decimal operations: |
| |
| \begin{itemize} |
| \item \member{prec} is the precision, the number of decimal places. |
| \item \member{rounding} specifies the rounding mode. The \module{decimal} |
| module has constants for the various possibilities: |
| \constant{ROUND_DOWN}, \constant{ROUND_CEILING}, |
| \constant{ROUND_HALF_EVEN}, and various others. |
| \item \member{traps} is a dictionary specifying what happens on |
| encountering certain error conditions: either an exception is raised or |
| a value is returned. Some examples of error conditions are |
| division by zero, loss of precision, and overflow. |
| \end{itemize} |
| |
| There's a thread-local default context available by calling |
| \function{getcontext()}; you can change the properties of this context |
| to alter the default precision, rounding, or trap handling. The |
| following example shows the effect of changing the precision of the default |
| context: |
| |
| \begin{verbatim} |
| >>> decimal.getcontext().prec |
| 28 |
| >>> decimal.Decimal(1) / decimal.Decimal(7) |
| Decimal("0.1428571428571428571428571429") |
| >>> decimal.getcontext().prec = 9 |
| >>> decimal.Decimal(1) / decimal.Decimal(7) |
| Decimal("0.142857143") |
| \end{verbatim} |
| |
| The default action for error conditions is selectable; the module can |
| either return a special value such as infinity or not-a-number, or |
| exceptions can be raised: |
| |
| \begin{verbatim} |
| >>> decimal.Decimal(1) / decimal.Decimal(0) |
| Traceback (most recent call last): |
| ... |
| decimal.DivisionByZero: x / 0 |
| >>> decimal.getcontext().traps[decimal.DivisionByZero] = False |
| >>> decimal.Decimal(1) / decimal.Decimal(0) |
| Decimal("Infinity") |
| >>> |
| \end{verbatim} |
| |
| The \class{Context} instance also has various methods for formatting |
| numbers such as \method{to_eng_string()} and \method{to_sci_string()}. |
| |
| For more information, see the documentation for the \module{decimal} |
| module, which includes a quick-start tutorial and a reference. |
| |
| \begin{seealso} |
| \seepep{327}{Decimal Data Type}{Written by Facundo Batista and implemented |
| by Facundo Batista, Eric Price, Raymond Hettinger, Aahz, and Tim Peters.} |
| |
| \seeurl{http://research.microsoft.com/\textasciitilde hollasch/cgindex/coding/ieeefloat.html} |
| {A more detailed overview of the IEEE-754 representation.} |
| |
| \seeurl{http://www.lahey.com/float.htm} |
| {The article uses Fortran code to illustrate many of the problems |
| that floating-point inaccuracy can cause.} |
| |
| \seeurl{http://www2.hursley.ibm.com/decimal/} |
| {A description of a decimal-based representation. This representation |
| is being proposed as a standard, and underlies the new Python decimal |
| type. Much of this material was written by Mike Cowlishaw, designer of the |
| Rexx language.} |
| |
| \end{seealso} |
| |
| |
| %====================================================================== |
| \section{PEP 328: Multi-line Imports} |
| |
| One language change is a small syntactic tweak aimed at making it |
| easier to import many names from a module. In a |
| \code{from \var{module} import \var{names}} statement, |
| \var{names} is a sequence of names separated by commas. If the sequence is |
| very long, you can either write multiple imports from the same module, |
| or you can use backslashes to escape the line endings like this: |
| |
| \begin{verbatim} |
| from SimpleXMLRPCServer import SimpleXMLRPCServer,\ |
| SimpleXMLRPCRequestHandler,\ |
| CGIXMLRPCRequestHandler,\ |
| resolve_dotted_attribute |
| \end{verbatim} |
| |
| The syntactic change in Python 2.4 simply allows putting the names |
| within parentheses. Python ignores newlines within a parenthesized |
| expression, so the backslashes are no longer needed: |
| |
| \begin{verbatim} |
| from SimpleXMLRPCServer import (SimpleXMLRPCServer, |
| SimpleXMLRPCRequestHandler, |
| CGIXMLRPCRequestHandler, |
| resolve_dotted_attribute) |
| \end{verbatim} |
| |
| The PEP also proposes that all \keyword{import} statements be absolute |
| imports, with a leading \samp{.} character to indicate a relative |
| import. This part of the PEP was not implemented for Python 2.4, |
| but was completed for Python 2.5. |
| |
| \begin{seealso} |
| \seepep{328}{Imports: Multi-Line and Absolute/Relative} |
| {Written by Aahz. Multi-line imports were implemented by |
| Dima Dorfman.} |
| \end{seealso} |
| |
| |
| %====================================================================== |
| \section{PEP 331: Locale-Independent Float/String Conversions} |
| |
| The \module{locale} modules lets Python software select various |
| conversions and display conventions that are localized to a particular |
| country or language. However, the module was careful to not change |
| the numeric locale because various functions in Python's |
| implementation required that the numeric locale remain set to the |
| \code{'C'} locale. Often this was because the code was using the C library's |
| \cfunction{atof()} function. |
| |
| Not setting the numeric locale caused trouble for extensions that used |
| third-party C libraries, however, because they wouldn't have the |
| correct locale set. The motivating example was GTK+, whose user |
| interface widgets weren't displaying numbers in the current locale. |
| |
| The solution described in the PEP is to add three new functions to the |
| Python API that perform ASCII-only conversions, ignoring the locale |
| setting: |
| |
| \begin{itemize} |
| \item \cfunction{PyOS_ascii_strtod(\var{str}, \var{ptr})} |
| and \cfunction{PyOS_ascii_atof(\var{str}, \var{ptr})} |
| both convert a string to a C \ctype{double}. |
| \item \cfunction{PyOS_ascii_formatd(\var{buffer}, \var{buf_len}, \var{format}, \var{d})} converts a \ctype{double} to an ASCII string. |
| \end{itemize} |
| |
| The code for these functions came from the GLib library |
| (\url{http://developer.gnome.org/arch/gtk/glib.html}), whose |
| developers kindly relicensed the relevant functions and donated them |
| to the Python Software Foundation. The \module{locale} module |
| can now change the numeric locale, letting extensions such as GTK+ |
| produce the correct results. |
| |
| \begin{seealso} |
| \seepep{331}{Locale-Independent Float/String Conversions} |
| {Written by Christian R. Reis, and implemented by Gustavo Carneiro.} |
| \end{seealso} |
| |
| %====================================================================== |
| \section{Other Language Changes} |
| |
| Here are all of the changes that Python 2.4 makes to the core Python |
| language. |
| |
| \begin{itemize} |
| |
| \item Decorators for functions and methods were added (\pep{318}). |
| |
| \item Built-in \function{set} and \function{frozenset} types were |
| added (\pep{218}). Other new built-ins include the \function{reversed(\var{seq})} function (\pep{322}). |
| |
| \item Generator expressions were added (\pep{289}). |
| |
| \item Certain numeric expressions no longer return values restricted to 32 or 64 bits (\pep{237}). |
| |
| \item You can now put parentheses around the list of names in a |
| \code{from \var{module} import \var{names}} statement (\pep{328}). |
| |
| \item The \method{dict.update()} method now accepts the same |
| argument forms as the \class{dict} constructor. This includes any |
| mapping, any iterable of key/value pairs, and keyword arguments. |
| (Contributed by Raymond Hettinger.) |
| |
| \item The string methods \method{ljust()}, \method{rjust()}, and |
| \method{center()} now take an optional argument for specifying a |
| fill character other than a space. |
| (Contributed by Raymond Hettinger.) |
| |
| \item Strings also gained an \method{rsplit()} method that |
| works like the \method{split()} method but splits from the end of |
| the string. |
| (Contributed by Sean Reifschneider.) |
| |
| \begin{verbatim} |
| >>> 'www.python.org'.split('.', 1) |
| ['www', 'python.org'] |
| 'www.python.org'.rsplit('.', 1) |
| ['www.python', 'org'] |
| \end{verbatim} |
| |
| \item Three keyword parameters, \var{cmp}, \var{key}, and |
| \var{reverse}, were added to the \method{sort()} method of lists. |
| These parameters make some common usages of \method{sort()} simpler. |
| All of these parameters are optional. |
| |
| For the \var{cmp} parameter, the value should be a comparison function |
| that takes two parameters and returns -1, 0, or +1 depending on how |
| the parameters compare. This function will then be used to sort the |
| list. Previously this was the only parameter that could be provided |
| to \method{sort()}. |
| |
| \var{key} should be a single-parameter function that takes a list |
| element and returns a comparison key for the element. The list is |
| then sorted using the comparison keys. The following example sorts a |
| list case-insensitively: |
| |
| \begin{verbatim} |
| >>> L = ['A', 'b', 'c', 'D'] |
| >>> L.sort() # Case-sensitive sort |
| >>> L |
| ['A', 'D', 'b', 'c'] |
| >>> # Using 'key' parameter to sort list |
| >>> L.sort(key=lambda x: x.lower()) |
| >>> L |
| ['A', 'b', 'c', 'D'] |
| >>> # Old-fashioned way |
| >>> L.sort(cmp=lambda x,y: cmp(x.lower(), y.lower())) |
| >>> L |
| ['A', 'b', 'c', 'D'] |
| \end{verbatim} |
| |
| The last example, which uses the \var{cmp} parameter, is the old way |
| to perform a case-insensitive sort. It works but is slower than using |
| a \var{key} parameter. Using \var{key} calls \method{lower()} method |
| once for each element in the list while using \var{cmp} will call it |
| twice for each comparison, so using \var{key} saves on invocations of |
| the \method{lower()} method. |
| |
| For simple key functions and comparison functions, it is often |
| possible to avoid a \keyword{lambda} expression by using an unbound |
| method instead. For example, the above case-insensitive sort is best |
| written as: |
| |
| \begin{verbatim} |
| >>> L.sort(key=str.lower) |
| >>> L |
| ['A', 'b', 'c', 'D'] |
| \end{verbatim} |
| |
| Finally, the \var{reverse} parameter takes a Boolean value. If the |
| value is true, the list will be sorted into reverse order. |
| Instead of \code{L.sort() ; L.reverse()}, you can now write |
| \code{L.sort(reverse=True)}. |
| |
| The results of sorting are now guaranteed to be stable. This means |
| that two entries with equal keys will be returned in the same order as |
| they were input. For example, you can sort a list of people by name, |
| and then sort the list by age, resulting in a list sorted by age where |
| people with the same age are in name-sorted order. |
| |
| (All changes to \method{sort()} contributed by Raymond Hettinger.) |
| |
| \item There is a new built-in function |
| \function{sorted(\var{iterable})} that works like the in-place |
| \method{list.sort()} method but can be used in |
| expressions. The differences are: |
| \begin{itemize} |
| \item the input may be any iterable; |
| \item a newly formed copy is sorted, leaving the original intact; and |
| \item the expression returns the new sorted copy |
| \end{itemize} |
| |
| \begin{verbatim} |
| >>> L = [9,7,8,3,2,4,1,6,5] |
| >>> [10+i for i in sorted(L)] # usable in a list comprehension |
| [11, 12, 13, 14, 15, 16, 17, 18, 19] |
| >>> L # original is left unchanged |
| [9,7,8,3,2,4,1,6,5] |
| >>> sorted('Monty Python') # any iterable may be an input |
| [' ', 'M', 'P', 'h', 'n', 'n', 'o', 'o', 't', 't', 'y', 'y'] |
| |
| >>> # List the contents of a dict sorted by key values |
| >>> colormap = dict(red=1, blue=2, green=3, black=4, yellow=5) |
| >>> for k, v in sorted(colormap.iteritems()): |
| ... print k, v |
| ... |
| black 4 |
| blue 2 |
| green 3 |
| red 1 |
| yellow 5 |
| \end{verbatim} |
| |
| (Contributed by Raymond Hettinger.) |
| |
| \item Integer operations will no longer trigger an \exception{OverflowWarning}. |
| The \exception{OverflowWarning} warning will disappear in Python 2.5. |
| |
| \item The interpreter gained a new switch, \programopt{-m}, that |
| takes a name, searches for the corresponding module on \code{sys.path}, |
| and runs the module as a script. For example, |
| you can now run the Python profiler with \code{python -m profile}. |
| (Contributed by Nick Coghlan.) |
| |
| \item The \function{eval(\var{expr}, \var{globals}, \var{locals})} |
| and \function{execfile(\var{filename}, \var{globals}, \var{locals})} |
| functions and the \keyword{exec} statement now accept any mapping type |
| for the \var{locals} parameter. Previously this had to be a regular |
| Python dictionary. (Contributed by Raymond Hettinger.) |
| |
| \item The \function{zip()} built-in function and \function{itertools.izip()} |
| now return an empty list if called with no arguments. |
| Previously they raised a \exception{TypeError} |
| exception. This makes them more |
| suitable for use with variable length argument lists: |
| |
| \begin{verbatim} |
| >>> def transpose(array): |
| ... return zip(*array) |
| ... |
| >>> transpose([(1,2,3), (4,5,6)]) |
| [(1, 4), (2, 5), (3, 6)] |
| >>> transpose([]) |
| [] |
| \end{verbatim} |
| (Contributed by Raymond Hettinger.) |
| |
| \item Encountering a failure while importing a module no longer leaves |
| a partially-initialized module object in \code{sys.modules}. The |
| incomplete module object left behind would fool further imports of the |
| same module into succeeding, leading to confusing errors. |
| (Fixed by Tim Peters.) |
| |
| \item \constant{None} is now a constant; code that binds a new value to |
| the name \samp{None} is now a syntax error. |
| (Contributed by Raymond Hettinger.) |
| |
| \end{itemize} |
| |
| |
| %====================================================================== |
| \subsection{Optimizations} |
| |
| \begin{itemize} |
| |
| \item The inner loops for list and tuple slicing |
| were optimized and now run about one-third faster. The inner loops |
| for dictionaries were also optimized, resulting in performance boosts for |
| \method{keys()}, \method{values()}, \method{items()}, |
| \method{iterkeys()}, \method{itervalues()}, and \method{iteritems()}. |
| (Contributed by Raymond Hettinger.) |
| |
| \item The machinery for growing and shrinking lists was optimized for |
| speed and for space efficiency. Appending and popping from lists now |
| runs faster due to more efficient code paths and less frequent use of |
| the underlying system \cfunction{realloc()}. List comprehensions |
| also benefit. \method{list.extend()} was also optimized and no |
| longer converts its argument into a temporary list before extending |
| the base list. (Contributed by Raymond Hettinger.) |
| |
| \item \function{list()}, \function{tuple()}, \function{map()}, |
| \function{filter()}, and \function{zip()} now run several times |
| faster with non-sequence arguments that supply a \method{__len__()} |
| method. (Contributed by Raymond Hettinger.) |
| |
| \item The methods \method{list.__getitem__()}, |
| \method{dict.__getitem__()}, and \method{dict.__contains__()} are |
| are now implemented as \class{method_descriptor} objects rather |
| than \class{wrapper_descriptor} objects. This form of |
| access doubles their performance and makes them more suitable for |
| use as arguments to functionals: |
| \samp{map(mydict.__getitem__, keylist)}. |
| (Contributed by Raymond Hettinger.) |
| |
| \item Added a new opcode, \code{LIST_APPEND}, that simplifies |
| the generated bytecode for list comprehensions and speeds them up |
| by about a third. (Contributed by Raymond Hettinger.) |
| |
| \item The peephole bytecode optimizer has been improved to |
| produce shorter, faster bytecode; remarkably, the resulting bytecode is |
| more readable. (Enhanced by Raymond Hettinger.) |
| |
| \item String concatenations in statements of the form \code{s = s + |
| "abc"} and \code{s += "abc"} are now performed more efficiently in |
| certain circumstances. This optimization won't be present in other |
| Python implementations such as Jython, so you shouldn't rely on it; |
| using the \method{join()} method of strings is still recommended when |
| you want to efficiently glue a large number of strings together. |
| (Contributed by Armin Rigo.) |
| |
| \end{itemize} |
| |
| % pystone is almost useless for comparing different versions of Python; |
| % instead, it excels at predicting relative Python performance on |
| % different machines. |
| % So, this section would be more informative if it used other tools |
| % such as pybench and parrotbench. For a more application oriented |
| % benchmark, try comparing the timings of test_decimal.py under 2.3 |
| % and 2.4. |
| |
| The net result of the 2.4 optimizations is that Python 2.4 runs the |
| pystone benchmark around 5\% faster than Python 2.3 and 35\% faster |
| than Python 2.2. (pystone is not a particularly good benchmark, but |
| it's the most commonly used measurement of Python's performance. Your |
| own applications may show greater or smaller benefits from Python~2.4.) |
| |
| |
| %====================================================================== |
| \section{New, Improved, and Deprecated Modules} |
| |
| As usual, Python's standard library received a number of enhancements and |
| bug fixes. Here's a partial list of the most notable changes, sorted |
| alphabetically by module name. Consult the |
| \file{Misc/NEWS} file in the source tree for a more |
| complete list of changes, or look through the CVS logs for all the |
| details. |
| |
| \begin{itemize} |
| |
| \item The \module{asyncore} module's \function{loop()} function now |
| has a \var{count} parameter that lets you perform a limited number |
| of passes through the polling loop. The default is still to loop |
| forever. |
| |
| \item The \module{base64} module now has more complete RFC 3548 support |
| for Base64, Base32, and Base16 encoding and decoding, including |
| optional case folding and optional alternative alphabets. |
| (Contributed by Barry Warsaw.) |
| |
| \item The \module{bisect} module now has an underlying C implementation |
| for improved performance. |
| (Contributed by Dmitry Vasiliev.) |
| |
| \item The CJKCodecs collections of East Asian codecs, maintained |
| by Hye-Shik Chang, was integrated into 2.4. |
| The new encodings are: |
| |
| \begin{itemize} |
| \item Chinese (PRC): gb2312, gbk, gb18030, big5hkscs, hz |
| \item Chinese (ROC): big5, cp950 |
| \item Japanese: cp932, euc-jis-2004, euc-jp, |
| euc-jisx0213, iso-2022-jp, iso-2022-jp-1, iso-2022-jp-2, |
| iso-2022-jp-3, iso-2022-jp-ext, iso-2022-jp-2004, |
| shift-jis, shift-jisx0213, shift-jis-2004 |
| \item Korean: cp949, euc-kr, johab, iso-2022-kr |
| \end{itemize} |
| |
| \item Some other new encodings were added: HP Roman8, |
| ISO_8859-11, ISO_8859-16, PCTP-154, and TIS-620. |
| |
| \item The UTF-8 and UTF-16 codecs now cope better with receiving partial input. |
| Previously the \class{StreamReader} class would try to read more data, |
| making it impossible to resume decoding from the stream. The |
| \method{read()} method will now return as much data as it can and future |
| calls will resume decoding where previous ones left off. |
| (Implemented by Walter D\"orwald.) |
| |
| \item There is a new \module{collections} module for |
| various specialized collection datatypes. |
| Currently it contains just one type, \class{deque}, |
| a double-ended queue that supports efficiently adding and removing |
| elements from either end: |
| |
| \begin{verbatim} |
| >>> from collections import deque |
| >>> d = deque('ghi') # make a new deque with three items |
| >>> d.append('j') # add a new entry to the right side |
| >>> d.appendleft('f') # add a new entry to the left side |
| >>> d # show the representation of the deque |
| deque(['f', 'g', 'h', 'i', 'j']) |
| >>> d.pop() # return and remove the rightmost item |
| 'j' |
| >>> d.popleft() # return and remove the leftmost item |
| 'f' |
| >>> list(d) # list the contents of the deque |
| ['g', 'h', 'i'] |
| >>> 'h' in d # search the deque |
| True |
| \end{verbatim} |
| |
| Several modules, such as the \module{Queue} and \module{threading} |
| modules, now take advantage of \class{collections.deque} for improved |
| performance. (Contributed by Raymond Hettinger.) |
| |
| \item The \module{ConfigParser} classes have been enhanced slightly. |
| The \method{read()} method now returns a list of the files that |
| were successfully parsed, and the \method{set()} method raises |
| \exception{TypeError} if passed a \var{value} argument that isn't a |
| string. (Contributed by John Belmonte and David Goodger.) |
| |
| \item The \module{curses} module now supports the ncurses extension |
| \function{use_default_colors()}. On platforms where the terminal |
| supports transparency, this makes it possible to use a transparent |
| background. (Contributed by J\"org Lehmann.) |
| |
| \item The \module{difflib} module now includes an \class{HtmlDiff} class |
| that creates an HTML table showing a side by side comparison |
| of two versions of a text. (Contributed by Dan Gass.) |
| |
| \item The \module{email} package was updated to version 3.0, |
| which dropped various deprecated APIs and removes support for Python |
| versions earlier than 2.3. The 3.0 version of the package uses a new |
| incremental parser for MIME messages, available in the |
| \module{email.FeedParser} module. The new parser doesn't require |
| reading the entire message into memory, and doesn't throw exceptions |
| if a message is malformed; instead it records any problems in the |
| \member{defect} attribute of the message. (Developed by Anthony |
| Baxter, Barry Warsaw, Thomas Wouters, and others.) |
| |
| \item The \module{heapq} module has been converted to C. The resulting |
| tenfold improvement in speed makes the module suitable for handling |
| high volumes of data. In addition, the module has two new functions |
| \function{nlargest()} and \function{nsmallest()} that use heaps to |
| find the N largest or smallest values in a dataset without the |
| expense of a full sort. (Contributed by Raymond Hettinger.) |
| |
| \item The \module{httplib} module now contains constants for HTTP |
| status codes defined in various HTTP-related RFC documents. Constants |
| have names such as \constant{OK}, \constant{CREATED}, |
| \constant{CONTINUE}, and \constant{MOVED_PERMANENTLY}; use pydoc to |
| get a full list. (Contributed by Andrew Eland.) |
| |
| \item The \module{imaplib} module now supports IMAP's THREAD command |
| (contributed by Yves Dionne) and new \method{deleteacl()} and |
| \method{myrights()} methods (contributed by Arnaud Mazin). |
| |
| \item The \module{itertools} module gained a |
| \function{groupby(\var{iterable}\optional{, \var{func}})} function. |
| \var{iterable} is something that can be iterated over to return a |
| stream of elements, and the optional \var{func} parameter is a |
| function that takes an element and returns a key value; if omitted, |
| the key is simply the element itself. \function{groupby()} then |
| groups the elements into subsequences which have matching values of |
| the key, and returns a series of 2-tuples containing the key value |
| and an iterator over the subsequence. |
| |
| Here's an example to make this clearer. The \var{key} function simply |
| returns whether a number is even or odd, so the result of |
| \function{groupby()} is to return consecutive runs of odd or even |
| numbers. |
| |
| \begin{verbatim} |
| >>> import itertools |
| >>> L = [2, 4, 6, 7, 8, 9, 11, 12, 14] |
| >>> for key_val, it in itertools.groupby(L, lambda x: x % 2): |
| ... print key_val, list(it) |
| ... |
| 0 [2, 4, 6] |
| 1 [7] |
| 0 [8] |
| 1 [9, 11] |
| 0 [12, 14] |
| >>> |
| \end{verbatim} |
| |
| \function{groupby()} is typically used with sorted input. The logic |
| for \function{groupby()} is similar to the \UNIX{} \code{uniq} filter |
| which makes it handy for eliminating, counting, or identifying |
| duplicate elements: |
| |
| \begin{verbatim} |
| >>> word = 'abracadabra' |
| >>> letters = sorted(word) # Turn string into a sorted list of letters |
| >>> letters |
| ['a', 'a', 'a', 'a', 'a', 'b', 'b', 'c', 'd', 'r', 'r'] |
| >>> for k, g in itertools.groupby(letters): |
| ... print k, list(g) |
| ... |
| a ['a', 'a', 'a', 'a', 'a'] |
| b ['b', 'b'] |
| c ['c'] |
| d ['d'] |
| r ['r', 'r'] |
| >>> # List unique letters |
| >>> [k for k, g in groupby(letters)] |
| ['a', 'b', 'c', 'd', 'r'] |
| >>> # Count letter occurrences |
| >>> [(k, len(list(g))) for k, g in groupby(letters)] |
| [('a', 5), ('b', 2), ('c', 1), ('d', 1), ('r', 2)] |
| \end{verbatim} |
| |
| (Contributed by Hye-Shik Chang.) |
| |
| \item \module{itertools} also gained a function named |
| \function{tee(\var{iterator}, \var{N})} that returns \var{N} independent |
| iterators that replicate \var{iterator}. If \var{N} is omitted, the |
| default is 2. |
| |
| \begin{verbatim} |
| >>> L = [1,2,3] |
| >>> i1, i2 = itertools.tee(L) |
| >>> i1,i2 |
| (<itertools.tee object at 0x402c2080>, <itertools.tee object at 0x402c2090>) |
| >>> list(i1) # Run the first iterator to exhaustion |
| [1, 2, 3] |
| >>> list(i2) # Run the second iterator to exhaustion |
| [1, 2, 3] |
| >\end{verbatim} |
| |
| Note that \function{tee()} has to keep copies of the values returned |
| by the iterator; in the worst case, it may need to keep all of them. |
| This should therefore be used carefully if the leading iterator |
| can run far ahead of the trailing iterator in a long stream of inputs. |
| If the separation is large, then you might as well use |
| \function{list()} instead. When the iterators track closely with one |
| another, \function{tee()} is ideal. Possible applications include |
| bookmarking, windowing, or lookahead iterators. |
| (Contributed by Raymond Hettinger.) |
| |
| \item A number of functions were added to the \module{locale} |
| module, such as \function{bind_textdomain_codeset()} to specify a |
| particular encoding and a family of \function{l*gettext()} functions |
| that return messages in the chosen encoding. |
| (Contributed by Gustavo Niemeyer.) |
| |
| \item Some keyword arguments were added to the \module{logging} |
| package's \function{basicConfig} function to simplify log |
| configuration. The default behavior is to log messages to standard |
| error, but various keyword arguments can be specified to log to a |
| particular file, change the logging format, or set the logging level. |
| For example: |
| |
| \begin{verbatim} |
| import logging |
| logging.basicConfig(filename='/var/log/application.log', |
| level=0, # Log all messages |
| format='%(levelname):%(process):%(thread):%(message)') |
| \end{verbatim} |
| |
| Other additions to the \module{logging} package include a |
| \method{log(\var{level}, \var{msg})} convenience method, as well as a |
| \class{TimedRotatingFileHandler} class that rotates its log files at a |
| timed interval. The module already had \class{RotatingFileHandler}, |
| which rotated logs once the file exceeded a certain size. Both |
| classes derive from a new \class{BaseRotatingHandler} class that can |
| be used to implement other rotating handlers. |
| |
| (Changes implemented by Vinay Sajip.) |
| |
| \item The \module{marshal} module now shares interned strings on unpacking a |
| data structure. This may shrink the size of certain pickle strings, |
| but the primary effect is to make \file{.pyc} files significantly smaller. |
| (Contributed by Martin von~L\"owis.) |
| |
| \item The \module{nntplib} module's \class{NNTP} class gained |
| \method{description()} and \method{descriptions()} methods to retrieve |
| newsgroup descriptions for a single group or for a range of groups. |
| (Contributed by J\"urgen A. Erhard.) |
| |
| \item Two new functions were added to the \module{operator} module, |
| \function{attrgetter(\var{attr})} and \function{itemgetter(\var{index})}. |
| Both functions return callables that take a single argument and return |
| the corresponding attribute or item; these callables make excellent |
| data extractors when used with \function{map()} or |
| \function{sorted()}. For example: |
| |
| \begin{verbatim} |
| >>> L = [('c', 2), ('d', 1), ('a', 4), ('b', 3)] |
| >>> map(operator.itemgetter(0), L) |
| ['c', 'd', 'a', 'b'] |
| >>> map(operator.itemgetter(1), L) |
| [2, 1, 4, 3] |
| >>> sorted(L, key=operator.itemgetter(1)) # Sort list by second tuple item |
| [('d', 1), ('c', 2), ('b', 3), ('a', 4)] |
| \end{verbatim} |
| |
| (Contributed by Raymond Hettinger.) |
| |
| \item The \module{optparse} module was updated in various ways. The |
| module now passes its messages through \function{gettext.gettext()}, |
| making it possible to internationalize Optik's help and error |
| messages. Help messages for options can now include the string |
| \code{'\%default'}, which will be replaced by the option's default |
| value. (Contributed by Greg Ward.) |
| |
| \item The long-term plan is to deprecate the \module{rfc822} module |
| in some future Python release in favor of the \module{email} package. |
| To this end, the \function{email.Utils.formatdate()} function has been |
| changed to make it usable as a replacement for |
| \function{rfc822.formatdate()}. You may want to write new e-mail |
| processing code with this in mind. (Change implemented by Anthony |
| Baxter.) |
| |
| \item A new \function{urandom(\var{n})} function was added to the |
| \module{os} module, returning a string containing \var{n} bytes of |
| random data. This function provides access to platform-specific |
| sources of randomness such as \file{/dev/urandom} on Linux or the |
| Windows CryptoAPI. (Contributed by Trevor Perrin.) |
| |
| \item Another new function: \function{os.path.lexists(\var{path})} |
| returns true if the file specified by \var{path} exists, whether or |
| not it's a symbolic link. This differs from the existing |
| \function{os.path.exists(\var{path})} function, which returns false if |
| \var{path} is a symlink that points to a destination that doesn't exist. |
| (Contributed by Beni Cherniavsky.) |
| |
| \item A new \function{getsid()} function was added to the |
| \module{posix} module that underlies the \module{os} module. |
| (Contributed by J. Raynor.) |
| |
| \item The \module{poplib} module now supports POP over SSL. (Contributed by |
| Hector Urtubia.) |
| |
| \item The \module{profile} module can now profile C extension functions. |
| (Contributed by Nick Bastin.) |
| |
| \item The \module{random} module has a new method called |
| \method{getrandbits(\var{N})} that returns a long integer \var{N} |
| bits in length. The existing \method{randrange()} method now uses |
| \method{getrandbits()} where appropriate, making generation of |
| arbitrarily large random numbers more efficient. (Contributed by |
| Raymond Hettinger.) |
| |
| \item The regular expression language accepted by the \module{re} module |
| was extended with simple conditional expressions, written as |
| \regexp{(?(\var{group})\var{A}|\var{B})}. \var{group} is either a |
| numeric group ID or a group name defined with \regexp{(?P<group>...)} |
| earlier in the expression. If the specified group matched, the |
| regular expression pattern \var{A} will be tested against the string; if |
| the group didn't match, the pattern \var{B} will be used instead. |
| (Contributed by Gustavo Niemeyer.) |
| |
| \item The \module{re} module is also no longer recursive, thanks to a |
| massive amount of work by Gustavo Niemeyer. In a recursive regular |
| expression engine, certain patterns result in a large amount of C |
| stack space being consumed, and it was possible to overflow the stack. |
| For example, if you matched a 30000-byte string of \samp{a} characters |
| against the expression \regexp{(a|b)+}, one stack frame was consumed |
| per character. Python 2.3 tried to check for stack overflow and raise |
| a \exception{RuntimeError} exception, but certain patterns could |
| sidestep the checking and if you were unlucky Python could segfault. |
| Python 2.4's regular expression engine can match this pattern without |
| problems. |
| |
| \item The \module{signal} module now performs tighter error-checking |
| on the parameters to the \function{signal.signal()} function. For |
| example, you can't set a handler on the \constant{SIGKILL} signal; |
| previous versions of Python would quietly accept this, but 2.4 will |
| raise a \exception{RuntimeError} exception. |
| |
| \item Two new functions were added to the \module{socket} module. |
| \function{socketpair()} returns a pair of connected sockets and |
| \function{getservbyport(\var{port})} looks up the service name for a |
| given port number. (Contributed by Dave Cole and Barry Warsaw.) |
| |
| \item The \function{sys.exitfunc()} function has been deprecated. Code |
| should be using the existing \module{atexit} module, which correctly |
| handles calling multiple exit functions. Eventually |
| \function{sys.exitfunc()} will become a purely internal interface, |
| accessed only by \module{atexit}. |
| |
| \item The \module{tarfile} module now generates GNU-format tar files |
| by default. (Contributed by Lars Gustaebel.) |
| |
| \item The \module{threading} module now has an elegantly simple way to support |
| thread-local data. The module contains a \class{local} class whose |
| attribute values are local to different threads. |
| |
| \begin{verbatim} |
| import threading |
| |
| data = threading.local() |
| data.number = 42 |
| data.url = ('www.python.org', 80) |
| \end{verbatim} |
| |
| Other threads can assign and retrieve their own values for the |
| \member{number} and \member{url} attributes. You can subclass |
| \class{local} to initialize attributes or to add methods. |
| (Contributed by Jim Fulton.) |
| |
| \item The \module{timeit} module now automatically disables periodic |
| garbage collection during the timing loop. This change makes |
| consecutive timings more comparable. (Contributed by Raymond Hettinger.) |
| |
| \item The \module{weakref} module now supports a wider variety of objects |
| including Python functions, class instances, sets, frozensets, deques, |
| arrays, files, sockets, and regular expression pattern objects. |
| (Contributed by Raymond Hettinger.) |
| |
| \item The \module{xmlrpclib} module now supports a multi-call extension for |
| transmitting multiple XML-RPC calls in a single HTTP operation. |
| (Contributed by Brian Quinlan.) |
| |
| \item The \module{mpz}, \module{rotor}, and \module{xreadlines} modules have |
| been removed. |
| |
| \end{itemize} |
| |
| |
| %====================================================================== |
| % whole new modules get described in subsections here |
| |
| %===================== |
| \subsection{cookielib} |
| |
| The \module{cookielib} library supports client-side handling for HTTP |
| cookies, mirroring the \module{Cookie} module's server-side cookie |
| support. Cookies are stored in cookie jars; the library transparently |
| stores cookies offered by the web server in the cookie jar, and |
| fetches the cookie from the jar when connecting to the server. As in |
| web browsers, policy objects control whether cookies are accepted or |
| not. |
| |
| In order to store cookies across sessions, two implementations of |
| cookie jars are provided: one that stores cookies in the Netscape |
| format so applications can use the Mozilla or Lynx cookie files, and |
| one that stores cookies in the same format as the Perl libwww library. |
| |
| \module{urllib2} has been changed to interact with \module{cookielib}: |
| \class{HTTPCookieProcessor} manages a cookie jar that is used when |
| accessing URLs. |
| |
| This module was contributed by John J. Lee. |
| |
| |
| % ================== |
| \subsection{doctest} |
| |
| The \module{doctest} module underwent considerable refactoring thanks |
| to Edward Loper and Tim Peters. Testing can still be as simple as |
| running \function{doctest.testmod()}, but the refactorings allow |
| customizing the module's operation in various ways |
| |
| The new \class{DocTestFinder} class extracts the tests from a given |
| object's docstrings: |
| |
| \begin{verbatim} |
| def f (x, y): |
| """>>> f(2,2) |
| 4 |
| >>> f(3,2) |
| 6 |
| """ |
| return x*y |
| |
| finder = doctest.DocTestFinder() |
| |
| # Get list of DocTest instances |
| tests = finder.find(f) |
| \end{verbatim} |
| |
| The new \class{DocTestRunner} class then runs individual tests and can |
| produce a summary of the results: |
| |
| \begin{verbatim} |
| runner = doctest.DocTestRunner() |
| for t in tests: |
| tried, failed = runner.run(t) |
| |
| runner.summarize(verbose=1) |
| \end{verbatim} |
| |
| The above example produces the following output: |
| |
| \begin{verbatim} |
| 1 items passed all tests: |
| 2 tests in f |
| 2 tests in 1 items. |
| 2 passed and 0 failed. |
| Test passed. |
| \end{verbatim} |
| |
| \class{DocTestRunner} uses an instance of the \class{OutputChecker} |
| class to compare the expected output with the actual output. This |
| class takes a number of different flags that customize its behaviour; |
| ambitious users can also write a completely new subclass of |
| \class{OutputChecker}. |
| |
| The default output checker provides a number of handy features. |
| For example, with the \constant{doctest.ELLIPSIS} option flag, |
| an ellipsis (\samp{...}) in the expected output matches any substring, |
| making it easier to accommodate outputs that vary in minor ways: |
| |
| \begin{verbatim} |
| def o (n): |
| """>>> o(1) |
| <__main__.C instance at 0x...> |
| >>> |
| """ |
| \end{verbatim} |
| |
| Another special string, \samp{<BLANKLINE>}, matches a blank line: |
| |
| \begin{verbatim} |
| def p (n): |
| """>>> p(1) |
| <BLANKLINE> |
| >>> |
| """ |
| \end{verbatim} |
| |
| Another new capability is producing a diff-style display of the output |
| by specifying the \constant{doctest.REPORT_UDIFF} (unified diffs), |
| \constant{doctest.REPORT_CDIFF} (context diffs), or |
| \constant{doctest.REPORT_NDIFF} (delta-style) option flags. For example: |
| |
| \begin{verbatim} |
| def g (n): |
| """>>> g(4) |
| here |
| is |
| a |
| lengthy |
| >>>""" |
| L = 'here is a rather lengthy list of words'.split() |
| for word in L[:n]: |
| print word |
| \end{verbatim} |
| |
| Running the above function's tests with |
| \constant{doctest.REPORT_UDIFF} specified, you get the following output: |
| |
| \begin{verbatim} |
| ********************************************************************** |
| File ``t.py'', line 15, in g |
| Failed example: |
| g(4) |
| Differences (unified diff with -expected +actual): |
| @@ -2,3 +2,3 @@ |
| is |
| a |
| -lengthy |
| +rather |
| ********************************************************************** |
| \end{verbatim} |
| |
| |
| % ====================================================================== |
| \section{Build and C API Changes} |
| |
| Some of the changes to Python's build process and to the C API are: |
| |
| \begin{itemize} |
| |
| \item Three new convenience macros were added for common return |
| values from extension functions: \csimplemacro{Py_RETURN_NONE}, |
| \csimplemacro{Py_RETURN_TRUE}, and \csimplemacro{Py_RETURN_FALSE}. |
| (Contributed by Brett Cannon.) |
| |
| \item Another new macro, \csimplemacro{Py_CLEAR(\var{obj})}, |
| decreases the reference count of \var{obj} and sets \var{obj} to the |
| null pointer. (Contributed by Jim Fulton.) |
| |
| \item A new function, \cfunction{PyTuple_Pack(\var{N}, \var{obj1}, |
| \var{obj2}, ..., \var{objN})}, constructs tuples from a variable |
| length argument list of Python objects. (Contributed by Raymond Hettinger.) |
| |
| \item A new function, \cfunction{PyDict_Contains(\var{d}, \var{k})}, |
| implements fast dictionary lookups without masking exceptions raised |
| during the look-up process. (Contributed by Raymond Hettinger.) |
| |
| \item The \csimplemacro{Py_IS_NAN(\var{X})} macro returns 1 if |
| its float or double argument \var{X} is a NaN. |
| (Contributed by Tim Peters.) |
| |
| \item C code can avoid unnecessary locking by using the new |
| \cfunction{PyEval_ThreadsInitialized()} function to tell |
| if any thread operations have been performed. If this function |
| returns false, no lock operations are needed. |
| (Contributed by Nick Coghlan.) |
| |
| \item A new function, \cfunction{PyArg_VaParseTupleAndKeywords()}, |
| is the same as \cfunction{PyArg_ParseTupleAndKeywords()} but takes a |
| \ctype{va_list} instead of a number of arguments. |
| (Contributed by Greg Chapman.) |
| |
| \item A new method flag, \constant{METH_COEXISTS}, allows a function |
| defined in slots to co-exist with a \ctype{PyCFunction} having the |
| same name. This can halve the access time for a method such as |
| \method{set.__contains__()}. (Contributed by Raymond Hettinger.) |
| |
| \item Python can now be built with additional profiling for the |
| interpreter itself, intended as an aid to people developing the |
| Python core. Providing \longprogramopt{--enable-profiling} to the |
| \program{configure} script will let you profile the interpreter with |
| \program{gprof}, and providing the \longprogramopt{--with-tsc} |
| switch enables profiling using the Pentium's Time-Stamp-Counter |
| register. Note that the \longprogramopt{--with-tsc} switch is slightly |
| misnamed, because the profiling feature also works on the PowerPC |
| platform, though that processor architecture doesn't call that |
| register ``the TSC register''. (Contributed by Jeremy Hylton.) |
| |
| \item The \ctype{tracebackobject} type has been renamed to \ctype{PyTracebackObject}. |
| |
| \end{itemize} |
| |
| |
| %====================================================================== |
| \subsection{Port-Specific Changes} |
| |
| \begin{itemize} |
| |
| \item The Windows port now builds under MSVC++ 7.1 as well as version 6. |
| (Contributed by Martin von~L\"owis.) |
| |
| \end{itemize} |
| |
| |
| |
| %====================================================================== |
| \section{Porting to Python 2.4} |
| |
| This section lists previously described changes that may require |
| changes to your code: |
| |
| \begin{itemize} |
| |
| \item Left shifts and hexadecimal/octal constants that are too |
| large no longer trigger a \exception{FutureWarning} and return |
| a value limited to 32 or 64 bits; instead they return a long integer. |
| |
| \item Integer operations will no longer trigger an \exception{OverflowWarning}. |
| The \exception{OverflowWarning} warning will disappear in Python 2.5. |
| |
| \item The \function{zip()} built-in function and \function{itertools.izip()} |
| now return an empty list instead of raising a \exception{TypeError} |
| exception if called with no arguments. |
| |
| \item You can no longer compare the \class{date} and \class{datetime} |
| instances provided by the \module{datetime} module. Two |
| instances of different classes will now always be unequal, and |
| relative comparisons (\code{<}, \code{>}) will raise a \exception{TypeError}. |
| |
| \item \function{dircache.listdir()} now passes exceptions to the caller |
| instead of returning empty lists. |
| |
| \item \function{LexicalHandler.startDTD()} used to receive the public and |
| system IDs in the wrong order. This has been corrected; applications |
| relying on the wrong order need to be fixed. |
| |
| \item \function{fcntl.ioctl} now warns if the \var{mutate} |
| argument is omitted and relevant. |
| |
| \item The \module{tarfile} module now generates GNU-format tar files |
| by default. |
| |
| \item Encountering a failure while importing a module no longer leaves |
| a partially-initialized module object in \code{sys.modules}. |
| |
| \item \constant{None} is now a constant; code that binds a new value to |
| the name \samp{None} is now a syntax error. |
| |
| \item The \function{signals.signal()} function now raises a |
| \exception{RuntimeError} exception for certain illegal values; |
| previously these errors would pass silently. For example, you can no |
| longer set a handler on the \constant{SIGKILL} signal. |
| |
| \end{itemize} |
| |
| |
| %====================================================================== |
| \section{Acknowledgements \label{acks}} |
| |
| The author would like to thank the following people for offering |
| suggestions, corrections and assistance with various drafts of this |
| article: Koray Can, Hye-Shik Chang, Michael Dyck, Raymond Hettinger, |
| Brian Hurt, Hamish Lawson, Fredrik Lundh, Sean Reifschneider, |
| Sadruddin Rejeb. |
| |
| \end{document} |