| \documentclass{howto} | 
 |  | 
 | \title{Idioms and Anti-Idioms in Python} | 
 |  | 
 | \release{0.00} | 
 |  | 
 | \author{Moshe Zadka} | 
 | \authoraddress{howto@zadka.site.co.il} | 
 |  | 
 | \begin{document} | 
 | \maketitle | 
 |  | 
 | This document is placed in the public doman. | 
 |  | 
 | \begin{abstract} | 
 | \noindent | 
 | This document can be considered a companion to the tutorial. It | 
 | shows how to use Python, and even more importantly, how {\em not} | 
 | to use Python.  | 
 | \end{abstract} | 
 |  | 
 | \tableofcontents | 
 |  | 
 | \section{Language Constructs You Should Not Use} | 
 |  | 
 | While Python has relatively few gotchas compared to other languages, it | 
 | still has some constructs which are only useful in corner cases, or are | 
 | plain dangerous.  | 
 |  | 
 | \subsection{from module import *} | 
 |  | 
 | \subsubsection{Inside Function Definitions} | 
 |  | 
 | \code{from module import *} is {\em invalid} inside function definitions. | 
 | While many versions of Python do no check for the invalidity, it does not | 
 | make it more valid, no more then having a smart lawyer makes a man innocent. | 
 | Do not use it like that ever. Even in versions where it was accepted, it made | 
 | the function execution slower, because the compiler could not be certain | 
 | which names are local and which are global. In Python 2.1 this construct | 
 | causes warnings, and sometimes even errors. | 
 |  | 
 | \subsubsection{At Module Level} | 
 |  | 
 | While it is valid to use \code{from module import *} at module level it | 
 | is usually a bad idea. For one, this loses an important property Python | 
 | otherwise has --- you can know where each toplevel name is defined by | 
 | a simple "search" function in your favourite editor. You also open yourself | 
 | to trouble in the future, if some module grows additional functions or | 
 | classes.  | 
 |  | 
 | One of the most awful question asked on the newsgroup is why this code: | 
 |  | 
 | \begin{verbatim} | 
 | f = open("www") | 
 | f.read() | 
 | \end{verbatim} | 
 |  | 
 | does not work. Of course, it works just fine (assuming you have a file | 
 | called "www".) But it does not work if somewhere in the module, the | 
 | statement \code{from os import *} is present. The \module{os} module | 
 | has a function called \function{open()} which returns an integer. While | 
 | it is very useful, shadowing builtins is one of its least useful properties. | 
 |  | 
 | Remember, you can never know for sure what names a module exports, so either | 
 | take what you need --- \code{from module import name1, name2}, or keep them in | 
 | the module and access on a per-need basis ---  | 
 | \code{import module;print module.name}. | 
 |  | 
 | \subsubsection{When It Is Just Fine} | 
 |  | 
 | There are situations in which \code{from module import *} is just fine: | 
 |  | 
 | \begin{itemize} | 
 |  | 
 | \item The interactive prompt. For example, \code{from math import *} makes | 
 |       Python an amazing scientific calculator. | 
 |  | 
 | \item When extending a module in C with a module in Python. | 
 |  | 
 | \item When the module advertises itself as \code{from import *} safe. | 
 |  | 
 | \end{itemize} | 
 |  | 
 | \subsection{Unadorned \keyword{exec}, \function{execfile} and friends} | 
 |  | 
 | The word ``unadorned'' refers to the use without an explicit dictionary, | 
 | in which case those constructs evaluate code in the {\em current} environment. | 
 | This is dangerous for the same reasons \code{from import *} is dangerous --- | 
 | it might step over variables you are counting on and mess up things for | 
 | the rest of your code. Simply do not do that. | 
 |  | 
 | Bad examples: | 
 |  | 
 | \begin{verbatim} | 
 | >>> for name in sys.argv[1:]: | 
 | >>>     exec "%s=1" % name | 
 | >>> def func(s, **kw): | 
 | >>>     for var, val in kw.items(): | 
 | >>>         exec "s.%s=val" % var  # invalid! | 
 | >>> execfile("handler.py") | 
 | >>> handle() | 
 | \end{verbatim} | 
 |  | 
 | Good examples: | 
 |  | 
 | \begin{verbatim} | 
 | >>> d = {} | 
 | >>> for name in sys.argv[1:]: | 
 | >>>     d[name] = 1 | 
 | >>> def func(s, **kw): | 
 | >>>     for var, val in kw.items(): | 
 | >>>         setattr(s, var, val) | 
 | >>> d={} | 
 | >>> execfile("handle.py", d, d) | 
 | >>> handle = d['handle'] | 
 | >>> handle() | 
 | \end{verbatim} | 
 |  | 
 | \subsection{from module import name1, name2} | 
 |  | 
 | This is a ``don't'' which is much weaker then the previous ``don't''s | 
 | but is still something you should not do if you don't have good reasons | 
 | to do that. The reason it is usually bad idea is because you suddenly | 
 | have an object which lives in two seperate namespaces. When the binding | 
 | in one namespace changes, the binding in the other will not, so there | 
 | will be a discrepancy between them. This happens when, for example, | 
 | one module is reloaded, or changes the definition of a function at runtime.  | 
 |  | 
 | Bad example: | 
 |  | 
 | \begin{verbatim} | 
 | # foo.py | 
 | a = 1 | 
 |  | 
 | # bar.py | 
 | from foo import a | 
 | if something(): | 
 |     a = 2 # danger: foo.a != a  | 
 | \end{verbatim} | 
 |  | 
 | Good example: | 
 |  | 
 | \begin{verbatim} | 
 | # foo.py | 
 | a = 1 | 
 |  | 
 | # bar.py | 
 | import foo | 
 | if something(): | 
 |     foo.a = 2 | 
 | \end{verbatim} | 
 |  | 
 | \subsection{except:} | 
 |  | 
 | Python has the \code{except:} clause, which catches all exceptions. | 
 | Since {\em every} error in Python raises an exception, this makes many | 
 | programming errors look like runtime problems, and hinders | 
 | the debugging process. | 
 |  | 
 | The following code shows a great example: | 
 |  | 
 | \begin{verbatim} | 
 | try: | 
 |     foo = opne("file") # misspelled "open" | 
 | except: | 
 |     sys.exit("could not open file!") | 
 | \end{verbatim} | 
 |  | 
 | The second line triggers a \exception{NameError} which is caught by the | 
 | except clause. The program will exit, and you will have no idea that | 
 | this has nothing to do with the readability of \code{"file"}. | 
 |  | 
 | The example above is better written | 
 |  | 
 | \begin{verbatim} | 
 | try: | 
 |     foo = opne("file") # will be changed to "open" as soon as we run it | 
 | except IOError: | 
 |     sys.exit("could not open file") | 
 | \end{verbatim} | 
 |  | 
 | There are some situations in which the \code{except:} clause is useful: | 
 | for example, in a framework when running callbacks, it is good not to | 
 | let any callback disturb the framework. | 
 |  | 
 | \section{Exceptions} | 
 |  | 
 | Exceptions are a useful feature of Python. You should learn to raise | 
 | them whenever something unexpected occurs, and catch them only where | 
 | you can do something about them. | 
 |  | 
 | The following is a very popular anti-idiom | 
 |  | 
 | \begin{verbatim} | 
 | def get_status(file): | 
 |     if not os.path.exists(file): | 
 |         print "file not found" | 
 |         sys.exit(1) | 
 |     return open(file).readline() | 
 | \end{verbatim} | 
 |  | 
 | Consider the case the file gets deleted between the time the call to  | 
 | \function{os.path.exists} is made and the time \function{open} is called. | 
 | That means the last line will throw an \exception{IOError}. The same would | 
 | happen if \var{file} exists but has no read permission. Since testing this | 
 | on a normal machine on existing and non-existing files make it seem bugless, | 
 | that means in testing the results will seem fine, and the code will get | 
 | shipped. Then an unhandled \exception{IOError} escapes to the user, who | 
 | has to watch the ugly traceback. | 
 |  | 
 | Here is a better way to do it. | 
 |  | 
 | \begin{verbatim} | 
 | def get_status(file): | 
 |     try: | 
 |         return open(file).readline() | 
 |     except (IOError, OSError): | 
 |         print "file not found" | 
 |         sys.exit(1) | 
 | \end{verbatim} | 
 |  | 
 | In this version, *either* the file gets opened and the line is read | 
 | (so it works even on flaky NFS or SMB connections), or the message | 
 | is printed and the application aborted. | 
 |  | 
 | Still, \function{get_status} makes too many assumptions --- that it | 
 | will only be used in a short running script, and not, say, in a long | 
 | running server. Sure, the caller could do something like | 
 |  | 
 | \begin{verbatim} | 
 | try: | 
 |     status = get_status(log) | 
 | except SystemExit: | 
 |     status = None | 
 | \end{verbatim} | 
 |  | 
 | So, try to make as few \code{except} clauses in your code --- those will | 
 | usually be a catch-all in the \function{main}, or inside calls which | 
 | should always succeed. | 
 |  | 
 | So, the best version is probably | 
 |  | 
 | \begin{verbatim} | 
 | def get_status(file): | 
 |     return open(file).readline() | 
 | \end{verbatim} | 
 |  | 
 | The caller can deal with the exception if it wants (for example, if it  | 
 | tries several files in a loop), or just let the exception filter upwards | 
 | to {\em its} caller. | 
 |  | 
 | The last version is not very good either --- due to implementation details, | 
 | the file would not be closed when an exception is raised until the handler | 
 | finishes, and perhaps not at all in non-C implementations (e.g., Jython). | 
 |  | 
 | \begin{verbatim} | 
 | def get_status(file): | 
 |     fp = open(file) | 
 |     try: | 
 |         return fp.readline() | 
 |     finally: | 
 |         fp.close() | 
 | \end{verbatim} | 
 |  | 
 | \section{Using the Batteries} | 
 |  | 
 | Every so often, people seem to be writing stuff in the Python library | 
 | again, usually poorly. While the occasional module has a poor interface, | 
 | it is usually much better to use the rich standard library and data | 
 | types that come with Python then inventing your own. | 
 |  | 
 | A useful module very few people know about is \module{os.path}. It  | 
 | always has the correct path arithmetic for your operating system, and | 
 | will usually be much better then whatever you come up with yourself. | 
 |  | 
 | Compare: | 
 |  | 
 | \begin{verbatim} | 
 | # ugh! | 
 | return dir+"/"+file | 
 | # better | 
 | return os.path.join(dir, file) | 
 | \end{verbatim} | 
 |  | 
 | More useful functions in \module{os.path}: \function{basename},  | 
 | \function{dirname} and \function{splitext}. | 
 |  | 
 | There are also many useful builtin functions people seem not to be | 
 | aware of for some reason: \function{min()} and \function{max()} can | 
 | find the minimum/maximum of any sequence with comparable semantics, | 
 | for example, yet many people write they own max/min. Another highly | 
 | useful function is \function{reduce()}. Classical use of \function{reduce()} | 
 | is something like | 
 |  | 
 | \begin{verbatim} | 
 | import sys, operator | 
 | nums = map(float, sys.argv[1:]) | 
 | print reduce(operator.add, nums)/len(nums) | 
 | \end{verbatim} | 
 |  | 
 | This cute little script prints the average of all numbers given on the | 
 | command line. The \function{reduce()} adds up all the numbers, and | 
 | the rest is just some pre- and postprocessing. | 
 |  | 
 | On the same note, note that \function{float()}, \function{int()} and | 
 | \function{long()} all accept arguments of type string, and so are | 
 | suited to parsing --- assuming you are ready to deal with the | 
 | \exception{ValueError} they raise. | 
 |  | 
 | \section{Using Backslash to Continue Statements} | 
 |  | 
 | Since Python treats a newline as a statement terminator, | 
 | and since statements are often more then is comfortable to put | 
 | in one line, many people do: | 
 |  | 
 | \begin{verbatim} | 
 | if foo.bar()['first'][0] == baz.quux(1, 2)[5:9] and \ | 
 |    calculate_number(10, 20) != forbulate(500, 360): | 
 |       pass | 
 | \end{verbatim} | 
 |  | 
 | You should realize that this is dangerous: a stray space after the | 
 | \code{\\} would make this line wrong, and stray spaces are notoriously | 
 | hard to see in editors. In this case, at least it would be a syntax | 
 | error, but if the code was: | 
 |  | 
 | \begin{verbatim} | 
 | value = foo.bar()['first'][0]*baz.quux(1, 2)[5:9] \ | 
 |         + calculate_number(10, 20)*forbulate(500, 360) | 
 | \end{verbatim} | 
 |  | 
 | then it would just be subtly wrong. | 
 |  | 
 | It is usually much better to use the implicit continuation inside parenthesis: | 
 |  | 
 | This version is bulletproof: | 
 |  | 
 | \begin{verbatim} | 
 | value = (foo.bar()['first'][0]*baz.quux(1, 2)[5:9]  | 
 |         + calculate_number(10, 20)*forbulate(500, 360)) | 
 | \end{verbatim} | 
 |  | 
 | \end{document} |