blob: adbde66987744bd3c0c3ee8130972b5aff7929a2 [file] [log] [blame]
Andrew M. Kuchlinge8f44d62005-08-30 01:25:05 +00001\documentclass{howto}
2
3\title{Idioms and Anti-Idioms in Python}
4
5\release{0.00}
6
7\author{Moshe Zadka}
8\authoraddress{howto@zadka.site.co.il}
9
10\begin{document}
11\maketitle
12
13This document is placed in the public doman.
14
15\begin{abstract}
16\noindent
17This document can be considered a companion to the tutorial. It
18shows how to use Python, and even more importantly, how {\em not}
19to use Python.
20\end{abstract}
21
22\tableofcontents
23
24\section{Language Constructs You Should Not Use}
25
26While Python has relatively few gotchas compared to other languages, it
27still has some constructs which are only useful in corner cases, or are
28plain dangerous.
29
30\subsection{from module import *}
31
32\subsubsection{Inside Function Definitions}
33
34\code{from module import *} is {\em invalid} inside function definitions.
35While many versions of Python do no check for the invalidity, it does not
36make it more valid, no more then having a smart lawyer makes a man innocent.
37Do not use it like that ever. Even in versions where it was accepted, it made
38the function execution slower, because the compiler could not be certain
39which names are local and which are global. In Python 2.1 this construct
40causes warnings, and sometimes even errors.
41
42\subsubsection{At Module Level}
43
44While it is valid to use \code{from module import *} at module level it
45is usually a bad idea. For one, this loses an important property Python
46otherwise has --- you can know where each toplevel name is defined by
47a simple "search" function in your favourite editor. You also open yourself
48to trouble in the future, if some module grows additional functions or
49classes.
50
51One of the most awful question asked on the newsgroup is why this code:
52
53\begin{verbatim}
54f = open("www")
55f.read()
56\end{verbatim}
57
58does not work. Of course, it works just fine (assuming you have a file
59called "www".) But it does not work if somewhere in the module, the
60statement \code{from os import *} is present. The \module{os} module
61has a function called \function{open()} which returns an integer. While
62it is very useful, shadowing builtins is one of its least useful properties.
63
64Remember, you can never know for sure what names a module exports, so either
65take what you need --- \code{from module import name1, name2}, or keep them in
66the module and access on a per-need basis ---
67\code{import module;print module.name}.
68
69\subsubsection{When It Is Just Fine}
70
71There are situations in which \code{from module import *} is just fine:
72
73\begin{itemize}
74
75\item The interactive prompt. For example, \code{from math import *} makes
76 Python an amazing scientific calculator.
77
78\item When extending a module in C with a module in Python.
79
80\item When the module advertises itself as \code{from import *} safe.
81
82\end{itemize}
83
84\subsection{Unadorned \keyword{exec}, \function{execfile} and friends}
85
86The word ``unadorned'' refers to the use without an explicit dictionary,
87in which case those constructs evaluate code in the {\em current} environment.
88This is dangerous for the same reasons \code{from import *} is dangerous ---
89it might step over variables you are counting on and mess up things for
90the rest of your code. Simply do not do that.
91
92Bad examples:
93
94\begin{verbatim}
95>>> for name in sys.argv[1:]:
96>>> exec "%s=1" % name
97>>> def func(s, **kw):
98>>> for var, val in kw.items():
99>>> exec "s.%s=val" % var # invalid!
100>>> execfile("handler.py")
101>>> handle()
102\end{verbatim}
103
104Good examples:
105
106\begin{verbatim}
107>>> d = {}
108>>> for name in sys.argv[1:]:
109>>> d[name] = 1
110>>> def func(s, **kw):
111>>> for var, val in kw.items():
112>>> setattr(s, var, val)
113>>> d={}
114>>> execfile("handle.py", d, d)
115>>> handle = d['handle']
116>>> handle()
117\end{verbatim}
118
119\subsection{from module import name1, name2}
120
121This is a ``don't'' which is much weaker then the previous ``don't''s
122but is still something you should not do if you don't have good reasons
123to do that. The reason it is usually bad idea is because you suddenly
124have an object which lives in two seperate namespaces. When the binding
125in one namespace changes, the binding in the other will not, so there
126will be a discrepancy between them. This happens when, for example,
127one module is reloaded, or changes the definition of a function at runtime.
128
129Bad example:
130
131\begin{verbatim}
132# foo.py
133a = 1
134
135# bar.py
136from foo import a
137if something():
138 a = 2 # danger: foo.a != a
139\end{verbatim}
140
141Good example:
142
143\begin{verbatim}
144# foo.py
145a = 1
146
147# bar.py
148import foo
149if something():
150 foo.a = 2
151\end{verbatim}
152
153\subsection{except:}
154
155Python has the \code{except:} clause, which catches all exceptions.
156Since {\em every} error in Python raises an exception, this makes many
157programming errors look like runtime problems, and hinders
158the debugging process.
159
160The following code shows a great example:
161
162\begin{verbatim}
163try:
164 foo = opne("file") # misspelled "open"
165except:
166 sys.exit("could not open file!")
167\end{verbatim}
168
169The second line triggers a \exception{NameError} which is caught by the
170except clause. The program will exit, and you will have no idea that
171this has nothing to do with the readability of \code{"file"}.
172
173The example above is better written
174
175\begin{verbatim}
176try:
177 foo = opne("file") # will be changed to "open" as soon as we run it
178except IOError:
179 sys.exit("could not open file")
180\end{verbatim}
181
182There are some situations in which the \code{except:} clause is useful:
183for example, in a framework when running callbacks, it is good not to
184let any callback disturb the framework.
185
186\section{Exceptions}
187
188Exceptions are a useful feature of Python. You should learn to raise
189them whenever something unexpected occurs, and catch them only where
190you can do something about them.
191
192The following is a very popular anti-idiom
193
194\begin{verbatim}
195def get_status(file):
196 if not os.path.exists(file):
197 print "file not found"
198 sys.exit(1)
199 return open(file).readline()
200\end{verbatim}
201
202Consider the case the file gets deleted between the time the call to
203\function{os.path.exists} is made and the time \function{open} is called.
204That means the last line will throw an \exception{IOError}. The same would
205happen if \var{file} exists but has no read permission. Since testing this
206on a normal machine on existing and non-existing files make it seem bugless,
207that means in testing the results will seem fine, and the code will get
208shipped. Then an unhandled \exception{IOError} escapes to the user, who
209has to watch the ugly traceback.
210
211Here is a better way to do it.
212
213\begin{verbatim}
214def get_status(file):
215 try:
216 return open(file).readline()
217 except (IOError, OSError):
218 print "file not found"
219 sys.exit(1)
220\end{verbatim}
221
222In this version, *either* the file gets opened and the line is read
223(so it works even on flaky NFS or SMB connections), or the message
224is printed and the application aborted.
225
226Still, \function{get_status} makes too many assumptions --- that it
227will only be used in a short running script, and not, say, in a long
228running server. Sure, the caller could do something like
229
230\begin{verbatim}
231try:
232 status = get_status(log)
233except SystemExit:
234 status = None
235\end{verbatim}
236
237So, try to make as few \code{except} clauses in your code --- those will
238usually be a catch-all in the \function{main}, or inside calls which
239should always succeed.
240
241So, the best version is probably
242
243\begin{verbatim}
244def get_status(file):
245 return open(file).readline()
246\end{verbatim}
247
248The caller can deal with the exception if it wants (for example, if it
249tries several files in a loop), or just let the exception filter upwards
250to {\em its} caller.
251
252The last version is not very good either --- due to implementation details,
253the file would not be closed when an exception is raised until the handler
254finishes, and perhaps not at all in non-C implementations (e.g., Jython).
255
256\begin{verbatim}
257def get_status(file):
258 fp = open(file)
259 try:
260 return fp.readline()
261 finally:
262 fp.close()
263\end{verbatim}
264
265\section{Using the Batteries}
266
267Every so often, people seem to be writing stuff in the Python library
268again, usually poorly. While the occasional module has a poor interface,
269it is usually much better to use the rich standard library and data
270types that come with Python then inventing your own.
271
272A useful module very few people know about is \module{os.path}. It
273always has the correct path arithmetic for your operating system, and
274will usually be much better then whatever you come up with yourself.
275
276Compare:
277
278\begin{verbatim}
279# ugh!
280return dir+"/"+file
281# better
282return os.path.join(dir, file)
283\end{verbatim}
284
285More useful functions in \module{os.path}: \function{basename},
286\function{dirname} and \function{splitext}.
287
288There are also many useful builtin functions people seem not to be
289aware of for some reason: \function{min()} and \function{max()} can
290find the minimum/maximum of any sequence with comparable semantics,
291for example, yet many people write they own max/min. Another highly
292useful function is \function{reduce()}. Classical use of \function{reduce()}
293is something like
294
295\begin{verbatim}
296import sys, operator
297nums = map(float, sys.argv[1:])
298print reduce(operator.add, nums)/len(nums)
299\end{verbatim}
300
301This cute little script prints the average of all numbers given on the
302command line. The \function{reduce()} adds up all the numbers, and
303the rest is just some pre- and postprocessing.
304
305On the same note, note that \function{float()}, \function{int()} and
306\function{long()} all accept arguments of type string, and so are
307suited to parsing --- assuming you are ready to deal with the
308\exception{ValueError} they raise.
309
310\section{Using Backslash to Continue Statements}
311
312Since Python treats a newline as a statement terminator,
313and since statements are often more then is comfortable to put
314in one line, many people do:
315
316\begin{verbatim}
317if foo.bar()['first'][0] == baz.quux(1, 2)[5:9] and \
318 calculate_number(10, 20) != forbulate(500, 360):
319 pass
320\end{verbatim}
321
322You should realize that this is dangerous: a stray space after the
323\code{\\} would make this line wrong, and stray spaces are notoriously
324hard to see in editors. In this case, at least it would be a syntax
325error, but if the code was:
326
327\begin{verbatim}
328value = foo.bar()['first'][0]*baz.quux(1, 2)[5:9] \
329 + calculate_number(10, 20)*forbulate(500, 360)
330\end{verbatim}
331
332then it would just be subtly wrong.
333
334It is usually much better to use the implicit continuation inside parenthesis:
335
336This version is bulletproof:
337
338\begin{verbatim}
339value = (foo.bar()['first'][0]*baz.quux(1, 2)[5:9]
340 + calculate_number(10, 20)*forbulate(500, 360))
341\end{verbatim}
342
343\end{document}