Blame - Doc/howto/doanddont.tex - platform/external/python/cpython2

blob: adbde66987744bd3c0c3ee8130972b5aff7929a2 [file] [log] [blame]

Andrew M. Kuchling	e8f44d6	2005-08-30 01:25:05 +0000	[diff] [blame^]	1	\documentclass{howto}
				2
				3	\title{Idioms and Anti-Idioms in Python}
				4
				5	\release{0.00}
				6
				7	\author{Moshe Zadka}
				8	\authoraddress{howto@zadka.site.co.il}
				9
				10	\begin{document}
				11	\maketitle
				12
				13	This document is placed in the public doman.
				14
				15	\begin{abstract}
				16	\noindent
				17	This document can be considered a companion to the tutorial. It
				18	shows how to use Python, and even more importantly, how {\em not}
				19	to use Python.
				20	\end{abstract}
				21
				22	\tableofcontents
				23
				24	\section{Language Constructs You Should Not Use}
				25
				26	While Python has relatively few gotchas compared to other languages, it
				27	still has some constructs which are only useful in corner cases, or are
				28	plain dangerous.
				29
				30	\subsection{from module import *}
				31
				32	\subsubsection{Inside Function Definitions}
				33
				34	\code{from module import *} is {\em invalid} inside function definitions.
				35	While many versions of Python do no check for the invalidity, it does not
				36	make it more valid, no more then having a smart lawyer makes a man innocent.
				37	Do not use it like that ever. Even in versions where it was accepted, it made
				38	the function execution slower, because the compiler could not be certain
				39	which names are local and which are global. In Python 2.1 this construct
				40	causes warnings, and sometimes even errors.
				41
				42	\subsubsection{At Module Level}
				43
				44	While it is valid to use \code{from module import *} at module level it
				45	is usually a bad idea. For one, this loses an important property Python
				46	otherwise has --- you can know where each toplevel name is defined by
				47	a simple "search" function in your favourite editor. You also open yourself
				48	to trouble in the future, if some module grows additional functions or
				49	classes.
				50
				51	One of the most awful question asked on the newsgroup is why this code:
				52
				53	\begin{verbatim}
				54	f = open("www")
				55	f.read()
				56	\end{verbatim}
				57
				58	does not work. Of course, it works just fine (assuming you have a file
				59	called "www".) But it does not work if somewhere in the module, the
				60	statement \code{from os import *} is present. The \module{os} module
				61	has a function called \function{open()} which returns an integer. While
				62	it is very useful, shadowing builtins is one of its least useful properties.
				63
				64	Remember, you can never know for sure what names a module exports, so either
				65	take what you need --- \code{from module import name1, name2}, or keep them in
				66	the module and access on a per-need basis ---
				67	\code{import module;print module.name}.
				68
				69	\subsubsection{When It Is Just Fine}
				70
				71	There are situations in which \code{from module import *} is just fine:
				72
				73	\begin{itemize}
				74
				75	\item The interactive prompt. For example, \code{from math import *} makes
				76	Python an amazing scientific calculator.
				77
				78	\item When extending a module in C with a module in Python.
				79
				80	\item When the module advertises itself as \code{from import *} safe.
				81
				82	\end{itemize}
				83
				84	\subsection{Unadorned \keyword{exec}, \function{execfile} and friends}
				85
				86	The word ``unadorned'' refers to the use without an explicit dictionary,
				87	in which case those constructs evaluate code in the {\em current} environment.
				88	This is dangerous for the same reasons \code{from import *} is dangerous ---
				89	it might step over variables you are counting on and mess up things for
				90	the rest of your code. Simply do not do that.
				91
				92	Bad examples:
				93
				94	\begin{verbatim}
				95	>>> for name in sys.argv[1:]:
				96	>>> exec "%s=1" % name
				97	>>> def func(s, **kw):
				98	>>> for var, val in kw.items():
				99	>>> exec "s.%s=val" % var # invalid!
				100	>>> execfile("handler.py")
				101	>>> handle()
				102	\end{verbatim}
				103
				104	Good examples:
				105
				106	\begin{verbatim}
				107	>>> d = {}
				108	>>> for name in sys.argv[1:]:
				109	>>> d[name] = 1
				110	>>> def func(s, **kw):
				111	>>> for var, val in kw.items():
				112	>>> setattr(s, var, val)
				113	>>> d={}
				114	>>> execfile("handle.py", d, d)
				115	>>> handle = d['handle']
				116	>>> handle()
				117	\end{verbatim}
				118
				119	\subsection{from module import name1, name2}
				120
				121	This is a ``don't'' which is much weaker then the previous ``don't''s
				122	but is still something you should not do if you don't have good reasons
				123	to do that. The reason it is usually bad idea is because you suddenly
				124	have an object which lives in two seperate namespaces. When the binding
				125	in one namespace changes, the binding in the other will not, so there
				126	will be a discrepancy between them. This happens when, for example,
				127	one module is reloaded, or changes the definition of a function at runtime.
				128
				129	Bad example:
				130
				131	\begin{verbatim}
				132	# foo.py
				133	a = 1
				134
				135	# bar.py
				136	from foo import a
				137	if something():
				138	a = 2 # danger: foo.a != a
				139	\end{verbatim}
				140
				141	Good example:
				142
				143	\begin{verbatim}
				144	# foo.py
				145	a = 1
				146
				147	# bar.py
				148	import foo
				149	if something():
				150	foo.a = 2
				151	\end{verbatim}
				152
				153	\subsection{except:}
				154
				155	Python has the \code{except:} clause, which catches all exceptions.
				156	Since {\em every} error in Python raises an exception, this makes many
				157	programming errors look like runtime problems, and hinders
				158	the debugging process.
				159
				160	The following code shows a great example:
				161
				162	\begin{verbatim}
				163	try:
				164	foo = opne("file") # misspelled "open"
				165	except:
				166	sys.exit("could not open file!")
				167	\end{verbatim}
				168
				169	The second line triggers a \exception{NameError} which is caught by the
				170	except clause. The program will exit, and you will have no idea that
				171	this has nothing to do with the readability of \code{"file"}.
				172
				173	The example above is better written
				174
				175	\begin{verbatim}
				176	try:
				177	foo = opne("file") # will be changed to "open" as soon as we run it
				178	except IOError:
				179	sys.exit("could not open file")
				180	\end{verbatim}
				181
				182	There are some situations in which the \code{except:} clause is useful:
				183	for example, in a framework when running callbacks, it is good not to
				184	let any callback disturb the framework.
				185
				186	\section{Exceptions}
				187
				188	Exceptions are a useful feature of Python. You should learn to raise
				189	them whenever something unexpected occurs, and catch them only where
				190	you can do something about them.
				191
				192	The following is a very popular anti-idiom
				193
				194	\begin{verbatim}
				195	def get_status(file):
				196	if not os.path.exists(file):
				197	print "file not found"
				198	sys.exit(1)
				199	return open(file).readline()
				200	\end{verbatim}
				201
				202	Consider the case the file gets deleted between the time the call to
				203	\function{os.path.exists} is made and the time \function{open} is called.
				204	That means the last line will throw an \exception{IOError}. The same would
				205	happen if \var{file} exists but has no read permission. Since testing this
				206	on a normal machine on existing and non-existing files make it seem bugless,
				207	that means in testing the results will seem fine, and the code will get
				208	shipped. Then an unhandled \exception{IOError} escapes to the user, who
				209	has to watch the ugly traceback.
				210
				211	Here is a better way to do it.
				212
				213	\begin{verbatim}
				214	def get_status(file):
				215	try:
				216	return open(file).readline()
				217	except (IOError, OSError):
				218	print "file not found"
				219	sys.exit(1)
				220	\end{verbatim}
				221
				222	In this version, either the file gets opened and the line is read
				223	(so it works even on flaky NFS or SMB connections), or the message
				224	is printed and the application aborted.
				225
				226	Still, \function{get_status} makes too many assumptions --- that it
				227	will only be used in a short running script, and not, say, in a long
				228	running server. Sure, the caller could do something like
				229
				230	\begin{verbatim}
				231	try:
				232	status = get_status(log)
				233	except SystemExit:
				234	status = None
				235	\end{verbatim}
				236
				237	So, try to make as few \code{except} clauses in your code --- those will
				238	usually be a catch-all in the \function{main}, or inside calls which
				239	should always succeed.
				240
				241	So, the best version is probably
				242
				243	\begin{verbatim}
				244	def get_status(file):
				245	return open(file).readline()
				246	\end{verbatim}
				247
				248	The caller can deal with the exception if it wants (for example, if it
				249	tries several files in a loop), or just let the exception filter upwards
				250	to {\em its} caller.
				251
				252	The last version is not very good either --- due to implementation details,
				253	the file would not be closed when an exception is raised until the handler
				254	finishes, and perhaps not at all in non-C implementations (e.g., Jython).
				255
				256	\begin{verbatim}
				257	def get_status(file):
				258	fp = open(file)
				259	try:
				260	return fp.readline()
				261	finally:
				262	fp.close()
				263	\end{verbatim}
				264
				265	\section{Using the Batteries}
				266
				267	Every so often, people seem to be writing stuff in the Python library
				268	again, usually poorly. While the occasional module has a poor interface,
				269	it is usually much better to use the rich standard library and data
				270	types that come with Python then inventing your own.
				271
				272	A useful module very few people know about is \module{os.path}. It
				273	always has the correct path arithmetic for your operating system, and
				274	will usually be much better then whatever you come up with yourself.
				275
				276	Compare:
				277
				278	\begin{verbatim}
				279	# ugh!
				280	return dir+"/"+file
				281	# better
				282	return os.path.join(dir, file)
				283	\end{verbatim}
				284
				285	More useful functions in \module{os.path}: \function{basename},
				286	\function{dirname} and \function{splitext}.
				287
				288	There are also many useful builtin functions people seem not to be
				289	aware of for some reason: \function{min()} and \function{max()} can
				290	find the minimum/maximum of any sequence with comparable semantics,
				291	for example, yet many people write they own max/min. Another highly
				292	useful function is \function{reduce()}. Classical use of \function{reduce()}
				293	is something like
				294
				295	\begin{verbatim}
				296	import sys, operator
				297	nums = map(float, sys.argv[1:])
				298	print reduce(operator.add, nums)/len(nums)
				299	\end{verbatim}
				300
				301	This cute little script prints the average of all numbers given on the
				302	command line. The \function{reduce()} adds up all the numbers, and
				303	the rest is just some pre- and postprocessing.
				304
				305	On the same note, note that \function{float()}, \function{int()} and
				306	\function{long()} all accept arguments of type string, and so are
				307	suited to parsing --- assuming you are ready to deal with the
				308	\exception{ValueError} they raise.
				309
				310	\section{Using Backslash to Continue Statements}
				311
				312	Since Python treats a newline as a statement terminator,
				313	and since statements are often more then is comfortable to put
				314	in one line, many people do:
				315
				316	\begin{verbatim}
				317	if foo.bar()['first'][0] == baz.quux(1, 2)[5:9] and \
				318	calculate_number(10, 20) != forbulate(500, 360):
				319	pass
				320	\end{verbatim}
				321
				322	You should realize that this is dangerous: a stray space after the
				323	\code{\\} would make this line wrong, and stray spaces are notoriously
				324	hard to see in editors. In this case, at least it would be a syntax
				325	error, but if the code was:
				326
				327	\begin{verbatim}
				328	value = foo.bar()['first'][0]*baz.quux(1, 2)[5:9] \
				329	+ calculate_number(10, 20)*forbulate(500, 360)
				330	\end{verbatim}
				331
				332	then it would just be subtly wrong.
				333
				334	It is usually much better to use the implicit continuation inside parenthesis:
				335
				336	This version is bulletproof:
				337
				338	\begin{verbatim}
				339	value = (foo.bar()['first'][0]*baz.quux(1, 2)[5:9]
				340	+ calculate_number(10, 20)*forbulate(500, 360))
				341	\end{verbatim}
				342
				343	\end{document}