Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1 | ************************************ |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 2 | Idioms and Anti-Idioms in Python |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 3 | ************************************ |
| 4 | |
| 5 | :Author: Moshe Zadka |
| 6 | |
Andrew M. Kuchling | 4f74769 | 2008-04-18 16:53:09 +0000 | [diff] [blame] | 7 | This document is placed in the public domain. |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 8 | |
| 9 | |
| 10 | .. topic:: Abstract |
| 11 | |
| 12 | This document can be considered a companion to the tutorial. It shows how to use |
| 13 | Python, and even more importantly, how *not* to use Python. |
| 14 | |
| 15 | |
| 16 | Language Constructs You Should Not Use |
| 17 | ====================================== |
| 18 | |
| 19 | While Python has relatively few gotchas compared to other languages, it still |
| 20 | has some constructs which are only useful in corner cases, or are plain |
| 21 | dangerous. |
| 22 | |
| 23 | |
| 24 | from module import \* |
| 25 | --------------------- |
| 26 | |
| 27 | |
| 28 | Inside Function Definitions |
| 29 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 30 | |
| 31 | ``from module import *`` is *invalid* inside function definitions. While many |
| 32 | versions of Python do not check for the invalidity, it does not make it more |
Georg Brandl | bc4af35 | 2009-05-22 10:40:00 +0000 | [diff] [blame] | 33 | valid, no more than having a smart lawyer makes a man innocent. Do not use it |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 34 | like that ever. Even in versions where it was accepted, it made the function |
| 35 | execution slower, because the compiler could not be certain which names are |
| 36 | local and which are global. In Python 2.1 this construct causes warnings, and |
| 37 | sometimes even errors. |
| 38 | |
| 39 | |
| 40 | At Module Level |
| 41 | ^^^^^^^^^^^^^^^ |
| 42 | |
| 43 | While it is valid to use ``from module import *`` at module level it is usually |
| 44 | a bad idea. For one, this loses an important property Python otherwise has --- |
| 45 | you can know where each toplevel name is defined by a simple "search" function |
| 46 | in your favourite editor. You also open yourself to trouble in the future, if |
| 47 | some module grows additional functions or classes. |
| 48 | |
| 49 | One of the most awful question asked on the newsgroup is why this code:: |
| 50 | |
| 51 | f = open("www") |
| 52 | f.read() |
| 53 | |
| 54 | does not work. Of course, it works just fine (assuming you have a file called |
| 55 | "www".) But it does not work if somewhere in the module, the statement ``from os |
| 56 | import *`` is present. The :mod:`os` module has a function called :func:`open` |
| 57 | which returns an integer. While it is very useful, shadowing builtins is one of |
| 58 | its least useful properties. |
| 59 | |
| 60 | Remember, you can never know for sure what names a module exports, so either |
| 61 | take what you need --- ``from module import name1, name2``, or keep them in the |
| 62 | module and access on a per-need basis --- ``import module;print module.name``. |
| 63 | |
| 64 | |
| 65 | When It Is Just Fine |
| 66 | ^^^^^^^^^^^^^^^^^^^^ |
| 67 | |
| 68 | There are situations in which ``from module import *`` is just fine: |
| 69 | |
| 70 | * The interactive prompt. For example, ``from math import *`` makes Python an |
| 71 | amazing scientific calculator. |
| 72 | |
| 73 | * When extending a module in C with a module in Python. |
| 74 | |
| 75 | * When the module advertises itself as ``from import *`` safe. |
| 76 | |
| 77 | |
| 78 | Unadorned :keyword:`exec`, :func:`execfile` and friends |
| 79 | ------------------------------------------------------- |
| 80 | |
| 81 | The word "unadorned" refers to the use without an explicit dictionary, in which |
| 82 | case those constructs evaluate code in the *current* environment. This is |
| 83 | dangerous for the same reasons ``from import *`` is dangerous --- it might step |
| 84 | over variables you are counting on and mess up things for the rest of your code. |
| 85 | Simply do not do that. |
| 86 | |
| 87 | Bad examples:: |
| 88 | |
| 89 | >>> for name in sys.argv[1:]: |
| 90 | >>> exec "%s=1" % name |
| 91 | >>> def func(s, **kw): |
| 92 | >>> for var, val in kw.items(): |
| 93 | >>> exec "s.%s=val" % var # invalid! |
| 94 | >>> execfile("handler.py") |
| 95 | >>> handle() |
| 96 | |
| 97 | Good examples:: |
| 98 | |
| 99 | >>> d = {} |
| 100 | >>> for name in sys.argv[1:]: |
| 101 | >>> d[name] = 1 |
| 102 | >>> def func(s, **kw): |
| 103 | >>> for var, val in kw.items(): |
| 104 | >>> setattr(s, var, val) |
| 105 | >>> d={} |
| 106 | >>> execfile("handle.py", d, d) |
| 107 | >>> handle = d['handle'] |
| 108 | >>> handle() |
| 109 | |
| 110 | |
| 111 | from module import name1, name2 |
| 112 | ------------------------------- |
| 113 | |
Georg Brandl | bc4af35 | 2009-05-22 10:40:00 +0000 | [diff] [blame] | 114 | This is a "don't" which is much weaker than the previous "don't"s but is still |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 115 | something you should not do if you don't have good reasons to do that. The |
| 116 | reason it is usually bad idea is because you suddenly have an object which lives |
Georg Brandl | 907a720 | 2008-02-22 12:31:45 +0000 | [diff] [blame] | 117 | in two separate namespaces. When the binding in one namespace changes, the |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 118 | binding in the other will not, so there will be a discrepancy between them. This |
| 119 | happens when, for example, one module is reloaded, or changes the definition of |
| 120 | a function at runtime. |
| 121 | |
| 122 | Bad example:: |
| 123 | |
| 124 | # foo.py |
| 125 | a = 1 |
| 126 | |
| 127 | # bar.py |
| 128 | from foo import a |
| 129 | if something(): |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 130 | a = 2 # danger: foo.a != a |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 131 | |
| 132 | Good example:: |
| 133 | |
| 134 | # foo.py |
| 135 | a = 1 |
| 136 | |
| 137 | # bar.py |
| 138 | import foo |
| 139 | if something(): |
| 140 | foo.a = 2 |
| 141 | |
| 142 | |
| 143 | except: |
| 144 | ------- |
| 145 | |
| 146 | Python has the ``except:`` clause, which catches all exceptions. Since *every* |
| 147 | error in Python raises an exception, this makes many programming errors look |
| 148 | like runtime problems, and hinders the debugging process. |
| 149 | |
| 150 | The following code shows a great example:: |
| 151 | |
| 152 | try: |
| 153 | foo = opne("file") # misspelled "open" |
| 154 | except: |
| 155 | sys.exit("could not open file!") |
| 156 | |
| 157 | The second line triggers a :exc:`NameError` which is caught by the except |
| 158 | clause. The program will exit, and you will have no idea that this has nothing |
| 159 | to do with the readability of ``"file"``. |
| 160 | |
| 161 | The example above is better written :: |
| 162 | |
| 163 | try: |
| 164 | foo = opne("file") # will be changed to "open" as soon as we run it |
| 165 | except IOError: |
| 166 | sys.exit("could not open file") |
| 167 | |
| 168 | There are some situations in which the ``except:`` clause is useful: for |
| 169 | example, in a framework when running callbacks, it is good not to let any |
| 170 | callback disturb the framework. |
| 171 | |
| 172 | |
| 173 | Exceptions |
| 174 | ========== |
| 175 | |
| 176 | Exceptions are a useful feature of Python. You should learn to raise them |
| 177 | whenever something unexpected occurs, and catch them only where you can do |
| 178 | something about them. |
| 179 | |
| 180 | The following is a very popular anti-idiom :: |
| 181 | |
| 182 | def get_status(file): |
| 183 | if not os.path.exists(file): |
| 184 | print "file not found" |
| 185 | sys.exit(1) |
| 186 | return open(file).readline() |
| 187 | |
| 188 | Consider the case the file gets deleted between the time the call to |
| 189 | :func:`os.path.exists` is made and the time :func:`open` is called. That means |
| 190 | the last line will throw an :exc:`IOError`. The same would happen if *file* |
| 191 | exists but has no read permission. Since testing this on a normal machine on |
| 192 | existing and non-existing files make it seem bugless, that means in testing the |
| 193 | results will seem fine, and the code will get shipped. Then an unhandled |
| 194 | :exc:`IOError` escapes to the user, who has to watch the ugly traceback. |
| 195 | |
| 196 | Here is a better way to do it. :: |
| 197 | |
| 198 | def get_status(file): |
| 199 | try: |
| 200 | return open(file).readline() |
| 201 | except (IOError, OSError): |
| 202 | print "file not found" |
| 203 | sys.exit(1) |
| 204 | |
| 205 | In this version, \*either\* the file gets opened and the line is read (so it |
| 206 | works even on flaky NFS or SMB connections), or the message is printed and the |
| 207 | application aborted. |
| 208 | |
| 209 | Still, :func:`get_status` makes too many assumptions --- that it will only be |
| 210 | used in a short running script, and not, say, in a long running server. Sure, |
| 211 | the caller could do something like :: |
| 212 | |
| 213 | try: |
| 214 | status = get_status(log) |
| 215 | except SystemExit: |
| 216 | status = None |
| 217 | |
| 218 | So, try to make as few ``except`` clauses in your code --- those will usually be |
| 219 | a catch-all in the :func:`main`, or inside calls which should always succeed. |
| 220 | |
| 221 | So, the best version is probably :: |
| 222 | |
| 223 | def get_status(file): |
| 224 | return open(file).readline() |
| 225 | |
| 226 | The caller can deal with the exception if it wants (for example, if it tries |
| 227 | several files in a loop), or just let the exception filter upwards to *its* |
| 228 | caller. |
| 229 | |
| 230 | The last version is not very good either --- due to implementation details, the |
| 231 | file would not be closed when an exception is raised until the handler finishes, |
| 232 | and perhaps not at all in non-C implementations (e.g., Jython). :: |
| 233 | |
| 234 | def get_status(file): |
| 235 | fp = open(file) |
| 236 | try: |
| 237 | return fp.readline() |
| 238 | finally: |
| 239 | fp.close() |
| 240 | |
| 241 | |
| 242 | Using the Batteries |
| 243 | =================== |
| 244 | |
| 245 | Every so often, people seem to be writing stuff in the Python library again, |
| 246 | usually poorly. While the occasional module has a poor interface, it is usually |
| 247 | much better to use the rich standard library and data types that come with |
Georg Brandl | bc4af35 | 2009-05-22 10:40:00 +0000 | [diff] [blame] | 248 | Python than inventing your own. |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 249 | |
| 250 | A useful module very few people know about is :mod:`os.path`. It always has the |
| 251 | correct path arithmetic for your operating system, and will usually be much |
Georg Brandl | bc4af35 | 2009-05-22 10:40:00 +0000 | [diff] [blame] | 252 | better than whatever you come up with yourself. |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 253 | |
| 254 | Compare:: |
| 255 | |
| 256 | # ugh! |
| 257 | return dir+"/"+file |
| 258 | # better |
| 259 | return os.path.join(dir, file) |
| 260 | |
| 261 | More useful functions in :mod:`os.path`: :func:`basename`, :func:`dirname` and |
| 262 | :func:`splitext`. |
| 263 | |
Georg Brandl | d7d4fd7 | 2009-07-26 14:37:28 +0000 | [diff] [blame] | 264 | There are also many useful built-in functions people seem not to be aware of for |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 265 | some reason: :func:`min` and :func:`max` can find the minimum/maximum of any |
| 266 | sequence with comparable semantics, for example, yet many people write their own |
| 267 | :func:`max`/:func:`min`. Another highly useful function is :func:`reduce`. A |
| 268 | classical use of :func:`reduce` is something like :: |
| 269 | |
Benjamin Peterson | a7b55a3 | 2009-02-20 03:31:23 +0000 | [diff] [blame] | 270 | import sys, operator |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 271 | nums = map(float, sys.argv[1:]) |
| 272 | print reduce(operator.add, nums)/len(nums) |
| 273 | |
| 274 | This cute little script prints the average of all numbers given on the command |
| 275 | line. The :func:`reduce` adds up all the numbers, and the rest is just some |
| 276 | pre- and postprocessing. |
| 277 | |
| 278 | On the same note, note that :func:`float`, :func:`int` and :func:`long` all |
| 279 | accept arguments of type string, and so are suited to parsing --- assuming you |
| 280 | are ready to deal with the :exc:`ValueError` they raise. |
| 281 | |
| 282 | |
| 283 | Using Backslash to Continue Statements |
| 284 | ====================================== |
| 285 | |
| 286 | Since Python treats a newline as a statement terminator, and since statements |
Georg Brandl | bc4af35 | 2009-05-22 10:40:00 +0000 | [diff] [blame] | 287 | are often more than is comfortable to put in one line, many people do:: |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 288 | |
| 289 | if foo.bar()['first'][0] == baz.quux(1, 2)[5:9] and \ |
| 290 | calculate_number(10, 20) != forbulate(500, 360): |
| 291 | pass |
| 292 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 293 | You should realize that this is dangerous: a stray space after the ``\`` would |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 294 | make this line wrong, and stray spaces are notoriously hard to see in editors. |
| 295 | In this case, at least it would be a syntax error, but if the code was:: |
| 296 | |
| 297 | value = foo.bar()['first'][0]*baz.quux(1, 2)[5:9] \ |
| 298 | + calculate_number(10, 20)*forbulate(500, 360) |
| 299 | |
| 300 | then it would just be subtly wrong. |
| 301 | |
| 302 | It is usually much better to use the implicit continuation inside parenthesis: |
| 303 | |
| 304 | This version is bulletproof:: |
| 305 | |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 306 | value = (foo.bar()['first'][0]*baz.quux(1, 2)[5:9] |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 307 | + calculate_number(10, 20)*forbulate(500, 360)) |
| 308 | |