Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1 | ************************************ |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 2 | Idioms and Anti-Idioms in Python |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 3 | ************************************ |
| 4 | |
| 5 | :Author: Moshe Zadka |
| 6 | |
Andrew M. Kuchling | 4f74769 | 2008-04-18 16:53:09 +0000 | [diff] [blame] | 7 | This document is placed in the public domain. |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 8 | |
| 9 | |
| 10 | .. topic:: Abstract |
| 11 | |
| 12 | This document can be considered a companion to the tutorial. It shows how to use |
| 13 | Python, and even more importantly, how *not* to use Python. |
| 14 | |
| 15 | |
| 16 | Language Constructs You Should Not Use |
| 17 | ====================================== |
| 18 | |
| 19 | While Python has relatively few gotchas compared to other languages, it still |
| 20 | has some constructs which are only useful in corner cases, or are plain |
| 21 | dangerous. |
| 22 | |
| 23 | |
| 24 | from module import \* |
| 25 | --------------------- |
| 26 | |
| 27 | |
| 28 | Inside Function Definitions |
| 29 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 30 | |
| 31 | ``from module import *`` is *invalid* inside function definitions. While many |
| 32 | versions of Python do not check for the invalidity, it does not make it more |
Georg Brandl | bc4af35 | 2009-05-22 10:40:00 +0000 | [diff] [blame] | 33 | valid, no more than having a smart lawyer makes a man innocent. Do not use it |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 34 | like that ever. Even in versions where it was accepted, it made the function |
| 35 | execution slower, because the compiler could not be certain which names are |
| 36 | local and which are global. In Python 2.1 this construct causes warnings, and |
| 37 | sometimes even errors. |
| 38 | |
| 39 | |
| 40 | At Module Level |
| 41 | ^^^^^^^^^^^^^^^ |
| 42 | |
| 43 | While it is valid to use ``from module import *`` at module level it is usually |
| 44 | a bad idea. For one, this loses an important property Python otherwise has --- |
| 45 | you can know where each toplevel name is defined by a simple "search" function |
| 46 | in your favourite editor. You also open yourself to trouble in the future, if |
| 47 | some module grows additional functions or classes. |
| 48 | |
| 49 | One of the most awful question asked on the newsgroup is why this code:: |
| 50 | |
| 51 | f = open("www") |
| 52 | f.read() |
| 53 | |
| 54 | does not work. Of course, it works just fine (assuming you have a file called |
Georg Brandl | 6f82cd3 | 2010-02-06 18:44:44 +0000 | [diff] [blame] | 55 | "www".) But it does not work if somewhere in the module, the statement ``from |
| 56 | os import *`` is present. The :mod:`os` module has a function called |
| 57 | :func:`open` which returns an integer. While it is very useful, shadowing a |
| 58 | builtin is one of its least useful properties. |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 59 | |
| 60 | Remember, you can never know for sure what names a module exports, so either |
| 61 | take what you need --- ``from module import name1, name2``, or keep them in the |
| 62 | module and access on a per-need basis --- ``import module;print module.name``. |
| 63 | |
| 64 | |
| 65 | When It Is Just Fine |
| 66 | ^^^^^^^^^^^^^^^^^^^^ |
| 67 | |
| 68 | There are situations in which ``from module import *`` is just fine: |
| 69 | |
| 70 | * The interactive prompt. For example, ``from math import *`` makes Python an |
| 71 | amazing scientific calculator. |
| 72 | |
| 73 | * When extending a module in C with a module in Python. |
| 74 | |
| 75 | * When the module advertises itself as ``from import *`` safe. |
| 76 | |
| 77 | |
| 78 | Unadorned :keyword:`exec`, :func:`execfile` and friends |
| 79 | ------------------------------------------------------- |
| 80 | |
| 81 | The word "unadorned" refers to the use without an explicit dictionary, in which |
| 82 | case those constructs evaluate code in the *current* environment. This is |
| 83 | dangerous for the same reasons ``from import *`` is dangerous --- it might step |
| 84 | over variables you are counting on and mess up things for the rest of your code. |
| 85 | Simply do not do that. |
| 86 | |
| 87 | Bad examples:: |
| 88 | |
| 89 | >>> for name in sys.argv[1:]: |
| 90 | >>> exec "%s=1" % name |
| 91 | >>> def func(s, **kw): |
| 92 | >>> for var, val in kw.items(): |
| 93 | >>> exec "s.%s=val" % var # invalid! |
| 94 | >>> execfile("handler.py") |
| 95 | >>> handle() |
| 96 | |
| 97 | Good examples:: |
| 98 | |
| 99 | >>> d = {} |
| 100 | >>> for name in sys.argv[1:]: |
| 101 | >>> d[name] = 1 |
| 102 | >>> def func(s, **kw): |
| 103 | >>> for var, val in kw.items(): |
| 104 | >>> setattr(s, var, val) |
| 105 | >>> d={} |
| 106 | >>> execfile("handle.py", d, d) |
| 107 | >>> handle = d['handle'] |
| 108 | >>> handle() |
| 109 | |
| 110 | |
| 111 | from module import name1, name2 |
| 112 | ------------------------------- |
| 113 | |
Georg Brandl | bc4af35 | 2009-05-22 10:40:00 +0000 | [diff] [blame] | 114 | This is a "don't" which is much weaker than the previous "don't"s but is still |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 115 | something you should not do if you don't have good reasons to do that. The |
| 116 | reason it is usually bad idea is because you suddenly have an object which lives |
Georg Brandl | 907a720 | 2008-02-22 12:31:45 +0000 | [diff] [blame] | 117 | in two separate namespaces. When the binding in one namespace changes, the |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 118 | binding in the other will not, so there will be a discrepancy between them. This |
| 119 | happens when, for example, one module is reloaded, or changes the definition of |
| 120 | a function at runtime. |
| 121 | |
| 122 | Bad example:: |
| 123 | |
| 124 | # foo.py |
| 125 | a = 1 |
| 126 | |
| 127 | # bar.py |
| 128 | from foo import a |
| 129 | if something(): |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 130 | a = 2 # danger: foo.a != a |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 131 | |
| 132 | Good example:: |
| 133 | |
| 134 | # foo.py |
| 135 | a = 1 |
| 136 | |
| 137 | # bar.py |
| 138 | import foo |
| 139 | if something(): |
| 140 | foo.a = 2 |
| 141 | |
| 142 | |
| 143 | except: |
| 144 | ------- |
| 145 | |
| 146 | Python has the ``except:`` clause, which catches all exceptions. Since *every* |
R. David Murray | a9b14a6 | 2010-09-11 18:57:20 +0000 | [diff] [blame] | 147 | error in Python raises an exception, using ``except:`` can make many |
| 148 | programming errors look like runtime problems, which hinders the debugging |
| 149 | process. |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 150 | |
R. David Murray | a9b14a6 | 2010-09-11 18:57:20 +0000 | [diff] [blame] | 151 | The following code shows a great example of why this is bad:: |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 152 | |
| 153 | try: |
| 154 | foo = opne("file") # misspelled "open" |
| 155 | except: |
| 156 | sys.exit("could not open file!") |
| 157 | |
R. David Murray | a9b14a6 | 2010-09-11 18:57:20 +0000 | [diff] [blame] | 158 | The second line triggers a :exc:`NameError`, which is caught by the except |
| 159 | clause. The program will exit, and the error message the program prints will |
| 160 | make you think the problem is the readability of ``"file"`` when in fact |
| 161 | the real error has nothing to do with ``"file"``. |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 162 | |
R. David Murray | a9b14a6 | 2010-09-11 18:57:20 +0000 | [diff] [blame] | 163 | A better way to write the above is :: |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 164 | |
| 165 | try: |
R. David Murray | a9b14a6 | 2010-09-11 18:57:20 +0000 | [diff] [blame] | 166 | foo = opne("file") |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 167 | except IOError: |
| 168 | sys.exit("could not open file") |
| 169 | |
R. David Murray | a9b14a6 | 2010-09-11 18:57:20 +0000 | [diff] [blame] | 170 | When this is run, Python will produce a traceback showing the :exc:`NameError`, |
| 171 | and it will be immediately apparent what needs to be fixed. |
| 172 | |
| 173 | .. index:: bare except, except; bare |
| 174 | |
| 175 | Because ``except:`` catches *all* exceptions, including :exc:`SystemExit`, |
| 176 | :exc:`KeyboardInterrupt`, and :exc:`GeneratorExit` (which is not an error and |
| 177 | should not normally be caught by user code), using a bare ``except:`` is almost |
| 178 | never a good idea. In situations where you need to catch all "normal" errors, |
| 179 | such as in a framework that runs callbacks, you can catch the base class for |
R. David Murray | a1f7481 | 2010-09-11 19:15:40 +0000 | [diff] [blame] | 180 | all normal exceptions, :exc:`Exception`. Unfortunately in Python 2.x it is |
R. David Murray | a9b14a6 | 2010-09-11 18:57:20 +0000 | [diff] [blame] | 181 | possible for third-party code to raise exceptions that do not inherit from |
| 182 | :exc:`Exception`, so in Python 2.x there are some cases where you may have to |
| 183 | use a bare ``except:`` and manually re-raise the exceptions you don't want |
| 184 | to catch. |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 185 | |
| 186 | |
| 187 | Exceptions |
| 188 | ========== |
| 189 | |
| 190 | Exceptions are a useful feature of Python. You should learn to raise them |
| 191 | whenever something unexpected occurs, and catch them only where you can do |
| 192 | something about them. |
| 193 | |
| 194 | The following is a very popular anti-idiom :: |
| 195 | |
| 196 | def get_status(file): |
| 197 | if not os.path.exists(file): |
| 198 | print "file not found" |
| 199 | sys.exit(1) |
| 200 | return open(file).readline() |
| 201 | |
R. David Murray | a9b14a6 | 2010-09-11 18:57:20 +0000 | [diff] [blame] | 202 | Consider the case where the file gets deleted between the time the call to |
| 203 | :func:`os.path.exists` is made and the time :func:`open` is called. In that |
| 204 | case the last line will raise an :exc:`IOError`. The same thing would happen |
| 205 | if *file* exists but has no read permission. Since testing this on a normal |
| 206 | machine on existent and non-existent files makes it seem bugless, the test |
| 207 | results will seem fine, and the code will get shipped. Later an unhandled |
| 208 | :exc:`IOError` (or perhaps some other :exc:`EnvironmentError`) escapes to the |
| 209 | user, who gets to watch the ugly traceback. |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 210 | |
R. David Murray | a9b14a6 | 2010-09-11 18:57:20 +0000 | [diff] [blame] | 211 | Here is a somewhat better way to do it. :: |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 212 | |
| 213 | def get_status(file): |
| 214 | try: |
| 215 | return open(file).readline() |
R. David Murray | a9b14a6 | 2010-09-11 18:57:20 +0000 | [diff] [blame] | 216 | except EnvironmentError as err: |
| 217 | print "Unable to open file: {}".format(err) |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 218 | sys.exit(1) |
| 219 | |
R. David Murray | a9b14a6 | 2010-09-11 18:57:20 +0000 | [diff] [blame] | 220 | In this version, *either* the file gets opened and the line is read (so it |
| 221 | works even on flaky NFS or SMB connections), or an error message is printed |
| 222 | that provides all the available information on why the open failed, and the |
| 223 | application is aborted. |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 224 | |
R. David Murray | a9b14a6 | 2010-09-11 18:57:20 +0000 | [diff] [blame] | 225 | However, even this version of :func:`get_status` makes too many assumptions --- |
| 226 | that it will only be used in a short running script, and not, say, in a long |
| 227 | running server. Sure, the caller could do something like :: |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 228 | |
| 229 | try: |
| 230 | status = get_status(log) |
| 231 | except SystemExit: |
| 232 | status = None |
| 233 | |
R. David Murray | a9b14a6 | 2010-09-11 18:57:20 +0000 | [diff] [blame] | 234 | But there is a better way. You should try to use as few ``except`` clauses in |
| 235 | your code as you can --- the ones you do use will usually be inside calls which |
| 236 | should always succeed, or a catch-all in a main function. |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 237 | |
R. David Murray | a9b14a6 | 2010-09-11 18:57:20 +0000 | [diff] [blame] | 238 | So, an even better version of :func:`get_status()` is probably :: |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 239 | |
| 240 | def get_status(file): |
| 241 | return open(file).readline() |
| 242 | |
R. David Murray | a9b14a6 | 2010-09-11 18:57:20 +0000 | [diff] [blame] | 243 | The caller can deal with the exception if it wants (for example, if it tries |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 244 | several files in a loop), or just let the exception filter upwards to *its* |
| 245 | caller. |
| 246 | |
R. David Murray | a9b14a6 | 2010-09-11 18:57:20 +0000 | [diff] [blame] | 247 | But the last version still has a serious problem --- due to implementation |
| 248 | details in CPython, the file would not be closed when an exception is raised |
| 249 | until the exception handler finishes; and, worse, in other implementations |
| 250 | (e.g., Jython) it might not be closed at all regardless of whether or not |
| 251 | an exception is raised. |
| 252 | |
| 253 | The best version of this function uses the ``open()`` call as a context |
| 254 | manager, which will ensure that the file gets closed as soon as the |
| 255 | function returns:: |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 256 | |
| 257 | def get_status(file): |
Georg Brandl | 0b093e0 | 2010-04-25 10:17:27 +0000 | [diff] [blame] | 258 | with open(file) as fp: |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 259 | return fp.readline() |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 260 | |
| 261 | |
| 262 | Using the Batteries |
| 263 | =================== |
| 264 | |
| 265 | Every so often, people seem to be writing stuff in the Python library again, |
| 266 | usually poorly. While the occasional module has a poor interface, it is usually |
| 267 | much better to use the rich standard library and data types that come with |
Georg Brandl | bc4af35 | 2009-05-22 10:40:00 +0000 | [diff] [blame] | 268 | Python than inventing your own. |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 269 | |
| 270 | A useful module very few people know about is :mod:`os.path`. It always has the |
| 271 | correct path arithmetic for your operating system, and will usually be much |
Georg Brandl | bc4af35 | 2009-05-22 10:40:00 +0000 | [diff] [blame] | 272 | better than whatever you come up with yourself. |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 273 | |
| 274 | Compare:: |
| 275 | |
| 276 | # ugh! |
| 277 | return dir+"/"+file |
| 278 | # better |
| 279 | return os.path.join(dir, file) |
| 280 | |
| 281 | More useful functions in :mod:`os.path`: :func:`basename`, :func:`dirname` and |
| 282 | :func:`splitext`. |
| 283 | |
Raymond Hettinger | 48d1928 | 2010-10-31 22:00:27 +0000 | [diff] [blame] | 284 | There are also many useful built-in functions people seem not to be aware of |
| 285 | for some reason: :func:`min` and :func:`max` can find the minimum/maximum of |
| 286 | any sequence with comparable semantics, for example, yet many people write |
| 287 | their own :func:`max`/:func:`min`. Another highly useful function is |
| 288 | :func:`reduce` which can be used to repeatly apply a binary operation to a |
| 289 | sequence, reducing it to a single value. For example, compute a factorial |
| 290 | with a series of multiply operations:: |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 291 | |
Raymond Hettinger | 48d1928 | 2010-10-31 22:00:27 +0000 | [diff] [blame] | 292 | >>> n = 4 |
| 293 | >>> import operator |
| 294 | >>> reduce(operator.mul, range(1, n+1)) |
| 295 | 24 |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 296 | |
Raymond Hettinger | 48d1928 | 2010-10-31 22:00:27 +0000 | [diff] [blame] | 297 | When it comes to parsing numbers, note that :func:`float`, :func:`int` and |
| 298 | :func:`long` all accept string arguments and will reject ill-formed strings |
| 299 | by raising an :exc:`ValueError`. |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 300 | |
| 301 | |
| 302 | Using Backslash to Continue Statements |
| 303 | ====================================== |
| 304 | |
| 305 | Since Python treats a newline as a statement terminator, and since statements |
Georg Brandl | bc4af35 | 2009-05-22 10:40:00 +0000 | [diff] [blame] | 306 | are often more than is comfortable to put in one line, many people do:: |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 307 | |
| 308 | if foo.bar()['first'][0] == baz.quux(1, 2)[5:9] and \ |
| 309 | calculate_number(10, 20) != forbulate(500, 360): |
| 310 | pass |
| 311 | |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 312 | You should realize that this is dangerous: a stray space after the ``\`` would |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 313 | make this line wrong, and stray spaces are notoriously hard to see in editors. |
| 314 | In this case, at least it would be a syntax error, but if the code was:: |
| 315 | |
| 316 | value = foo.bar()['first'][0]*baz.quux(1, 2)[5:9] \ |
| 317 | + calculate_number(10, 20)*forbulate(500, 360) |
| 318 | |
| 319 | then it would just be subtly wrong. |
| 320 | |
| 321 | It is usually much better to use the implicit continuation inside parenthesis: |
| 322 | |
| 323 | This version is bulletproof:: |
| 324 | |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 325 | value = (foo.bar()['first'][0]*baz.quux(1, 2)[5:9] |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 326 | + calculate_number(10, 20)*forbulate(500, 360)) |
| 327 | |