Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 1 | ************************************ |
Georg Brandl | 48310cd | 2009-01-03 21:18:54 +0000 | [diff] [blame] | 2 | Idioms and Anti-Idioms in Python |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 3 | ************************************ |
| 4 | |
| 5 | :Author: Moshe Zadka |
| 6 | |
Christian Heimes | dae2a89 | 2008-04-19 00:55:37 +0000 | [diff] [blame] | 7 | This document is placed in the public domain. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 8 | |
| 9 | |
| 10 | .. topic:: Abstract |
| 11 | |
| 12 | This document can be considered a companion to the tutorial. It shows how to use |
| 13 | Python, and even more importantly, how *not* to use Python. |
| 14 | |
| 15 | |
| 16 | Language Constructs You Should Not Use |
| 17 | ====================================== |
| 18 | |
| 19 | While Python has relatively few gotchas compared to other languages, it still |
| 20 | has some constructs which are only useful in corner cases, or are plain |
| 21 | dangerous. |
| 22 | |
| 23 | |
| 24 | from module import \* |
| 25 | --------------------- |
| 26 | |
| 27 | |
| 28 | Inside Function Definitions |
| 29 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 30 | |
| 31 | ``from module import *`` is *invalid* inside function definitions. While many |
| 32 | versions of Python do not check for the invalidity, it does not make it more |
Georg Brandl | fe5f409 | 2009-05-22 10:44:31 +0000 | [diff] [blame] | 33 | valid, no more than having a smart lawyer makes a man innocent. Do not use it |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 34 | like that ever. Even in versions where it was accepted, it made the function |
| 35 | execution slower, because the compiler could not be certain which names are |
| 36 | local and which are global. In Python 2.1 this construct causes warnings, and |
| 37 | sometimes even errors. |
| 38 | |
| 39 | |
| 40 | At Module Level |
| 41 | ^^^^^^^^^^^^^^^ |
| 42 | |
| 43 | While it is valid to use ``from module import *`` at module level it is usually |
| 44 | a bad idea. For one, this loses an important property Python otherwise has --- |
| 45 | you can know where each toplevel name is defined by a simple "search" function |
| 46 | in your favourite editor. You also open yourself to trouble in the future, if |
| 47 | some module grows additional functions or classes. |
| 48 | |
| 49 | One of the most awful question asked on the newsgroup is why this code:: |
| 50 | |
| 51 | f = open("www") |
| 52 | f.read() |
| 53 | |
| 54 | does not work. Of course, it works just fine (assuming you have a file called |
Georg Brandl | c4a55fc | 2010-02-06 18:46:57 +0000 | [diff] [blame] | 55 | "www".) But it does not work if somewhere in the module, the statement ``from |
| 56 | os import *`` is present. The :mod:`os` module has a function called |
| 57 | :func:`open` which returns an integer. While it is very useful, shadowing a |
| 58 | builtin is one of its least useful properties. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 59 | |
| 60 | Remember, you can never know for sure what names a module exports, so either |
| 61 | take what you need --- ``from module import name1, name2``, or keep them in the |
Georg Brandl | 6911e3c | 2007-09-04 07:15:32 +0000 | [diff] [blame] | 62 | module and access on a per-need basis --- ``import module; print(module.name)``. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 63 | |
| 64 | |
| 65 | When It Is Just Fine |
| 66 | ^^^^^^^^^^^^^^^^^^^^ |
| 67 | |
| 68 | There are situations in which ``from module import *`` is just fine: |
| 69 | |
| 70 | * The interactive prompt. For example, ``from math import *`` makes Python an |
| 71 | amazing scientific calculator. |
| 72 | |
| 73 | * When extending a module in C with a module in Python. |
| 74 | |
| 75 | * When the module advertises itself as ``from import *`` safe. |
| 76 | |
| 77 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 78 | from module import name1, name2 |
| 79 | ------------------------------- |
| 80 | |
Georg Brandl | fe5f409 | 2009-05-22 10:44:31 +0000 | [diff] [blame] | 81 | This is a "don't" which is much weaker than the previous "don't"s but is still |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 82 | something you should not do if you don't have good reasons to do that. The |
| 83 | reason it is usually bad idea is because you suddenly have an object which lives |
Christian Heimes | c3f30c4 | 2008-02-22 16:37:40 +0000 | [diff] [blame] | 84 | in two separate namespaces. When the binding in one namespace changes, the |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 85 | binding in the other will not, so there will be a discrepancy between them. This |
| 86 | happens when, for example, one module is reloaded, or changes the definition of |
| 87 | a function at runtime. |
| 88 | |
| 89 | Bad example:: |
| 90 | |
| 91 | # foo.py |
| 92 | a = 1 |
| 93 | |
| 94 | # bar.py |
| 95 | from foo import a |
| 96 | if something(): |
Georg Brandl | 48310cd | 2009-01-03 21:18:54 +0000 | [diff] [blame] | 97 | a = 2 # danger: foo.a != a |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 98 | |
| 99 | Good example:: |
| 100 | |
| 101 | # foo.py |
| 102 | a = 1 |
| 103 | |
| 104 | # bar.py |
| 105 | import foo |
| 106 | if something(): |
| 107 | foo.a = 2 |
| 108 | |
| 109 | |
| 110 | except: |
| 111 | ------- |
| 112 | |
| 113 | Python has the ``except:`` clause, which catches all exceptions. Since *every* |
R. David Murray | 44ef774 | 2010-09-11 18:12:25 +0000 | [diff] [blame] | 114 | error in Python raises an exception, using ``except:`` can make many |
| 115 | programming errors look like runtime problems, which hinders the debugging |
| 116 | process. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 117 | |
R. David Murray | 44ef774 | 2010-09-11 18:12:25 +0000 | [diff] [blame] | 118 | The following code shows a great example of why this is bad:: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 119 | |
| 120 | try: |
| 121 | foo = opne("file") # misspelled "open" |
| 122 | except: |
| 123 | sys.exit("could not open file!") |
| 124 | |
R. David Murray | 44ef774 | 2010-09-11 18:12:25 +0000 | [diff] [blame] | 125 | The second line triggers a :exc:`NameError`, which is caught by the except |
| 126 | clause. The program will exit, and the error message the program prints will |
| 127 | make you think the problem is the readability of ``"file"`` when in fact |
| 128 | the real error has nothing to do with ``"file"``. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 129 | |
R. David Murray | 44ef774 | 2010-09-11 18:12:25 +0000 | [diff] [blame] | 130 | A better way to write the above is :: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 131 | |
| 132 | try: |
R. David Murray | 44ef774 | 2010-09-11 18:12:25 +0000 | [diff] [blame] | 133 | foo = opne("file") |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 134 | except IOError: |
| 135 | sys.exit("could not open file") |
| 136 | |
R. David Murray | 44ef774 | 2010-09-11 18:12:25 +0000 | [diff] [blame] | 137 | When this is run, Python will produce a traceback showing the :exc:`NameError`, |
| 138 | and it will be immediately apparent what needs to be fixed. |
| 139 | |
| 140 | .. index:: bare except, except; bare |
| 141 | |
| 142 | Because ``except:`` catches *all* exceptions, including :exc:`SystemExit`, |
| 143 | :exc:`KeyboardInterrupt`, and :exc:`GeneratorExit` (which is not an error and |
| 144 | should not normally be caught by user code), using a bare ``except:`` is almost |
| 145 | never a good idea. In situations where you need to catch all "normal" errors, |
| 146 | such as in a framework that runs callbacks, you can catch the base class for |
| 147 | all normal exceptions, :exc:`Exception`. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 148 | |
| 149 | |
| 150 | Exceptions |
| 151 | ========== |
| 152 | |
| 153 | Exceptions are a useful feature of Python. You should learn to raise them |
| 154 | whenever something unexpected occurs, and catch them only where you can do |
| 155 | something about them. |
| 156 | |
| 157 | The following is a very popular anti-idiom :: |
| 158 | |
| 159 | def get_status(file): |
| 160 | if not os.path.exists(file): |
Georg Brandl | 6911e3c | 2007-09-04 07:15:32 +0000 | [diff] [blame] | 161 | print("file not found") |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 162 | sys.exit(1) |
| 163 | return open(file).readline() |
| 164 | |
R. David Murray | 44ef774 | 2010-09-11 18:12:25 +0000 | [diff] [blame] | 165 | Consider the case where the file gets deleted between the time the call to |
| 166 | :func:`os.path.exists` is made and the time :func:`open` is called. In that |
| 167 | case the last line will raise an :exc:`IOError`. The same thing would happen |
| 168 | if *file* exists but has no read permission. Since testing this on a normal |
| 169 | machine on existent and non-existent files makes it seem bugless, the test |
| 170 | results will seem fine, and the code will get shipped. Later an unhandled |
| 171 | :exc:`IOError` (or perhaps some other :exc:`EnvironmentError`) escapes to the |
| 172 | user, who gets to watch the ugly traceback. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 173 | |
R. David Murray | 44ef774 | 2010-09-11 18:12:25 +0000 | [diff] [blame] | 174 | Here is a somewhat better way to do it. :: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 175 | |
| 176 | def get_status(file): |
| 177 | try: |
| 178 | return open(file).readline() |
R. David Murray | 44ef774 | 2010-09-11 18:12:25 +0000 | [diff] [blame] | 179 | except EnvironmentError as err: |
| 180 | print("Unable to open file: {}".format(err)) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 181 | sys.exit(1) |
| 182 | |
R. David Murray | 44ef774 | 2010-09-11 18:12:25 +0000 | [diff] [blame] | 183 | In this version, *either* the file gets opened and the line is read (so it |
| 184 | works even on flaky NFS or SMB connections), or an error message is printed |
| 185 | that provides all the available information on why the open failed, and the |
| 186 | application is aborted. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 187 | |
R. David Murray | 44ef774 | 2010-09-11 18:12:25 +0000 | [diff] [blame] | 188 | However, even this version of :func:`get_status` makes too many assumptions --- |
| 189 | that it will only be used in a short running script, and not, say, in a long |
| 190 | running server. Sure, the caller could do something like :: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 191 | |
| 192 | try: |
| 193 | status = get_status(log) |
| 194 | except SystemExit: |
| 195 | status = None |
| 196 | |
R. David Murray | 44ef774 | 2010-09-11 18:12:25 +0000 | [diff] [blame] | 197 | But there is a better way. You should try to use as few ``except`` clauses in |
| 198 | your code as you can --- the ones you do use will usually be inside calls which |
| 199 | should always succeed, or a catch-all in a main function. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 200 | |
R. David Murray | 44ef774 | 2010-09-11 18:12:25 +0000 | [diff] [blame] | 201 | So, an even better version of :func:`get_status()` is probably :: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 202 | |
| 203 | def get_status(file): |
| 204 | return open(file).readline() |
| 205 | |
R. David Murray | 44ef774 | 2010-09-11 18:12:25 +0000 | [diff] [blame] | 206 | The caller can deal with the exception if it wants (for example, if it tries |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 207 | several files in a loop), or just let the exception filter upwards to *its* |
| 208 | caller. |
| 209 | |
R. David Murray | 44ef774 | 2010-09-11 18:12:25 +0000 | [diff] [blame] | 210 | But the last version still has a serious problem --- due to implementation |
| 211 | details in CPython, the file would not be closed when an exception is raised |
| 212 | until the exception handler finishes; and, worse, in other implementations |
| 213 | (e.g., Jython) it might not be closed at all regardless of whether or not |
| 214 | an exception is raised. |
| 215 | |
| 216 | The best version of this function uses the ``open()`` call as a context |
| 217 | manager, which will ensure that the file gets closed as soon as the |
| 218 | function returns:: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 219 | |
| 220 | def get_status(file): |
Georg Brandl | 386bc6d | 2010-04-25 10:19:53 +0000 | [diff] [blame] | 221 | with open(file) as fp: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 222 | return fp.readline() |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 223 | |
| 224 | |
| 225 | Using the Batteries |
| 226 | =================== |
| 227 | |
| 228 | Every so often, people seem to be writing stuff in the Python library again, |
| 229 | usually poorly. While the occasional module has a poor interface, it is usually |
| 230 | much better to use the rich standard library and data types that come with |
Georg Brandl | fe5f409 | 2009-05-22 10:44:31 +0000 | [diff] [blame] | 231 | Python than inventing your own. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 232 | |
| 233 | A useful module very few people know about is :mod:`os.path`. It always has the |
| 234 | correct path arithmetic for your operating system, and will usually be much |
Georg Brandl | fe5f409 | 2009-05-22 10:44:31 +0000 | [diff] [blame] | 235 | better than whatever you come up with yourself. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 236 | |
| 237 | Compare:: |
| 238 | |
| 239 | # ugh! |
| 240 | return dir+"/"+file |
| 241 | # better |
| 242 | return os.path.join(dir, file) |
| 243 | |
| 244 | More useful functions in :mod:`os.path`: :func:`basename`, :func:`dirname` and |
| 245 | :func:`splitext`. |
| 246 | |
Raymond Hettinger | 099cfed | 2010-10-31 22:00:50 +0000 | [diff] [blame] | 247 | There are also many useful built-in functions people seem not to be aware of |
| 248 | for some reason: :func:`min` and :func:`max` can find the minimum/maximum of |
| 249 | any sequence with comparable semantics, for example, yet many people write |
| 250 | their own :func:`max`/:func:`min`. Another highly useful function is |
| 251 | :func:`functools.reduce` which can be used to repeatly apply a binary |
| 252 | operation to a sequence, reducing it to a single value. For example, compute |
| 253 | a factorial with a series of multiply operations:: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 254 | |
Raymond Hettinger | 099cfed | 2010-10-31 22:00:50 +0000 | [diff] [blame] | 255 | >>> n = 4 |
| 256 | >>> import operator, functools |
| 257 | >>> functools.reduce(operator.mul, range(1, n+1)) |
| 258 | 24 |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 259 | |
Raymond Hettinger | 099cfed | 2010-10-31 22:00:50 +0000 | [diff] [blame] | 260 | When it comes to parsing numbers, note that :func:`float`, :func:`int` and |
| 261 | :func:`long` all accept string arguments and will reject ill-formed strings |
| 262 | by raising an :exc:`ValueError`. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 263 | |
| 264 | |
| 265 | Using Backslash to Continue Statements |
| 266 | ====================================== |
| 267 | |
| 268 | Since Python treats a newline as a statement terminator, and since statements |
Georg Brandl | fe5f409 | 2009-05-22 10:44:31 +0000 | [diff] [blame] | 269 | are often more than is comfortable to put in one line, many people do:: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 270 | |
| 271 | if foo.bar()['first'][0] == baz.quux(1, 2)[5:9] and \ |
| 272 | calculate_number(10, 20) != forbulate(500, 360): |
| 273 | pass |
| 274 | |
Christian Heimes | 5b5e81c | 2007-12-31 16:14:33 +0000 | [diff] [blame] | 275 | You should realize that this is dangerous: a stray space after the ``\`` would |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 276 | make this line wrong, and stray spaces are notoriously hard to see in editors. |
| 277 | In this case, at least it would be a syntax error, but if the code was:: |
| 278 | |
| 279 | value = foo.bar()['first'][0]*baz.quux(1, 2)[5:9] \ |
| 280 | + calculate_number(10, 20)*forbulate(500, 360) |
| 281 | |
| 282 | then it would just be subtly wrong. |
| 283 | |
| 284 | It is usually much better to use the implicit continuation inside parenthesis: |
| 285 | |
| 286 | This version is bulletproof:: |
| 287 | |
Georg Brandl | 48310cd | 2009-01-03 21:18:54 +0000 | [diff] [blame] | 288 | value = (foo.bar()['first'][0]*baz.quux(1, 2)[5:9] |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 289 | + calculate_number(10, 20)*forbulate(500, 360)) |
| 290 | |