Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 1 | ************************************ |
| 2 | Idioms and Anti-Idioms in Python |
| 3 | ************************************ |
| 4 | |
| 5 | :Author: Moshe Zadka |
| 6 | |
| 7 | This document is placed in the public doman. |
| 8 | |
| 9 | |
| 10 | .. topic:: Abstract |
| 11 | |
| 12 | This document can be considered a companion to the tutorial. It shows how to use |
| 13 | Python, and even more importantly, how *not* to use Python. |
| 14 | |
| 15 | |
| 16 | Language Constructs You Should Not Use |
| 17 | ====================================== |
| 18 | |
| 19 | While Python has relatively few gotchas compared to other languages, it still |
| 20 | has some constructs which are only useful in corner cases, or are plain |
| 21 | dangerous. |
| 22 | |
| 23 | |
| 24 | from module import \* |
| 25 | --------------------- |
| 26 | |
| 27 | |
| 28 | Inside Function Definitions |
| 29 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 30 | |
| 31 | ``from module import *`` is *invalid* inside function definitions. While many |
| 32 | versions of Python do not check for the invalidity, it does not make it more |
| 33 | valid, no more then having a smart lawyer makes a man innocent. Do not use it |
| 34 | like that ever. Even in versions where it was accepted, it made the function |
| 35 | execution slower, because the compiler could not be certain which names are |
| 36 | local and which are global. In Python 2.1 this construct causes warnings, and |
| 37 | sometimes even errors. |
| 38 | |
| 39 | |
| 40 | At Module Level |
| 41 | ^^^^^^^^^^^^^^^ |
| 42 | |
| 43 | While it is valid to use ``from module import *`` at module level it is usually |
| 44 | a bad idea. For one, this loses an important property Python otherwise has --- |
| 45 | you can know where each toplevel name is defined by a simple "search" function |
| 46 | in your favourite editor. You also open yourself to trouble in the future, if |
| 47 | some module grows additional functions or classes. |
| 48 | |
| 49 | One of the most awful question asked on the newsgroup is why this code:: |
| 50 | |
| 51 | f = open("www") |
| 52 | f.read() |
| 53 | |
| 54 | does not work. Of course, it works just fine (assuming you have a file called |
| 55 | "www".) But it does not work if somewhere in the module, the statement ``from os |
| 56 | import *`` is present. The :mod:`os` module has a function called :func:`open` |
| 57 | which returns an integer. While it is very useful, shadowing builtins is one of |
| 58 | its least useful properties. |
| 59 | |
| 60 | Remember, you can never know for sure what names a module exports, so either |
| 61 | take what you need --- ``from module import name1, name2``, or keep them in the |
Georg Brandl | 6911e3c | 2007-09-04 07:15:32 +0000 | [diff] [blame] | 62 | module and access on a per-need basis --- ``import module; print(module.name)``. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 63 | |
| 64 | |
| 65 | When It Is Just Fine |
| 66 | ^^^^^^^^^^^^^^^^^^^^ |
| 67 | |
| 68 | There are situations in which ``from module import *`` is just fine: |
| 69 | |
| 70 | * The interactive prompt. For example, ``from math import *`` makes Python an |
| 71 | amazing scientific calculator. |
| 72 | |
| 73 | * When extending a module in C with a module in Python. |
| 74 | |
| 75 | * When the module advertises itself as ``from import *`` safe. |
| 76 | |
| 77 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 78 | from module import name1, name2 |
| 79 | ------------------------------- |
| 80 | |
| 81 | This is a "don't" which is much weaker then the previous "don't"s but is still |
| 82 | something you should not do if you don't have good reasons to do that. The |
| 83 | reason it is usually bad idea is because you suddenly have an object which lives |
Christian Heimes | c3f30c4 | 2008-02-22 16:37:40 +0000 | [diff] [blame^] | 84 | in two separate namespaces. When the binding in one namespace changes, the |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 85 | binding in the other will not, so there will be a discrepancy between them. This |
| 86 | happens when, for example, one module is reloaded, or changes the definition of |
| 87 | a function at runtime. |
| 88 | |
| 89 | Bad example:: |
| 90 | |
| 91 | # foo.py |
| 92 | a = 1 |
| 93 | |
| 94 | # bar.py |
| 95 | from foo import a |
| 96 | if something(): |
| 97 | a = 2 # danger: foo.a != a |
| 98 | |
| 99 | Good example:: |
| 100 | |
| 101 | # foo.py |
| 102 | a = 1 |
| 103 | |
| 104 | # bar.py |
| 105 | import foo |
| 106 | if something(): |
| 107 | foo.a = 2 |
| 108 | |
| 109 | |
| 110 | except: |
| 111 | ------- |
| 112 | |
| 113 | Python has the ``except:`` clause, which catches all exceptions. Since *every* |
| 114 | error in Python raises an exception, this makes many programming errors look |
| 115 | like runtime problems, and hinders the debugging process. |
| 116 | |
| 117 | The following code shows a great example:: |
| 118 | |
| 119 | try: |
| 120 | foo = opne("file") # misspelled "open" |
| 121 | except: |
| 122 | sys.exit("could not open file!") |
| 123 | |
| 124 | The second line triggers a :exc:`NameError` which is caught by the except |
| 125 | clause. The program will exit, and you will have no idea that this has nothing |
| 126 | to do with the readability of ``"file"``. |
| 127 | |
| 128 | The example above is better written :: |
| 129 | |
| 130 | try: |
| 131 | foo = opne("file") # will be changed to "open" as soon as we run it |
| 132 | except IOError: |
| 133 | sys.exit("could not open file") |
| 134 | |
| 135 | There are some situations in which the ``except:`` clause is useful: for |
| 136 | example, in a framework when running callbacks, it is good not to let any |
| 137 | callback disturb the framework. |
| 138 | |
| 139 | |
| 140 | Exceptions |
| 141 | ========== |
| 142 | |
| 143 | Exceptions are a useful feature of Python. You should learn to raise them |
| 144 | whenever something unexpected occurs, and catch them only where you can do |
| 145 | something about them. |
| 146 | |
| 147 | The following is a very popular anti-idiom :: |
| 148 | |
| 149 | def get_status(file): |
| 150 | if not os.path.exists(file): |
Georg Brandl | 6911e3c | 2007-09-04 07:15:32 +0000 | [diff] [blame] | 151 | print("file not found") |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 152 | sys.exit(1) |
| 153 | return open(file).readline() |
| 154 | |
| 155 | Consider the case the file gets deleted between the time the call to |
| 156 | :func:`os.path.exists` is made and the time :func:`open` is called. That means |
| 157 | the last line will throw an :exc:`IOError`. The same would happen if *file* |
| 158 | exists but has no read permission. Since testing this on a normal machine on |
| 159 | existing and non-existing files make it seem bugless, that means in testing the |
| 160 | results will seem fine, and the code will get shipped. Then an unhandled |
| 161 | :exc:`IOError` escapes to the user, who has to watch the ugly traceback. |
| 162 | |
| 163 | Here is a better way to do it. :: |
| 164 | |
| 165 | def get_status(file): |
| 166 | try: |
| 167 | return open(file).readline() |
| 168 | except (IOError, OSError): |
Georg Brandl | 6911e3c | 2007-09-04 07:15:32 +0000 | [diff] [blame] | 169 | print("file not found") |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 170 | sys.exit(1) |
| 171 | |
| 172 | In this version, \*either\* the file gets opened and the line is read (so it |
| 173 | works even on flaky NFS or SMB connections), or the message is printed and the |
| 174 | application aborted. |
| 175 | |
| 176 | Still, :func:`get_status` makes too many assumptions --- that it will only be |
| 177 | used in a short running script, and not, say, in a long running server. Sure, |
| 178 | the caller could do something like :: |
| 179 | |
| 180 | try: |
| 181 | status = get_status(log) |
| 182 | except SystemExit: |
| 183 | status = None |
| 184 | |
| 185 | So, try to make as few ``except`` clauses in your code --- those will usually be |
| 186 | a catch-all in the :func:`main`, or inside calls which should always succeed. |
| 187 | |
| 188 | So, the best version is probably :: |
| 189 | |
| 190 | def get_status(file): |
| 191 | return open(file).readline() |
| 192 | |
| 193 | The caller can deal with the exception if it wants (for example, if it tries |
| 194 | several files in a loop), or just let the exception filter upwards to *its* |
| 195 | caller. |
| 196 | |
| 197 | The last version is not very good either --- due to implementation details, the |
| 198 | file would not be closed when an exception is raised until the handler finishes, |
| 199 | and perhaps not at all in non-C implementations (e.g., Jython). :: |
| 200 | |
| 201 | def get_status(file): |
| 202 | fp = open(file) |
| 203 | try: |
| 204 | return fp.readline() |
| 205 | finally: |
| 206 | fp.close() |
| 207 | |
| 208 | |
| 209 | Using the Batteries |
| 210 | =================== |
| 211 | |
| 212 | Every so often, people seem to be writing stuff in the Python library again, |
| 213 | usually poorly. While the occasional module has a poor interface, it is usually |
| 214 | much better to use the rich standard library and data types that come with |
| 215 | Python then inventing your own. |
| 216 | |
| 217 | A useful module very few people know about is :mod:`os.path`. It always has the |
| 218 | correct path arithmetic for your operating system, and will usually be much |
| 219 | better then whatever you come up with yourself. |
| 220 | |
| 221 | Compare:: |
| 222 | |
| 223 | # ugh! |
| 224 | return dir+"/"+file |
| 225 | # better |
| 226 | return os.path.join(dir, file) |
| 227 | |
| 228 | More useful functions in :mod:`os.path`: :func:`basename`, :func:`dirname` and |
| 229 | :func:`splitext`. |
| 230 | |
| 231 | There are also many useful builtin functions people seem not to be aware of for |
| 232 | some reason: :func:`min` and :func:`max` can find the minimum/maximum of any |
| 233 | sequence with comparable semantics, for example, yet many people write their own |
Georg Brandl | 6911e3c | 2007-09-04 07:15:32 +0000 | [diff] [blame] | 234 | :func:`max`/:func:`min`. Another highly useful function is |
| 235 | :func:`functools.reduce`. A classical use of :func:`reduce` is something like |
| 236 | :: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 237 | |
Georg Brandl | 6911e3c | 2007-09-04 07:15:32 +0000 | [diff] [blame] | 238 | import sys, operator, functools |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 239 | nums = map(float, sys.argv[1:]) |
Georg Brandl | 6911e3c | 2007-09-04 07:15:32 +0000 | [diff] [blame] | 240 | print(functools.reduce(operator.add, nums) / len(nums)) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 241 | |
| 242 | This cute little script prints the average of all numbers given on the command |
| 243 | line. The :func:`reduce` adds up all the numbers, and the rest is just some |
| 244 | pre- and postprocessing. |
| 245 | |
Georg Brandl | 5c10664 | 2007-11-29 17:41:05 +0000 | [diff] [blame] | 246 | On the same note, note that :func:`float` and :func:`int` accept arguments of |
| 247 | type string, and so are suited to parsing --- assuming you are ready to deal |
| 248 | with the :exc:`ValueError` they raise. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 249 | |
| 250 | |
| 251 | Using Backslash to Continue Statements |
| 252 | ====================================== |
| 253 | |
| 254 | Since Python treats a newline as a statement terminator, and since statements |
| 255 | are often more then is comfortable to put in one line, many people do:: |
| 256 | |
| 257 | if foo.bar()['first'][0] == baz.quux(1, 2)[5:9] and \ |
| 258 | calculate_number(10, 20) != forbulate(500, 360): |
| 259 | pass |
| 260 | |
Christian Heimes | 5b5e81c | 2007-12-31 16:14:33 +0000 | [diff] [blame] | 261 | You should realize that this is dangerous: a stray space after the ``\`` would |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 262 | make this line wrong, and stray spaces are notoriously hard to see in editors. |
| 263 | In this case, at least it would be a syntax error, but if the code was:: |
| 264 | |
| 265 | value = foo.bar()['first'][0]*baz.quux(1, 2)[5:9] \ |
| 266 | + calculate_number(10, 20)*forbulate(500, 360) |
| 267 | |
| 268 | then it would just be subtly wrong. |
| 269 | |
| 270 | It is usually much better to use the implicit continuation inside parenthesis: |
| 271 | |
| 272 | This version is bulletproof:: |
| 273 | |
| 274 | value = (foo.bar()['first'][0]*baz.quux(1, 2)[5:9] |
| 275 | + calculate_number(10, 20)*forbulate(500, 360)) |
| 276 | |