Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1 | :tocdepth: 2 |
| 2 | |
| 3 | =============== |
| 4 | Programming FAQ |
| 5 | =============== |
| 6 | |
| 7 | .. contents:: |
| 8 | |
| 9 | General Questions |
| 10 | ================= |
| 11 | |
| 12 | Is there a source code level debugger with breakpoints, single-stepping, etc.? |
| 13 | ------------------------------------------------------------------------------ |
| 14 | |
| 15 | Yes. |
| 16 | |
| 17 | The pdb module is a simple but adequate console-mode debugger for Python. It is |
| 18 | part of the standard Python library, and is :mod:`documented in the Library |
| 19 | Reference Manual <pdb>`. You can also write your own debugger by using the code |
| 20 | for pdb as an example. |
| 21 | |
| 22 | The IDLE interactive development environment, which is part of the standard |
| 23 | Python distribution (normally available as Tools/scripts/idle), includes a |
| 24 | graphical debugger. There is documentation for the IDLE debugger at |
| 25 | http://www.python.org/idle/doc/idle2.html#Debugger. |
| 26 | |
| 27 | PythonWin is a Python IDE that includes a GUI debugger based on pdb. The |
| 28 | Pythonwin debugger colors breakpoints and has quite a few cool features such as |
| 29 | debugging non-Pythonwin programs. Pythonwin is available as part of the `Python |
| 30 | for Windows Extensions <http://sourceforge.net/projects/pywin32/>`__ project and |
| 31 | as a part of the ActivePython distribution (see |
| 32 | http://www.activestate.com/Products/ActivePython/index.html). |
| 33 | |
| 34 | `Boa Constructor <http://boa-constructor.sourceforge.net/>`_ is an IDE and GUI |
| 35 | builder that uses wxWidgets. It offers visual frame creation and manipulation, |
| 36 | an object inspector, many views on the source like object browsers, inheritance |
| 37 | hierarchies, doc string generated html documentation, an advanced debugger, |
| 38 | integrated help, and Zope support. |
| 39 | |
| 40 | `Eric <http://www.die-offenbachs.de/eric/index.html>`_ is an IDE built on PyQt |
| 41 | and the Scintilla editing component. |
| 42 | |
| 43 | Pydb is a version of the standard Python debugger pdb, modified for use with DDD |
| 44 | (Data Display Debugger), a popular graphical debugger front end. Pydb can be |
| 45 | found at http://bashdb.sourceforge.net/pydb/ and DDD can be found at |
| 46 | http://www.gnu.org/software/ddd. |
| 47 | |
| 48 | There are a number of commercial Python IDEs that include graphical debuggers. |
| 49 | They include: |
| 50 | |
| 51 | * Wing IDE (http://wingware.com/) |
| 52 | * Komodo IDE (http://www.activestate.com/Products/Komodo) |
| 53 | |
| 54 | |
| 55 | Is there a tool to help find bugs or perform static analysis? |
| 56 | ------------------------------------------------------------- |
| 57 | |
| 58 | Yes. |
| 59 | |
| 60 | PyChecker is a static analysis tool that finds bugs in Python source code and |
| 61 | warns about code complexity and style. You can get PyChecker from |
| 62 | http://pychecker.sf.net. |
| 63 | |
| 64 | `Pylint <http://www.logilab.org/projects/pylint>`_ is another tool that checks |
| 65 | if a module satisfies a coding standard, and also makes it possible to write |
| 66 | plug-ins to add a custom feature. In addition to the bug checking that |
| 67 | PyChecker performs, Pylint offers some additional features such as checking line |
| 68 | length, whether variable names are well-formed according to your coding |
| 69 | standard, whether declared interfaces are fully implemented, and more. |
Georg Brandl | 495f7b5 | 2009-10-27 15:28:25 +0000 | [diff] [blame] | 70 | http://www.logilab.org/card/pylint_manual provides a full list of Pylint's |
| 71 | features. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 72 | |
| 73 | |
| 74 | How can I create a stand-alone binary from a Python script? |
| 75 | ----------------------------------------------------------- |
| 76 | |
| 77 | You don't need the ability to compile Python to C code if all you want is a |
| 78 | stand-alone program that users can download and run without having to install |
| 79 | the Python distribution first. There are a number of tools that determine the |
| 80 | set of modules required by a program and bind these modules together with a |
| 81 | Python binary to produce a single executable. |
| 82 | |
| 83 | One is to use the freeze tool, which is included in the Python source tree as |
| 84 | ``Tools/freeze``. It converts Python byte code to C arrays; a C compiler you can |
| 85 | embed all your modules into a new program, which is then linked with the |
| 86 | standard Python modules. |
| 87 | |
| 88 | It works by scanning your source recursively for import statements (in both |
| 89 | forms) and looking for the modules in the standard Python path as well as in the |
| 90 | source directory (for built-in modules). It then turns the bytecode for modules |
| 91 | written in Python into C code (array initializers that can be turned into code |
| 92 | objects using the marshal module) and creates a custom-made config file that |
| 93 | only contains those built-in modules which are actually used in the program. It |
| 94 | then compiles the generated C code and links it with the rest of the Python |
| 95 | interpreter to form a self-contained binary which acts exactly like your script. |
| 96 | |
| 97 | Obviously, freeze requires a C compiler. There are several other utilities |
| 98 | which don't. One is Thomas Heller's py2exe (Windows only) at |
| 99 | |
| 100 | http://www.py2exe.org/ |
| 101 | |
| 102 | Another is Christian Tismer's `SQFREEZE <http://starship.python.net/crew/pirx>`_ |
| 103 | which appends the byte code to a specially-prepared Python interpreter that can |
| 104 | find the byte code in the executable. |
| 105 | |
| 106 | Other tools include Fredrik Lundh's `Squeeze |
| 107 | <http://www.pythonware.com/products/python/squeeze>`_ and Anthony Tuininga's |
| 108 | `cx_Freeze <http://starship.python.net/crew/atuining/cx_Freeze/index.html>`_. |
| 109 | |
| 110 | |
| 111 | Are there coding standards or a style guide for Python programs? |
| 112 | ---------------------------------------------------------------- |
| 113 | |
| 114 | Yes. The coding style required for standard library modules is documented as |
| 115 | :pep:`8`. |
| 116 | |
| 117 | |
| 118 | My program is too slow. How do I speed it up? |
| 119 | --------------------------------------------- |
| 120 | |
| 121 | That's a tough one, in general. There are many tricks to speed up Python code; |
| 122 | consider rewriting parts in C as a last resort. |
| 123 | |
| 124 | In some cases it's possible to automatically translate Python to C or x86 |
| 125 | assembly language, meaning that you don't have to modify your code to gain |
| 126 | increased speed. |
| 127 | |
| 128 | .. XXX seems to have overlap with other questions! |
| 129 | |
Antoine Pitrou | 09264b6 | 2011-02-05 10:57:17 +0000 | [diff] [blame] | 130 | `Cython <http://cython.org>`_ and `Pyrex <http://www.cosc.canterbury.ac.nz/~greg/python/Pyrex/>`_ |
| 131 | can compile a slightly modified version of Python code into a C extension, and |
| 132 | can be used on many different platforms. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 133 | |
| 134 | `Psyco <http://psyco.sourceforge.net>`_ is a just-in-time compiler that |
| 135 | translates Python code into x86 assembly language. If you can use it, Psyco can |
| 136 | provide dramatic speedups for critical functions. |
| 137 | |
| 138 | The rest of this answer will discuss various tricks for squeezing a bit more |
| 139 | speed out of Python code. *Never* apply any optimization tricks unless you know |
| 140 | you need them, after profiling has indicated that a particular function is the |
| 141 | heavily executed hot spot in the code. Optimizations almost always make the |
| 142 | code less clear, and you shouldn't pay the costs of reduced clarity (increased |
| 143 | development time, greater likelihood of bugs) unless the resulting performance |
| 144 | benefit is worth it. |
| 145 | |
| 146 | There is a page on the wiki devoted to `performance tips |
| 147 | <http://wiki.python.org/moin/PythonSpeed/PerformanceTips>`_. |
| 148 | |
| 149 | Guido van Rossum has written up an anecdote related to optimization at |
| 150 | http://www.python.org/doc/essays/list2str.html. |
| 151 | |
| 152 | One thing to notice is that function and (especially) method calls are rather |
| 153 | expensive; if you have designed a purely OO interface with lots of tiny |
| 154 | functions that don't do much more than get or set an instance variable or call |
| 155 | another method, you might consider using a more direct way such as directly |
| 156 | accessing instance variables. Also see the standard module :mod:`profile` which |
| 157 | makes it possible to find out where your program is spending most of its time |
| 158 | (if you have some patience -- the profiling itself can slow your program down by |
| 159 | an order of magnitude). |
| 160 | |
| 161 | Remember that many standard optimization heuristics you may know from other |
| 162 | programming experience may well apply to Python. For example it may be faster |
| 163 | to send output to output devices using larger writes rather than smaller ones in |
| 164 | order to reduce the overhead of kernel system calls. Thus CGI scripts that |
| 165 | write all output in "one shot" may be faster than those that write lots of small |
| 166 | pieces of output. |
| 167 | |
| 168 | Also, be sure to use Python's core features where appropriate. For example, |
| 169 | slicing allows programs to chop up lists and other sequence objects in a single |
| 170 | tick of the interpreter's mainloop using highly optimized C implementations. |
| 171 | Thus to get the same effect as:: |
| 172 | |
| 173 | L2 = [] |
| 174 | for i in range[3]: |
| 175 | L2.append(L1[i]) |
| 176 | |
| 177 | it is much shorter and far faster to use :: |
| 178 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 179 | L2 = list(L1[:3]) # "list" is redundant if L1 is a list. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 180 | |
Georg Brandl | c4a55fc | 2010-02-06 18:46:57 +0000 | [diff] [blame] | 181 | Note that the functionally-oriented built-in functions such as :func:`map`, |
| 182 | :func:`zip`, and friends can be a convenient accelerator for loops that |
| 183 | perform a single task. For example to pair the elements of two lists |
| 184 | together:: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 185 | |
Georg Brandl | 11b6362 | 2009-12-20 14:21:27 +0000 | [diff] [blame] | 186 | >>> list(zip([1, 2, 3], [4, 5, 6])) |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 187 | [(1, 4), (2, 5), (3, 6)] |
| 188 | |
| 189 | or to compute a number of sines:: |
| 190 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 191 | >>> list(map(math.sin, (1, 2, 3, 4))) |
| 192 | [0.841470984808, 0.909297426826, 0.14112000806, -0.756802495308] |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 193 | |
| 194 | The operation completes very quickly in such cases. |
| 195 | |
Georg Brandl | 11b6362 | 2009-12-20 14:21:27 +0000 | [diff] [blame] | 196 | Other examples include the ``join()`` and ``split()`` :ref:`methods |
| 197 | of string objects <string-methods>`. |
| 198 | |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 199 | For example if s1..s7 are large (10K+) strings then |
| 200 | ``"".join([s1,s2,s3,s4,s5,s6,s7])`` may be far faster than the more obvious |
| 201 | ``s1+s2+s3+s4+s5+s6+s7``, since the "summation" will compute many |
| 202 | subexpressions, whereas ``join()`` does all the copying in one pass. For |
Georg Brandl | 11b6362 | 2009-12-20 14:21:27 +0000 | [diff] [blame] | 203 | manipulating strings, use the ``replace()`` and the ``format()`` :ref:`methods |
| 204 | on string objects <string-methods>`. Use regular expressions only when you're |
| 205 | not dealing with constant string patterns. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 206 | |
Georg Brandl | c4a55fc | 2010-02-06 18:46:57 +0000 | [diff] [blame] | 207 | Be sure to use the :meth:`list.sort` built-in method to do sorting, and see the |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 208 | `sorting mini-HOWTO <http://wiki.python.org/moin/HowTo/Sorting>`_ for examples |
| 209 | of moderately advanced usage. :meth:`list.sort` beats other techniques for |
| 210 | sorting in all but the most extreme circumstances. |
| 211 | |
| 212 | Another common trick is to "push loops into functions or methods." For example |
| 213 | suppose you have a program that runs slowly and you use the profiler to |
| 214 | determine that a Python function ``ff()`` is being called lots of times. If you |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 215 | notice that ``ff()``:: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 216 | |
| 217 | def ff(x): |
| 218 | ... # do something with x computing result... |
| 219 | return result |
| 220 | |
| 221 | tends to be called in loops like:: |
| 222 | |
| 223 | list = map(ff, oldlist) |
| 224 | |
| 225 | or:: |
| 226 | |
| 227 | for x in sequence: |
| 228 | value = ff(x) |
| 229 | ... # do something with value... |
| 230 | |
| 231 | then you can often eliminate function call overhead by rewriting ``ff()`` to:: |
| 232 | |
| 233 | def ffseq(seq): |
| 234 | resultseq = [] |
| 235 | for x in seq: |
| 236 | ... # do something with x computing result... |
| 237 | resultseq.append(result) |
| 238 | return resultseq |
| 239 | |
| 240 | and rewrite the two examples to ``list = ffseq(oldlist)`` and to:: |
| 241 | |
| 242 | for value in ffseq(sequence): |
| 243 | ... # do something with value... |
| 244 | |
| 245 | Single calls to ``ff(x)`` translate to ``ffseq([x])[0]`` with little penalty. |
| 246 | Of course this technique is not always appropriate and there are other variants |
| 247 | which you can figure out. |
| 248 | |
| 249 | You can gain some performance by explicitly storing the results of a function or |
| 250 | method lookup into a local variable. A loop like:: |
| 251 | |
| 252 | for key in token: |
| 253 | dict[key] = dict.get(key, 0) + 1 |
| 254 | |
| 255 | resolves ``dict.get`` every iteration. If the method isn't going to change, a |
| 256 | slightly faster implementation is:: |
| 257 | |
| 258 | dict_get = dict.get # look up the method once |
| 259 | for key in token: |
| 260 | dict[key] = dict_get(key, 0) + 1 |
| 261 | |
| 262 | Default arguments can be used to determine values once, at compile time instead |
| 263 | of at run time. This can only be done for functions or objects which will not |
| 264 | be changed during program execution, such as replacing :: |
| 265 | |
| 266 | def degree_sin(deg): |
| 267 | return math.sin(deg * math.pi / 180.0) |
| 268 | |
| 269 | with :: |
| 270 | |
| 271 | def degree_sin(deg, factor=math.pi/180.0, sin=math.sin): |
| 272 | return sin(deg * factor) |
| 273 | |
| 274 | Because this trick uses default arguments for terms which should not be changed, |
| 275 | it should only be used when you are not concerned with presenting a possibly |
| 276 | confusing API to your users. |
| 277 | |
| 278 | |
| 279 | Core Language |
| 280 | ============= |
| 281 | |
R. David Murray | c04a694 | 2009-11-14 22:21:32 +0000 | [diff] [blame] | 282 | Why am I getting an UnboundLocalError when the variable has a value? |
| 283 | -------------------------------------------------------------------- |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 284 | |
R. David Murray | c04a694 | 2009-11-14 22:21:32 +0000 | [diff] [blame] | 285 | It can be a surprise to get the UnboundLocalError in previously working |
| 286 | code when it is modified by adding an assignment statement somewhere in |
| 287 | the body of a function. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 288 | |
R. David Murray | c04a694 | 2009-11-14 22:21:32 +0000 | [diff] [blame] | 289 | This code: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 290 | |
R. David Murray | c04a694 | 2009-11-14 22:21:32 +0000 | [diff] [blame] | 291 | >>> x = 10 |
| 292 | >>> def bar(): |
| 293 | ... print(x) |
| 294 | >>> bar() |
| 295 | 10 |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 296 | |
R. David Murray | c04a694 | 2009-11-14 22:21:32 +0000 | [diff] [blame] | 297 | works, but this code: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 298 | |
R. David Murray | c04a694 | 2009-11-14 22:21:32 +0000 | [diff] [blame] | 299 | >>> x = 10 |
| 300 | >>> def foo(): |
| 301 | ... print(x) |
| 302 | ... x += 1 |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 303 | |
R. David Murray | c04a694 | 2009-11-14 22:21:32 +0000 | [diff] [blame] | 304 | results in an UnboundLocalError: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 305 | |
R. David Murray | c04a694 | 2009-11-14 22:21:32 +0000 | [diff] [blame] | 306 | >>> foo() |
| 307 | Traceback (most recent call last): |
| 308 | ... |
| 309 | UnboundLocalError: local variable 'x' referenced before assignment |
| 310 | |
| 311 | This is because when you make an assignment to a variable in a scope, that |
| 312 | variable becomes local to that scope and shadows any similarly named variable |
| 313 | in the outer scope. Since the last statement in foo assigns a new value to |
| 314 | ``x``, the compiler recognizes it as a local variable. Consequently when the |
R. David Murray | 18163c3 | 2009-11-14 22:27:22 +0000 | [diff] [blame] | 315 | earlier ``print(x)`` attempts to print the uninitialized local variable and |
R. David Murray | c04a694 | 2009-11-14 22:21:32 +0000 | [diff] [blame] | 316 | an error results. |
| 317 | |
| 318 | In the example above you can access the outer scope variable by declaring it |
| 319 | global: |
| 320 | |
| 321 | >>> x = 10 |
| 322 | >>> def foobar(): |
| 323 | ... global x |
| 324 | ... print(x) |
| 325 | ... x += 1 |
| 326 | >>> foobar() |
| 327 | 10 |
| 328 | |
| 329 | This explicit declaration is required in order to remind you that (unlike the |
| 330 | superficially analogous situation with class and instance variables) you are |
| 331 | actually modifying the value of the variable in the outer scope: |
| 332 | |
| 333 | >>> print(x) |
| 334 | 11 |
| 335 | |
| 336 | You can do a similar thing in a nested scope using the :keyword:`nonlocal` |
| 337 | keyword: |
| 338 | |
| 339 | >>> def foo(): |
| 340 | ... x = 10 |
| 341 | ... def bar(): |
| 342 | ... nonlocal x |
| 343 | ... print(x) |
| 344 | ... x += 1 |
| 345 | ... bar() |
| 346 | ... print(x) |
| 347 | >>> foo() |
| 348 | 10 |
| 349 | 11 |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 350 | |
| 351 | |
| 352 | What are the rules for local and global variables in Python? |
| 353 | ------------------------------------------------------------ |
| 354 | |
| 355 | In Python, variables that are only referenced inside a function are implicitly |
| 356 | global. If a variable is assigned a new value anywhere within the function's |
| 357 | body, it's assumed to be a local. If a variable is ever assigned a new value |
| 358 | inside the function, the variable is implicitly local, and you need to |
| 359 | explicitly declare it as 'global'. |
| 360 | |
| 361 | Though a bit surprising at first, a moment's consideration explains this. On |
| 362 | one hand, requiring :keyword:`global` for assigned variables provides a bar |
| 363 | against unintended side-effects. On the other hand, if ``global`` was required |
| 364 | for all global references, you'd be using ``global`` all the time. You'd have |
Georg Brandl | c4a55fc | 2010-02-06 18:46:57 +0000 | [diff] [blame] | 365 | to declare as global every reference to a built-in function or to a component of |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 366 | an imported module. This clutter would defeat the usefulness of the ``global`` |
| 367 | declaration for identifying side-effects. |
| 368 | |
| 369 | |
| 370 | How do I share global variables across modules? |
| 371 | ------------------------------------------------ |
| 372 | |
| 373 | The canonical way to share information across modules within a single program is |
| 374 | to create a special module (often called config or cfg). Just import the config |
| 375 | module in all modules of your application; the module then becomes available as |
| 376 | a global name. Because there is only one instance of each module, any changes |
| 377 | made to the module object get reflected everywhere. For example: |
| 378 | |
| 379 | config.py:: |
| 380 | |
| 381 | x = 0 # Default value of the 'x' configuration setting |
| 382 | |
| 383 | mod.py:: |
| 384 | |
| 385 | import config |
| 386 | config.x = 1 |
| 387 | |
| 388 | main.py:: |
| 389 | |
| 390 | import config |
| 391 | import mod |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 392 | print(config.x) |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 393 | |
| 394 | Note that using a module is also the basis for implementing the Singleton design |
| 395 | pattern, for the same reason. |
| 396 | |
| 397 | |
| 398 | What are the "best practices" for using import in a module? |
| 399 | ----------------------------------------------------------- |
| 400 | |
| 401 | In general, don't use ``from modulename import *``. Doing so clutters the |
| 402 | importer's namespace. Some people avoid this idiom even with the few modules |
| 403 | that were designed to be imported in this manner. Modules designed in this |
Georg Brandl | d404fa6 | 2009-10-13 16:55:12 +0000 | [diff] [blame] | 404 | manner include :mod:`tkinter`, and :mod:`threading`. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 405 | |
| 406 | Import modules at the top of a file. Doing so makes it clear what other modules |
| 407 | your code requires and avoids questions of whether the module name is in scope. |
| 408 | Using one import per line makes it easy to add and delete module imports, but |
| 409 | using multiple imports per line uses less screen space. |
| 410 | |
| 411 | It's good practice if you import modules in the following order: |
| 412 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 413 | 1. standard library modules -- e.g. ``sys``, ``os``, ``getopt``, ``re`` |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 414 | 2. third-party library modules (anything installed in Python's site-packages |
| 415 | directory) -- e.g. mx.DateTime, ZODB, PIL.Image, etc. |
| 416 | 3. locally-developed modules |
| 417 | |
| 418 | Never use relative package imports. If you're writing code that's in the |
| 419 | ``package.sub.m1`` module and want to import ``package.sub.m2``, do not just |
Georg Brandl | 11b6362 | 2009-12-20 14:21:27 +0000 | [diff] [blame] | 420 | write ``from . import m2``, even though it's legal. Write ``from package.sub |
| 421 | import m2`` instead. See :pep:`328` for details. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 422 | |
| 423 | It is sometimes necessary to move imports to a function or class to avoid |
| 424 | problems with circular imports. Gordon McMillan says: |
| 425 | |
| 426 | Circular imports are fine where both modules use the "import <module>" form |
| 427 | of import. They fail when the 2nd module wants to grab a name out of the |
| 428 | first ("from module import name") and the import is at the top level. That's |
| 429 | because names in the 1st are not yet available, because the first module is |
| 430 | busy importing the 2nd. |
| 431 | |
| 432 | In this case, if the second module is only used in one function, then the import |
| 433 | can easily be moved into that function. By the time the import is called, the |
| 434 | first module will have finished initializing, and the second module can do its |
| 435 | import. |
| 436 | |
| 437 | It may also be necessary to move imports out of the top level of code if some of |
| 438 | the modules are platform-specific. In that case, it may not even be possible to |
| 439 | import all of the modules at the top of the file. In this case, importing the |
| 440 | correct modules in the corresponding platform-specific code is a good option. |
| 441 | |
| 442 | Only move imports into a local scope, such as inside a function definition, if |
| 443 | it's necessary to solve a problem such as avoiding a circular import or are |
| 444 | trying to reduce the initialization time of a module. This technique is |
| 445 | especially helpful if many of the imports are unnecessary depending on how the |
| 446 | program executes. You may also want to move imports into a function if the |
| 447 | modules are only ever used in that function. Note that loading a module the |
| 448 | first time may be expensive because of the one time initialization of the |
| 449 | module, but loading a module multiple times is virtually free, costing only a |
| 450 | couple of dictionary lookups. Even if the module name has gone out of scope, |
| 451 | the module is probably available in :data:`sys.modules`. |
| 452 | |
| 453 | If only instances of a specific class use a module, then it is reasonable to |
| 454 | import the module in the class's ``__init__`` method and then assign the module |
| 455 | to an instance variable so that the module is always available (via that |
| 456 | instance variable) during the life of the object. Note that to delay an import |
| 457 | until the class is instantiated, the import must be inside a method. Putting |
| 458 | the import inside the class but outside of any method still causes the import to |
| 459 | occur when the module is initialized. |
| 460 | |
| 461 | |
| 462 | How can I pass optional or keyword parameters from one function to another? |
| 463 | --------------------------------------------------------------------------- |
| 464 | |
| 465 | Collect the arguments using the ``*`` and ``**`` specifiers in the function's |
| 466 | parameter list; this gives you the positional arguments as a tuple and the |
| 467 | keyword arguments as a dictionary. You can then pass these arguments when |
| 468 | calling another function by using ``*`` and ``**``:: |
| 469 | |
| 470 | def f(x, *args, **kwargs): |
| 471 | ... |
| 472 | kwargs['width'] = '14.3c' |
| 473 | ... |
| 474 | g(x, *args, **kwargs) |
| 475 | |
| 476 | In the unlikely case that you care about Python versions older than 2.0, use |
| 477 | :func:`apply`:: |
| 478 | |
| 479 | def f(x, *args, **kwargs): |
| 480 | ... |
| 481 | kwargs['width'] = '14.3c' |
| 482 | ... |
| 483 | apply(g, (x,)+args, kwargs) |
| 484 | |
| 485 | |
| 486 | How do I write a function with output parameters (call by reference)? |
| 487 | --------------------------------------------------------------------- |
| 488 | |
| 489 | Remember that arguments are passed by assignment in Python. Since assignment |
| 490 | just creates references to objects, there's no alias between an argument name in |
| 491 | the caller and callee, and so no call-by-reference per se. You can achieve the |
| 492 | desired effect in a number of ways. |
| 493 | |
| 494 | 1) By returning a tuple of the results:: |
| 495 | |
| 496 | def func2(a, b): |
| 497 | a = 'new-value' # a and b are local names |
| 498 | b = b + 1 # assigned to new objects |
| 499 | return a, b # return new values |
| 500 | |
| 501 | x, y = 'old-value', 99 |
| 502 | x, y = func2(x, y) |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 503 | print(x, y) # output: new-value 100 |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 504 | |
| 505 | This is almost always the clearest solution. |
| 506 | |
| 507 | 2) By using global variables. This isn't thread-safe, and is not recommended. |
| 508 | |
| 509 | 3) By passing a mutable (changeable in-place) object:: |
| 510 | |
| 511 | def func1(a): |
| 512 | a[0] = 'new-value' # 'a' references a mutable list |
| 513 | a[1] = a[1] + 1 # changes a shared object |
| 514 | |
| 515 | args = ['old-value', 99] |
| 516 | func1(args) |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 517 | print(args[0], args[1]) # output: new-value 100 |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 518 | |
| 519 | 4) By passing in a dictionary that gets mutated:: |
| 520 | |
| 521 | def func3(args): |
| 522 | args['a'] = 'new-value' # args is a mutable dictionary |
| 523 | args['b'] = args['b'] + 1 # change it in-place |
| 524 | |
| 525 | args = {'a':' old-value', 'b': 99} |
| 526 | func3(args) |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 527 | print(args['a'], args['b']) |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 528 | |
| 529 | 5) Or bundle up values in a class instance:: |
| 530 | |
| 531 | class callByRef: |
| 532 | def __init__(self, **args): |
| 533 | for (key, value) in args.items(): |
| 534 | setattr(self, key, value) |
| 535 | |
| 536 | def func4(args): |
| 537 | args.a = 'new-value' # args is a mutable callByRef |
| 538 | args.b = args.b + 1 # change object in-place |
| 539 | |
| 540 | args = callByRef(a='old-value', b=99) |
| 541 | func4(args) |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 542 | print(args.a, args.b) |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 543 | |
| 544 | |
| 545 | There's almost never a good reason to get this complicated. |
| 546 | |
| 547 | Your best choice is to return a tuple containing the multiple results. |
| 548 | |
| 549 | |
| 550 | How do you make a higher order function in Python? |
| 551 | -------------------------------------------------- |
| 552 | |
| 553 | You have two choices: you can use nested scopes or you can use callable objects. |
| 554 | For example, suppose you wanted to define ``linear(a,b)`` which returns a |
| 555 | function ``f(x)`` that computes the value ``a*x+b``. Using nested scopes:: |
| 556 | |
| 557 | def linear(a, b): |
| 558 | def result(x): |
| 559 | return a * x + b |
| 560 | return result |
| 561 | |
| 562 | Or using a callable object:: |
| 563 | |
| 564 | class linear: |
| 565 | |
| 566 | def __init__(self, a, b): |
| 567 | self.a, self.b = a, b |
| 568 | |
| 569 | def __call__(self, x): |
| 570 | return self.a * x + self.b |
| 571 | |
| 572 | In both cases, :: |
| 573 | |
| 574 | taxes = linear(0.3, 2) |
| 575 | |
| 576 | gives a callable object where ``taxes(10e6) == 0.3 * 10e6 + 2``. |
| 577 | |
| 578 | The callable object approach has the disadvantage that it is a bit slower and |
| 579 | results in slightly longer code. However, note that a collection of callables |
| 580 | can share their signature via inheritance:: |
| 581 | |
| 582 | class exponential(linear): |
| 583 | # __init__ inherited |
| 584 | def __call__(self, x): |
| 585 | return self.a * (x ** self.b) |
| 586 | |
| 587 | Object can encapsulate state for several methods:: |
| 588 | |
| 589 | class counter: |
| 590 | |
| 591 | value = 0 |
| 592 | |
| 593 | def set(self, x): |
| 594 | self.value = x |
| 595 | |
| 596 | def up(self): |
| 597 | self.value = self.value + 1 |
| 598 | |
| 599 | def down(self): |
| 600 | self.value = self.value - 1 |
| 601 | |
| 602 | count = counter() |
| 603 | inc, dec, reset = count.up, count.down, count.set |
| 604 | |
| 605 | Here ``inc()``, ``dec()`` and ``reset()`` act like functions which share the |
| 606 | same counting variable. |
| 607 | |
| 608 | |
| 609 | How do I copy an object in Python? |
| 610 | ---------------------------------- |
| 611 | |
| 612 | In general, try :func:`copy.copy` or :func:`copy.deepcopy` for the general case. |
| 613 | Not all objects can be copied, but most can. |
| 614 | |
| 615 | Some objects can be copied more easily. Dictionaries have a :meth:`~dict.copy` |
| 616 | method:: |
| 617 | |
| 618 | newdict = olddict.copy() |
| 619 | |
| 620 | Sequences can be copied by slicing:: |
| 621 | |
| 622 | new_l = l[:] |
| 623 | |
| 624 | |
| 625 | How can I find the methods or attributes of an object? |
| 626 | ------------------------------------------------------ |
| 627 | |
| 628 | For an instance x of a user-defined class, ``dir(x)`` returns an alphabetized |
| 629 | list of the names containing the instance attributes and methods and attributes |
| 630 | defined by its class. |
| 631 | |
| 632 | |
| 633 | How can my code discover the name of an object? |
| 634 | ----------------------------------------------- |
| 635 | |
| 636 | Generally speaking, it can't, because objects don't really have names. |
| 637 | Essentially, assignment always binds a name to a value; The same is true of |
| 638 | ``def`` and ``class`` statements, but in that case the value is a |
| 639 | callable. Consider the following code:: |
| 640 | |
| 641 | class A: |
| 642 | pass |
| 643 | |
| 644 | B = A |
| 645 | |
| 646 | a = B() |
| 647 | b = a |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 648 | print(b) |
| 649 | <__main__.A object at 0x16D07CC> |
| 650 | print(a) |
| 651 | <__main__.A object at 0x16D07CC> |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 652 | |
| 653 | Arguably the class has a name: even though it is bound to two names and invoked |
| 654 | through the name B the created instance is still reported as an instance of |
| 655 | class A. However, it is impossible to say whether the instance's name is a or |
| 656 | b, since both names are bound to the same value. |
| 657 | |
| 658 | Generally speaking it should not be necessary for your code to "know the names" |
| 659 | of particular values. Unless you are deliberately writing introspective |
| 660 | programs, this is usually an indication that a change of approach might be |
| 661 | beneficial. |
| 662 | |
| 663 | In comp.lang.python, Fredrik Lundh once gave an excellent analogy in answer to |
| 664 | this question: |
| 665 | |
| 666 | The same way as you get the name of that cat you found on your porch: the cat |
| 667 | (object) itself cannot tell you its name, and it doesn't really care -- so |
| 668 | the only way to find out what it's called is to ask all your neighbours |
| 669 | (namespaces) if it's their cat (object)... |
| 670 | |
| 671 | ....and don't be surprised if you'll find that it's known by many names, or |
| 672 | no name at all! |
| 673 | |
| 674 | |
| 675 | What's up with the comma operator's precedence? |
| 676 | ----------------------------------------------- |
| 677 | |
| 678 | Comma is not an operator in Python. Consider this session:: |
| 679 | |
| 680 | >>> "a" in "b", "a" |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 681 | (False, 'a') |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 682 | |
| 683 | Since the comma is not an operator, but a separator between expressions the |
| 684 | above is evaluated as if you had entered:: |
| 685 | |
| 686 | >>> ("a" in "b"), "a" |
| 687 | |
| 688 | not:: |
| 689 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 690 | >>> "a" in ("b", "a") |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 691 | |
| 692 | The same is true of the various assignment operators (``=``, ``+=`` etc). They |
| 693 | are not truly operators but syntactic delimiters in assignment statements. |
| 694 | |
| 695 | |
| 696 | Is there an equivalent of C's "?:" ternary operator? |
| 697 | ---------------------------------------------------- |
| 698 | |
| 699 | Yes, this feature was added in Python 2.5. The syntax would be as follows:: |
| 700 | |
| 701 | [on_true] if [expression] else [on_false] |
| 702 | |
| 703 | x, y = 50, 25 |
| 704 | |
| 705 | small = x if x < y else y |
| 706 | |
| 707 | For versions previous to 2.5 the answer would be 'No'. |
| 708 | |
| 709 | .. XXX remove rest? |
| 710 | |
| 711 | In many cases you can mimic ``a ? b : c`` with ``a and b or c``, but there's a |
| 712 | flaw: if *b* is zero (or empty, or ``None`` -- anything that tests false) then |
| 713 | *c* will be selected instead. In many cases you can prove by looking at the |
| 714 | code that this can't happen (e.g. because *b* is a constant or has a type that |
| 715 | can never be false), but in general this can be a problem. |
| 716 | |
| 717 | Tim Peters (who wishes it was Steve Majewski) suggested the following solution: |
| 718 | ``(a and [b] or [c])[0]``. Because ``[b]`` is a singleton list it is never |
| 719 | false, so the wrong path is never taken; then applying ``[0]`` to the whole |
| 720 | thing gets the *b* or *c* that you really wanted. Ugly, but it gets you there |
| 721 | in the rare cases where it is really inconvenient to rewrite your code using |
| 722 | 'if'. |
| 723 | |
| 724 | The best course is usually to write a simple ``if...else`` statement. Another |
| 725 | solution is to implement the ``?:`` operator as a function:: |
| 726 | |
| 727 | def q(cond, on_true, on_false): |
| 728 | if cond: |
| 729 | if not isfunction(on_true): |
| 730 | return on_true |
| 731 | else: |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 732 | return on_true() |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 733 | else: |
| 734 | if not isfunction(on_false): |
| 735 | return on_false |
| 736 | else: |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 737 | return on_false() |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 738 | |
| 739 | In most cases you'll pass b and c directly: ``q(a, b, c)``. To avoid evaluating |
| 740 | b or c when they shouldn't be, encapsulate them within a lambda function, e.g.: |
| 741 | ``q(a, lambda: b, lambda: c)``. |
| 742 | |
| 743 | It has been asked *why* Python has no if-then-else expression. There are |
| 744 | several answers: many languages do just fine without one; it can easily lead to |
| 745 | less readable code; no sufficiently "Pythonic" syntax has been discovered; a |
| 746 | search of the standard library found remarkably few places where using an |
| 747 | if-then-else expression would make the code more understandable. |
| 748 | |
| 749 | In 2002, :pep:`308` was written proposing several possible syntaxes and the |
| 750 | community was asked to vote on the issue. The vote was inconclusive. Most |
| 751 | people liked one of the syntaxes, but also hated other syntaxes; many votes |
| 752 | implied that people preferred no ternary operator rather than having a syntax |
| 753 | they hated. |
| 754 | |
| 755 | |
| 756 | Is it possible to write obfuscated one-liners in Python? |
| 757 | -------------------------------------------------------- |
| 758 | |
| 759 | Yes. Usually this is done by nesting :keyword:`lambda` within |
| 760 | :keyword:`lambda`. See the following three examples, due to Ulf Bartelt:: |
| 761 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 762 | from functools import reduce |
| 763 | |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 764 | # Primes < 1000 |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 765 | print(list(filter(None,map(lambda y:y*reduce(lambda x,y:x*y!=0, |
| 766 | map(lambda x,y=y:y%x,range(2,int(pow(y,0.5)+1))),1),range(2,1000))))) |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 767 | |
| 768 | # First 10 Fibonacci numbers |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 769 | print(list(map(lambda x,f=lambda x,f:(f(x-1,f)+f(x-2,f)) if x>1 else 1: |
| 770 | f(x,f), range(10)))) |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 771 | |
| 772 | # Mandelbrot set |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 773 | print((lambda Ru,Ro,Iu,Io,IM,Sx,Sy:reduce(lambda x,y:x+y,map(lambda y, |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 774 | Iu=Iu,Io=Io,Ru=Ru,Ro=Ro,Sy=Sy,L=lambda yc,Iu=Iu,Io=Io,Ru=Ru,Ro=Ro,i=IM, |
| 775 | Sx=Sx,Sy=Sy:reduce(lambda x,y:x+y,map(lambda x,xc=Ru,yc=yc,Ru=Ru,Ro=Ro, |
| 776 | i=i,Sx=Sx,F=lambda xc,yc,x,y,k,f=lambda xc,yc,x,y,k,f:(k<=0)or (x*x+y*y |
| 777 | >=4.0) or 1+f(xc,yc,x*x-y*y+xc,2.0*x*y+yc,k-1,f):f(xc,yc,x,y,k,f):chr( |
| 778 | 64+F(Ru+x*(Ro-Ru)/Sx,yc,0,0,i)),range(Sx))):L(Iu+y*(Io-Iu)/Sy),range(Sy |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 779 | ))))(-2.1, 0.7, -1.2, 1.2, 30, 80, 24)) |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 780 | # \___ ___/ \___ ___/ | | |__ lines on screen |
| 781 | # V V | |______ columns on screen |
| 782 | # | | |__________ maximum of "iterations" |
| 783 | # | |_________________ range on y axis |
| 784 | # |____________________________ range on x axis |
| 785 | |
| 786 | Don't try this at home, kids! |
| 787 | |
| 788 | |
| 789 | Numbers and strings |
| 790 | =================== |
| 791 | |
| 792 | How do I specify hexadecimal and octal integers? |
| 793 | ------------------------------------------------ |
| 794 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 795 | To specify an octal digit, precede the octal value with a zero, and then a lower |
| 796 | or uppercase "o". For example, to set the variable "a" to the octal value "10" |
| 797 | (8 in decimal), type:: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 798 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 799 | >>> a = 0o10 |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 800 | >>> a |
| 801 | 8 |
| 802 | |
| 803 | Hexadecimal is just as easy. Simply precede the hexadecimal number with a zero, |
| 804 | and then a lower or uppercase "x". Hexadecimal digits can be specified in lower |
| 805 | or uppercase. For example, in the Python interpreter:: |
| 806 | |
| 807 | >>> a = 0xa5 |
| 808 | >>> a |
| 809 | 165 |
| 810 | >>> b = 0XB2 |
| 811 | >>> b |
| 812 | 178 |
| 813 | |
| 814 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 815 | Why does -22 // 10 return -3? |
| 816 | ----------------------------- |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 817 | |
| 818 | It's primarily driven by the desire that ``i % j`` have the same sign as ``j``. |
| 819 | If you want that, and also want:: |
| 820 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 821 | i == (i // j) * j + (i % j) |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 822 | |
| 823 | then integer division has to return the floor. C also requires that identity to |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 824 | hold, and then compilers that truncate ``i // j`` need to make ``i % j`` have |
| 825 | the same sign as ``i``. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 826 | |
| 827 | There are few real use cases for ``i % j`` when ``j`` is negative. When ``j`` |
| 828 | is positive, there are many, and in virtually all of them it's more useful for |
| 829 | ``i % j`` to be ``>= 0``. If the clock says 10 now, what did it say 200 hours |
| 830 | ago? ``-190 % 12 == 2`` is useful; ``-190 % 12 == -10`` is a bug waiting to |
| 831 | bite. |
| 832 | |
| 833 | |
| 834 | How do I convert a string to a number? |
| 835 | -------------------------------------- |
| 836 | |
| 837 | For integers, use the built-in :func:`int` type constructor, e.g. ``int('144') |
| 838 | == 144``. Similarly, :func:`float` converts to floating-point, |
| 839 | e.g. ``float('144') == 144.0``. |
| 840 | |
| 841 | By default, these interpret the number as decimal, so that ``int('0144') == |
| 842 | 144`` and ``int('0x144')`` raises :exc:`ValueError`. ``int(string, base)`` takes |
| 843 | the base to convert from as a second optional argument, so ``int('0x144', 16) == |
| 844 | 324``. If the base is specified as 0, the number is interpreted using Python's |
| 845 | rules: a leading '0' indicates octal, and '0x' indicates a hex number. |
| 846 | |
| 847 | Do not use the built-in function :func:`eval` if all you need is to convert |
| 848 | strings to numbers. :func:`eval` will be significantly slower and it presents a |
| 849 | security risk: someone could pass you a Python expression that might have |
| 850 | unwanted side effects. For example, someone could pass |
| 851 | ``__import__('os').system("rm -rf $HOME")`` which would erase your home |
| 852 | directory. |
| 853 | |
| 854 | :func:`eval` also has the effect of interpreting numbers as Python expressions, |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 855 | so that e.g. ``eval('09')`` gives a syntax error because Python does not allow |
| 856 | leading '0' in a decimal number (except '0'). |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 857 | |
| 858 | |
| 859 | How do I convert a number to a string? |
| 860 | -------------------------------------- |
| 861 | |
| 862 | To convert, e.g., the number 144 to the string '144', use the built-in type |
| 863 | constructor :func:`str`. If you want a hexadecimal or octal representation, use |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 864 | the built-in functions :func:`hex` or :func:`oct`. For fancy formatting, see |
| 865 | the :ref:`string-formatting` section, e.g. ``"{:04d}".format(144)`` yields |
Georg Brandl | 11b6362 | 2009-12-20 14:21:27 +0000 | [diff] [blame] | 866 | ``'0144'`` and ``"{:.3f}".format(1/3)`` yields ``'0.333'``. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 867 | |
| 868 | |
| 869 | How do I modify a string in place? |
| 870 | ---------------------------------- |
| 871 | |
| 872 | You can't, because strings are immutable. If you need an object with this |
| 873 | ability, try converting the string to a list or use the array module:: |
| 874 | |
| 875 | >>> s = "Hello, world" |
| 876 | >>> a = list(s) |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 877 | >>> print(a) |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 878 | ['H', 'e', 'l', 'l', 'o', ',', ' ', 'w', 'o', 'r', 'l', 'd'] |
| 879 | >>> a[7:] = list("there!") |
| 880 | >>> ''.join(a) |
| 881 | 'Hello, there!' |
| 882 | |
| 883 | >>> import array |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 884 | >>> a = array.array('u', s) |
| 885 | >>> print(a) |
| 886 | array('u', 'Hello, world') |
| 887 | >>> a[0] = 'y' |
| 888 | >>> print(a) |
| 889 | array('u', 'yello world') |
| 890 | >>> a.tounicode() |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 891 | 'yello, world' |
| 892 | |
| 893 | |
| 894 | How do I use strings to call functions/methods? |
| 895 | ----------------------------------------------- |
| 896 | |
| 897 | There are various techniques. |
| 898 | |
| 899 | * The best is to use a dictionary that maps strings to functions. The primary |
| 900 | advantage of this technique is that the strings do not need to match the names |
| 901 | of the functions. This is also the primary technique used to emulate a case |
| 902 | construct:: |
| 903 | |
| 904 | def a(): |
| 905 | pass |
| 906 | |
| 907 | def b(): |
| 908 | pass |
| 909 | |
| 910 | dispatch = {'go': a, 'stop': b} # Note lack of parens for funcs |
| 911 | |
| 912 | dispatch[get_input()]() # Note trailing parens to call function |
| 913 | |
| 914 | * Use the built-in function :func:`getattr`:: |
| 915 | |
| 916 | import foo |
| 917 | getattr(foo, 'bar')() |
| 918 | |
| 919 | Note that :func:`getattr` works on any object, including classes, class |
| 920 | instances, modules, and so on. |
| 921 | |
| 922 | This is used in several places in the standard library, like this:: |
| 923 | |
| 924 | class Foo: |
| 925 | def do_foo(self): |
| 926 | ... |
| 927 | |
| 928 | def do_bar(self): |
| 929 | ... |
| 930 | |
| 931 | f = getattr(foo_instance, 'do_' + opname) |
| 932 | f() |
| 933 | |
| 934 | |
| 935 | * Use :func:`locals` or :func:`eval` to resolve the function name:: |
| 936 | |
| 937 | def myFunc(): |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 938 | print("hello") |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 939 | |
| 940 | fname = "myFunc" |
| 941 | |
| 942 | f = locals()[fname] |
| 943 | f() |
| 944 | |
| 945 | f = eval(fname) |
| 946 | f() |
| 947 | |
| 948 | Note: Using :func:`eval` is slow and dangerous. If you don't have absolute |
| 949 | control over the contents of the string, someone could pass a string that |
| 950 | resulted in an arbitrary function being executed. |
| 951 | |
| 952 | Is there an equivalent to Perl's chomp() for removing trailing newlines from strings? |
| 953 | ------------------------------------------------------------------------------------- |
| 954 | |
| 955 | Starting with Python 2.2, you can use ``S.rstrip("\r\n")`` to remove all |
Georg Brandl | 6faee4e | 2010-09-21 14:48:28 +0000 | [diff] [blame] | 956 | occurrences of any line terminator from the end of the string ``S`` without |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 957 | removing other trailing whitespace. If the string ``S`` represents more than |
| 958 | one line, with several empty lines at the end, the line terminators for all the |
| 959 | blank lines will be removed:: |
| 960 | |
| 961 | >>> lines = ("line 1 \r\n" |
| 962 | ... "\r\n" |
| 963 | ... "\r\n") |
| 964 | >>> lines.rstrip("\n\r") |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 965 | 'line 1 ' |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 966 | |
| 967 | Since this is typically only desired when reading text one line at a time, using |
| 968 | ``S.rstrip()`` this way works well. |
| 969 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 970 | For older versions of Python, there are two partial substitutes: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 971 | |
| 972 | - If you want to remove all trailing whitespace, use the ``rstrip()`` method of |
| 973 | string objects. This removes all trailing whitespace, not just a single |
| 974 | newline. |
| 975 | |
| 976 | - Otherwise, if there is only one line in the string ``S``, use |
| 977 | ``S.splitlines()[0]``. |
| 978 | |
| 979 | |
| 980 | Is there a scanf() or sscanf() equivalent? |
| 981 | ------------------------------------------ |
| 982 | |
| 983 | Not as such. |
| 984 | |
| 985 | For simple input parsing, the easiest approach is usually to split the line into |
| 986 | whitespace-delimited words using the :meth:`~str.split` method of string objects |
| 987 | and then convert decimal strings to numeric values using :func:`int` or |
| 988 | :func:`float`. ``split()`` supports an optional "sep" parameter which is useful |
| 989 | if the line uses something other than whitespace as a separator. |
| 990 | |
Brian Curtin | 5a7a52f | 2010-09-23 13:45:21 +0000 | [diff] [blame] | 991 | For more complicated input parsing, regular expressions are more powerful |
Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 992 | than C's :c:func:`sscanf` and better suited for the task. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 993 | |
| 994 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 995 | What does 'UnicodeDecodeError' or 'UnicodeEncodeError' error mean? |
| 996 | ------------------------------------------------------------------- |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 997 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 998 | See the :ref:`unicode-howto`. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 999 | |
| 1000 | |
| 1001 | Sequences (Tuples/Lists) |
| 1002 | ======================== |
| 1003 | |
| 1004 | How do I convert between tuples and lists? |
| 1005 | ------------------------------------------ |
| 1006 | |
| 1007 | The type constructor ``tuple(seq)`` converts any sequence (actually, any |
| 1008 | iterable) into a tuple with the same items in the same order. |
| 1009 | |
| 1010 | For example, ``tuple([1, 2, 3])`` yields ``(1, 2, 3)`` and ``tuple('abc')`` |
| 1011 | yields ``('a', 'b', 'c')``. If the argument is a tuple, it does not make a copy |
| 1012 | but returns the same object, so it is cheap to call :func:`tuple` when you |
| 1013 | aren't sure that an object is already a tuple. |
| 1014 | |
| 1015 | The type constructor ``list(seq)`` converts any sequence or iterable into a list |
| 1016 | with the same items in the same order. For example, ``list((1, 2, 3))`` yields |
| 1017 | ``[1, 2, 3]`` and ``list('abc')`` yields ``['a', 'b', 'c']``. If the argument |
| 1018 | is a list, it makes a copy just like ``seq[:]`` would. |
| 1019 | |
| 1020 | |
| 1021 | What's a negative index? |
| 1022 | ------------------------ |
| 1023 | |
| 1024 | Python sequences are indexed with positive numbers and negative numbers. For |
| 1025 | positive numbers 0 is the first index 1 is the second index and so forth. For |
| 1026 | negative indices -1 is the last index and -2 is the penultimate (next to last) |
| 1027 | index and so forth. Think of ``seq[-n]`` as the same as ``seq[len(seq)-n]``. |
| 1028 | |
| 1029 | Using negative indices can be very convenient. For example ``S[:-1]`` is all of |
| 1030 | the string except for its last character, which is useful for removing the |
| 1031 | trailing newline from a string. |
| 1032 | |
| 1033 | |
| 1034 | How do I iterate over a sequence in reverse order? |
| 1035 | -------------------------------------------------- |
| 1036 | |
Georg Brandl | c4a55fc | 2010-02-06 18:46:57 +0000 | [diff] [blame] | 1037 | Use the :func:`reversed` built-in function, which is new in Python 2.4:: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1038 | |
| 1039 | for x in reversed(sequence): |
| 1040 | ... # do something with x... |
| 1041 | |
| 1042 | This won't touch your original sequence, but build a new copy with reversed |
| 1043 | order to iterate over. |
| 1044 | |
| 1045 | With Python 2.3, you can use an extended slice syntax:: |
| 1046 | |
| 1047 | for x in sequence[::-1]: |
| 1048 | ... # do something with x... |
| 1049 | |
| 1050 | |
| 1051 | How do you remove duplicates from a list? |
| 1052 | ----------------------------------------- |
| 1053 | |
| 1054 | See the Python Cookbook for a long discussion of many ways to do this: |
| 1055 | |
| 1056 | http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/52560 |
| 1057 | |
| 1058 | If you don't mind reordering the list, sort it and then scan from the end of the |
| 1059 | list, deleting duplicates as you go:: |
| 1060 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1061 | if mylist: |
| 1062 | mylist.sort() |
| 1063 | last = mylist[-1] |
| 1064 | for i in range(len(mylist)-2, -1, -1): |
| 1065 | if last == mylist[i]: |
| 1066 | del mylist[i] |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1067 | else: |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1068 | last = mylist[i] |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1069 | |
| 1070 | If all elements of the list may be used as dictionary keys (i.e. they are all |
| 1071 | hashable) this is often faster :: |
| 1072 | |
| 1073 | d = {} |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1074 | for x in mylist: |
| 1075 | d[x] = 1 |
| 1076 | mylist = list(d.keys()) |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1077 | |
| 1078 | In Python 2.5 and later, the following is possible instead:: |
| 1079 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1080 | mylist = list(set(mylist)) |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1081 | |
| 1082 | This converts the list into a set, thereby removing duplicates, and then back |
| 1083 | into a list. |
| 1084 | |
| 1085 | |
| 1086 | How do you make an array in Python? |
| 1087 | ----------------------------------- |
| 1088 | |
| 1089 | Use a list:: |
| 1090 | |
| 1091 | ["this", 1, "is", "an", "array"] |
| 1092 | |
| 1093 | Lists are equivalent to C or Pascal arrays in their time complexity; the primary |
| 1094 | difference is that a Python list can contain objects of many different types. |
| 1095 | |
| 1096 | The ``array`` module also provides methods for creating arrays of fixed types |
| 1097 | with compact representations, but they are slower to index than lists. Also |
| 1098 | note that the Numeric extensions and others define array-like structures with |
| 1099 | various characteristics as well. |
| 1100 | |
| 1101 | To get Lisp-style linked lists, you can emulate cons cells using tuples:: |
| 1102 | |
| 1103 | lisp_list = ("like", ("this", ("example", None) ) ) |
| 1104 | |
| 1105 | If mutability is desired, you could use lists instead of tuples. Here the |
| 1106 | analogue of lisp car is ``lisp_list[0]`` and the analogue of cdr is |
| 1107 | ``lisp_list[1]``. Only do this if you're sure you really need to, because it's |
| 1108 | usually a lot slower than using Python lists. |
| 1109 | |
| 1110 | |
| 1111 | How do I create a multidimensional list? |
| 1112 | ---------------------------------------- |
| 1113 | |
| 1114 | You probably tried to make a multidimensional array like this:: |
| 1115 | |
| 1116 | A = [[None] * 2] * 3 |
| 1117 | |
| 1118 | This looks correct if you print it:: |
| 1119 | |
| 1120 | >>> A |
| 1121 | [[None, None], [None, None], [None, None]] |
| 1122 | |
| 1123 | But when you assign a value, it shows up in multiple places: |
| 1124 | |
| 1125 | >>> A[0][0] = 5 |
| 1126 | >>> A |
| 1127 | [[5, None], [5, None], [5, None]] |
| 1128 | |
| 1129 | The reason is that replicating a list with ``*`` doesn't create copies, it only |
| 1130 | creates references to the existing objects. The ``*3`` creates a list |
| 1131 | containing 3 references to the same list of length two. Changes to one row will |
| 1132 | show in all rows, which is almost certainly not what you want. |
| 1133 | |
| 1134 | The suggested approach is to create a list of the desired length first and then |
| 1135 | fill in each element with a newly created list:: |
| 1136 | |
| 1137 | A = [None] * 3 |
| 1138 | for i in range(3): |
| 1139 | A[i] = [None] * 2 |
| 1140 | |
| 1141 | This generates a list containing 3 different lists of length two. You can also |
| 1142 | use a list comprehension:: |
| 1143 | |
| 1144 | w, h = 2, 3 |
| 1145 | A = [[None] * w for i in range(h)] |
| 1146 | |
| 1147 | Or, you can use an extension that provides a matrix datatype; `Numeric Python |
Georg Brandl | 495f7b5 | 2009-10-27 15:28:25 +0000 | [diff] [blame] | 1148 | <http://numpy.scipy.org/>`_ is the best known. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1149 | |
| 1150 | |
| 1151 | How do I apply a method to a sequence of objects? |
| 1152 | ------------------------------------------------- |
| 1153 | |
| 1154 | Use a list comprehension:: |
| 1155 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1156 | result = [obj.method() for obj in mylist] |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1157 | |
| 1158 | |
| 1159 | Dictionaries |
| 1160 | ============ |
| 1161 | |
| 1162 | How can I get a dictionary to display its keys in a consistent order? |
| 1163 | --------------------------------------------------------------------- |
| 1164 | |
| 1165 | You can't. Dictionaries store their keys in an unpredictable order, so the |
| 1166 | display order of a dictionary's elements will be similarly unpredictable. |
| 1167 | |
| 1168 | This can be frustrating if you want to save a printable version to a file, make |
| 1169 | some changes and then compare it with some other printed dictionary. In this |
| 1170 | case, use the ``pprint`` module to pretty-print the dictionary; the items will |
| 1171 | be presented in order sorted by the key. |
| 1172 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1173 | A more complicated solution is to subclass ``dict`` to create a |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1174 | ``SortedDict`` class that prints itself in a predictable order. Here's one |
| 1175 | simpleminded implementation of such a class:: |
| 1176 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1177 | class SortedDict(dict): |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1178 | def __repr__(self): |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1179 | keys = sorted(self.keys()) |
| 1180 | result = ("{!r}: {!r}".format(k, self[k]) for k in keys) |
| 1181 | return "{{{}}}".format(", ".join(result)) |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1182 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1183 | __str__ = __repr__ |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1184 | |
| 1185 | This will work for many common situations you might encounter, though it's far |
| 1186 | from a perfect solution. The largest flaw is that if some values in the |
| 1187 | dictionary are also dictionaries, their values won't be presented in any |
| 1188 | particular order. |
| 1189 | |
| 1190 | |
| 1191 | I want to do a complicated sort: can you do a Schwartzian Transform in Python? |
| 1192 | ------------------------------------------------------------------------------ |
| 1193 | |
| 1194 | The technique, attributed to Randal Schwartz of the Perl community, sorts the |
| 1195 | elements of a list by a metric which maps each element to its "sort value". In |
| 1196 | Python, just use the ``key`` argument for the ``sort()`` method:: |
| 1197 | |
| 1198 | Isorted = L[:] |
| 1199 | Isorted.sort(key=lambda s: int(s[10:15])) |
| 1200 | |
| 1201 | The ``key`` argument is new in Python 2.4, for older versions this kind of |
| 1202 | sorting is quite simple to do with list comprehensions. To sort a list of |
| 1203 | strings by their uppercase values:: |
| 1204 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1205 | tmp1 = [(x.upper(), x) for x in L] # Schwartzian transform |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1206 | tmp1.sort() |
| 1207 | Usorted = [x[1] for x in tmp1] |
| 1208 | |
| 1209 | To sort by the integer value of a subfield extending from positions 10-15 in |
| 1210 | each string:: |
| 1211 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1212 | tmp2 = [(int(s[10:15]), s) for s in L] # Schwartzian transform |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1213 | tmp2.sort() |
| 1214 | Isorted = [x[1] for x in tmp2] |
| 1215 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1216 | For versions prior to 3.0, Isorted may also be computed by :: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1217 | |
| 1218 | def intfield(s): |
| 1219 | return int(s[10:15]) |
| 1220 | |
| 1221 | def Icmp(s1, s2): |
| 1222 | return cmp(intfield(s1), intfield(s2)) |
| 1223 | |
| 1224 | Isorted = L[:] |
| 1225 | Isorted.sort(Icmp) |
| 1226 | |
| 1227 | but since this method calls ``intfield()`` many times for each element of L, it |
| 1228 | is slower than the Schwartzian Transform. |
| 1229 | |
| 1230 | |
| 1231 | How can I sort one list by values from another list? |
| 1232 | ---------------------------------------------------- |
| 1233 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1234 | Merge them into an iterator of tuples, sort the resulting list, and then pick |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1235 | out the element you want. :: |
| 1236 | |
| 1237 | >>> list1 = ["what", "I'm", "sorting", "by"] |
| 1238 | >>> list2 = ["something", "else", "to", "sort"] |
| 1239 | >>> pairs = zip(list1, list2) |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1240 | >>> pairs = sorted(pairs) |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1241 | >>> pairs |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1242 | [("I'm", 'else'), ('by', 'sort'), ('sorting', 'to'), ('what', 'something')] |
| 1243 | >>> result = [x[1] for x in pairs] |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1244 | >>> result |
| 1245 | ['else', 'sort', 'to', 'something'] |
| 1246 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1247 | |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1248 | An alternative for the last step is:: |
| 1249 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1250 | >>> result = [] |
| 1251 | >>> for p in pairs: result.append(p[1]) |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1252 | |
| 1253 | If you find this more legible, you might prefer to use this instead of the final |
| 1254 | list comprehension. However, it is almost twice as slow for long lists. Why? |
| 1255 | First, the ``append()`` operation has to reallocate memory, and while it uses |
| 1256 | some tricks to avoid doing that each time, it still has to do it occasionally, |
| 1257 | and that costs quite a bit. Second, the expression "result.append" requires an |
| 1258 | extra attribute lookup, and third, there's a speed reduction from having to make |
| 1259 | all those function calls. |
| 1260 | |
| 1261 | |
| 1262 | Objects |
| 1263 | ======= |
| 1264 | |
| 1265 | What is a class? |
| 1266 | ---------------- |
| 1267 | |
| 1268 | A class is the particular object type created by executing a class statement. |
| 1269 | Class objects are used as templates to create instance objects, which embody |
| 1270 | both the data (attributes) and code (methods) specific to a datatype. |
| 1271 | |
| 1272 | A class can be based on one or more other classes, called its base class(es). It |
| 1273 | then inherits the attributes and methods of its base classes. This allows an |
| 1274 | object model to be successively refined by inheritance. You might have a |
| 1275 | generic ``Mailbox`` class that provides basic accessor methods for a mailbox, |
| 1276 | and subclasses such as ``MboxMailbox``, ``MaildirMailbox``, ``OutlookMailbox`` |
| 1277 | that handle various specific mailbox formats. |
| 1278 | |
| 1279 | |
| 1280 | What is a method? |
| 1281 | ----------------- |
| 1282 | |
| 1283 | A method is a function on some object ``x`` that you normally call as |
| 1284 | ``x.name(arguments...)``. Methods are defined as functions inside the class |
| 1285 | definition:: |
| 1286 | |
| 1287 | class C: |
| 1288 | def meth (self, arg): |
| 1289 | return arg * 2 + self.attribute |
| 1290 | |
| 1291 | |
| 1292 | What is self? |
| 1293 | ------------- |
| 1294 | |
| 1295 | Self is merely a conventional name for the first argument of a method. A method |
| 1296 | defined as ``meth(self, a, b, c)`` should be called as ``x.meth(a, b, c)`` for |
| 1297 | some instance ``x`` of the class in which the definition occurs; the called |
| 1298 | method will think it is called as ``meth(x, a, b, c)``. |
| 1299 | |
| 1300 | See also :ref:`why-self`. |
| 1301 | |
| 1302 | |
| 1303 | How do I check if an object is an instance of a given class or of a subclass of it? |
| 1304 | ----------------------------------------------------------------------------------- |
| 1305 | |
| 1306 | Use the built-in function ``isinstance(obj, cls)``. You can check if an object |
| 1307 | is an instance of any of a number of classes by providing a tuple instead of a |
| 1308 | single class, e.g. ``isinstance(obj, (class1, class2, ...))``, and can also |
| 1309 | check whether an object is one of Python's built-in types, e.g. |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1310 | ``isinstance(obj, str)`` or ``isinstance(obj, (int, float, complex))``. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1311 | |
| 1312 | Note that most programs do not use :func:`isinstance` on user-defined classes |
| 1313 | very often. If you are developing the classes yourself, a more proper |
| 1314 | object-oriented style is to define methods on the classes that encapsulate a |
| 1315 | particular behaviour, instead of checking the object's class and doing a |
| 1316 | different thing based on what class it is. For example, if you have a function |
| 1317 | that does something:: |
| 1318 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1319 | def search(obj): |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1320 | if isinstance(obj, Mailbox): |
| 1321 | # ... code to search a mailbox |
| 1322 | elif isinstance(obj, Document): |
| 1323 | # ... code to search a document |
| 1324 | elif ... |
| 1325 | |
| 1326 | A better approach is to define a ``search()`` method on all the classes and just |
| 1327 | call it:: |
| 1328 | |
| 1329 | class Mailbox: |
| 1330 | def search(self): |
| 1331 | # ... code to search a mailbox |
| 1332 | |
| 1333 | class Document: |
| 1334 | def search(self): |
| 1335 | # ... code to search a document |
| 1336 | |
| 1337 | obj.search() |
| 1338 | |
| 1339 | |
| 1340 | What is delegation? |
| 1341 | ------------------- |
| 1342 | |
| 1343 | Delegation is an object oriented technique (also called a design pattern). |
| 1344 | Let's say you have an object ``x`` and want to change the behaviour of just one |
| 1345 | of its methods. You can create a new class that provides a new implementation |
| 1346 | of the method you're interested in changing and delegates all other methods to |
| 1347 | the corresponding method of ``x``. |
| 1348 | |
| 1349 | Python programmers can easily implement delegation. For example, the following |
| 1350 | class implements a class that behaves like a file but converts all written data |
| 1351 | to uppercase:: |
| 1352 | |
| 1353 | class UpperOut: |
| 1354 | |
| 1355 | def __init__(self, outfile): |
| 1356 | self._outfile = outfile |
| 1357 | |
| 1358 | def write(self, s): |
| 1359 | self._outfile.write(s.upper()) |
| 1360 | |
| 1361 | def __getattr__(self, name): |
| 1362 | return getattr(self._outfile, name) |
| 1363 | |
| 1364 | Here the ``UpperOut`` class redefines the ``write()`` method to convert the |
| 1365 | argument string to uppercase before calling the underlying |
| 1366 | ``self.__outfile.write()`` method. All other methods are delegated to the |
| 1367 | underlying ``self.__outfile`` object. The delegation is accomplished via the |
| 1368 | ``__getattr__`` method; consult :ref:`the language reference <attribute-access>` |
| 1369 | for more information about controlling attribute access. |
| 1370 | |
| 1371 | Note that for more general cases delegation can get trickier. When attributes |
| 1372 | must be set as well as retrieved, the class must define a :meth:`__setattr__` |
| 1373 | method too, and it must do so carefully. The basic implementation of |
| 1374 | :meth:`__setattr__` is roughly equivalent to the following:: |
| 1375 | |
| 1376 | class X: |
| 1377 | ... |
| 1378 | def __setattr__(self, name, value): |
| 1379 | self.__dict__[name] = value |
| 1380 | ... |
| 1381 | |
| 1382 | Most :meth:`__setattr__` implementations must modify ``self.__dict__`` to store |
| 1383 | local state for self without causing an infinite recursion. |
| 1384 | |
| 1385 | |
| 1386 | How do I call a method defined in a base class from a derived class that overrides it? |
| 1387 | -------------------------------------------------------------------------------------- |
| 1388 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1389 | Use the built-in :func:`super` function:: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1390 | |
| 1391 | class Derived(Base): |
| 1392 | def meth (self): |
| 1393 | super(Derived, self).meth() |
| 1394 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1395 | For version prior to 3.0, you may be using classic classes: For a class |
| 1396 | definition such as ``class Derived(Base): ...`` you can call method ``meth()`` |
| 1397 | defined in ``Base`` (or one of ``Base``'s base classes) as ``Base.meth(self, |
| 1398 | arguments...)``. Here, ``Base.meth`` is an unbound method, so you need to |
| 1399 | provide the ``self`` argument. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1400 | |
| 1401 | |
| 1402 | How can I organize my code to make it easier to change the base class? |
| 1403 | ---------------------------------------------------------------------- |
| 1404 | |
| 1405 | You could define an alias for the base class, assign the real base class to it |
| 1406 | before your class definition, and use the alias throughout your class. Then all |
| 1407 | you have to change is the value assigned to the alias. Incidentally, this trick |
| 1408 | is also handy if you want to decide dynamically (e.g. depending on availability |
| 1409 | of resources) which base class to use. Example:: |
| 1410 | |
| 1411 | BaseAlias = <real base class> |
| 1412 | |
| 1413 | class Derived(BaseAlias): |
| 1414 | def meth(self): |
| 1415 | BaseAlias.meth(self) |
| 1416 | ... |
| 1417 | |
| 1418 | |
| 1419 | How do I create static class data and static class methods? |
| 1420 | ----------------------------------------------------------- |
| 1421 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1422 | Both static data and static methods (in the sense of C++ or Java) are supported |
| 1423 | in Python. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1424 | |
| 1425 | For static data, simply define a class attribute. To assign a new value to the |
| 1426 | attribute, you have to explicitly use the class name in the assignment:: |
| 1427 | |
| 1428 | class C: |
| 1429 | count = 0 # number of times C.__init__ called |
| 1430 | |
| 1431 | def __init__(self): |
| 1432 | C.count = C.count + 1 |
| 1433 | |
| 1434 | def getcount(self): |
| 1435 | return C.count # or return self.count |
| 1436 | |
| 1437 | ``c.count`` also refers to ``C.count`` for any ``c`` such that ``isinstance(c, |
| 1438 | C)`` holds, unless overridden by ``c`` itself or by some class on the base-class |
| 1439 | search path from ``c.__class__`` back to ``C``. |
| 1440 | |
| 1441 | Caution: within a method of C, an assignment like ``self.count = 42`` creates a |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1442 | new and unrelated instance named "count" in ``self``'s own dict. Rebinding of a |
| 1443 | class-static data name must always specify the class whether inside a method or |
| 1444 | not:: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1445 | |
| 1446 | C.count = 314 |
| 1447 | |
| 1448 | Static methods are possible since Python 2.2:: |
| 1449 | |
| 1450 | class C: |
| 1451 | def static(arg1, arg2, arg3): |
| 1452 | # No 'self' parameter! |
| 1453 | ... |
| 1454 | static = staticmethod(static) |
| 1455 | |
| 1456 | With Python 2.4's decorators, this can also be written as :: |
| 1457 | |
| 1458 | class C: |
| 1459 | @staticmethod |
| 1460 | def static(arg1, arg2, arg3): |
| 1461 | # No 'self' parameter! |
| 1462 | ... |
| 1463 | |
| 1464 | However, a far more straightforward way to get the effect of a static method is |
| 1465 | via a simple module-level function:: |
| 1466 | |
| 1467 | def getcount(): |
| 1468 | return C.count |
| 1469 | |
| 1470 | If your code is structured so as to define one class (or tightly related class |
| 1471 | hierarchy) per module, this supplies the desired encapsulation. |
| 1472 | |
| 1473 | |
| 1474 | How can I overload constructors (or methods) in Python? |
| 1475 | ------------------------------------------------------- |
| 1476 | |
| 1477 | This answer actually applies to all methods, but the question usually comes up |
| 1478 | first in the context of constructors. |
| 1479 | |
| 1480 | In C++ you'd write |
| 1481 | |
| 1482 | .. code-block:: c |
| 1483 | |
| 1484 | class C { |
| 1485 | C() { cout << "No arguments\n"; } |
| 1486 | C(int i) { cout << "Argument is " << i << "\n"; } |
| 1487 | } |
| 1488 | |
| 1489 | In Python you have to write a single constructor that catches all cases using |
| 1490 | default arguments. For example:: |
| 1491 | |
| 1492 | class C: |
| 1493 | def __init__(self, i=None): |
| 1494 | if i is None: |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1495 | print("No arguments") |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1496 | else: |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1497 | print("Argument is", i) |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1498 | |
| 1499 | This is not entirely equivalent, but close enough in practice. |
| 1500 | |
| 1501 | You could also try a variable-length argument list, e.g. :: |
| 1502 | |
| 1503 | def __init__(self, *args): |
| 1504 | ... |
| 1505 | |
| 1506 | The same approach works for all method definitions. |
| 1507 | |
| 1508 | |
| 1509 | I try to use __spam and I get an error about _SomeClassName__spam. |
| 1510 | ------------------------------------------------------------------ |
| 1511 | |
| 1512 | Variable names with double leading underscores are "mangled" to provide a simple |
| 1513 | but effective way to define class private variables. Any identifier of the form |
| 1514 | ``__spam`` (at least two leading underscores, at most one trailing underscore) |
| 1515 | is textually replaced with ``_classname__spam``, where ``classname`` is the |
| 1516 | current class name with any leading underscores stripped. |
| 1517 | |
| 1518 | This doesn't guarantee privacy: an outside user can still deliberately access |
| 1519 | the "_classname__spam" attribute, and private values are visible in the object's |
| 1520 | ``__dict__``. Many Python programmers never bother to use private variable |
| 1521 | names at all. |
| 1522 | |
| 1523 | |
| 1524 | My class defines __del__ but it is not called when I delete the object. |
| 1525 | ----------------------------------------------------------------------- |
| 1526 | |
| 1527 | There are several possible reasons for this. |
| 1528 | |
| 1529 | The del statement does not necessarily call :meth:`__del__` -- it simply |
| 1530 | decrements the object's reference count, and if this reaches zero |
| 1531 | :meth:`__del__` is called. |
| 1532 | |
| 1533 | If your data structures contain circular links (e.g. a tree where each child has |
| 1534 | a parent reference and each parent has a list of children) the reference counts |
| 1535 | will never go back to zero. Once in a while Python runs an algorithm to detect |
| 1536 | such cycles, but the garbage collector might run some time after the last |
| 1537 | reference to your data structure vanishes, so your :meth:`__del__` method may be |
| 1538 | called at an inconvenient and random time. This is inconvenient if you're trying |
| 1539 | to reproduce a problem. Worse, the order in which object's :meth:`__del__` |
| 1540 | methods are executed is arbitrary. You can run :func:`gc.collect` to force a |
| 1541 | collection, but there *are* pathological cases where objects will never be |
| 1542 | collected. |
| 1543 | |
| 1544 | Despite the cycle collector, it's still a good idea to define an explicit |
| 1545 | ``close()`` method on objects to be called whenever you're done with them. The |
| 1546 | ``close()`` method can then remove attributes that refer to subobjecs. Don't |
| 1547 | call :meth:`__del__` directly -- :meth:`__del__` should call ``close()`` and |
| 1548 | ``close()`` should make sure that it can be called more than once for the same |
| 1549 | object. |
| 1550 | |
| 1551 | Another way to avoid cyclical references is to use the :mod:`weakref` module, |
| 1552 | which allows you to point to objects without incrementing their reference count. |
| 1553 | Tree data structures, for instance, should use weak references for their parent |
| 1554 | and sibling references (if they need them!). |
| 1555 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1556 | .. XXX relevant for Python 3? |
| 1557 | |
| 1558 | If the object has ever been a local variable in a function that caught an |
| 1559 | expression in an except clause, chances are that a reference to the object |
| 1560 | still exists in that function's stack frame as contained in the stack trace. |
| 1561 | Normally, calling :func:`sys.exc_clear` will take care of this by clearing |
| 1562 | the last recorded exception. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1563 | |
| 1564 | Finally, if your :meth:`__del__` method raises an exception, a warning message |
| 1565 | is printed to :data:`sys.stderr`. |
| 1566 | |
| 1567 | |
| 1568 | How do I get a list of all instances of a given class? |
| 1569 | ------------------------------------------------------ |
| 1570 | |
| 1571 | Python does not keep track of all instances of a class (or of a built-in type). |
| 1572 | You can program the class's constructor to keep track of all instances by |
| 1573 | keeping a list of weak references to each instance. |
| 1574 | |
| 1575 | |
| 1576 | Modules |
| 1577 | ======= |
| 1578 | |
| 1579 | How do I create a .pyc file? |
| 1580 | ---------------------------- |
| 1581 | |
| 1582 | When a module is imported for the first time (or when the source is more recent |
| 1583 | than the current compiled file) a ``.pyc`` file containing the compiled code |
| 1584 | should be created in the same directory as the ``.py`` file. |
| 1585 | |
| 1586 | One reason that a ``.pyc`` file may not be created is permissions problems with |
| 1587 | the directory. This can happen, for example, if you develop as one user but run |
| 1588 | as another, such as if you are testing with a web server. Creation of a .pyc |
| 1589 | file is automatic if you're importing a module and Python has the ability |
| 1590 | (permissions, free space, etc...) to write the compiled module back to the |
| 1591 | directory. |
| 1592 | |
| 1593 | Running Python on a top level script is not considered an import and no ``.pyc`` |
| 1594 | will be created. For example, if you have a top-level module ``abc.py`` that |
| 1595 | imports another module ``xyz.py``, when you run abc, ``xyz.pyc`` will be created |
| 1596 | since xyz is imported, but no ``abc.pyc`` file will be created since ``abc.py`` |
| 1597 | isn't being imported. |
| 1598 | |
| 1599 | If you need to create abc.pyc -- that is, to create a .pyc file for a module |
| 1600 | that is not imported -- you can, using the :mod:`py_compile` and |
| 1601 | :mod:`compileall` modules. |
| 1602 | |
| 1603 | The :mod:`py_compile` module can manually compile any module. One way is to use |
| 1604 | the ``compile()`` function in that module interactively:: |
| 1605 | |
| 1606 | >>> import py_compile |
| 1607 | >>> py_compile.compile('abc.py') |
| 1608 | |
| 1609 | This will write the ``.pyc`` to the same location as ``abc.py`` (or you can |
| 1610 | override that with the optional parameter ``cfile``). |
| 1611 | |
| 1612 | You can also automatically compile all files in a directory or directories using |
| 1613 | the :mod:`compileall` module. You can do it from the shell prompt by running |
| 1614 | ``compileall.py`` and providing the path of a directory containing Python files |
| 1615 | to compile:: |
| 1616 | |
| 1617 | python -m compileall . |
| 1618 | |
| 1619 | |
| 1620 | How do I find the current module name? |
| 1621 | -------------------------------------- |
| 1622 | |
| 1623 | A module can find out its own module name by looking at the predefined global |
| 1624 | variable ``__name__``. If this has the value ``'__main__'``, the program is |
| 1625 | running as a script. Many modules that are usually used by importing them also |
| 1626 | provide a command-line interface or a self-test, and only execute this code |
| 1627 | after checking ``__name__``:: |
| 1628 | |
| 1629 | def main(): |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1630 | print('Running test...') |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1631 | ... |
| 1632 | |
| 1633 | if __name__ == '__main__': |
| 1634 | main() |
| 1635 | |
| 1636 | |
| 1637 | How can I have modules that mutually import each other? |
| 1638 | ------------------------------------------------------- |
| 1639 | |
| 1640 | Suppose you have the following modules: |
| 1641 | |
| 1642 | foo.py:: |
| 1643 | |
| 1644 | from bar import bar_var |
| 1645 | foo_var = 1 |
| 1646 | |
| 1647 | bar.py:: |
| 1648 | |
| 1649 | from foo import foo_var |
| 1650 | bar_var = 2 |
| 1651 | |
| 1652 | The problem is that the interpreter will perform the following steps: |
| 1653 | |
| 1654 | * main imports foo |
| 1655 | * Empty globals for foo are created |
| 1656 | * foo is compiled and starts executing |
| 1657 | * foo imports bar |
| 1658 | * Empty globals for bar are created |
| 1659 | * bar is compiled and starts executing |
| 1660 | * bar imports foo (which is a no-op since there already is a module named foo) |
| 1661 | * bar.foo_var = foo.foo_var |
| 1662 | |
| 1663 | The last step fails, because Python isn't done with interpreting ``foo`` yet and |
| 1664 | the global symbol dictionary for ``foo`` is still empty. |
| 1665 | |
| 1666 | The same thing happens when you use ``import foo``, and then try to access |
| 1667 | ``foo.foo_var`` in global code. |
| 1668 | |
| 1669 | There are (at least) three possible workarounds for this problem. |
| 1670 | |
| 1671 | Guido van Rossum recommends avoiding all uses of ``from <module> import ...``, |
| 1672 | and placing all code inside functions. Initializations of global variables and |
| 1673 | class variables should use constants or built-in functions only. This means |
| 1674 | everything from an imported module is referenced as ``<module>.<name>``. |
| 1675 | |
| 1676 | Jim Roskind suggests performing steps in the following order in each module: |
| 1677 | |
| 1678 | * exports (globals, functions, and classes that don't need imported base |
| 1679 | classes) |
| 1680 | * ``import`` statements |
| 1681 | * active code (including globals that are initialized from imported values). |
| 1682 | |
| 1683 | van Rossum doesn't like this approach much because the imports appear in a |
| 1684 | strange place, but it does work. |
| 1685 | |
| 1686 | Matthias Urlichs recommends restructuring your code so that the recursive import |
| 1687 | is not necessary in the first place. |
| 1688 | |
| 1689 | These solutions are not mutually exclusive. |
| 1690 | |
| 1691 | |
| 1692 | __import__('x.y.z') returns <module 'x'>; how do I get z? |
| 1693 | --------------------------------------------------------- |
| 1694 | |
| 1695 | Try:: |
| 1696 | |
| 1697 | __import__('x.y.z').y.z |
| 1698 | |
| 1699 | For more realistic situations, you may have to do something like :: |
| 1700 | |
| 1701 | m = __import__(s) |
| 1702 | for i in s.split(".")[1:]: |
| 1703 | m = getattr(m, i) |
| 1704 | |
| 1705 | See :mod:`importlib` for a convenience function called |
| 1706 | :func:`~importlib.import_module`. |
| 1707 | |
| 1708 | |
| 1709 | |
| 1710 | When I edit an imported module and reimport it, the changes don't show up. Why does this happen? |
| 1711 | ------------------------------------------------------------------------------------------------- |
| 1712 | |
| 1713 | For reasons of efficiency as well as consistency, Python only reads the module |
| 1714 | file on the first time a module is imported. If it didn't, in a program |
| 1715 | consisting of many modules where each one imports the same basic module, the |
| 1716 | basic module would be parsed and re-parsed many times. To force rereading of a |
| 1717 | changed module, do this:: |
| 1718 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1719 | import imp |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1720 | import modname |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1721 | imp.reload(modname) |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1722 | |
| 1723 | Warning: this technique is not 100% fool-proof. In particular, modules |
| 1724 | containing statements like :: |
| 1725 | |
| 1726 | from modname import some_objects |
| 1727 | |
| 1728 | will continue to work with the old version of the imported objects. If the |
| 1729 | module contains class definitions, existing class instances will *not* be |
| 1730 | updated to use the new class definition. This can result in the following |
| 1731 | paradoxical behaviour: |
| 1732 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1733 | >>> import imp |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1734 | >>> import cls |
| 1735 | >>> c = cls.C() # Create an instance of C |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1736 | >>> imp.reload(cls) |
| 1737 | <module 'cls' from 'cls.py'> |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1738 | >>> isinstance(c, cls.C) # isinstance is false?!? |
| 1739 | False |
| 1740 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1741 | The nature of the problem is made clear if you print out the "identity" of the |
| 1742 | class objects: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1743 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1744 | >>> hex(id(c.__class__)) |
| 1745 | '0x7352a0' |
| 1746 | >>> hex(id(cls.C)) |
| 1747 | '0x4198d0' |