Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1 | :tocdepth: 2 |
| 2 | |
| 3 | =============== |
| 4 | Programming FAQ |
| 5 | =============== |
| 6 | |
Georg Brandl | 44ea77b | 2013-03-28 13:28:44 +0100 | [diff] [blame] | 7 | .. only:: html |
| 8 | |
| 9 | .. contents:: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 10 | |
| 11 | General Questions |
| 12 | ================= |
| 13 | |
| 14 | Is there a source code level debugger with breakpoints, single-stepping, etc.? |
| 15 | ------------------------------------------------------------------------------ |
| 16 | |
| 17 | Yes. |
| 18 | |
Andre Delfino | cf48e55 | 2019-05-03 13:53:22 -0300 | [diff] [blame] | 19 | Several debuggers for Python are described below, and the built-in function |
| 20 | :func:`breakpoint` allows you to drop into any of them. |
| 21 | |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 22 | The pdb module is a simple but adequate console-mode debugger for Python. It is |
| 23 | part of the standard Python library, and is :mod:`documented in the Library |
| 24 | Reference Manual <pdb>`. You can also write your own debugger by using the code |
| 25 | for pdb as an example. |
| 26 | |
| 27 | The IDLE interactive development environment, which is part of the standard |
| 28 | Python distribution (normally available as Tools/scripts/idle), includes a |
Georg Brandl | 5e722f6 | 2014-10-29 08:55:14 +0100 | [diff] [blame] | 29 | graphical debugger. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 30 | |
| 31 | PythonWin is a Python IDE that includes a GUI debugger based on pdb. The |
Andre Delfino | 08a4803 | 2021-04-28 22:06:53 -0300 | [diff] [blame] | 32 | PythonWin debugger colors breakpoints and has quite a few cool features such as |
| 33 | debugging non-PythonWin programs. PythonWin is available as part of |
| 34 | `pywin32 <https://github.com/mhammond/pywin32>`_ project and |
| 35 | as a part of the |
| 36 | `ActivePython <https://www.activestate.com/products/python/>`_ distribution. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 37 | |
Georg Brandl | 77fe77d | 2014-10-29 09:24:54 +0100 | [diff] [blame] | 38 | `Eric <http://eric-ide.python-projects.org/>`_ is an IDE built on PyQt |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 39 | and the Scintilla editing component. |
| 40 | |
Andre Delfino | 08a4803 | 2021-04-28 22:06:53 -0300 | [diff] [blame] | 41 | `trepan3k <https://github.com/rocky/python3-trepan/>`_ is a gdb-like debugger. |
| 42 | |
| 43 | `Visual Studio Code <https://code.visualstudio.com/>`_ is an IDE with debugging |
| 44 | tools that integrates with version-control software. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 45 | |
| 46 | There are a number of commercial Python IDEs that include graphical debuggers. |
| 47 | They include: |
| 48 | |
Andre Delfino | 08a4803 | 2021-04-28 22:06:53 -0300 | [diff] [blame] | 49 | * `Wing IDE <https://wingware.com/>`_ |
| 50 | * `Komodo IDE <https://www.activestate.com/products/komodo-ide/>`_ |
| 51 | * `PyCharm <https://www.jetbrains.com/pycharm/>`_ |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 52 | |
| 53 | |
Andre Delfino | dea82b6 | 2020-09-02 00:21:12 -0300 | [diff] [blame] | 54 | Are there tools to help find bugs or perform static analysis? |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 55 | ------------------------------------------------------------- |
| 56 | |
| 57 | Yes. |
| 58 | |
Andre Delfino | dea82b6 | 2020-09-02 00:21:12 -0300 | [diff] [blame] | 59 | `Pylint <https://www.pylint.org/>`_ and |
| 60 | `Pyflakes <https://github.com/PyCQA/pyflakes>`_ do basic checking that will |
| 61 | help you catch bugs sooner. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 62 | |
Andrés Delfino | a378254 | 2018-09-11 02:12:41 -0300 | [diff] [blame] | 63 | Static type checkers such as `Mypy <http://mypy-lang.org/>`_, |
| 64 | `Pyre <https://pyre-check.org/>`_, and |
| 65 | `Pytype <https://github.com/google/pytype>`_ can check type hints in Python |
| 66 | source code. |
| 67 | |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 68 | |
Miss Islington (bot) | 413df57 | 2021-05-22 15:23:03 -0700 | [diff] [blame] | 69 | .. _faq-create-standalone-binary: |
| 70 | |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 71 | How can I create a stand-alone binary from a Python script? |
| 72 | ----------------------------------------------------------- |
| 73 | |
| 74 | You don't need the ability to compile Python to C code if all you want is a |
| 75 | stand-alone program that users can download and run without having to install |
| 76 | the Python distribution first. There are a number of tools that determine the |
| 77 | set of modules required by a program and bind these modules together with a |
| 78 | Python binary to produce a single executable. |
| 79 | |
| 80 | One is to use the freeze tool, which is included in the Python source tree as |
| 81 | ``Tools/freeze``. It converts Python byte code to C arrays; a C compiler you can |
| 82 | embed all your modules into a new program, which is then linked with the |
| 83 | standard Python modules. |
| 84 | |
| 85 | It works by scanning your source recursively for import statements (in both |
| 86 | forms) and looking for the modules in the standard Python path as well as in the |
| 87 | source directory (for built-in modules). It then turns the bytecode for modules |
| 88 | written in Python into C code (array initializers that can be turned into code |
| 89 | objects using the marshal module) and creates a custom-made config file that |
| 90 | only contains those built-in modules which are actually used in the program. It |
| 91 | then compiles the generated C code and links it with the rest of the Python |
| 92 | interpreter to form a self-contained binary which acts exactly like your script. |
| 93 | |
Miss Islington (bot) | 413df57 | 2021-05-22 15:23:03 -0700 | [diff] [blame] | 94 | The following packages can help with the creation of console and GUI |
| 95 | executables: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 96 | |
Miss Islington (bot) | 413df57 | 2021-05-22 15:23:03 -0700 | [diff] [blame] | 97 | * `Nuitka <https://nuitka.net/>`_ (Cross-platform) |
| 98 | * `PyInstaller <http://www.pyinstaller.org/>`_ (Cross-platform) |
| 99 | * `PyOxidizer <https://pyoxidizer.readthedocs.io/en/stable/>`_ (Cross-platform) |
| 100 | * `cx_Freeze <https://marcelotduarte.github.io/cx_Freeze/>`_ (Cross-platform) |
| 101 | * `py2app <https://github.com/ronaldoussoren/py2app>`_ (macOS only) |
| 102 | * `py2exe <http://www.py2exe.org/>`_ (Windows only) |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 103 | |
| 104 | Are there coding standards or a style guide for Python programs? |
| 105 | ---------------------------------------------------------------- |
| 106 | |
| 107 | Yes. The coding style required for standard library modules is documented as |
| 108 | :pep:`8`. |
| 109 | |
| 110 | |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 111 | Core Language |
| 112 | ============= |
| 113 | |
R. David Murray | c04a694 | 2009-11-14 22:21:32 +0000 | [diff] [blame] | 114 | Why am I getting an UnboundLocalError when the variable has a value? |
| 115 | -------------------------------------------------------------------- |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 116 | |
R. David Murray | c04a694 | 2009-11-14 22:21:32 +0000 | [diff] [blame] | 117 | It can be a surprise to get the UnboundLocalError in previously working |
| 118 | code when it is modified by adding an assignment statement somewhere in |
| 119 | the body of a function. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 120 | |
R. David Murray | c04a694 | 2009-11-14 22:21:32 +0000 | [diff] [blame] | 121 | This code: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 122 | |
R. David Murray | c04a694 | 2009-11-14 22:21:32 +0000 | [diff] [blame] | 123 | >>> x = 10 |
| 124 | >>> def bar(): |
| 125 | ... print(x) |
| 126 | >>> bar() |
| 127 | 10 |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 128 | |
R. David Murray | c04a694 | 2009-11-14 22:21:32 +0000 | [diff] [blame] | 129 | works, but this code: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 130 | |
R. David Murray | c04a694 | 2009-11-14 22:21:32 +0000 | [diff] [blame] | 131 | >>> x = 10 |
| 132 | >>> def foo(): |
| 133 | ... print(x) |
| 134 | ... x += 1 |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 135 | |
R. David Murray | c04a694 | 2009-11-14 22:21:32 +0000 | [diff] [blame] | 136 | results in an UnboundLocalError: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 137 | |
R. David Murray | c04a694 | 2009-11-14 22:21:32 +0000 | [diff] [blame] | 138 | >>> foo() |
| 139 | Traceback (most recent call last): |
| 140 | ... |
| 141 | UnboundLocalError: local variable 'x' referenced before assignment |
| 142 | |
| 143 | This is because when you make an assignment to a variable in a scope, that |
| 144 | variable becomes local to that scope and shadows any similarly named variable |
| 145 | in the outer scope. Since the last statement in foo assigns a new value to |
| 146 | ``x``, the compiler recognizes it as a local variable. Consequently when the |
R. David Murray | 18163c3 | 2009-11-14 22:27:22 +0000 | [diff] [blame] | 147 | earlier ``print(x)`` attempts to print the uninitialized local variable and |
R. David Murray | c04a694 | 2009-11-14 22:21:32 +0000 | [diff] [blame] | 148 | an error results. |
| 149 | |
| 150 | In the example above you can access the outer scope variable by declaring it |
| 151 | global: |
| 152 | |
| 153 | >>> x = 10 |
| 154 | >>> def foobar(): |
| 155 | ... global x |
| 156 | ... print(x) |
| 157 | ... x += 1 |
| 158 | >>> foobar() |
| 159 | 10 |
| 160 | |
| 161 | This explicit declaration is required in order to remind you that (unlike the |
| 162 | superficially analogous situation with class and instance variables) you are |
| 163 | actually modifying the value of the variable in the outer scope: |
| 164 | |
| 165 | >>> print(x) |
| 166 | 11 |
| 167 | |
| 168 | You can do a similar thing in a nested scope using the :keyword:`nonlocal` |
| 169 | keyword: |
| 170 | |
| 171 | >>> def foo(): |
| 172 | ... x = 10 |
| 173 | ... def bar(): |
| 174 | ... nonlocal x |
| 175 | ... print(x) |
| 176 | ... x += 1 |
| 177 | ... bar() |
| 178 | ... print(x) |
| 179 | >>> foo() |
| 180 | 10 |
| 181 | 11 |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 182 | |
| 183 | |
| 184 | What are the rules for local and global variables in Python? |
| 185 | ------------------------------------------------------------ |
| 186 | |
| 187 | In Python, variables that are only referenced inside a function are implicitly |
Robert Collins | bd4dd54 | 2015-07-30 06:14:32 +1200 | [diff] [blame] | 188 | global. If a variable is assigned a value anywhere within the function's body, |
| 189 | it's assumed to be a local unless explicitly declared as global. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 190 | |
| 191 | Though a bit surprising at first, a moment's consideration explains this. On |
| 192 | one hand, requiring :keyword:`global` for assigned variables provides a bar |
| 193 | against unintended side-effects. On the other hand, if ``global`` was required |
| 194 | for all global references, you'd be using ``global`` all the time. You'd have |
Georg Brandl | c4a55fc | 2010-02-06 18:46:57 +0000 | [diff] [blame] | 195 | to declare as global every reference to a built-in function or to a component of |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 196 | an imported module. This clutter would defeat the usefulness of the ``global`` |
| 197 | declaration for identifying side-effects. |
| 198 | |
| 199 | |
Ezio Melotti | cad8b0f | 2013-01-05 00:50:46 +0200 | [diff] [blame] | 200 | Why do lambdas defined in a loop with different values all return the same result? |
| 201 | ---------------------------------------------------------------------------------- |
| 202 | |
| 203 | Assume you use a for loop to define a few different lambdas (or even plain |
| 204 | functions), e.g.:: |
| 205 | |
R David Murray | fdf9503 | 2013-06-19 16:58:26 -0400 | [diff] [blame] | 206 | >>> squares = [] |
| 207 | >>> for x in range(5): |
Serhiy Storchaka | dba9039 | 2016-05-10 12:01:23 +0300 | [diff] [blame] | 208 | ... squares.append(lambda: x**2) |
Ezio Melotti | cad8b0f | 2013-01-05 00:50:46 +0200 | [diff] [blame] | 209 | |
| 210 | This gives you a list that contains 5 lambdas that calculate ``x**2``. You |
| 211 | might expect that, when called, they would return, respectively, ``0``, ``1``, |
| 212 | ``4``, ``9``, and ``16``. However, when you actually try you will see that |
| 213 | they all return ``16``:: |
| 214 | |
| 215 | >>> squares[2]() |
| 216 | 16 |
| 217 | >>> squares[4]() |
| 218 | 16 |
| 219 | |
| 220 | This happens because ``x`` is not local to the lambdas, but is defined in |
| 221 | the outer scope, and it is accessed when the lambda is called --- not when it |
| 222 | is defined. At the end of the loop, the value of ``x`` is ``4``, so all the |
| 223 | functions now return ``4**2``, i.e. ``16``. You can also verify this by |
| 224 | changing the value of ``x`` and see how the results of the lambdas change:: |
| 225 | |
| 226 | >>> x = 8 |
| 227 | >>> squares[2]() |
| 228 | 64 |
| 229 | |
| 230 | In order to avoid this, you need to save the values in variables local to the |
| 231 | lambdas, so that they don't rely on the value of the global ``x``:: |
| 232 | |
R David Murray | fdf9503 | 2013-06-19 16:58:26 -0400 | [diff] [blame] | 233 | >>> squares = [] |
| 234 | >>> for x in range(5): |
Serhiy Storchaka | dba9039 | 2016-05-10 12:01:23 +0300 | [diff] [blame] | 235 | ... squares.append(lambda n=x: n**2) |
Ezio Melotti | cad8b0f | 2013-01-05 00:50:46 +0200 | [diff] [blame] | 236 | |
| 237 | Here, ``n=x`` creates a new variable ``n`` local to the lambda and computed |
| 238 | when the lambda is defined so that it has the same value that ``x`` had at |
| 239 | that point in the loop. This means that the value of ``n`` will be ``0`` |
| 240 | in the first lambda, ``1`` in the second, ``2`` in the third, and so on. |
| 241 | Therefore each lambda will now return the correct result:: |
| 242 | |
| 243 | >>> squares[2]() |
| 244 | 4 |
| 245 | >>> squares[4]() |
| 246 | 16 |
| 247 | |
| 248 | Note that this behaviour is not peculiar to lambdas, but applies to regular |
| 249 | functions too. |
| 250 | |
| 251 | |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 252 | How do I share global variables across modules? |
| 253 | ------------------------------------------------ |
| 254 | |
| 255 | The canonical way to share information across modules within a single program is |
| 256 | to create a special module (often called config or cfg). Just import the config |
| 257 | module in all modules of your application; the module then becomes available as |
| 258 | a global name. Because there is only one instance of each module, any changes |
| 259 | made to the module object get reflected everywhere. For example: |
| 260 | |
| 261 | config.py:: |
| 262 | |
| 263 | x = 0 # Default value of the 'x' configuration setting |
| 264 | |
| 265 | mod.py:: |
| 266 | |
| 267 | import config |
| 268 | config.x = 1 |
| 269 | |
| 270 | main.py:: |
| 271 | |
| 272 | import config |
| 273 | import mod |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 274 | print(config.x) |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 275 | |
| 276 | Note that using a module is also the basis for implementing the Singleton design |
| 277 | pattern, for the same reason. |
| 278 | |
| 279 | |
| 280 | What are the "best practices" for using import in a module? |
| 281 | ----------------------------------------------------------- |
| 282 | |
| 283 | In general, don't use ``from modulename import *``. Doing so clutters the |
Georg Brandl | a94ad1e | 2014-10-06 16:02:09 +0200 | [diff] [blame] | 284 | importer's namespace, and makes it much harder for linters to detect undefined |
| 285 | names. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 286 | |
| 287 | Import modules at the top of a file. Doing so makes it clear what other modules |
| 288 | your code requires and avoids questions of whether the module name is in scope. |
| 289 | Using one import per line makes it easy to add and delete module imports, but |
| 290 | using multiple imports per line uses less screen space. |
| 291 | |
| 292 | It's good practice if you import modules in the following order: |
| 293 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 294 | 1. standard library modules -- e.g. ``sys``, ``os``, ``getopt``, ``re`` |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 295 | 2. third-party library modules (anything installed in Python's site-packages |
| 296 | directory) -- e.g. mx.DateTime, ZODB, PIL.Image, etc. |
| 297 | 3. locally-developed modules |
| 298 | |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 299 | It is sometimes necessary to move imports to a function or class to avoid |
| 300 | problems with circular imports. Gordon McMillan says: |
| 301 | |
| 302 | Circular imports are fine where both modules use the "import <module>" form |
| 303 | of import. They fail when the 2nd module wants to grab a name out of the |
| 304 | first ("from module import name") and the import is at the top level. That's |
| 305 | because names in the 1st are not yet available, because the first module is |
| 306 | busy importing the 2nd. |
| 307 | |
| 308 | In this case, if the second module is only used in one function, then the import |
| 309 | can easily be moved into that function. By the time the import is called, the |
| 310 | first module will have finished initializing, and the second module can do its |
| 311 | import. |
| 312 | |
| 313 | It may also be necessary to move imports out of the top level of code if some of |
| 314 | the modules are platform-specific. In that case, it may not even be possible to |
| 315 | import all of the modules at the top of the file. In this case, importing the |
| 316 | correct modules in the corresponding platform-specific code is a good option. |
| 317 | |
| 318 | Only move imports into a local scope, such as inside a function definition, if |
| 319 | it's necessary to solve a problem such as avoiding a circular import or are |
| 320 | trying to reduce the initialization time of a module. This technique is |
| 321 | especially helpful if many of the imports are unnecessary depending on how the |
| 322 | program executes. You may also want to move imports into a function if the |
| 323 | modules are only ever used in that function. Note that loading a module the |
| 324 | first time may be expensive because of the one time initialization of the |
| 325 | module, but loading a module multiple times is virtually free, costing only a |
| 326 | couple of dictionary lookups. Even if the module name has gone out of scope, |
| 327 | the module is probably available in :data:`sys.modules`. |
| 328 | |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 329 | |
Ezio Melotti | 898eb82 | 2014-07-06 20:53:27 +0300 | [diff] [blame] | 330 | Why are default values shared between objects? |
| 331 | ---------------------------------------------- |
| 332 | |
| 333 | This type of bug commonly bites neophyte programmers. Consider this function:: |
| 334 | |
| 335 | def foo(mydict={}): # Danger: shared reference to one dict for all calls |
| 336 | ... compute something ... |
| 337 | mydict[key] = value |
| 338 | return mydict |
| 339 | |
| 340 | The first time you call this function, ``mydict`` contains a single item. The |
| 341 | second time, ``mydict`` contains two items because when ``foo()`` begins |
| 342 | executing, ``mydict`` starts out with an item already in it. |
| 343 | |
| 344 | It is often expected that a function call creates new objects for default |
| 345 | values. This is not what happens. Default values are created exactly once, when |
| 346 | the function is defined. If that object is changed, like the dictionary in this |
| 347 | example, subsequent calls to the function will refer to this changed object. |
| 348 | |
| 349 | By definition, immutable objects such as numbers, strings, tuples, and ``None``, |
| 350 | are safe from change. Changes to mutable objects such as dictionaries, lists, |
| 351 | and class instances can lead to confusion. |
| 352 | |
| 353 | Because of this feature, it is good programming practice to not use mutable |
| 354 | objects as default values. Instead, use ``None`` as the default value and |
| 355 | inside the function, check if the parameter is ``None`` and create a new |
| 356 | list/dictionary/whatever if it is. For example, don't write:: |
| 357 | |
| 358 | def foo(mydict={}): |
| 359 | ... |
| 360 | |
| 361 | but:: |
| 362 | |
| 363 | def foo(mydict=None): |
| 364 | if mydict is None: |
| 365 | mydict = {} # create a new dict for local namespace |
| 366 | |
| 367 | This feature can be useful. When you have a function that's time-consuming to |
| 368 | compute, a common technique is to cache the parameters and the resulting value |
| 369 | of each call to the function, and return the cached value if the same value is |
| 370 | requested again. This is called "memoizing", and can be implemented like this:: |
| 371 | |
Noah Haasis | 2707e41 | 2018-06-16 05:29:11 +0200 | [diff] [blame] | 372 | # Callers can only provide two parameters and optionally pass _cache by keyword |
| 373 | def expensive(arg1, arg2, *, _cache={}): |
Ezio Melotti | 898eb82 | 2014-07-06 20:53:27 +0300 | [diff] [blame] | 374 | if (arg1, arg2) in _cache: |
| 375 | return _cache[(arg1, arg2)] |
| 376 | |
| 377 | # Calculate the value |
| 378 | result = ... expensive computation ... |
R David Murray | 623ae29 | 2014-09-28 11:01:11 -0400 | [diff] [blame] | 379 | _cache[(arg1, arg2)] = result # Store result in the cache |
Ezio Melotti | 898eb82 | 2014-07-06 20:53:27 +0300 | [diff] [blame] | 380 | return result |
| 381 | |
| 382 | You could use a global variable containing a dictionary instead of the default |
| 383 | value; it's a matter of taste. |
| 384 | |
| 385 | |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 386 | How can I pass optional or keyword parameters from one function to another? |
| 387 | --------------------------------------------------------------------------- |
| 388 | |
| 389 | Collect the arguments using the ``*`` and ``**`` specifiers in the function's |
| 390 | parameter list; this gives you the positional arguments as a tuple and the |
| 391 | keyword arguments as a dictionary. You can then pass these arguments when |
| 392 | calling another function by using ``*`` and ``**``:: |
| 393 | |
| 394 | def f(x, *args, **kwargs): |
| 395 | ... |
| 396 | kwargs['width'] = '14.3c' |
| 397 | ... |
| 398 | g(x, *args, **kwargs) |
| 399 | |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 400 | |
Chris Jerdonek | b430994 | 2012-12-25 14:54:44 -0800 | [diff] [blame] | 401 | .. index:: |
| 402 | single: argument; difference from parameter |
| 403 | single: parameter; difference from argument |
| 404 | |
Chris Jerdonek | c2a7fd6 | 2012-11-28 02:29:33 -0800 | [diff] [blame] | 405 | .. _faq-argument-vs-parameter: |
| 406 | |
| 407 | What is the difference between arguments and parameters? |
| 408 | -------------------------------------------------------- |
| 409 | |
| 410 | :term:`Parameters <parameter>` are defined by the names that appear in a |
| 411 | function definition, whereas :term:`arguments <argument>` are the values |
| 412 | actually passed to a function when calling it. Parameters define what types of |
| 413 | arguments a function can accept. For example, given the function definition:: |
| 414 | |
| 415 | def func(foo, bar=None, **kwargs): |
| 416 | pass |
| 417 | |
| 418 | *foo*, *bar* and *kwargs* are parameters of ``func``. However, when calling |
| 419 | ``func``, for example:: |
| 420 | |
| 421 | func(42, bar=314, extra=somevar) |
| 422 | |
| 423 | the values ``42``, ``314``, and ``somevar`` are arguments. |
| 424 | |
| 425 | |
R David Murray | 623ae29 | 2014-09-28 11:01:11 -0400 | [diff] [blame] | 426 | Why did changing list 'y' also change list 'x'? |
| 427 | ------------------------------------------------ |
| 428 | |
| 429 | If you wrote code like:: |
| 430 | |
| 431 | >>> x = [] |
| 432 | >>> y = x |
| 433 | >>> y.append(10) |
| 434 | >>> y |
| 435 | [10] |
| 436 | >>> x |
| 437 | [10] |
| 438 | |
| 439 | you might be wondering why appending an element to ``y`` changed ``x`` too. |
| 440 | |
| 441 | There are two factors that produce this result: |
| 442 | |
| 443 | 1) Variables are simply names that refer to objects. Doing ``y = x`` doesn't |
| 444 | create a copy of the list -- it creates a new variable ``y`` that refers to |
| 445 | the same object ``x`` refers to. This means that there is only one object |
| 446 | (the list), and both ``x`` and ``y`` refer to it. |
| 447 | 2) Lists are :term:`mutable`, which means that you can change their content. |
| 448 | |
| 449 | After the call to :meth:`~list.append`, the content of the mutable object has |
| 450 | changed from ``[]`` to ``[10]``. Since both the variables refer to the same |
R David Murray | 12dc0d9 | 2014-09-29 10:17:28 -0400 | [diff] [blame] | 451 | object, using either name accesses the modified value ``[10]``. |
R David Murray | 623ae29 | 2014-09-28 11:01:11 -0400 | [diff] [blame] | 452 | |
| 453 | If we instead assign an immutable object to ``x``:: |
| 454 | |
| 455 | >>> x = 5 # ints are immutable |
| 456 | >>> y = x |
| 457 | >>> x = x + 1 # 5 can't be mutated, we are creating a new object here |
| 458 | >>> x |
| 459 | 6 |
| 460 | >>> y |
| 461 | 5 |
| 462 | |
| 463 | we can see that in this case ``x`` and ``y`` are not equal anymore. This is |
| 464 | because integers are :term:`immutable`, and when we do ``x = x + 1`` we are not |
| 465 | mutating the int ``5`` by incrementing its value; instead, we are creating a |
| 466 | new object (the int ``6``) and assigning it to ``x`` (that is, changing which |
| 467 | object ``x`` refers to). After this assignment we have two objects (the ints |
| 468 | ``6`` and ``5``) and two variables that refer to them (``x`` now refers to |
| 469 | ``6`` but ``y`` still refers to ``5``). |
| 470 | |
| 471 | Some operations (for example ``y.append(10)`` and ``y.sort()``) mutate the |
| 472 | object, whereas superficially similar operations (for example ``y = y + [10]`` |
| 473 | and ``sorted(y)``) create a new object. In general in Python (and in all cases |
| 474 | in the standard library) a method that mutates an object will return ``None`` |
| 475 | to help avoid getting the two types of operations confused. So if you |
| 476 | mistakenly write ``y.sort()`` thinking it will give you a sorted copy of ``y``, |
| 477 | you'll instead end up with ``None``, which will likely cause your program to |
| 478 | generate an easily diagnosed error. |
| 479 | |
| 480 | However, there is one class of operations where the same operation sometimes |
| 481 | has different behaviors with different types: the augmented assignment |
| 482 | operators. For example, ``+=`` mutates lists but not tuples or ints (``a_list |
| 483 | += [1, 2, 3]`` is equivalent to ``a_list.extend([1, 2, 3])`` and mutates |
| 484 | ``a_list``, whereas ``some_tuple += (1, 2, 3)`` and ``some_int += 1`` create |
| 485 | new objects). |
| 486 | |
| 487 | In other words: |
| 488 | |
| 489 | * If we have a mutable object (:class:`list`, :class:`dict`, :class:`set`, |
| 490 | etc.), we can use some specific operations to mutate it and all the variables |
| 491 | that refer to it will see the change. |
| 492 | * If we have an immutable object (:class:`str`, :class:`int`, :class:`tuple`, |
| 493 | etc.), all the variables that refer to it will always see the same value, |
| 494 | but operations that transform that value into a new value always return a new |
| 495 | object. |
| 496 | |
| 497 | If you want to know if two variables refer to the same object or not, you can |
| 498 | use the :keyword:`is` operator, or the built-in function :func:`id`. |
| 499 | |
| 500 | |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 501 | How do I write a function with output parameters (call by reference)? |
| 502 | --------------------------------------------------------------------- |
| 503 | |
| 504 | Remember that arguments are passed by assignment in Python. Since assignment |
| 505 | just creates references to objects, there's no alias between an argument name in |
| 506 | the caller and callee, and so no call-by-reference per se. You can achieve the |
| 507 | desired effect in a number of ways. |
| 508 | |
| 509 | 1) By returning a tuple of the results:: |
| 510 | |
Jiajie Zhong | 67acf74 | 2020-08-09 03:29:03 +0800 | [diff] [blame] | 511 | >>> def func1(a, b): |
| 512 | ... a = 'new-value' # a and b are local names |
| 513 | ... b = b + 1 # assigned to new objects |
| 514 | ... return a, b # return new values |
| 515 | ... |
| 516 | >>> x, y = 'old-value', 99 |
| 517 | >>> func1(x, y) |
| 518 | ('new-value', 100) |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 519 | |
| 520 | This is almost always the clearest solution. |
| 521 | |
| 522 | 2) By using global variables. This isn't thread-safe, and is not recommended. |
| 523 | |
| 524 | 3) By passing a mutable (changeable in-place) object:: |
| 525 | |
Jiajie Zhong | 67acf74 | 2020-08-09 03:29:03 +0800 | [diff] [blame] | 526 | >>> def func2(a): |
| 527 | ... a[0] = 'new-value' # 'a' references a mutable list |
| 528 | ... a[1] = a[1] + 1 # changes a shared object |
| 529 | ... |
| 530 | >>> args = ['old-value', 99] |
| 531 | >>> func2(args) |
| 532 | >>> args |
| 533 | ['new-value', 100] |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 534 | |
| 535 | 4) By passing in a dictionary that gets mutated:: |
| 536 | |
Jiajie Zhong | 67acf74 | 2020-08-09 03:29:03 +0800 | [diff] [blame] | 537 | >>> def func3(args): |
| 538 | ... args['a'] = 'new-value' # args is a mutable dictionary |
| 539 | ... args['b'] = args['b'] + 1 # change it in-place |
| 540 | ... |
| 541 | >>> args = {'a': 'old-value', 'b': 99} |
| 542 | >>> func3(args) |
| 543 | >>> args |
| 544 | {'a': 'new-value', 'b': 100} |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 545 | |
| 546 | 5) Or bundle up values in a class instance:: |
| 547 | |
Jiajie Zhong | 67acf74 | 2020-08-09 03:29:03 +0800 | [diff] [blame] | 548 | >>> class Namespace: |
| 549 | ... def __init__(self, /, **args): |
| 550 | ... for key, value in args.items(): |
| 551 | ... setattr(self, key, value) |
| 552 | ... |
| 553 | >>> def func4(args): |
| 554 | ... args.a = 'new-value' # args is a mutable Namespace |
| 555 | ... args.b = args.b + 1 # change object in-place |
| 556 | ... |
| 557 | >>> args = Namespace(a='old-value', b=99) |
| 558 | >>> func4(args) |
| 559 | >>> vars(args) |
| 560 | {'a': 'new-value', 'b': 100} |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 561 | |
| 562 | |
| 563 | There's almost never a good reason to get this complicated. |
| 564 | |
| 565 | Your best choice is to return a tuple containing the multiple results. |
| 566 | |
| 567 | |
| 568 | How do you make a higher order function in Python? |
| 569 | -------------------------------------------------- |
| 570 | |
| 571 | You have two choices: you can use nested scopes or you can use callable objects. |
| 572 | For example, suppose you wanted to define ``linear(a,b)`` which returns a |
| 573 | function ``f(x)`` that computes the value ``a*x+b``. Using nested scopes:: |
| 574 | |
| 575 | def linear(a, b): |
| 576 | def result(x): |
| 577 | return a * x + b |
| 578 | return result |
| 579 | |
| 580 | Or using a callable object:: |
| 581 | |
| 582 | class linear: |
| 583 | |
| 584 | def __init__(self, a, b): |
| 585 | self.a, self.b = a, b |
| 586 | |
| 587 | def __call__(self, x): |
| 588 | return self.a * x + self.b |
| 589 | |
| 590 | In both cases, :: |
| 591 | |
| 592 | taxes = linear(0.3, 2) |
| 593 | |
| 594 | gives a callable object where ``taxes(10e6) == 0.3 * 10e6 + 2``. |
| 595 | |
| 596 | The callable object approach has the disadvantage that it is a bit slower and |
| 597 | results in slightly longer code. However, note that a collection of callables |
| 598 | can share their signature via inheritance:: |
| 599 | |
| 600 | class exponential(linear): |
| 601 | # __init__ inherited |
| 602 | def __call__(self, x): |
| 603 | return self.a * (x ** self.b) |
| 604 | |
| 605 | Object can encapsulate state for several methods:: |
| 606 | |
| 607 | class counter: |
| 608 | |
| 609 | value = 0 |
| 610 | |
| 611 | def set(self, x): |
| 612 | self.value = x |
| 613 | |
| 614 | def up(self): |
| 615 | self.value = self.value + 1 |
| 616 | |
| 617 | def down(self): |
| 618 | self.value = self.value - 1 |
| 619 | |
| 620 | count = counter() |
| 621 | inc, dec, reset = count.up, count.down, count.set |
| 622 | |
| 623 | Here ``inc()``, ``dec()`` and ``reset()`` act like functions which share the |
| 624 | same counting variable. |
| 625 | |
| 626 | |
| 627 | How do I copy an object in Python? |
| 628 | ---------------------------------- |
| 629 | |
| 630 | In general, try :func:`copy.copy` or :func:`copy.deepcopy` for the general case. |
| 631 | Not all objects can be copied, but most can. |
| 632 | |
| 633 | Some objects can be copied more easily. Dictionaries have a :meth:`~dict.copy` |
| 634 | method:: |
| 635 | |
| 636 | newdict = olddict.copy() |
| 637 | |
| 638 | Sequences can be copied by slicing:: |
| 639 | |
| 640 | new_l = l[:] |
| 641 | |
| 642 | |
| 643 | How can I find the methods or attributes of an object? |
| 644 | ------------------------------------------------------ |
| 645 | |
| 646 | For an instance x of a user-defined class, ``dir(x)`` returns an alphabetized |
| 647 | list of the names containing the instance attributes and methods and attributes |
| 648 | defined by its class. |
| 649 | |
| 650 | |
| 651 | How can my code discover the name of an object? |
| 652 | ----------------------------------------------- |
| 653 | |
| 654 | Generally speaking, it can't, because objects don't really have names. |
avinassh | 3aa48b8 | 2019-08-29 11:10:50 +0530 | [diff] [blame] | 655 | Essentially, assignment always binds a name to a value; the same is true of |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 656 | ``def`` and ``class`` statements, but in that case the value is a |
| 657 | callable. Consider the following code:: |
| 658 | |
Serhiy Storchaka | dba9039 | 2016-05-10 12:01:23 +0300 | [diff] [blame] | 659 | >>> class A: |
| 660 | ... pass |
| 661 | ... |
| 662 | >>> B = A |
| 663 | >>> a = B() |
| 664 | >>> b = a |
| 665 | >>> print(b) |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 666 | <__main__.A object at 0x16D07CC> |
Serhiy Storchaka | dba9039 | 2016-05-10 12:01:23 +0300 | [diff] [blame] | 667 | >>> print(a) |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 668 | <__main__.A object at 0x16D07CC> |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 669 | |
| 670 | Arguably the class has a name: even though it is bound to two names and invoked |
| 671 | through the name B the created instance is still reported as an instance of |
| 672 | class A. However, it is impossible to say whether the instance's name is a or |
| 673 | b, since both names are bound to the same value. |
| 674 | |
| 675 | Generally speaking it should not be necessary for your code to "know the names" |
| 676 | of particular values. Unless you are deliberately writing introspective |
| 677 | programs, this is usually an indication that a change of approach might be |
| 678 | beneficial. |
| 679 | |
| 680 | In comp.lang.python, Fredrik Lundh once gave an excellent analogy in answer to |
| 681 | this question: |
| 682 | |
| 683 | The same way as you get the name of that cat you found on your porch: the cat |
| 684 | (object) itself cannot tell you its name, and it doesn't really care -- so |
| 685 | the only way to find out what it's called is to ask all your neighbours |
| 686 | (namespaces) if it's their cat (object)... |
| 687 | |
| 688 | ....and don't be surprised if you'll find that it's known by many names, or |
| 689 | no name at all! |
| 690 | |
| 691 | |
| 692 | What's up with the comma operator's precedence? |
| 693 | ----------------------------------------------- |
| 694 | |
| 695 | Comma is not an operator in Python. Consider this session:: |
| 696 | |
| 697 | >>> "a" in "b", "a" |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 698 | (False, 'a') |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 699 | |
| 700 | Since the comma is not an operator, but a separator between expressions the |
| 701 | above is evaluated as if you had entered:: |
| 702 | |
R David Murray | fdf9503 | 2013-06-19 16:58:26 -0400 | [diff] [blame] | 703 | ("a" in "b"), "a" |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 704 | |
| 705 | not:: |
| 706 | |
R David Murray | fdf9503 | 2013-06-19 16:58:26 -0400 | [diff] [blame] | 707 | "a" in ("b", "a") |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 708 | |
| 709 | The same is true of the various assignment operators (``=``, ``+=`` etc). They |
| 710 | are not truly operators but syntactic delimiters in assignment statements. |
| 711 | |
| 712 | |
| 713 | Is there an equivalent of C's "?:" ternary operator? |
| 714 | ---------------------------------------------------- |
| 715 | |
Antoine Pitrou | c5b266e | 2011-12-03 22:11:11 +0100 | [diff] [blame] | 716 | Yes, there is. The syntax is as follows:: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 717 | |
| 718 | [on_true] if [expression] else [on_false] |
| 719 | |
| 720 | x, y = 50, 25 |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 721 | small = x if x < y else y |
| 722 | |
Antoine Pitrou | c5b266e | 2011-12-03 22:11:11 +0100 | [diff] [blame] | 723 | Before this syntax was introduced in Python 2.5, a common idiom was to use |
| 724 | logical operators:: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 725 | |
Antoine Pitrou | c5b266e | 2011-12-03 22:11:11 +0100 | [diff] [blame] | 726 | [expression] and [on_true] or [on_false] |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 727 | |
Antoine Pitrou | c5b266e | 2011-12-03 22:11:11 +0100 | [diff] [blame] | 728 | However, this idiom is unsafe, as it can give wrong results when *on_true* |
| 729 | has a false boolean value. Therefore, it is always better to use |
| 730 | the ``... if ... else ...`` form. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 731 | |
| 732 | |
| 733 | Is it possible to write obfuscated one-liners in Python? |
| 734 | -------------------------------------------------------- |
| 735 | |
| 736 | Yes. Usually this is done by nesting :keyword:`lambda` within |
Serhiy Storchaka | 2b57c43 | 2018-12-19 08:09:46 +0200 | [diff] [blame] | 737 | :keyword:`!lambda`. See the following three examples, due to Ulf Bartelt:: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 738 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 739 | from functools import reduce |
| 740 | |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 741 | # Primes < 1000 |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 742 | print(list(filter(None,map(lambda y:y*reduce(lambda x,y:x*y!=0, |
| 743 | map(lambda x,y=y:y%x,range(2,int(pow(y,0.5)+1))),1),range(2,1000))))) |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 744 | |
| 745 | # First 10 Fibonacci numbers |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 746 | print(list(map(lambda x,f=lambda x,f:(f(x-1,f)+f(x-2,f)) if x>1 else 1: |
| 747 | f(x,f), range(10)))) |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 748 | |
| 749 | # Mandelbrot set |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 750 | print((lambda Ru,Ro,Iu,Io,IM,Sx,Sy:reduce(lambda x,y:x+y,map(lambda y, |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 751 | Iu=Iu,Io=Io,Ru=Ru,Ro=Ro,Sy=Sy,L=lambda yc,Iu=Iu,Io=Io,Ru=Ru,Ro=Ro,i=IM, |
| 752 | Sx=Sx,Sy=Sy:reduce(lambda x,y:x+y,map(lambda x,xc=Ru,yc=yc,Ru=Ru,Ro=Ro, |
| 753 | i=i,Sx=Sx,F=lambda xc,yc,x,y,k,f=lambda xc,yc,x,y,k,f:(k<=0)or (x*x+y*y |
| 754 | >=4.0) or 1+f(xc,yc,x*x-y*y+xc,2.0*x*y+yc,k-1,f):f(xc,yc,x,y,k,f):chr( |
| 755 | 64+F(Ru+x*(Ro-Ru)/Sx,yc,0,0,i)),range(Sx))):L(Iu+y*(Io-Iu)/Sy),range(Sy |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 756 | ))))(-2.1, 0.7, -1.2, 1.2, 30, 80, 24)) |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 757 | # \___ ___/ \___ ___/ | | |__ lines on screen |
| 758 | # V V | |______ columns on screen |
| 759 | # | | |__________ maximum of "iterations" |
| 760 | # | |_________________ range on y axis |
| 761 | # |____________________________ range on x axis |
| 762 | |
| 763 | Don't try this at home, kids! |
| 764 | |
| 765 | |
Lysandros Nikolaou | 1aeeaeb | 2019-03-10 12:30:11 +0100 | [diff] [blame] | 766 | .. _faq-positional-only-arguments: |
| 767 | |
| 768 | What does the slash(/) in the parameter list of a function mean? |
| 769 | ---------------------------------------------------------------- |
| 770 | |
| 771 | A slash in the argument list of a function denotes that the parameters prior to |
| 772 | it are positional-only. Positional-only parameters are the ones without an |
| 773 | externally-usable name. Upon calling a function that accepts positional-only |
| 774 | parameters, arguments are mapped to parameters based solely on their position. |
Ammar Askar | 87d6cd3 | 2019-09-21 00:28:49 -0400 | [diff] [blame] | 775 | For example, :func:`divmod` is a function that accepts positional-only |
| 776 | parameters. Its documentation looks like this:: |
Lysandros Nikolaou | 1aeeaeb | 2019-03-10 12:30:11 +0100 | [diff] [blame] | 777 | |
Ammar Askar | 87d6cd3 | 2019-09-21 00:28:49 -0400 | [diff] [blame] | 778 | >>> help(divmod) |
| 779 | Help on built-in function divmod in module builtins: |
Lysandros Nikolaou | 1aeeaeb | 2019-03-10 12:30:11 +0100 | [diff] [blame] | 780 | |
Ammar Askar | 87d6cd3 | 2019-09-21 00:28:49 -0400 | [diff] [blame] | 781 | divmod(x, y, /) |
| 782 | Return the tuple (x//y, x%y). Invariant: div*y + mod == x. |
Lysandros Nikolaou | 1aeeaeb | 2019-03-10 12:30:11 +0100 | [diff] [blame] | 783 | |
Ammar Askar | 87d6cd3 | 2019-09-21 00:28:49 -0400 | [diff] [blame] | 784 | The slash at the end of the parameter list means that both parameters are |
| 785 | positional-only. Thus, calling :func:`divmod` with keyword arguments would lead |
| 786 | to an error:: |
Lysandros Nikolaou | 1aeeaeb | 2019-03-10 12:30:11 +0100 | [diff] [blame] | 787 | |
Ammar Askar | 87d6cd3 | 2019-09-21 00:28:49 -0400 | [diff] [blame] | 788 | >>> divmod(x=3, y=4) |
Lysandros Nikolaou | 1aeeaeb | 2019-03-10 12:30:11 +0100 | [diff] [blame] | 789 | Traceback (most recent call last): |
| 790 | File "<stdin>", line 1, in <module> |
Ammar Askar | 87d6cd3 | 2019-09-21 00:28:49 -0400 | [diff] [blame] | 791 | TypeError: divmod() takes no keyword arguments |
Lysandros Nikolaou | 1aeeaeb | 2019-03-10 12:30:11 +0100 | [diff] [blame] | 792 | |
Lysandros Nikolaou | 1aeeaeb | 2019-03-10 12:30:11 +0100 | [diff] [blame] | 793 | |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 794 | Numbers and strings |
| 795 | =================== |
| 796 | |
| 797 | How do I specify hexadecimal and octal integers? |
| 798 | ------------------------------------------------ |
| 799 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 800 | To specify an octal digit, precede the octal value with a zero, and then a lower |
| 801 | or uppercase "o". For example, to set the variable "a" to the octal value "10" |
| 802 | (8 in decimal), type:: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 803 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 804 | >>> a = 0o10 |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 805 | >>> a |
| 806 | 8 |
| 807 | |
| 808 | Hexadecimal is just as easy. Simply precede the hexadecimal number with a zero, |
| 809 | and then a lower or uppercase "x". Hexadecimal digits can be specified in lower |
| 810 | or uppercase. For example, in the Python interpreter:: |
| 811 | |
| 812 | >>> a = 0xa5 |
| 813 | >>> a |
| 814 | 165 |
| 815 | >>> b = 0XB2 |
| 816 | >>> b |
| 817 | 178 |
| 818 | |
| 819 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 820 | Why does -22 // 10 return -3? |
| 821 | ----------------------------- |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 822 | |
| 823 | It's primarily driven by the desire that ``i % j`` have the same sign as ``j``. |
| 824 | If you want that, and also want:: |
| 825 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 826 | i == (i // j) * j + (i % j) |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 827 | |
| 828 | then integer division has to return the floor. C also requires that identity to |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 829 | hold, and then compilers that truncate ``i // j`` need to make ``i % j`` have |
| 830 | the same sign as ``i``. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 831 | |
| 832 | There are few real use cases for ``i % j`` when ``j`` is negative. When ``j`` |
| 833 | is positive, there are many, and in virtually all of them it's more useful for |
| 834 | ``i % j`` to be ``>= 0``. If the clock says 10 now, what did it say 200 hours |
| 835 | ago? ``-190 % 12 == 2`` is useful; ``-190 % 12 == -10`` is a bug waiting to |
| 836 | bite. |
| 837 | |
| 838 | |
| 839 | How do I convert a string to a number? |
| 840 | -------------------------------------- |
| 841 | |
| 842 | For integers, use the built-in :func:`int` type constructor, e.g. ``int('144') |
| 843 | == 144``. Similarly, :func:`float` converts to floating-point, |
| 844 | e.g. ``float('144') == 144.0``. |
| 845 | |
| 846 | By default, these interpret the number as decimal, so that ``int('0144') == |
Cajetan Rodrigues | 5aafa54 | 2020-04-25 01:39:04 +0200 | [diff] [blame] | 847 | 144`` holds true, and ``int('0x144')`` raises :exc:`ValueError`. ``int(string, |
| 848 | base)`` takes the base to convert from as a second optional argument, so ``int( |
| 849 | '0x144', 16) == 324``. If the base is specified as 0, the number is interpreted |
| 850 | using Python's rules: a leading '0o' indicates octal, and '0x' indicates a hex |
| 851 | number. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 852 | |
| 853 | Do not use the built-in function :func:`eval` if all you need is to convert |
| 854 | strings to numbers. :func:`eval` will be significantly slower and it presents a |
| 855 | security risk: someone could pass you a Python expression that might have |
| 856 | unwanted side effects. For example, someone could pass |
| 857 | ``__import__('os').system("rm -rf $HOME")`` which would erase your home |
| 858 | directory. |
| 859 | |
| 860 | :func:`eval` also has the effect of interpreting numbers as Python expressions, |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 861 | so that e.g. ``eval('09')`` gives a syntax error because Python does not allow |
| 862 | leading '0' in a decimal number (except '0'). |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 863 | |
| 864 | |
| 865 | How do I convert a number to a string? |
| 866 | -------------------------------------- |
| 867 | |
| 868 | To convert, e.g., the number 144 to the string '144', use the built-in type |
| 869 | constructor :func:`str`. If you want a hexadecimal or octal representation, use |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 870 | the built-in functions :func:`hex` or :func:`oct`. For fancy formatting, see |
Martin Panter | bc1ee46 | 2016-02-13 00:41:37 +0000 | [diff] [blame] | 871 | the :ref:`f-strings` and :ref:`formatstrings` sections, |
| 872 | e.g. ``"{:04d}".format(144)`` yields |
Eric V. Smith | 04d8a24 | 2014-04-14 07:52:53 -0400 | [diff] [blame] | 873 | ``'0144'`` and ``"{:.3f}".format(1.0/3.0)`` yields ``'0.333'``. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 874 | |
| 875 | |
| 876 | How do I modify a string in place? |
| 877 | ---------------------------------- |
| 878 | |
Antoine Pitrou | c5b266e | 2011-12-03 22:11:11 +0100 | [diff] [blame] | 879 | You can't, because strings are immutable. In most situations, you should |
| 880 | simply construct a new string from the various parts you want to assemble |
| 881 | it from. However, if you need an object with the ability to modify in-place |
Martin Panter | 7462b649 | 2015-11-02 03:37:02 +0000 | [diff] [blame] | 882 | unicode data, try using an :class:`io.StringIO` object or the :mod:`array` |
Antoine Pitrou | c5b266e | 2011-12-03 22:11:11 +0100 | [diff] [blame] | 883 | module:: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 884 | |
R David Murray | fdf9503 | 2013-06-19 16:58:26 -0400 | [diff] [blame] | 885 | >>> import io |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 886 | >>> s = "Hello, world" |
Antoine Pitrou | c5b266e | 2011-12-03 22:11:11 +0100 | [diff] [blame] | 887 | >>> sio = io.StringIO(s) |
| 888 | >>> sio.getvalue() |
| 889 | 'Hello, world' |
| 890 | >>> sio.seek(7) |
| 891 | 7 |
| 892 | >>> sio.write("there!") |
| 893 | 6 |
| 894 | >>> sio.getvalue() |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 895 | 'Hello, there!' |
| 896 | |
| 897 | >>> import array |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 898 | >>> a = array.array('u', s) |
| 899 | >>> print(a) |
| 900 | array('u', 'Hello, world') |
| 901 | >>> a[0] = 'y' |
| 902 | >>> print(a) |
R David Murray | fdf9503 | 2013-06-19 16:58:26 -0400 | [diff] [blame] | 903 | array('u', 'yello, world') |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 904 | >>> a.tounicode() |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 905 | 'yello, world' |
| 906 | |
| 907 | |
| 908 | How do I use strings to call functions/methods? |
| 909 | ----------------------------------------------- |
| 910 | |
| 911 | There are various techniques. |
| 912 | |
| 913 | * The best is to use a dictionary that maps strings to functions. The primary |
| 914 | advantage of this technique is that the strings do not need to match the names |
| 915 | of the functions. This is also the primary technique used to emulate a case |
| 916 | construct:: |
| 917 | |
| 918 | def a(): |
| 919 | pass |
| 920 | |
| 921 | def b(): |
| 922 | pass |
| 923 | |
| 924 | dispatch = {'go': a, 'stop': b} # Note lack of parens for funcs |
| 925 | |
| 926 | dispatch[get_input()]() # Note trailing parens to call function |
| 927 | |
| 928 | * Use the built-in function :func:`getattr`:: |
| 929 | |
| 930 | import foo |
| 931 | getattr(foo, 'bar')() |
| 932 | |
| 933 | Note that :func:`getattr` works on any object, including classes, class |
| 934 | instances, modules, and so on. |
| 935 | |
| 936 | This is used in several places in the standard library, like this:: |
| 937 | |
| 938 | class Foo: |
| 939 | def do_foo(self): |
| 940 | ... |
| 941 | |
| 942 | def do_bar(self): |
| 943 | ... |
| 944 | |
| 945 | f = getattr(foo_instance, 'do_' + opname) |
| 946 | f() |
| 947 | |
| 948 | |
Zackery Spytz | a22a19f | 2020-10-16 12:44:17 -0600 | [diff] [blame] | 949 | * Use :func:`locals` to resolve the function name:: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 950 | |
| 951 | def myFunc(): |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 952 | print("hello") |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 953 | |
| 954 | fname = "myFunc" |
| 955 | |
| 956 | f = locals()[fname] |
| 957 | f() |
| 958 | |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 959 | |
| 960 | Is there an equivalent to Perl's chomp() for removing trailing newlines from strings? |
| 961 | ------------------------------------------------------------------------------------- |
| 962 | |
Antoine Pitrou | f352040 | 2011-12-03 22:19:55 +0100 | [diff] [blame] | 963 | You can use ``S.rstrip("\r\n")`` to remove all occurrences of any line |
| 964 | terminator from the end of the string ``S`` without removing other trailing |
| 965 | whitespace. If the string ``S`` represents more than one line, with several |
| 966 | empty lines at the end, the line terminators for all the blank lines will |
| 967 | be removed:: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 968 | |
| 969 | >>> lines = ("line 1 \r\n" |
| 970 | ... "\r\n" |
| 971 | ... "\r\n") |
| 972 | >>> lines.rstrip("\n\r") |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 973 | 'line 1 ' |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 974 | |
| 975 | Since this is typically only desired when reading text one line at a time, using |
| 976 | ``S.rstrip()`` this way works well. |
| 977 | |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 978 | |
| 979 | Is there a scanf() or sscanf() equivalent? |
| 980 | ------------------------------------------ |
| 981 | |
| 982 | Not as such. |
| 983 | |
| 984 | For simple input parsing, the easiest approach is usually to split the line into |
| 985 | whitespace-delimited words using the :meth:`~str.split` method of string objects |
| 986 | and then convert decimal strings to numeric values using :func:`int` or |
| 987 | :func:`float`. ``split()`` supports an optional "sep" parameter which is useful |
| 988 | if the line uses something other than whitespace as a separator. |
| 989 | |
Brian Curtin | 5a7a52f | 2010-09-23 13:45:21 +0000 | [diff] [blame] | 990 | For more complicated input parsing, regular expressions are more powerful |
Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 991 | than C's :c:func:`sscanf` and better suited for the task. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 992 | |
| 993 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 994 | What does 'UnicodeDecodeError' or 'UnicodeEncodeError' error mean? |
| 995 | ------------------------------------------------------------------- |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 996 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 997 | See the :ref:`unicode-howto`. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 998 | |
| 999 | |
Antoine Pitrou | 432259f | 2011-12-09 23:10:31 +0100 | [diff] [blame] | 1000 | Performance |
| 1001 | =========== |
| 1002 | |
| 1003 | My program is too slow. How do I speed it up? |
| 1004 | --------------------------------------------- |
| 1005 | |
| 1006 | That's a tough one, in general. First, here are a list of things to |
| 1007 | remember before diving further: |
| 1008 | |
Georg Brandl | 300a691 | 2012-03-14 22:40:08 +0100 | [diff] [blame] | 1009 | * Performance characteristics vary across Python implementations. This FAQ |
Gurupad Hegde | 6c7bb38 | 2019-12-28 17:16:02 -0500 | [diff] [blame] | 1010 | focuses on :term:`CPython`. |
Georg Brandl | 300a691 | 2012-03-14 22:40:08 +0100 | [diff] [blame] | 1011 | * Behaviour can vary across operating systems, especially when talking about |
Antoine Pitrou | 432259f | 2011-12-09 23:10:31 +0100 | [diff] [blame] | 1012 | I/O or multi-threading. |
| 1013 | * You should always find the hot spots in your program *before* attempting to |
| 1014 | optimize any code (see the :mod:`profile` module). |
| 1015 | * Writing benchmark scripts will allow you to iterate quickly when searching |
| 1016 | for improvements (see the :mod:`timeit` module). |
| 1017 | * It is highly recommended to have good code coverage (through unit testing |
| 1018 | or any other technique) before potentially introducing regressions hidden |
| 1019 | in sophisticated optimizations. |
| 1020 | |
| 1021 | That being said, there are many tricks to speed up Python code. Here are |
| 1022 | some general principles which go a long way towards reaching acceptable |
| 1023 | performance levels: |
| 1024 | |
| 1025 | * Making your algorithms faster (or changing to faster ones) can yield |
| 1026 | much larger benefits than trying to sprinkle micro-optimization tricks |
| 1027 | all over your code. |
| 1028 | |
| 1029 | * Use the right data structures. Study documentation for the :ref:`bltin-types` |
| 1030 | and the :mod:`collections` module. |
| 1031 | |
| 1032 | * When the standard library provides a primitive for doing something, it is |
| 1033 | likely (although not guaranteed) to be faster than any alternative you |
| 1034 | may come up with. This is doubly true for primitives written in C, such |
| 1035 | as builtins and some extension types. For example, be sure to use |
| 1036 | either the :meth:`list.sort` built-in method or the related :func:`sorted` |
Senthil Kumaran | d03d1d4 | 2016-01-01 23:25:58 -0800 | [diff] [blame] | 1037 | function to do sorting (and see the :ref:`sortinghowto` for examples |
Antoine Pitrou | 432259f | 2011-12-09 23:10:31 +0100 | [diff] [blame] | 1038 | of moderately advanced usage). |
| 1039 | |
| 1040 | * Abstractions tend to create indirections and force the interpreter to work |
| 1041 | more. If the levels of indirection outweigh the amount of useful work |
| 1042 | done, your program will be slower. You should avoid excessive abstraction, |
| 1043 | especially under the form of tiny functions or methods (which are also often |
| 1044 | detrimental to readability). |
| 1045 | |
| 1046 | If you have reached the limit of what pure Python can allow, there are tools |
| 1047 | to take you further away. For example, `Cython <http://cython.org>`_ can |
| 1048 | compile a slightly modified version of Python code into a C extension, and |
| 1049 | can be used on many different platforms. Cython can take advantage of |
| 1050 | compilation (and optional type annotations) to make your code significantly |
| 1051 | faster than when interpreted. If you are confident in your C programming |
| 1052 | skills, you can also :ref:`write a C extension module <extending-index>` |
| 1053 | yourself. |
| 1054 | |
| 1055 | .. seealso:: |
| 1056 | The wiki page devoted to `performance tips |
Georg Brandl | e73778c | 2014-10-29 08:36:35 +0100 | [diff] [blame] | 1057 | <https://wiki.python.org/moin/PythonSpeed/PerformanceTips>`_. |
Antoine Pitrou | 432259f | 2011-12-09 23:10:31 +0100 | [diff] [blame] | 1058 | |
| 1059 | .. _efficient_string_concatenation: |
| 1060 | |
Antoine Pitrou | fd9ebd4 | 2011-11-25 16:33:53 +0100 | [diff] [blame] | 1061 | What is the most efficient way to concatenate many strings together? |
| 1062 | -------------------------------------------------------------------- |
| 1063 | |
| 1064 | :class:`str` and :class:`bytes` objects are immutable, therefore concatenating |
| 1065 | many strings together is inefficient as each concatenation creates a new |
| 1066 | object. In the general case, the total runtime cost is quadratic in the |
| 1067 | total string length. |
| 1068 | |
| 1069 | To accumulate many :class:`str` objects, the recommended idiom is to place |
| 1070 | them into a list and call :meth:`str.join` at the end:: |
| 1071 | |
| 1072 | chunks = [] |
| 1073 | for s in my_strings: |
| 1074 | chunks.append(s) |
| 1075 | result = ''.join(chunks) |
| 1076 | |
| 1077 | (another reasonably efficient idiom is to use :class:`io.StringIO`) |
| 1078 | |
| 1079 | To accumulate many :class:`bytes` objects, the recommended idiom is to extend |
| 1080 | a :class:`bytearray` object using in-place concatenation (the ``+=`` operator):: |
| 1081 | |
| 1082 | result = bytearray() |
| 1083 | for b in my_bytes_objects: |
| 1084 | result += b |
| 1085 | |
| 1086 | |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1087 | Sequences (Tuples/Lists) |
| 1088 | ======================== |
| 1089 | |
| 1090 | How do I convert between tuples and lists? |
| 1091 | ------------------------------------------ |
| 1092 | |
| 1093 | The type constructor ``tuple(seq)`` converts any sequence (actually, any |
| 1094 | iterable) into a tuple with the same items in the same order. |
| 1095 | |
| 1096 | For example, ``tuple([1, 2, 3])`` yields ``(1, 2, 3)`` and ``tuple('abc')`` |
| 1097 | yields ``('a', 'b', 'c')``. If the argument is a tuple, it does not make a copy |
| 1098 | but returns the same object, so it is cheap to call :func:`tuple` when you |
| 1099 | aren't sure that an object is already a tuple. |
| 1100 | |
| 1101 | The type constructor ``list(seq)`` converts any sequence or iterable into a list |
| 1102 | with the same items in the same order. For example, ``list((1, 2, 3))`` yields |
| 1103 | ``[1, 2, 3]`` and ``list('abc')`` yields ``['a', 'b', 'c']``. If the argument |
| 1104 | is a list, it makes a copy just like ``seq[:]`` would. |
| 1105 | |
| 1106 | |
| 1107 | What's a negative index? |
| 1108 | ------------------------ |
| 1109 | |
| 1110 | Python sequences are indexed with positive numbers and negative numbers. For |
| 1111 | positive numbers 0 is the first index 1 is the second index and so forth. For |
| 1112 | negative indices -1 is the last index and -2 is the penultimate (next to last) |
| 1113 | index and so forth. Think of ``seq[-n]`` as the same as ``seq[len(seq)-n]``. |
| 1114 | |
| 1115 | Using negative indices can be very convenient. For example ``S[:-1]`` is all of |
| 1116 | the string except for its last character, which is useful for removing the |
| 1117 | trailing newline from a string. |
| 1118 | |
| 1119 | |
| 1120 | How do I iterate over a sequence in reverse order? |
| 1121 | -------------------------------------------------- |
| 1122 | |
Andre Delfino | fb2e946 | 2020-10-21 05:25:07 -0300 | [diff] [blame] | 1123 | Use the :func:`reversed` built-in function:: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1124 | |
| 1125 | for x in reversed(sequence): |
Serhiy Storchaka | dba9039 | 2016-05-10 12:01:23 +0300 | [diff] [blame] | 1126 | ... # do something with x ... |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1127 | |
| 1128 | This won't touch your original sequence, but build a new copy with reversed |
| 1129 | order to iterate over. |
| 1130 | |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1131 | |
| 1132 | How do you remove duplicates from a list? |
| 1133 | ----------------------------------------- |
| 1134 | |
| 1135 | See the Python Cookbook for a long discussion of many ways to do this: |
| 1136 | |
Andre Delfino | e8a2076 | 2020-09-26 21:47:25 -0300 | [diff] [blame] | 1137 | https://code.activestate.com/recipes/52560/ |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1138 | |
| 1139 | If you don't mind reordering the list, sort it and then scan from the end of the |
| 1140 | list, deleting duplicates as you go:: |
| 1141 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1142 | if mylist: |
| 1143 | mylist.sort() |
| 1144 | last = mylist[-1] |
| 1145 | for i in range(len(mylist)-2, -1, -1): |
| 1146 | if last == mylist[i]: |
| 1147 | del mylist[i] |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1148 | else: |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1149 | last = mylist[i] |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1150 | |
Antoine Pitrou | f352040 | 2011-12-03 22:19:55 +0100 | [diff] [blame] | 1151 | If all elements of the list may be used as set keys (i.e. they are all |
| 1152 | :term:`hashable`) this is often faster :: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1153 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1154 | mylist = list(set(mylist)) |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1155 | |
| 1156 | This converts the list into a set, thereby removing duplicates, and then back |
| 1157 | into a list. |
| 1158 | |
| 1159 | |
Terry Jan Reedy | 5b0181d | 2020-09-29 01:02:44 -0400 | [diff] [blame] | 1160 | How do you remove multiple items from a list |
| 1161 | -------------------------------------------- |
| 1162 | |
| 1163 | As with removing duplicates, explicitly iterating in reverse with a |
| 1164 | delete condition is one possibility. However, it is easier and faster |
| 1165 | to use slice replacement with an implicit or explicit forward iteration. |
| 1166 | Here are three variations.:: |
| 1167 | |
| 1168 | mylist[:] = filter(keep_function, mylist) |
| 1169 | mylist[:] = (x for x in mylist if keep_condition) |
| 1170 | mylist[:] = [x for x in mylist if keep_condition] |
| 1171 | |
Terry Jan Reedy | 060937d | 2020-10-05 10:31:44 -0400 | [diff] [blame] | 1172 | The list comprehension may be fastest. |
Terry Jan Reedy | 5b0181d | 2020-09-29 01:02:44 -0400 | [diff] [blame] | 1173 | |
| 1174 | |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1175 | How do you make an array in Python? |
| 1176 | ----------------------------------- |
| 1177 | |
| 1178 | Use a list:: |
| 1179 | |
| 1180 | ["this", 1, "is", "an", "array"] |
| 1181 | |
| 1182 | Lists are equivalent to C or Pascal arrays in their time complexity; the primary |
| 1183 | difference is that a Python list can contain objects of many different types. |
| 1184 | |
| 1185 | The ``array`` module also provides methods for creating arrays of fixed types |
| 1186 | with compact representations, but they are slower to index than lists. Also |
Andre Delfino | c8bb241 | 2020-10-01 20:22:14 -0300 | [diff] [blame] | 1187 | note that NumPy and other third party packages define array-like structures with |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1188 | various characteristics as well. |
| 1189 | |
| 1190 | To get Lisp-style linked lists, you can emulate cons cells using tuples:: |
| 1191 | |
| 1192 | lisp_list = ("like", ("this", ("example", None) ) ) |
| 1193 | |
| 1194 | If mutability is desired, you could use lists instead of tuples. Here the |
| 1195 | analogue of lisp car is ``lisp_list[0]`` and the analogue of cdr is |
| 1196 | ``lisp_list[1]``. Only do this if you're sure you really need to, because it's |
| 1197 | usually a lot slower than using Python lists. |
| 1198 | |
| 1199 | |
Martin Panter | 7f02d6d | 2015-09-07 02:08:55 +0000 | [diff] [blame] | 1200 | .. _faq-multidimensional-list: |
| 1201 | |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1202 | How do I create a multidimensional list? |
| 1203 | ---------------------------------------- |
| 1204 | |
| 1205 | You probably tried to make a multidimensional array like this:: |
| 1206 | |
R David Murray | fdf9503 | 2013-06-19 16:58:26 -0400 | [diff] [blame] | 1207 | >>> A = [[None] * 2] * 3 |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1208 | |
Senthil Kumaran | 7749320 | 2016-06-04 20:07:34 -0700 | [diff] [blame] | 1209 | This looks correct if you print it: |
| 1210 | |
| 1211 | .. testsetup:: |
| 1212 | |
| 1213 | A = [[None] * 2] * 3 |
| 1214 | |
| 1215 | .. doctest:: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1216 | |
| 1217 | >>> A |
| 1218 | [[None, None], [None, None], [None, None]] |
| 1219 | |
| 1220 | But when you assign a value, it shows up in multiple places: |
| 1221 | |
Senthil Kumaran | 7749320 | 2016-06-04 20:07:34 -0700 | [diff] [blame] | 1222 | .. testsetup:: |
| 1223 | |
| 1224 | A = [[None] * 2] * 3 |
| 1225 | |
| 1226 | .. doctest:: |
| 1227 | |
| 1228 | >>> A[0][0] = 5 |
| 1229 | >>> A |
| 1230 | [[5, None], [5, None], [5, None]] |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1231 | |
| 1232 | The reason is that replicating a list with ``*`` doesn't create copies, it only |
| 1233 | creates references to the existing objects. The ``*3`` creates a list |
| 1234 | containing 3 references to the same list of length two. Changes to one row will |
| 1235 | show in all rows, which is almost certainly not what you want. |
| 1236 | |
| 1237 | The suggested approach is to create a list of the desired length first and then |
| 1238 | fill in each element with a newly created list:: |
| 1239 | |
| 1240 | A = [None] * 3 |
| 1241 | for i in range(3): |
| 1242 | A[i] = [None] * 2 |
| 1243 | |
| 1244 | This generates a list containing 3 different lists of length two. You can also |
| 1245 | use a list comprehension:: |
| 1246 | |
| 1247 | w, h = 2, 3 |
| 1248 | A = [[None] * w for i in range(h)] |
| 1249 | |
Benjamin Peterson | 6d3ad2f | 2016-05-26 22:51:32 -0700 | [diff] [blame] | 1250 | Or, you can use an extension that provides a matrix datatype; `NumPy |
Ezio Melotti | c1f5839 | 2013-06-09 01:04:21 +0300 | [diff] [blame] | 1251 | <http://www.numpy.org/>`_ is the best known. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1252 | |
| 1253 | |
| 1254 | How do I apply a method to a sequence of objects? |
| 1255 | ------------------------------------------------- |
| 1256 | |
| 1257 | Use a list comprehension:: |
| 1258 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1259 | result = [obj.method() for obj in mylist] |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1260 | |
Larry Hastings | 3732ed2 | 2014-03-15 21:13:56 -0700 | [diff] [blame] | 1261 | .. _faq-augmented-assignment-tuple-error: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1262 | |
R David Murray | bcf06d3 | 2013-05-20 10:32:46 -0400 | [diff] [blame] | 1263 | Why does a_tuple[i] += ['item'] raise an exception when the addition works? |
| 1264 | --------------------------------------------------------------------------- |
| 1265 | |
| 1266 | This is because of a combination of the fact that augmented assignment |
| 1267 | operators are *assignment* operators, and the difference between mutable and |
| 1268 | immutable objects in Python. |
| 1269 | |
| 1270 | This discussion applies in general when augmented assignment operators are |
| 1271 | applied to elements of a tuple that point to mutable objects, but we'll use |
| 1272 | a ``list`` and ``+=`` as our exemplar. |
| 1273 | |
| 1274 | If you wrote:: |
| 1275 | |
| 1276 | >>> a_tuple = (1, 2) |
| 1277 | >>> a_tuple[0] += 1 |
| 1278 | Traceback (most recent call last): |
| 1279 | ... |
| 1280 | TypeError: 'tuple' object does not support item assignment |
| 1281 | |
| 1282 | The reason for the exception should be immediately clear: ``1`` is added to the |
| 1283 | object ``a_tuple[0]`` points to (``1``), producing the result object, ``2``, |
| 1284 | but when we attempt to assign the result of the computation, ``2``, to element |
| 1285 | ``0`` of the tuple, we get an error because we can't change what an element of |
| 1286 | a tuple points to. |
| 1287 | |
| 1288 | Under the covers, what this augmented assignment statement is doing is |
| 1289 | approximately this:: |
| 1290 | |
R David Murray | 95ae992 | 2013-05-21 11:44:41 -0400 | [diff] [blame] | 1291 | >>> result = a_tuple[0] + 1 |
R David Murray | bcf06d3 | 2013-05-20 10:32:46 -0400 | [diff] [blame] | 1292 | >>> a_tuple[0] = result |
| 1293 | Traceback (most recent call last): |
| 1294 | ... |
| 1295 | TypeError: 'tuple' object does not support item assignment |
| 1296 | |
| 1297 | It is the assignment part of the operation that produces the error, since a |
| 1298 | tuple is immutable. |
| 1299 | |
| 1300 | When you write something like:: |
| 1301 | |
| 1302 | >>> a_tuple = (['foo'], 'bar') |
| 1303 | >>> a_tuple[0] += ['item'] |
| 1304 | Traceback (most recent call last): |
| 1305 | ... |
| 1306 | TypeError: 'tuple' object does not support item assignment |
| 1307 | |
| 1308 | The exception is a bit more surprising, and even more surprising is the fact |
| 1309 | that even though there was an error, the append worked:: |
| 1310 | |
| 1311 | >>> a_tuple[0] |
| 1312 | ['foo', 'item'] |
| 1313 | |
R David Murray | 95ae992 | 2013-05-21 11:44:41 -0400 | [diff] [blame] | 1314 | To see why this happens, you need to know that (a) if an object implements an |
| 1315 | ``__iadd__`` magic method, it gets called when the ``+=`` augmented assignment |
| 1316 | is executed, and its return value is what gets used in the assignment statement; |
| 1317 | and (b) for lists, ``__iadd__`` is equivalent to calling ``extend`` on the list |
| 1318 | and returning the list. That's why we say that for lists, ``+=`` is a |
| 1319 | "shorthand" for ``list.extend``:: |
R David Murray | bcf06d3 | 2013-05-20 10:32:46 -0400 | [diff] [blame] | 1320 | |
| 1321 | >>> a_list = [] |
| 1322 | >>> a_list += [1] |
| 1323 | >>> a_list |
| 1324 | [1] |
| 1325 | |
R David Murray | 95ae992 | 2013-05-21 11:44:41 -0400 | [diff] [blame] | 1326 | This is equivalent to:: |
R David Murray | bcf06d3 | 2013-05-20 10:32:46 -0400 | [diff] [blame] | 1327 | |
| 1328 | >>> result = a_list.__iadd__([1]) |
| 1329 | >>> a_list = result |
| 1330 | |
| 1331 | The object pointed to by a_list has been mutated, and the pointer to the |
| 1332 | mutated object is assigned back to ``a_list``. The end result of the |
| 1333 | assignment is a no-op, since it is a pointer to the same object that ``a_list`` |
| 1334 | was previously pointing to, but the assignment still happens. |
| 1335 | |
| 1336 | Thus, in our tuple example what is happening is equivalent to:: |
| 1337 | |
| 1338 | >>> result = a_tuple[0].__iadd__(['item']) |
| 1339 | >>> a_tuple[0] = result |
| 1340 | Traceback (most recent call last): |
| 1341 | ... |
| 1342 | TypeError: 'tuple' object does not support item assignment |
| 1343 | |
| 1344 | The ``__iadd__`` succeeds, and thus the list is extended, but even though |
| 1345 | ``result`` points to the same object that ``a_tuple[0]`` already points to, |
| 1346 | that final assignment still results in an error, because tuples are immutable. |
| 1347 | |
| 1348 | |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1349 | I want to do a complicated sort: can you do a Schwartzian Transform in Python? |
| 1350 | ------------------------------------------------------------------------------ |
| 1351 | |
| 1352 | The technique, attributed to Randal Schwartz of the Perl community, sorts the |
| 1353 | elements of a list by a metric which maps each element to its "sort value". In |
Berker Peksag | 5b6a14d | 2016-06-01 13:54:33 -0700 | [diff] [blame] | 1354 | Python, use the ``key`` argument for the :meth:`list.sort` method:: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1355 | |
| 1356 | Isorted = L[:] |
| 1357 | Isorted.sort(key=lambda s: int(s[10:15])) |
| 1358 | |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1359 | |
| 1360 | How can I sort one list by values from another list? |
| 1361 | ---------------------------------------------------- |
| 1362 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1363 | Merge them into an iterator of tuples, sort the resulting list, and then pick |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1364 | out the element you want. :: |
| 1365 | |
| 1366 | >>> list1 = ["what", "I'm", "sorting", "by"] |
| 1367 | >>> list2 = ["something", "else", "to", "sort"] |
| 1368 | >>> pairs = zip(list1, list2) |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1369 | >>> pairs = sorted(pairs) |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1370 | >>> pairs |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1371 | [("I'm", 'else'), ('by', 'sort'), ('sorting', 'to'), ('what', 'something')] |
| 1372 | >>> result = [x[1] for x in pairs] |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1373 | >>> result |
| 1374 | ['else', 'sort', 'to', 'something'] |
| 1375 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1376 | |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1377 | Objects |
| 1378 | ======= |
| 1379 | |
| 1380 | What is a class? |
| 1381 | ---------------- |
| 1382 | |
| 1383 | A class is the particular object type created by executing a class statement. |
| 1384 | Class objects are used as templates to create instance objects, which embody |
| 1385 | both the data (attributes) and code (methods) specific to a datatype. |
| 1386 | |
| 1387 | A class can be based on one or more other classes, called its base class(es). It |
| 1388 | then inherits the attributes and methods of its base classes. This allows an |
| 1389 | object model to be successively refined by inheritance. You might have a |
| 1390 | generic ``Mailbox`` class that provides basic accessor methods for a mailbox, |
| 1391 | and subclasses such as ``MboxMailbox``, ``MaildirMailbox``, ``OutlookMailbox`` |
| 1392 | that handle various specific mailbox formats. |
| 1393 | |
| 1394 | |
| 1395 | What is a method? |
| 1396 | ----------------- |
| 1397 | |
| 1398 | A method is a function on some object ``x`` that you normally call as |
| 1399 | ``x.name(arguments...)``. Methods are defined as functions inside the class |
| 1400 | definition:: |
| 1401 | |
| 1402 | class C: |
Serhiy Storchaka | dba9039 | 2016-05-10 12:01:23 +0300 | [diff] [blame] | 1403 | def meth(self, arg): |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1404 | return arg * 2 + self.attribute |
| 1405 | |
| 1406 | |
| 1407 | What is self? |
| 1408 | ------------- |
| 1409 | |
| 1410 | Self is merely a conventional name for the first argument of a method. A method |
| 1411 | defined as ``meth(self, a, b, c)`` should be called as ``x.meth(a, b, c)`` for |
| 1412 | some instance ``x`` of the class in which the definition occurs; the called |
| 1413 | method will think it is called as ``meth(x, a, b, c)``. |
| 1414 | |
| 1415 | See also :ref:`why-self`. |
| 1416 | |
| 1417 | |
| 1418 | How do I check if an object is an instance of a given class or of a subclass of it? |
| 1419 | ----------------------------------------------------------------------------------- |
| 1420 | |
| 1421 | Use the built-in function ``isinstance(obj, cls)``. You can check if an object |
| 1422 | is an instance of any of a number of classes by providing a tuple instead of a |
| 1423 | single class, e.g. ``isinstance(obj, (class1, class2, ...))``, and can also |
| 1424 | check whether an object is one of Python's built-in types, e.g. |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1425 | ``isinstance(obj, str)`` or ``isinstance(obj, (int, float, complex))``. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1426 | |
Raymond Hettinger | 7bc25ec | 2021-04-05 12:48:24 -0700 | [diff] [blame] | 1427 | Note that :func:`isinstance` also checks for virtual inheritance from an |
| 1428 | :term:`abstract base class`. So, the test will return ``True`` for a |
| 1429 | registered class even if hasn't directly or indirectly inherited from it. To |
| 1430 | test for "true inheritance", scan the :term:`MRO` of the class: |
| 1431 | |
| 1432 | .. testcode:: |
| 1433 | |
| 1434 | from collections.abc import Mapping |
| 1435 | |
| 1436 | class P: |
| 1437 | pass |
| 1438 | |
| 1439 | class C(P): |
| 1440 | pass |
| 1441 | |
| 1442 | Mapping.register(P) |
| 1443 | |
| 1444 | .. doctest:: |
| 1445 | |
| 1446 | >>> c = C() |
| 1447 | >>> isinstance(c, C) # direct |
| 1448 | True |
| 1449 | >>> isinstance(c, P) # indirect |
| 1450 | True |
| 1451 | >>> isinstance(c, Mapping) # virtual |
| 1452 | True |
| 1453 | |
| 1454 | # Actual inheritance chain |
| 1455 | >>> type(c).__mro__ |
| 1456 | (<class 'C'>, <class 'P'>, <class 'object'>) |
| 1457 | |
| 1458 | # Test for "true inheritance" |
| 1459 | >>> Mapping in type(c).__mro__ |
| 1460 | False |
| 1461 | |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1462 | Note that most programs do not use :func:`isinstance` on user-defined classes |
| 1463 | very often. If you are developing the classes yourself, a more proper |
| 1464 | object-oriented style is to define methods on the classes that encapsulate a |
| 1465 | particular behaviour, instead of checking the object's class and doing a |
| 1466 | different thing based on what class it is. For example, if you have a function |
| 1467 | that does something:: |
| 1468 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1469 | def search(obj): |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1470 | if isinstance(obj, Mailbox): |
Serhiy Storchaka | dba9039 | 2016-05-10 12:01:23 +0300 | [diff] [blame] | 1471 | ... # code to search a mailbox |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1472 | elif isinstance(obj, Document): |
Serhiy Storchaka | dba9039 | 2016-05-10 12:01:23 +0300 | [diff] [blame] | 1473 | ... # code to search a document |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1474 | elif ... |
| 1475 | |
| 1476 | A better approach is to define a ``search()`` method on all the classes and just |
| 1477 | call it:: |
| 1478 | |
| 1479 | class Mailbox: |
| 1480 | def search(self): |
Serhiy Storchaka | dba9039 | 2016-05-10 12:01:23 +0300 | [diff] [blame] | 1481 | ... # code to search a mailbox |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1482 | |
| 1483 | class Document: |
| 1484 | def search(self): |
Serhiy Storchaka | dba9039 | 2016-05-10 12:01:23 +0300 | [diff] [blame] | 1485 | ... # code to search a document |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1486 | |
| 1487 | obj.search() |
| 1488 | |
| 1489 | |
| 1490 | What is delegation? |
| 1491 | ------------------- |
| 1492 | |
| 1493 | Delegation is an object oriented technique (also called a design pattern). |
| 1494 | Let's say you have an object ``x`` and want to change the behaviour of just one |
| 1495 | of its methods. You can create a new class that provides a new implementation |
| 1496 | of the method you're interested in changing and delegates all other methods to |
| 1497 | the corresponding method of ``x``. |
| 1498 | |
| 1499 | Python programmers can easily implement delegation. For example, the following |
| 1500 | class implements a class that behaves like a file but converts all written data |
| 1501 | to uppercase:: |
| 1502 | |
| 1503 | class UpperOut: |
| 1504 | |
| 1505 | def __init__(self, outfile): |
| 1506 | self._outfile = outfile |
| 1507 | |
| 1508 | def write(self, s): |
| 1509 | self._outfile.write(s.upper()) |
| 1510 | |
| 1511 | def __getattr__(self, name): |
| 1512 | return getattr(self._outfile, name) |
| 1513 | |
| 1514 | Here the ``UpperOut`` class redefines the ``write()`` method to convert the |
| 1515 | argument string to uppercase before calling the underlying |
Zackery Spytz | caf1aad | 2020-04-26 21:23:52 -0600 | [diff] [blame] | 1516 | ``self._outfile.write()`` method. All other methods are delegated to the |
| 1517 | underlying ``self._outfile`` object. The delegation is accomplished via the |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1518 | ``__getattr__`` method; consult :ref:`the language reference <attribute-access>` |
| 1519 | for more information about controlling attribute access. |
| 1520 | |
| 1521 | Note that for more general cases delegation can get trickier. When attributes |
| 1522 | must be set as well as retrieved, the class must define a :meth:`__setattr__` |
| 1523 | method too, and it must do so carefully. The basic implementation of |
| 1524 | :meth:`__setattr__` is roughly equivalent to the following:: |
| 1525 | |
| 1526 | class X: |
| 1527 | ... |
| 1528 | def __setattr__(self, name, value): |
| 1529 | self.__dict__[name] = value |
| 1530 | ... |
| 1531 | |
| 1532 | Most :meth:`__setattr__` implementations must modify ``self.__dict__`` to store |
| 1533 | local state for self without causing an infinite recursion. |
| 1534 | |
| 1535 | |
Andre Delfino | 778ad92 | 2020-09-20 14:09:50 -0300 | [diff] [blame] | 1536 | How do I call a method defined in a base class from a derived class that extends it? |
| 1537 | ------------------------------------------------------------------------------------ |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1538 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1539 | Use the built-in :func:`super` function:: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1540 | |
| 1541 | class Derived(Base): |
Serhiy Storchaka | dba9039 | 2016-05-10 12:01:23 +0300 | [diff] [blame] | 1542 | def meth(self): |
Andre Delfino | 778ad92 | 2020-09-20 14:09:50 -0300 | [diff] [blame] | 1543 | super().meth() # calls Base.meth |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1544 | |
Andre Delfino | 778ad92 | 2020-09-20 14:09:50 -0300 | [diff] [blame] | 1545 | In the example, :func:`super` will automatically determine the instance from |
| 1546 | which it was called (the ``self`` value), look up the :term:`method resolution |
| 1547 | order` (MRO) with ``type(self).__mro__``, and return the next in line after |
| 1548 | ``Derived`` in the MRO: ``Base``. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1549 | |
| 1550 | |
| 1551 | How can I organize my code to make it easier to change the base class? |
| 1552 | ---------------------------------------------------------------------- |
| 1553 | |
Andre Delfino | 4642ccd | 2020-10-21 02:25:05 -0300 | [diff] [blame] | 1554 | You could assign the base class to an alias and derive from the alias. Then all |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1555 | you have to change is the value assigned to the alias. Incidentally, this trick |
| 1556 | is also handy if you want to decide dynamically (e.g. depending on availability |
| 1557 | of resources) which base class to use. Example:: |
| 1558 | |
Andre Delfino | 4642ccd | 2020-10-21 02:25:05 -0300 | [diff] [blame] | 1559 | class Base: |
| 1560 | ... |
| 1561 | |
| 1562 | BaseAlias = Base |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1563 | |
| 1564 | class Derived(BaseAlias): |
Andre Delfino | 4642ccd | 2020-10-21 02:25:05 -0300 | [diff] [blame] | 1565 | ... |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1566 | |
| 1567 | |
| 1568 | How do I create static class data and static class methods? |
| 1569 | ----------------------------------------------------------- |
| 1570 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1571 | Both static data and static methods (in the sense of C++ or Java) are supported |
| 1572 | in Python. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1573 | |
| 1574 | For static data, simply define a class attribute. To assign a new value to the |
| 1575 | attribute, you have to explicitly use the class name in the assignment:: |
| 1576 | |
| 1577 | class C: |
| 1578 | count = 0 # number of times C.__init__ called |
| 1579 | |
| 1580 | def __init__(self): |
| 1581 | C.count = C.count + 1 |
| 1582 | |
| 1583 | def getcount(self): |
| 1584 | return C.count # or return self.count |
| 1585 | |
| 1586 | ``c.count`` also refers to ``C.count`` for any ``c`` such that ``isinstance(c, |
| 1587 | C)`` holds, unless overridden by ``c`` itself or by some class on the base-class |
| 1588 | search path from ``c.__class__`` back to ``C``. |
| 1589 | |
| 1590 | Caution: within a method of C, an assignment like ``self.count = 42`` creates a |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1591 | new and unrelated instance named "count" in ``self``'s own dict. Rebinding of a |
| 1592 | class-static data name must always specify the class whether inside a method or |
| 1593 | not:: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1594 | |
| 1595 | C.count = 314 |
| 1596 | |
Antoine Pitrou | f352040 | 2011-12-03 22:19:55 +0100 | [diff] [blame] | 1597 | Static methods are possible:: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1598 | |
| 1599 | class C: |
| 1600 | @staticmethod |
| 1601 | def static(arg1, arg2, arg3): |
| 1602 | # No 'self' parameter! |
| 1603 | ... |
| 1604 | |
| 1605 | However, a far more straightforward way to get the effect of a static method is |
| 1606 | via a simple module-level function:: |
| 1607 | |
| 1608 | def getcount(): |
| 1609 | return C.count |
| 1610 | |
| 1611 | If your code is structured so as to define one class (or tightly related class |
| 1612 | hierarchy) per module, this supplies the desired encapsulation. |
| 1613 | |
| 1614 | |
| 1615 | How can I overload constructors (or methods) in Python? |
| 1616 | ------------------------------------------------------- |
| 1617 | |
| 1618 | This answer actually applies to all methods, but the question usually comes up |
| 1619 | first in the context of constructors. |
| 1620 | |
| 1621 | In C++ you'd write |
| 1622 | |
| 1623 | .. code-block:: c |
| 1624 | |
| 1625 | class C { |
| 1626 | C() { cout << "No arguments\n"; } |
| 1627 | C(int i) { cout << "Argument is " << i << "\n"; } |
| 1628 | } |
| 1629 | |
| 1630 | In Python you have to write a single constructor that catches all cases using |
| 1631 | default arguments. For example:: |
| 1632 | |
| 1633 | class C: |
| 1634 | def __init__(self, i=None): |
| 1635 | if i is None: |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1636 | print("No arguments") |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1637 | else: |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1638 | print("Argument is", i) |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1639 | |
| 1640 | This is not entirely equivalent, but close enough in practice. |
| 1641 | |
| 1642 | You could also try a variable-length argument list, e.g. :: |
| 1643 | |
| 1644 | def __init__(self, *args): |
| 1645 | ... |
| 1646 | |
| 1647 | The same approach works for all method definitions. |
| 1648 | |
| 1649 | |
| 1650 | I try to use __spam and I get an error about _SomeClassName__spam. |
| 1651 | ------------------------------------------------------------------ |
| 1652 | |
| 1653 | Variable names with double leading underscores are "mangled" to provide a simple |
| 1654 | but effective way to define class private variables. Any identifier of the form |
| 1655 | ``__spam`` (at least two leading underscores, at most one trailing underscore) |
| 1656 | is textually replaced with ``_classname__spam``, where ``classname`` is the |
| 1657 | current class name with any leading underscores stripped. |
| 1658 | |
| 1659 | This doesn't guarantee privacy: an outside user can still deliberately access |
| 1660 | the "_classname__spam" attribute, and private values are visible in the object's |
| 1661 | ``__dict__``. Many Python programmers never bother to use private variable |
| 1662 | names at all. |
| 1663 | |
| 1664 | |
| 1665 | My class defines __del__ but it is not called when I delete the object. |
| 1666 | ----------------------------------------------------------------------- |
| 1667 | |
| 1668 | There are several possible reasons for this. |
| 1669 | |
| 1670 | The del statement does not necessarily call :meth:`__del__` -- it simply |
| 1671 | decrements the object's reference count, and if this reaches zero |
| 1672 | :meth:`__del__` is called. |
| 1673 | |
| 1674 | If your data structures contain circular links (e.g. a tree where each child has |
| 1675 | a parent reference and each parent has a list of children) the reference counts |
| 1676 | will never go back to zero. Once in a while Python runs an algorithm to detect |
| 1677 | such cycles, but the garbage collector might run some time after the last |
| 1678 | reference to your data structure vanishes, so your :meth:`__del__` method may be |
| 1679 | called at an inconvenient and random time. This is inconvenient if you're trying |
| 1680 | to reproduce a problem. Worse, the order in which object's :meth:`__del__` |
| 1681 | methods are executed is arbitrary. You can run :func:`gc.collect` to force a |
| 1682 | collection, but there *are* pathological cases where objects will never be |
| 1683 | collected. |
| 1684 | |
| 1685 | Despite the cycle collector, it's still a good idea to define an explicit |
| 1686 | ``close()`` method on objects to be called whenever you're done with them. The |
Gregory P. Smith | e9d978f | 2017-08-28 13:43:26 -0700 | [diff] [blame] | 1687 | ``close()`` method can then remove attributes that refer to subobjects. Don't |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1688 | call :meth:`__del__` directly -- :meth:`__del__` should call ``close()`` and |
| 1689 | ``close()`` should make sure that it can be called more than once for the same |
| 1690 | object. |
| 1691 | |
| 1692 | Another way to avoid cyclical references is to use the :mod:`weakref` module, |
| 1693 | which allows you to point to objects without incrementing their reference count. |
| 1694 | Tree data structures, for instance, should use weak references for their parent |
| 1695 | and sibling references (if they need them!). |
| 1696 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1697 | .. XXX relevant for Python 3? |
| 1698 | |
| 1699 | If the object has ever been a local variable in a function that caught an |
| 1700 | expression in an except clause, chances are that a reference to the object |
| 1701 | still exists in that function's stack frame as contained in the stack trace. |
| 1702 | Normally, calling :func:`sys.exc_clear` will take care of this by clearing |
| 1703 | the last recorded exception. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1704 | |
| 1705 | Finally, if your :meth:`__del__` method raises an exception, a warning message |
| 1706 | is printed to :data:`sys.stderr`. |
| 1707 | |
| 1708 | |
| 1709 | How do I get a list of all instances of a given class? |
| 1710 | ------------------------------------------------------ |
| 1711 | |
| 1712 | Python does not keep track of all instances of a class (or of a built-in type). |
| 1713 | You can program the class's constructor to keep track of all instances by |
| 1714 | keeping a list of weak references to each instance. |
| 1715 | |
| 1716 | |
Georg Brandl | d8ede4f | 2013-10-12 18:14:25 +0200 | [diff] [blame] | 1717 | Why does the result of ``id()`` appear to be not unique? |
| 1718 | -------------------------------------------------------- |
| 1719 | |
| 1720 | The :func:`id` builtin returns an integer that is guaranteed to be unique during |
| 1721 | the lifetime of the object. Since in CPython, this is the object's memory |
| 1722 | address, it happens frequently that after an object is deleted from memory, the |
| 1723 | next freshly created object is allocated at the same position in memory. This |
| 1724 | is illustrated by this example: |
| 1725 | |
Senthil Kumaran | 7749320 | 2016-06-04 20:07:34 -0700 | [diff] [blame] | 1726 | >>> id(1000) # doctest: +SKIP |
Georg Brandl | d8ede4f | 2013-10-12 18:14:25 +0200 | [diff] [blame] | 1727 | 13901272 |
Senthil Kumaran | 7749320 | 2016-06-04 20:07:34 -0700 | [diff] [blame] | 1728 | >>> id(2000) # doctest: +SKIP |
Georg Brandl | d8ede4f | 2013-10-12 18:14:25 +0200 | [diff] [blame] | 1729 | 13901272 |
| 1730 | |
| 1731 | The two ids belong to different integer objects that are created before, and |
| 1732 | deleted immediately after execution of the ``id()`` call. To be sure that |
| 1733 | objects whose id you want to examine are still alive, create another reference |
| 1734 | to the object: |
| 1735 | |
| 1736 | >>> a = 1000; b = 2000 |
Senthil Kumaran | 7749320 | 2016-06-04 20:07:34 -0700 | [diff] [blame] | 1737 | >>> id(a) # doctest: +SKIP |
Georg Brandl | d8ede4f | 2013-10-12 18:14:25 +0200 | [diff] [blame] | 1738 | 13901272 |
Senthil Kumaran | 7749320 | 2016-06-04 20:07:34 -0700 | [diff] [blame] | 1739 | >>> id(b) # doctest: +SKIP |
Georg Brandl | d8ede4f | 2013-10-12 18:14:25 +0200 | [diff] [blame] | 1740 | 13891296 |
| 1741 | |
| 1742 | |
Raymond Hettinger | f8775e4 | 2021-04-03 19:54:49 -0700 | [diff] [blame] | 1743 | When can I rely on identity tests with the *is* operator? |
| 1744 | --------------------------------------------------------- |
| 1745 | |
| 1746 | The ``is`` operator tests for object identity. The test ``a is b`` is |
| 1747 | equivalent to ``id(a) == id(b)``. |
| 1748 | |
| 1749 | The most important property of an identity test is that an object is always |
| 1750 | identical to itself, ``a is a`` always returns ``True``. Identity tests are |
| 1751 | usually faster than equality tests. And unlike equality tests, identity tests |
| 1752 | are guaranteed to return a boolean ``True`` or ``False``. |
| 1753 | |
| 1754 | However, identity tests can *only* be substituted for equality tests when |
| 1755 | object identity is assured. Generally, there are three circumstances where |
| 1756 | identity is guaranteed: |
| 1757 | |
| 1758 | 1) Assignments create new names but do not change object identity. After the |
| 1759 | assignment ``new = old``, it is guaranteed that ``new is old``. |
| 1760 | |
| 1761 | 2) Putting an object in a container that stores object references does not |
| 1762 | change object identity. After the list assignment ``s[0] = x``, it is |
| 1763 | guaranteed that ``s[0] is x``. |
| 1764 | |
| 1765 | 3) If an object is a singleton, it means that only one instance of that object |
| 1766 | can exist. After the assignments ``a = None`` and ``b = None``, it is |
| 1767 | guaranteed that ``a is b`` because ``None`` is a singleton. |
| 1768 | |
| 1769 | In most other circumstances, identity tests are inadvisable and equality tests |
| 1770 | are preferred. In particular, identity tests should not be used to check |
| 1771 | constants such as :class:`int` and :class:`str` which aren't guaranteed to be |
| 1772 | singletons:: |
| 1773 | |
| 1774 | >>> a = 1000 |
| 1775 | >>> b = 500 |
| 1776 | >>> c = b + 500 |
| 1777 | >>> a is c |
| 1778 | False |
| 1779 | |
| 1780 | >>> a = 'Python' |
| 1781 | >>> b = 'Py' |
| 1782 | >>> c = b + 'thon' |
| 1783 | >>> a is c |
| 1784 | False |
| 1785 | |
| 1786 | Likewise, new instances of mutable containers are never identical:: |
| 1787 | |
| 1788 | >>> a = [] |
| 1789 | >>> b = [] |
| 1790 | >>> a is b |
| 1791 | False |
| 1792 | |
| 1793 | In the standard library code, you will see several common patterns for |
| 1794 | correctly using identity tests: |
| 1795 | |
| 1796 | 1) As recommended by :pep:`8`, an identity test is the preferred way to check |
| 1797 | for ``None``. This reads like plain English in code and avoids confusion with |
| 1798 | other objects that may have boolean values that evaluate to false. |
| 1799 | |
| 1800 | 2) Detecting optional arguments can be tricky when ``None`` is a valid input |
| 1801 | value. In those situations, you can create an singleton sentinel object |
| 1802 | guaranteed to be distinct from other objects. For example, here is how |
| 1803 | to implement a method that behaves like :meth:`dict.pop`:: |
| 1804 | |
| 1805 | _sentinel = object() |
| 1806 | |
| 1807 | def pop(self, key, default=_sentinel): |
| 1808 | if key in self: |
| 1809 | value = self[key] |
| 1810 | del self[key] |
| 1811 | return value |
| 1812 | if default is _sentinel: |
| 1813 | raise KeyError(key) |
| 1814 | return default |
| 1815 | |
| 1816 | 3) Container implementations sometimes need to augment equality tests with |
| 1817 | identity tests. This prevents the code from being confused by objects such as |
| 1818 | ``float('NaN')`` that are not equal to themselves. |
| 1819 | |
| 1820 | For example, here is the implementation of |
| 1821 | :meth:`collections.abc.Sequence.__contains__`:: |
| 1822 | |
| 1823 | def __contains__(self, value): |
| 1824 | for v in self: |
| 1825 | if v is value or v == value: |
| 1826 | return True |
| 1827 | return False |
| 1828 | |
Miss Islington (bot) | 77eaf14 | 2021-06-17 14:14:36 -0700 | [diff] [blame] | 1829 | How do I cache method calls? |
| 1830 | ---------------------------- |
| 1831 | |
| 1832 | The two principal tools for caching methods are |
| 1833 | :func:`functools.cached_property` and :func:`functools.lru_cache`. The |
| 1834 | former stores results at the instance level and the latter at the class |
| 1835 | level. |
| 1836 | |
| 1837 | The *cached_property* approach only works with methods that do not take |
| 1838 | any arguments. It does not create a reference to the instance. The |
| 1839 | cached method result will be kept only as long as the instance is alive. |
| 1840 | |
| 1841 | The advantage is that when an instance is not longer used, the cached |
| 1842 | method result will be released right away. The disadvantage is that if |
| 1843 | instances accumulate, so too will the accumulated method results. They |
| 1844 | can grow without bound. |
| 1845 | |
| 1846 | The *lru_cache* approach works with methods that have hashable |
| 1847 | arguments. It creates a reference to the instance unless special |
| 1848 | efforts are made to pass in weak references. |
| 1849 | |
| 1850 | The advantage of the least recently used algorithm is that the cache is |
| 1851 | bounded by the specified *maxsize*. The disadvantage is that instances |
| 1852 | are kept alive until they age out of the cache or until the cache is |
| 1853 | cleared. |
| 1854 | |
Miss Islington (bot) | 77eaf14 | 2021-06-17 14:14:36 -0700 | [diff] [blame] | 1855 | This example shows the various techniques:: |
| 1856 | |
| 1857 | class Weather: |
| 1858 | "Lookup weather information on a government website" |
| 1859 | |
| 1860 | def __init__(self, station_id): |
| 1861 | self._station_id = station_id |
| 1862 | # The _station_id is private and immutable |
| 1863 | |
| 1864 | def current_temperature(self): |
| 1865 | "Latest hourly observation" |
| 1866 | # Do not cache this because old results |
| 1867 | # can be out of date. |
| 1868 | |
| 1869 | @cached_property |
| 1870 | def location(self): |
| 1871 | "Return the longitude/latitude coordinates of the station" |
| 1872 | # Result only depends on the station_id |
| 1873 | |
| 1874 | @lru_cache(maxsize=20) |
| 1875 | def historic_rainfall(self, date, units='mm'): |
| 1876 | "Rainfall on a given date" |
| 1877 | # Depends on the station_id, date, and units. |
| 1878 | |
Miss Islington (bot) | 77eaf14 | 2021-06-17 14:14:36 -0700 | [diff] [blame] | 1879 | The above example assumes that the *station_id* never changes. If the |
| 1880 | relevant instance attributes are mutable, the *cached_property* approach |
| 1881 | can't be made to work because it cannot detect changes to the |
| 1882 | attributes. |
| 1883 | |
| 1884 | The *lru_cache* approach can be made to work, but the class needs to define the |
| 1885 | *__eq__* and *__hash__* methods so the cache can detect relevant attribute |
| 1886 | updates:: |
| 1887 | |
| 1888 | class Weather: |
| 1889 | "Example with a mutable station identifier" |
| 1890 | |
| 1891 | def __init__(self, station_id): |
| 1892 | self.station_id = station_id |
| 1893 | |
| 1894 | def change_station(self, station_id): |
| 1895 | self.station_id = station_id |
| 1896 | |
| 1897 | def __eq__(self, other): |
| 1898 | return self.station_id == other.station_id |
| 1899 | |
| 1900 | def __hash__(self): |
| 1901 | return hash(self.station_id) |
| 1902 | |
| 1903 | @lru_cache(maxsize=20) |
| 1904 | def historic_rainfall(self, date, units='cm'): |
| 1905 | 'Rainfall on a given date' |
| 1906 | # Depends on the station_id, date, and units. |
| 1907 | |
Raymond Hettinger | f8775e4 | 2021-04-03 19:54:49 -0700 | [diff] [blame] | 1908 | |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1909 | Modules |
| 1910 | ======= |
| 1911 | |
| 1912 | How do I create a .pyc file? |
| 1913 | ---------------------------- |
| 1914 | |
R David Murray | d913d9d | 2013-12-13 12:29:29 -0500 | [diff] [blame] | 1915 | When a module is imported for the first time (or when the source file has |
| 1916 | changed since the current compiled file was created) a ``.pyc`` file containing |
| 1917 | the compiled code should be created in a ``__pycache__`` subdirectory of the |
| 1918 | directory containing the ``.py`` file. The ``.pyc`` file will have a |
| 1919 | filename that starts with the same name as the ``.py`` file, and ends with |
| 1920 | ``.pyc``, with a middle component that depends on the particular ``python`` |
| 1921 | binary that created it. (See :pep:`3147` for details.) |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1922 | |
R David Murray | d913d9d | 2013-12-13 12:29:29 -0500 | [diff] [blame] | 1923 | One reason that a ``.pyc`` file may not be created is a permissions problem |
| 1924 | with the directory containing the source file, meaning that the ``__pycache__`` |
| 1925 | subdirectory cannot be created. This can happen, for example, if you develop as |
| 1926 | one user but run as another, such as if you are testing with a web server. |
| 1927 | |
| 1928 | Unless the :envvar:`PYTHONDONTWRITEBYTECODE` environment variable is set, |
| 1929 | creation of a .pyc file is automatic if you're importing a module and Python |
| 1930 | has the ability (permissions, free space, etc...) to create a ``__pycache__`` |
| 1931 | subdirectory and write the compiled module to that subdirectory. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1932 | |
R David Murray | fdf9503 | 2013-06-19 16:58:26 -0400 | [diff] [blame] | 1933 | Running Python on a top level script is not considered an import and no |
| 1934 | ``.pyc`` will be created. For example, if you have a top-level module |
R David Murray | d913d9d | 2013-12-13 12:29:29 -0500 | [diff] [blame] | 1935 | ``foo.py`` that imports another module ``xyz.py``, when you run ``foo`` (by |
| 1936 | typing ``python foo.py`` as a shell command), a ``.pyc`` will be created for |
| 1937 | ``xyz`` because ``xyz`` is imported, but no ``.pyc`` file will be created for |
| 1938 | ``foo`` since ``foo.py`` isn't being imported. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1939 | |
R David Murray | d913d9d | 2013-12-13 12:29:29 -0500 | [diff] [blame] | 1940 | If you need to create a ``.pyc`` file for ``foo`` -- that is, to create a |
| 1941 | ``.pyc`` file for a module that is not imported -- you can, using the |
| 1942 | :mod:`py_compile` and :mod:`compileall` modules. |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1943 | |
| 1944 | The :mod:`py_compile` module can manually compile any module. One way is to use |
| 1945 | the ``compile()`` function in that module interactively:: |
| 1946 | |
| 1947 | >>> import py_compile |
R David Murray | fdf9503 | 2013-06-19 16:58:26 -0400 | [diff] [blame] | 1948 | >>> py_compile.compile('foo.py') # doctest: +SKIP |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1949 | |
R David Murray | d913d9d | 2013-12-13 12:29:29 -0500 | [diff] [blame] | 1950 | This will write the ``.pyc`` to a ``__pycache__`` subdirectory in the same |
| 1951 | location as ``foo.py`` (or you can override that with the optional parameter |
| 1952 | ``cfile``). |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1953 | |
| 1954 | You can also automatically compile all files in a directory or directories using |
| 1955 | the :mod:`compileall` module. You can do it from the shell prompt by running |
| 1956 | ``compileall.py`` and providing the path of a directory containing Python files |
| 1957 | to compile:: |
| 1958 | |
| 1959 | python -m compileall . |
| 1960 | |
| 1961 | |
| 1962 | How do I find the current module name? |
| 1963 | -------------------------------------- |
| 1964 | |
| 1965 | A module can find out its own module name by looking at the predefined global |
| 1966 | variable ``__name__``. If this has the value ``'__main__'``, the program is |
| 1967 | running as a script. Many modules that are usually used by importing them also |
| 1968 | provide a command-line interface or a self-test, and only execute this code |
| 1969 | after checking ``__name__``:: |
| 1970 | |
| 1971 | def main(): |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 1972 | print('Running test...') |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1973 | ... |
| 1974 | |
| 1975 | if __name__ == '__main__': |
| 1976 | main() |
| 1977 | |
| 1978 | |
| 1979 | How can I have modules that mutually import each other? |
| 1980 | ------------------------------------------------------- |
| 1981 | |
| 1982 | Suppose you have the following modules: |
| 1983 | |
Julien Palard | fd79af7 | 2021-04-13 18:03:22 +0200 | [diff] [blame] | 1984 | :file:`foo.py`:: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1985 | |
| 1986 | from bar import bar_var |
| 1987 | foo_var = 1 |
| 1988 | |
Julien Palard | fd79af7 | 2021-04-13 18:03:22 +0200 | [diff] [blame] | 1989 | :file:`bar.py`:: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 1990 | |
| 1991 | from foo import foo_var |
| 1992 | bar_var = 2 |
| 1993 | |
| 1994 | The problem is that the interpreter will perform the following steps: |
| 1995 | |
Julien Palard | fd79af7 | 2021-04-13 18:03:22 +0200 | [diff] [blame] | 1996 | * main imports ``foo`` |
| 1997 | * Empty globals for ``foo`` are created |
| 1998 | * ``foo`` is compiled and starts executing |
| 1999 | * ``foo`` imports ``bar`` |
| 2000 | * Empty globals for ``bar`` are created |
| 2001 | * ``bar`` is compiled and starts executing |
| 2002 | * ``bar`` imports ``foo`` (which is a no-op since there already is a module named ``foo``) |
| 2003 | * The import mechanism tries to read ``foo_var`` from ``foo`` globals, to set ``bar.foo_var = foo.foo_var`` |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 2004 | |
| 2005 | The last step fails, because Python isn't done with interpreting ``foo`` yet and |
| 2006 | the global symbol dictionary for ``foo`` is still empty. |
| 2007 | |
| 2008 | The same thing happens when you use ``import foo``, and then try to access |
| 2009 | ``foo.foo_var`` in global code. |
| 2010 | |
| 2011 | There are (at least) three possible workarounds for this problem. |
| 2012 | |
| 2013 | Guido van Rossum recommends avoiding all uses of ``from <module> import ...``, |
| 2014 | and placing all code inside functions. Initializations of global variables and |
| 2015 | class variables should use constants or built-in functions only. This means |
| 2016 | everything from an imported module is referenced as ``<module>.<name>``. |
| 2017 | |
| 2018 | Jim Roskind suggests performing steps in the following order in each module: |
| 2019 | |
| 2020 | * exports (globals, functions, and classes that don't need imported base |
| 2021 | classes) |
| 2022 | * ``import`` statements |
| 2023 | * active code (including globals that are initialized from imported values). |
| 2024 | |
| 2025 | van Rossum doesn't like this approach much because the imports appear in a |
| 2026 | strange place, but it does work. |
| 2027 | |
| 2028 | Matthias Urlichs recommends restructuring your code so that the recursive import |
| 2029 | is not necessary in the first place. |
| 2030 | |
| 2031 | These solutions are not mutually exclusive. |
| 2032 | |
| 2033 | |
| 2034 | __import__('x.y.z') returns <module 'x'>; how do I get z? |
| 2035 | --------------------------------------------------------- |
| 2036 | |
Ezio Melotti | e4aad5a | 2014-08-04 19:34:29 +0300 | [diff] [blame] | 2037 | Consider using the convenience function :func:`~importlib.import_module` from |
| 2038 | :mod:`importlib` instead:: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 2039 | |
Ezio Melotti | e4aad5a | 2014-08-04 19:34:29 +0300 | [diff] [blame] | 2040 | z = importlib.import_module('x.y.z') |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 2041 | |
| 2042 | |
| 2043 | When I edit an imported module and reimport it, the changes don't show up. Why does this happen? |
| 2044 | ------------------------------------------------------------------------------------------------- |
| 2045 | |
| 2046 | For reasons of efficiency as well as consistency, Python only reads the module |
| 2047 | file on the first time a module is imported. If it didn't, in a program |
| 2048 | consisting of many modules where each one imports the same basic module, the |
Brett Cannon | 4f422e3 | 2013-06-14 22:49:00 -0400 | [diff] [blame] | 2049 | basic module would be parsed and re-parsed many times. To force re-reading of a |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 2050 | changed module, do this:: |
| 2051 | |
Brett Cannon | 4f422e3 | 2013-06-14 22:49:00 -0400 | [diff] [blame] | 2052 | import importlib |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 2053 | import modname |
Brett Cannon | 4f422e3 | 2013-06-14 22:49:00 -0400 | [diff] [blame] | 2054 | importlib.reload(modname) |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 2055 | |
| 2056 | Warning: this technique is not 100% fool-proof. In particular, modules |
| 2057 | containing statements like :: |
| 2058 | |
| 2059 | from modname import some_objects |
| 2060 | |
| 2061 | will continue to work with the old version of the imported objects. If the |
| 2062 | module contains class definitions, existing class instances will *not* be |
| 2063 | updated to use the new class definition. This can result in the following |
Marco Buttu | 909a6f6 | 2017-03-18 17:59:33 +0100 | [diff] [blame] | 2064 | paradoxical behaviour:: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 2065 | |
Brett Cannon | 4f422e3 | 2013-06-14 22:49:00 -0400 | [diff] [blame] | 2066 | >>> import importlib |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 2067 | >>> import cls |
| 2068 | >>> c = cls.C() # Create an instance of C |
Brett Cannon | 4f422e3 | 2013-06-14 22:49:00 -0400 | [diff] [blame] | 2069 | >>> importlib.reload(cls) |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 2070 | <module 'cls' from 'cls.py'> |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 2071 | >>> isinstance(c, cls.C) # isinstance is false?!? |
| 2072 | False |
| 2073 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 2074 | The nature of the problem is made clear if you print out the "identity" of the |
Marco Buttu | 909a6f6 | 2017-03-18 17:59:33 +0100 | [diff] [blame] | 2075 | class objects:: |
Georg Brandl | d741315 | 2009-10-11 21:25:26 +0000 | [diff] [blame] | 2076 | |
Georg Brandl | 62eaaf6 | 2009-12-19 17:51:41 +0000 | [diff] [blame] | 2077 | >>> hex(id(c.__class__)) |
| 2078 | '0x7352a0' |
| 2079 | >>> hex(id(cls.C)) |
| 2080 | '0x4198d0' |