blob: e53facd5d60ab7d2dca5103f6ca89e8aa61c8d1b [file] [log] [blame]
Georg Brandl6728c5a2009-10-11 18:31:23 +00001:tocdepth: 2
2
3===============
4Programming FAQ
5===============
6
7.. contents::
8
9General Questions
10=================
11
12Is there a source code level debugger with breakpoints, single-stepping, etc.?
13------------------------------------------------------------------------------
14
15Yes.
16
17The pdb module is a simple but adequate console-mode debugger for Python. It is
18part of the standard Python library, and is :mod:`documented in the Library
19Reference Manual <pdb>`. You can also write your own debugger by using the code
20for pdb as an example.
21
22The IDLE interactive development environment, which is part of the standard
23Python distribution (normally available as Tools/scripts/idle), includes a
24graphical debugger. There is documentation for the IDLE debugger at
25http://www.python.org/idle/doc/idle2.html#Debugger.
26
27PythonWin is a Python IDE that includes a GUI debugger based on pdb. The
28Pythonwin debugger colors breakpoints and has quite a few cool features such as
29debugging non-Pythonwin programs. Pythonwin is available as part of the `Python
30for Windows Extensions <http://sourceforge.net/projects/pywin32/>`__ project and
31as a part of the ActivePython distribution (see
32http://www.activestate.com/Products/ActivePython/index.html).
33
34`Boa Constructor <http://boa-constructor.sourceforge.net/>`_ is an IDE and GUI
35builder that uses wxWidgets. It offers visual frame creation and manipulation,
36an object inspector, many views on the source like object browsers, inheritance
37hierarchies, doc string generated html documentation, an advanced debugger,
38integrated help, and Zope support.
39
40`Eric <http://www.die-offenbachs.de/eric/index.html>`_ is an IDE built on PyQt
41and the Scintilla editing component.
42
43Pydb is a version of the standard Python debugger pdb, modified for use with DDD
44(Data Display Debugger), a popular graphical debugger front end. Pydb can be
45found at http://bashdb.sourceforge.net/pydb/ and DDD can be found at
46http://www.gnu.org/software/ddd.
47
48There are a number of commercial Python IDEs that include graphical debuggers.
49They include:
50
51* Wing IDE (http://wingware.com/)
52* Komodo IDE (http://www.activestate.com/Products/Komodo)
53
54
55Is there a tool to help find bugs or perform static analysis?
56-------------------------------------------------------------
57
58Yes.
59
60PyChecker is a static analysis tool that finds bugs in Python source code and
61warns about code complexity and style. You can get PyChecker from
62http://pychecker.sf.net.
63
64`Pylint <http://www.logilab.org/projects/pylint>`_ is another tool that checks
65if a module satisfies a coding standard, and also makes it possible to write
66plug-ins to add a custom feature. In addition to the bug checking that
67PyChecker performs, Pylint offers some additional features such as checking line
68length, whether variable names are well-formed according to your coding
69standard, whether declared interfaces are fully implemented, and more.
Georg Brandla4314c22009-10-11 20:16:16 +000070http://www.logilab.org/card/pylint_manual provides a full list of Pylint's
71features.
Georg Brandl6728c5a2009-10-11 18:31:23 +000072
73
74How can I create a stand-alone binary from a Python script?
75-----------------------------------------------------------
76
77You don't need the ability to compile Python to C code if all you want is a
78stand-alone program that users can download and run without having to install
79the Python distribution first. There are a number of tools that determine the
80set of modules required by a program and bind these modules together with a
81Python binary to produce a single executable.
82
83One is to use the freeze tool, which is included in the Python source tree as
84``Tools/freeze``. It converts Python byte code to C arrays; a C compiler you can
85embed all your modules into a new program, which is then linked with the
86standard Python modules.
87
88It works by scanning your source recursively for import statements (in both
89forms) and looking for the modules in the standard Python path as well as in the
90source directory (for built-in modules). It then turns the bytecode for modules
91written in Python into C code (array initializers that can be turned into code
92objects using the marshal module) and creates a custom-made config file that
93only contains those built-in modules which are actually used in the program. It
94then compiles the generated C code and links it with the rest of the Python
95interpreter to form a self-contained binary which acts exactly like your script.
96
97Obviously, freeze requires a C compiler. There are several other utilities
98which don't. One is Thomas Heller's py2exe (Windows only) at
99
100 http://www.py2exe.org/
101
102Another is Christian Tismer's `SQFREEZE <http://starship.python.net/crew/pirx>`_
103which appends the byte code to a specially-prepared Python interpreter that can
104find the byte code in the executable.
105
106Other tools include Fredrik Lundh's `Squeeze
107<http://www.pythonware.com/products/python/squeeze>`_ and Anthony Tuininga's
108`cx_Freeze <http://starship.python.net/crew/atuining/cx_Freeze/index.html>`_.
109
110
111Are there coding standards or a style guide for Python programs?
112----------------------------------------------------------------
113
114Yes. The coding style required for standard library modules is documented as
115:pep:`8`.
116
117
118My program is too slow. How do I speed it up?
119---------------------------------------------
120
121That's a tough one, in general. There are many tricks to speed up Python code;
122consider rewriting parts in C as a last resort.
123
124In some cases it's possible to automatically translate Python to C or x86
125assembly language, meaning that you don't have to modify your code to gain
126increased speed.
127
128.. XXX seems to have overlap with other questions!
129
130`Pyrex <http://www.cosc.canterbury.ac.nz/~greg/python/Pyrex/>`_ can compile a
131slightly modified version of Python code into a C extension, and can be used on
132many different platforms.
133
134`Psyco <http://psyco.sourceforge.net>`_ is a just-in-time compiler that
135translates Python code into x86 assembly language. If you can use it, Psyco can
136provide dramatic speedups for critical functions.
137
138The rest of this answer will discuss various tricks for squeezing a bit more
139speed out of Python code. *Never* apply any optimization tricks unless you know
140you need them, after profiling has indicated that a particular function is the
141heavily executed hot spot in the code. Optimizations almost always make the
142code less clear, and you shouldn't pay the costs of reduced clarity (increased
143development time, greater likelihood of bugs) unless the resulting performance
144benefit is worth it.
145
146There is a page on the wiki devoted to `performance tips
147<http://wiki.python.org/moin/PythonSpeed/PerformanceTips>`_.
148
149Guido van Rossum has written up an anecdote related to optimization at
150http://www.python.org/doc/essays/list2str.html.
151
152One thing to notice is that function and (especially) method calls are rather
153expensive; if you have designed a purely OO interface with lots of tiny
154functions that don't do much more than get or set an instance variable or call
155another method, you might consider using a more direct way such as directly
156accessing instance variables. Also see the standard module :mod:`profile` which
157makes it possible to find out where your program is spending most of its time
158(if you have some patience -- the profiling itself can slow your program down by
159an order of magnitude).
160
161Remember that many standard optimization heuristics you may know from other
162programming experience may well apply to Python. For example it may be faster
163to send output to output devices using larger writes rather than smaller ones in
164order to reduce the overhead of kernel system calls. Thus CGI scripts that
165write all output in "one shot" may be faster than those that write lots of small
166pieces of output.
167
168Also, be sure to use Python's core features where appropriate. For example,
169slicing allows programs to chop up lists and other sequence objects in a single
170tick of the interpreter's mainloop using highly optimized C implementations.
171Thus to get the same effect as::
172
173 L2 = []
174 for i in range[3]:
175 L2.append(L1[i])
176
177it is much shorter and far faster to use ::
178
Georg Brandl0cedb4b2009-12-20 14:20:16 +0000179 L2 = list(L1[:3]) # "list" is redundant if L1 is a list.
Georg Brandl6728c5a2009-10-11 18:31:23 +0000180
Georg Brandl6f82cd32010-02-06 18:44:44 +0000181Note that the functionally-oriented built-in functions such as :func:`map`,
182:func:`zip`, and friends can be a convenient accelerator for loops that
183perform a single task. For example to pair the elements of two lists
184together::
Georg Brandl6728c5a2009-10-11 18:31:23 +0000185
Georg Brandl0cedb4b2009-12-20 14:20:16 +0000186 >>> zip([1, 2, 3], [4, 5, 6])
Georg Brandl6728c5a2009-10-11 18:31:23 +0000187 [(1, 4), (2, 5), (3, 6)]
188
189or to compute a number of sines::
190
Georg Brandl0cedb4b2009-12-20 14:20:16 +0000191 >>> map(math.sin, (1, 2, 3, 4))
192 [0.841470984808, 0.909297426826, 0.14112000806, -0.756802495308]
Georg Brandl6728c5a2009-10-11 18:31:23 +0000193
194The operation completes very quickly in such cases.
195
Georg Brandl0cedb4b2009-12-20 14:20:16 +0000196Other examples include the ``join()`` and ``split()`` :ref:`methods
197of string objects <string-methods>`.
Georg Brandl6728c5a2009-10-11 18:31:23 +0000198For example if s1..s7 are large (10K+) strings then
199``"".join([s1,s2,s3,s4,s5,s6,s7])`` may be far faster than the more obvious
200``s1+s2+s3+s4+s5+s6+s7``, since the "summation" will compute many
201subexpressions, whereas ``join()`` does all the copying in one pass. For
Georg Brandl0cedb4b2009-12-20 14:20:16 +0000202manipulating strings, use the ``replace()`` and the ``format()`` :ref:`methods
203on string objects <string-methods>`. Use regular expressions only when you're
204not dealing with constant string patterns. You may still use :ref:`the old %
205operations <string-formatting>` ``string % tuple`` and ``string % dictionary``.
Georg Brandl6728c5a2009-10-11 18:31:23 +0000206
Georg Brandl6f82cd32010-02-06 18:44:44 +0000207Be sure to use the :meth:`list.sort` built-in method to do sorting, and see the
Georg Brandl6728c5a2009-10-11 18:31:23 +0000208`sorting mini-HOWTO <http://wiki.python.org/moin/HowTo/Sorting>`_ for examples
209of moderately advanced usage. :meth:`list.sort` beats other techniques for
210sorting in all but the most extreme circumstances.
211
212Another common trick is to "push loops into functions or methods." For example
213suppose you have a program that runs slowly and you use the profiler to
214determine that a Python function ``ff()`` is being called lots of times. If you
Georg Brandl0cedb4b2009-12-20 14:20:16 +0000215notice that ``ff()``::
Georg Brandl6728c5a2009-10-11 18:31:23 +0000216
217 def ff(x):
218 ... # do something with x computing result...
219 return result
220
221tends to be called in loops like::
222
223 list = map(ff, oldlist)
224
225or::
226
227 for x in sequence:
228 value = ff(x)
229 ... # do something with value...
230
231then you can often eliminate function call overhead by rewriting ``ff()`` to::
232
233 def ffseq(seq):
234 resultseq = []
235 for x in seq:
236 ... # do something with x computing result...
237 resultseq.append(result)
238 return resultseq
239
240and rewrite the two examples to ``list = ffseq(oldlist)`` and to::
241
242 for value in ffseq(sequence):
243 ... # do something with value...
244
245Single calls to ``ff(x)`` translate to ``ffseq([x])[0]`` with little penalty.
246Of course this technique is not always appropriate and there are other variants
247which you can figure out.
248
249You can gain some performance by explicitly storing the results of a function or
250method lookup into a local variable. A loop like::
251
252 for key in token:
253 dict[key] = dict.get(key, 0) + 1
254
255resolves ``dict.get`` every iteration. If the method isn't going to change, a
256slightly faster implementation is::
257
258 dict_get = dict.get # look up the method once
259 for key in token:
260 dict[key] = dict_get(key, 0) + 1
261
262Default arguments can be used to determine values once, at compile time instead
263of at run time. This can only be done for functions or objects which will not
264be changed during program execution, such as replacing ::
265
266 def degree_sin(deg):
267 return math.sin(deg * math.pi / 180.0)
268
269with ::
270
271 def degree_sin(deg, factor=math.pi/180.0, sin=math.sin):
272 return sin(deg * factor)
273
274Because this trick uses default arguments for terms which should not be changed,
275it should only be used when you are not concerned with presenting a possibly
276confusing API to your users.
277
278
279Core Language
280=============
281
R. David Murray89064382009-11-10 18:58:02 +0000282Why am I getting an UnboundLocalError when the variable has a value?
283--------------------------------------------------------------------
Georg Brandl6728c5a2009-10-11 18:31:23 +0000284
R. David Murray89064382009-11-10 18:58:02 +0000285It can be a surprise to get the UnboundLocalError in previously working
286code when it is modified by adding an assignment statement somewhere in
287the body of a function.
Georg Brandl6728c5a2009-10-11 18:31:23 +0000288
R. David Murray89064382009-11-10 18:58:02 +0000289This code:
Georg Brandl6728c5a2009-10-11 18:31:23 +0000290
R. David Murray89064382009-11-10 18:58:02 +0000291 >>> x = 10
292 >>> def bar():
293 ... print x
294 >>> bar()
295 10
Georg Brandl6728c5a2009-10-11 18:31:23 +0000296
R. David Murray89064382009-11-10 18:58:02 +0000297works, but this code:
Georg Brandl6728c5a2009-10-11 18:31:23 +0000298
R. David Murray89064382009-11-10 18:58:02 +0000299 >>> x = 10
300 >>> def foo():
301 ... print x
302 ... x += 1
Georg Brandl6728c5a2009-10-11 18:31:23 +0000303
R. David Murray89064382009-11-10 18:58:02 +0000304results in an UnboundLocalError:
Georg Brandl6728c5a2009-10-11 18:31:23 +0000305
R. David Murray89064382009-11-10 18:58:02 +0000306 >>> foo()
307 Traceback (most recent call last):
308 ...
309 UnboundLocalError: local variable 'x' referenced before assignment
310
311This is because when you make an assignment to a variable in a scope, that
312variable becomes local to that scope and shadows any similarly named variable
313in the outer scope. Since the last statement in foo assigns a new value to
314``x``, the compiler recognizes it as a local variable. Consequently when the
315earlier ``print x`` attempts to print the uninitialized local variable and
316an error results.
317
318In the example above you can access the outer scope variable by declaring it
319global:
320
321 >>> x = 10
322 >>> def foobar():
323 ... global x
324 ... print x
325 ... x += 1
326 >>> foobar()
327 10
328
329This explicit declaration is required in order to remind you that (unlike the
330superficially analogous situation with class and instance variables) you are
331actually modifying the value of the variable in the outer scope:
332
333 >>> print x
334 11
335
Georg Brandl6728c5a2009-10-11 18:31:23 +0000336
337What are the rules for local and global variables in Python?
338------------------------------------------------------------
339
340In Python, variables that are only referenced inside a function are implicitly
341global. If a variable is assigned a new value anywhere within the function's
342body, it's assumed to be a local. If a variable is ever assigned a new value
343inside the function, the variable is implicitly local, and you need to
344explicitly declare it as 'global'.
345
346Though a bit surprising at first, a moment's consideration explains this. On
347one hand, requiring :keyword:`global` for assigned variables provides a bar
348against unintended side-effects. On the other hand, if ``global`` was required
349for all global references, you'd be using ``global`` all the time. You'd have
Georg Brandl6f82cd32010-02-06 18:44:44 +0000350to declare as global every reference to a built-in function or to a component of
Georg Brandl6728c5a2009-10-11 18:31:23 +0000351an imported module. This clutter would defeat the usefulness of the ``global``
352declaration for identifying side-effects.
353
354
355How do I share global variables across modules?
356------------------------------------------------
357
358The canonical way to share information across modules within a single program is
359to create a special module (often called config or cfg). Just import the config
360module in all modules of your application; the module then becomes available as
361a global name. Because there is only one instance of each module, any changes
362made to the module object get reflected everywhere. For example:
363
364config.py::
365
366 x = 0 # Default value of the 'x' configuration setting
367
368mod.py::
369
370 import config
371 config.x = 1
372
373main.py::
374
375 import config
376 import mod
377 print config.x
378
379Note that using a module is also the basis for implementing the Singleton design
380pattern, for the same reason.
381
382
383What are the "best practices" for using import in a module?
384-----------------------------------------------------------
385
386In general, don't use ``from modulename import *``. Doing so clutters the
387importer's namespace. Some people avoid this idiom even with the few modules
388that were designed to be imported in this manner. Modules designed in this
389manner include :mod:`Tkinter`, and :mod:`threading`.
390
391Import modules at the top of a file. Doing so makes it clear what other modules
392your code requires and avoids questions of whether the module name is in scope.
393Using one import per line makes it easy to add and delete module imports, but
394using multiple imports per line uses less screen space.
395
396It's good practice if you import modules in the following order:
397
Georg Brandl0cedb4b2009-12-20 14:20:16 +00003981. standard library modules -- e.g. ``sys``, ``os``, ``getopt``, ``re``
Georg Brandl6728c5a2009-10-11 18:31:23 +00003992. third-party library modules (anything installed in Python's site-packages
400 directory) -- e.g. mx.DateTime, ZODB, PIL.Image, etc.
4013. locally-developed modules
402
403Never use relative package imports. If you're writing code that's in the
404``package.sub.m1`` module and want to import ``package.sub.m2``, do not just
405write ``import m2``, even though it's legal. Write ``from package.sub import
406m2`` instead. Relative imports can lead to a module being initialized twice,
Georg Brandl0cedb4b2009-12-20 14:20:16 +0000407leading to confusing bugs. See :pep:`328` for details.
Georg Brandl6728c5a2009-10-11 18:31:23 +0000408
409It is sometimes necessary to move imports to a function or class to avoid
410problems with circular imports. Gordon McMillan says:
411
412 Circular imports are fine where both modules use the "import <module>" form
413 of import. They fail when the 2nd module wants to grab a name out of the
414 first ("from module import name") and the import is at the top level. That's
415 because names in the 1st are not yet available, because the first module is
416 busy importing the 2nd.
417
418In this case, if the second module is only used in one function, then the import
419can easily be moved into that function. By the time the import is called, the
420first module will have finished initializing, and the second module can do its
421import.
422
423It may also be necessary to move imports out of the top level of code if some of
424the modules are platform-specific. In that case, it may not even be possible to
425import all of the modules at the top of the file. In this case, importing the
426correct modules in the corresponding platform-specific code is a good option.
427
428Only move imports into a local scope, such as inside a function definition, if
429it's necessary to solve a problem such as avoiding a circular import or are
430trying to reduce the initialization time of a module. This technique is
431especially helpful if many of the imports are unnecessary depending on how the
432program executes. You may also want to move imports into a function if the
433modules are only ever used in that function. Note that loading a module the
434first time may be expensive because of the one time initialization of the
435module, but loading a module multiple times is virtually free, costing only a
436couple of dictionary lookups. Even if the module name has gone out of scope,
437the module is probably available in :data:`sys.modules`.
438
439If only instances of a specific class use a module, then it is reasonable to
440import the module in the class's ``__init__`` method and then assign the module
441to an instance variable so that the module is always available (via that
442instance variable) during the life of the object. Note that to delay an import
443until the class is instantiated, the import must be inside a method. Putting
444the import inside the class but outside of any method still causes the import to
445occur when the module is initialized.
446
447
448How can I pass optional or keyword parameters from one function to another?
449---------------------------------------------------------------------------
450
451Collect the arguments using the ``*`` and ``**`` specifiers in the function's
452parameter list; this gives you the positional arguments as a tuple and the
453keyword arguments as a dictionary. You can then pass these arguments when
454calling another function by using ``*`` and ``**``::
455
456 def f(x, *args, **kwargs):
457 ...
458 kwargs['width'] = '14.3c'
459 ...
460 g(x, *args, **kwargs)
461
462In the unlikely case that you care about Python versions older than 2.0, use
463:func:`apply`::
464
465 def f(x, *args, **kwargs):
466 ...
467 kwargs['width'] = '14.3c'
468 ...
469 apply(g, (x,)+args, kwargs)
470
471
472How do I write a function with output parameters (call by reference)?
473---------------------------------------------------------------------
474
475Remember that arguments are passed by assignment in Python. Since assignment
476just creates references to objects, there's no alias between an argument name in
477the caller and callee, and so no call-by-reference per se. You can achieve the
478desired effect in a number of ways.
479
4801) By returning a tuple of the results::
481
482 def func2(a, b):
483 a = 'new-value' # a and b are local names
484 b = b + 1 # assigned to new objects
485 return a, b # return new values
486
487 x, y = 'old-value', 99
488 x, y = func2(x, y)
489 print x, y # output: new-value 100
490
491 This is almost always the clearest solution.
492
4932) By using global variables. This isn't thread-safe, and is not recommended.
494
4953) By passing a mutable (changeable in-place) object::
496
497 def func1(a):
498 a[0] = 'new-value' # 'a' references a mutable list
499 a[1] = a[1] + 1 # changes a shared object
500
501 args = ['old-value', 99]
502 func1(args)
503 print args[0], args[1] # output: new-value 100
504
5054) By passing in a dictionary that gets mutated::
506
507 def func3(args):
508 args['a'] = 'new-value' # args is a mutable dictionary
509 args['b'] = args['b'] + 1 # change it in-place
510
511 args = {'a':' old-value', 'b': 99}
512 func3(args)
513 print args['a'], args['b']
514
5155) Or bundle up values in a class instance::
516
517 class callByRef:
518 def __init__(self, **args):
519 for (key, value) in args.items():
520 setattr(self, key, value)
521
522 def func4(args):
523 args.a = 'new-value' # args is a mutable callByRef
524 args.b = args.b + 1 # change object in-place
525
526 args = callByRef(a='old-value', b=99)
527 func4(args)
528 print args.a, args.b
529
530
531 There's almost never a good reason to get this complicated.
532
533Your best choice is to return a tuple containing the multiple results.
534
535
536How do you make a higher order function in Python?
537--------------------------------------------------
538
539You have two choices: you can use nested scopes or you can use callable objects.
540For example, suppose you wanted to define ``linear(a,b)`` which returns a
541function ``f(x)`` that computes the value ``a*x+b``. Using nested scopes::
542
543 def linear(a, b):
544 def result(x):
545 return a * x + b
546 return result
547
548Or using a callable object::
549
550 class linear:
551
552 def __init__(self, a, b):
553 self.a, self.b = a, b
554
555 def __call__(self, x):
556 return self.a * x + self.b
557
558In both cases, ::
559
560 taxes = linear(0.3, 2)
561
562gives a callable object where ``taxes(10e6) == 0.3 * 10e6 + 2``.
563
564The callable object approach has the disadvantage that it is a bit slower and
565results in slightly longer code. However, note that a collection of callables
566can share their signature via inheritance::
567
568 class exponential(linear):
569 # __init__ inherited
570 def __call__(self, x):
571 return self.a * (x ** self.b)
572
573Object can encapsulate state for several methods::
574
575 class counter:
576
577 value = 0
578
579 def set(self, x):
580 self.value = x
581
582 def up(self):
583 self.value = self.value + 1
584
585 def down(self):
586 self.value = self.value - 1
587
588 count = counter()
589 inc, dec, reset = count.up, count.down, count.set
590
591Here ``inc()``, ``dec()`` and ``reset()`` act like functions which share the
592same counting variable.
593
594
595How do I copy an object in Python?
596----------------------------------
597
598In general, try :func:`copy.copy` or :func:`copy.deepcopy` for the general case.
599Not all objects can be copied, but most can.
600
601Some objects can be copied more easily. Dictionaries have a :meth:`~dict.copy`
602method::
603
604 newdict = olddict.copy()
605
606Sequences can be copied by slicing::
607
608 new_l = l[:]
609
610
611How can I find the methods or attributes of an object?
612------------------------------------------------------
613
614For an instance x of a user-defined class, ``dir(x)`` returns an alphabetized
615list of the names containing the instance attributes and methods and attributes
616defined by its class.
617
618
619How can my code discover the name of an object?
620-----------------------------------------------
621
622Generally speaking, it can't, because objects don't really have names.
623Essentially, assignment always binds a name to a value; The same is true of
624``def`` and ``class`` statements, but in that case the value is a
625callable. Consider the following code::
626
627 class A:
628 pass
629
630 B = A
631
632 a = B()
633 b = a
634 print b
Georg Brandl0cedb4b2009-12-20 14:20:16 +0000635 <__main__.A instance at 0x16D07CC>
Georg Brandl6728c5a2009-10-11 18:31:23 +0000636 print a
Georg Brandl0cedb4b2009-12-20 14:20:16 +0000637 <__main__.A instance at 0x16D07CC>
Georg Brandl6728c5a2009-10-11 18:31:23 +0000638
639Arguably the class has a name: even though it is bound to two names and invoked
640through the name B the created instance is still reported as an instance of
641class A. However, it is impossible to say whether the instance's name is a or
642b, since both names are bound to the same value.
643
644Generally speaking it should not be necessary for your code to "know the names"
645of particular values. Unless you are deliberately writing introspective
646programs, this is usually an indication that a change of approach might be
647beneficial.
648
649In comp.lang.python, Fredrik Lundh once gave an excellent analogy in answer to
650this question:
651
652 The same way as you get the name of that cat you found on your porch: the cat
653 (object) itself cannot tell you its name, and it doesn't really care -- so
654 the only way to find out what it's called is to ask all your neighbours
655 (namespaces) if it's their cat (object)...
656
657 ....and don't be surprised if you'll find that it's known by many names, or
658 no name at all!
659
660
661What's up with the comma operator's precedence?
662-----------------------------------------------
663
664Comma is not an operator in Python. Consider this session::
665
666 >>> "a" in "b", "a"
Georg Brandl0cedb4b2009-12-20 14:20:16 +0000667 (False, 'a')
Georg Brandl6728c5a2009-10-11 18:31:23 +0000668
669Since the comma is not an operator, but a separator between expressions the
670above is evaluated as if you had entered::
671
672 >>> ("a" in "b"), "a"
673
674not::
675
Georg Brandl0cedb4b2009-12-20 14:20:16 +0000676 >>> "a" in ("b", "a")
Georg Brandl6728c5a2009-10-11 18:31:23 +0000677
678The same is true of the various assignment operators (``=``, ``+=`` etc). They
679are not truly operators but syntactic delimiters in assignment statements.
680
681
682Is there an equivalent of C's "?:" ternary operator?
683----------------------------------------------------
684
685Yes, this feature was added in Python 2.5. The syntax would be as follows::
686
687 [on_true] if [expression] else [on_false]
688
689 x, y = 50, 25
690
691 small = x if x < y else y
692
693For versions previous to 2.5 the answer would be 'No'.
694
695.. XXX remove rest?
696
697In many cases you can mimic ``a ? b : c`` with ``a and b or c``, but there's a
698flaw: if *b* is zero (or empty, or ``None`` -- anything that tests false) then
699*c* will be selected instead. In many cases you can prove by looking at the
700code that this can't happen (e.g. because *b* is a constant or has a type that
701can never be false), but in general this can be a problem.
702
703Tim Peters (who wishes it was Steve Majewski) suggested the following solution:
704``(a and [b] or [c])[0]``. Because ``[b]`` is a singleton list it is never
705false, so the wrong path is never taken; then applying ``[0]`` to the whole
706thing gets the *b* or *c* that you really wanted. Ugly, but it gets you there
707in the rare cases where it is really inconvenient to rewrite your code using
708'if'.
709
710The best course is usually to write a simple ``if...else`` statement. Another
711solution is to implement the ``?:`` operator as a function::
712
713 def q(cond, on_true, on_false):
714 if cond:
715 if not isfunction(on_true):
716 return on_true
717 else:
Georg Brandl0cedb4b2009-12-20 14:20:16 +0000718 return on_true()
Georg Brandl6728c5a2009-10-11 18:31:23 +0000719 else:
720 if not isfunction(on_false):
721 return on_false
722 else:
Georg Brandl0cedb4b2009-12-20 14:20:16 +0000723 return on_false()
Georg Brandl6728c5a2009-10-11 18:31:23 +0000724
725In most cases you'll pass b and c directly: ``q(a, b, c)``. To avoid evaluating
726b or c when they shouldn't be, encapsulate them within a lambda function, e.g.:
727``q(a, lambda: b, lambda: c)``.
728
729It has been asked *why* Python has no if-then-else expression. There are
730several answers: many languages do just fine without one; it can easily lead to
731less readable code; no sufficiently "Pythonic" syntax has been discovered; a
732search of the standard library found remarkably few places where using an
733if-then-else expression would make the code more understandable.
734
735In 2002, :pep:`308` was written proposing several possible syntaxes and the
736community was asked to vote on the issue. The vote was inconclusive. Most
737people liked one of the syntaxes, but also hated other syntaxes; many votes
738implied that people preferred no ternary operator rather than having a syntax
739they hated.
740
741
742Is it possible to write obfuscated one-liners in Python?
743--------------------------------------------------------
744
745Yes. Usually this is done by nesting :keyword:`lambda` within
746:keyword:`lambda`. See the following three examples, due to Ulf Bartelt::
747
748 # Primes < 1000
749 print filter(None,map(lambda y:y*reduce(lambda x,y:x*y!=0,
750 map(lambda x,y=y:y%x,range(2,int(pow(y,0.5)+1))),1),range(2,1000)))
751
752 # First 10 Fibonacci numbers
Georg Brandl0cedb4b2009-12-20 14:20:16 +0000753 print map(lambda x,f=lambda x,f:(f(x-1,f)+f(x-2,f)) if x>1 else 1: f(x,f),
Georg Brandl6728c5a2009-10-11 18:31:23 +0000754 range(10))
755
756 # Mandelbrot set
757 print (lambda Ru,Ro,Iu,Io,IM,Sx,Sy:reduce(lambda x,y:x+y,map(lambda y,
758 Iu=Iu,Io=Io,Ru=Ru,Ro=Ro,Sy=Sy,L=lambda yc,Iu=Iu,Io=Io,Ru=Ru,Ro=Ro,i=IM,
759 Sx=Sx,Sy=Sy:reduce(lambda x,y:x+y,map(lambda x,xc=Ru,yc=yc,Ru=Ru,Ro=Ro,
760 i=i,Sx=Sx,F=lambda xc,yc,x,y,k,f=lambda xc,yc,x,y,k,f:(k<=0)or (x*x+y*y
761 >=4.0) or 1+f(xc,yc,x*x-y*y+xc,2.0*x*y+yc,k-1,f):f(xc,yc,x,y,k,f):chr(
762 64+F(Ru+x*(Ro-Ru)/Sx,yc,0,0,i)),range(Sx))):L(Iu+y*(Io-Iu)/Sy),range(Sy
763 ))))(-2.1, 0.7, -1.2, 1.2, 30, 80, 24)
764 # \___ ___/ \___ ___/ | | |__ lines on screen
765 # V V | |______ columns on screen
766 # | | |__________ maximum of "iterations"
767 # | |_________________ range on y axis
768 # |____________________________ range on x axis
769
770Don't try this at home, kids!
771
772
773Numbers and strings
774===================
775
776How do I specify hexadecimal and octal integers?
777------------------------------------------------
778
Georg Brandl0cedb4b2009-12-20 14:20:16 +0000779To specify an octal digit, precede the octal value with a zero, and then a lower
780or uppercase "o". For example, to set the variable "a" to the octal value "10"
781(8 in decimal), type::
Georg Brandl6728c5a2009-10-11 18:31:23 +0000782
Georg Brandl0cedb4b2009-12-20 14:20:16 +0000783 >>> a = 0o10
Georg Brandl6728c5a2009-10-11 18:31:23 +0000784 >>> a
785 8
786
787Hexadecimal is just as easy. Simply precede the hexadecimal number with a zero,
788and then a lower or uppercase "x". Hexadecimal digits can be specified in lower
789or uppercase. For example, in the Python interpreter::
790
791 >>> a = 0xa5
792 >>> a
793 165
794 >>> b = 0XB2
795 >>> b
796 178
797
798
Georg Brandl0cedb4b2009-12-20 14:20:16 +0000799Why does -22 // 10 return -3?
800-----------------------------
Georg Brandl6728c5a2009-10-11 18:31:23 +0000801
802It's primarily driven by the desire that ``i % j`` have the same sign as ``j``.
803If you want that, and also want::
804
Georg Brandl0cedb4b2009-12-20 14:20:16 +0000805 i == (i // j) * j + (i % j)
Georg Brandl6728c5a2009-10-11 18:31:23 +0000806
807then integer division has to return the floor. C also requires that identity to
Georg Brandl0cedb4b2009-12-20 14:20:16 +0000808hold, and then compilers that truncate ``i // j`` need to make ``i % j`` have
809the same sign as ``i``.
Georg Brandl6728c5a2009-10-11 18:31:23 +0000810
811There are few real use cases for ``i % j`` when ``j`` is negative. When ``j``
812is positive, there are many, and in virtually all of them it's more useful for
813``i % j`` to be ``>= 0``. If the clock says 10 now, what did it say 200 hours
814ago? ``-190 % 12 == 2`` is useful; ``-190 % 12 == -10`` is a bug waiting to
815bite.
816
Georg Brandl0cedb4b2009-12-20 14:20:16 +0000817.. note::
818
819 On Python 2, ``a / b`` returns the same as ``a // b`` if
820 ``__future__.division`` is not in effect. This is also known as "classic"
821 division.
822
Georg Brandl6728c5a2009-10-11 18:31:23 +0000823
824How do I convert a string to a number?
825--------------------------------------
826
827For integers, use the built-in :func:`int` type constructor, e.g. ``int('144')
828== 144``. Similarly, :func:`float` converts to floating-point,
829e.g. ``float('144') == 144.0``.
830
831By default, these interpret the number as decimal, so that ``int('0144') ==
832144`` and ``int('0x144')`` raises :exc:`ValueError`. ``int(string, base)`` takes
833the base to convert from as a second optional argument, so ``int('0x144', 16) ==
834324``. If the base is specified as 0, the number is interpreted using Python's
835rules: a leading '0' indicates octal, and '0x' indicates a hex number.
836
837Do not use the built-in function :func:`eval` if all you need is to convert
838strings to numbers. :func:`eval` will be significantly slower and it presents a
839security risk: someone could pass you a Python expression that might have
840unwanted side effects. For example, someone could pass
841``__import__('os').system("rm -rf $HOME")`` which would erase your home
842directory.
843
844:func:`eval` also has the effect of interpreting numbers as Python expressions,
845so that e.g. ``eval('09')`` gives a syntax error because Python regards numbers
846starting with '0' as octal (base 8).
847
848
849How do I convert a number to a string?
850--------------------------------------
851
852To convert, e.g., the number 144 to the string '144', use the built-in type
853constructor :func:`str`. If you want a hexadecimal or octal representation, use
Georg Brandl0cedb4b2009-12-20 14:20:16 +0000854the built-in functions :func:`hex` or :func:`oct`. For fancy formatting, see
855the :ref:`formatstrings` section, e.g. ``"{:04d}".format(144)`` yields
856``'0144'`` and ``"{:.3f}".format(1/3)`` yields ``'0.333'``. You may also use
857:ref:`the % operator <string-formatting>` on strings. See the library reference
858manual for details.
Georg Brandl6728c5a2009-10-11 18:31:23 +0000859
860
861How do I modify a string in place?
862----------------------------------
863
864You can't, because strings are immutable. If you need an object with this
865ability, try converting the string to a list or use the array module::
866
867 >>> s = "Hello, world"
868 >>> a = list(s)
869 >>> print a
870 ['H', 'e', 'l', 'l', 'o', ',', ' ', 'w', 'o', 'r', 'l', 'd']
871 >>> a[7:] = list("there!")
872 >>> ''.join(a)
873 'Hello, there!'
874
875 >>> import array
876 >>> a = array.array('c', s)
877 >>> print a
878 array('c', 'Hello, world')
879 >>> a[0] = 'y' ; print a
880 array('c', 'yello world')
881 >>> a.tostring()
882 'yello, world'
883
884
885How do I use strings to call functions/methods?
886-----------------------------------------------
887
888There are various techniques.
889
890* The best is to use a dictionary that maps strings to functions. The primary
891 advantage of this technique is that the strings do not need to match the names
892 of the functions. This is also the primary technique used to emulate a case
893 construct::
894
895 def a():
896 pass
897
898 def b():
899 pass
900
901 dispatch = {'go': a, 'stop': b} # Note lack of parens for funcs
902
903 dispatch[get_input()]() # Note trailing parens to call function
904
905* Use the built-in function :func:`getattr`::
906
907 import foo
908 getattr(foo, 'bar')()
909
910 Note that :func:`getattr` works on any object, including classes, class
911 instances, modules, and so on.
912
913 This is used in several places in the standard library, like this::
914
915 class Foo:
916 def do_foo(self):
917 ...
918
919 def do_bar(self):
920 ...
921
922 f = getattr(foo_instance, 'do_' + opname)
923 f()
924
925
926* Use :func:`locals` or :func:`eval` to resolve the function name::
927
928 def myFunc():
929 print "hello"
930
931 fname = "myFunc"
932
933 f = locals()[fname]
934 f()
935
936 f = eval(fname)
937 f()
938
939 Note: Using :func:`eval` is slow and dangerous. If you don't have absolute
940 control over the contents of the string, someone could pass a string that
941 resulted in an arbitrary function being executed.
942
943Is there an equivalent to Perl's chomp() for removing trailing newlines from strings?
944-------------------------------------------------------------------------------------
945
946Starting with Python 2.2, you can use ``S.rstrip("\r\n")`` to remove all
Georg Brandl09302282010-10-06 09:32:48 +0000947occurrences of any line terminator from the end of the string ``S`` without
Georg Brandl6728c5a2009-10-11 18:31:23 +0000948removing other trailing whitespace. If the string ``S`` represents more than
949one line, with several empty lines at the end, the line terminators for all the
950blank lines will be removed::
951
952 >>> lines = ("line 1 \r\n"
953 ... "\r\n"
954 ... "\r\n")
955 >>> lines.rstrip("\n\r")
Georg Brandl0cedb4b2009-12-20 14:20:16 +0000956 'line 1 '
Georg Brandl6728c5a2009-10-11 18:31:23 +0000957
958Since this is typically only desired when reading text one line at a time, using
959``S.rstrip()`` this way works well.
960
Georg Brandl0cedb4b2009-12-20 14:20:16 +0000961For older versions of Python, there are two partial substitutes:
Georg Brandl6728c5a2009-10-11 18:31:23 +0000962
963- If you want to remove all trailing whitespace, use the ``rstrip()`` method of
964 string objects. This removes all trailing whitespace, not just a single
965 newline.
966
967- Otherwise, if there is only one line in the string ``S``, use
968 ``S.splitlines()[0]``.
969
970
971Is there a scanf() or sscanf() equivalent?
972------------------------------------------
973
974Not as such.
975
976For simple input parsing, the easiest approach is usually to split the line into
977whitespace-delimited words using the :meth:`~str.split` method of string objects
978and then convert decimal strings to numeric values using :func:`int` or
979:func:`float`. ``split()`` supports an optional "sep" parameter which is useful
980if the line uses something other than whitespace as a separator.
981
Brian Curtine49aefc2010-09-23 13:48:06 +0000982For more complicated input parsing, regular expressions are more powerful
983than C's :cfunc:`sscanf` and better suited for the task.
Georg Brandl6728c5a2009-10-11 18:31:23 +0000984
985
986What does 'UnicodeError: ASCII [decoding,encoding] error: ordinal not in range(128)' mean?
987------------------------------------------------------------------------------------------
988
989This error indicates that your Python installation can handle only 7-bit ASCII
990strings. There are a couple ways to fix or work around the problem.
991
992If your programs must handle data in arbitrary character set encodings, the
993environment the application runs in will generally identify the encoding of the
994data it is handing you. You need to convert the input to Unicode data using
995that encoding. For example, a program that handles email or web input will
996typically find character set encoding information in Content-Type headers. This
997can then be used to properly convert input data to Unicode. Assuming the string
998referred to by ``value`` is encoded as UTF-8::
999
1000 value = unicode(value, "utf-8")
1001
1002will return a Unicode object. If the data is not correctly encoded as UTF-8,
1003the above call will raise a :exc:`UnicodeError` exception.
1004
1005If you only want strings converted to Unicode which have non-ASCII data, you can
1006try converting them first assuming an ASCII encoding, and then generate Unicode
1007objects if that fails::
1008
1009 try:
1010 x = unicode(value, "ascii")
1011 except UnicodeError:
1012 value = unicode(value, "utf-8")
1013 else:
1014 # value was valid ASCII data
1015 pass
1016
1017It's possible to set a default encoding in a file called ``sitecustomize.py``
1018that's part of the Python library. However, this isn't recommended because
1019changing the Python-wide default encoding may cause third-party extension
1020modules to fail.
1021
1022Note that on Windows, there is an encoding known as "mbcs", which uses an
1023encoding specific to your current locale. In many cases, and particularly when
1024working with COM, this may be an appropriate default encoding to use.
1025
1026
1027Sequences (Tuples/Lists)
1028========================
1029
1030How do I convert between tuples and lists?
1031------------------------------------------
1032
1033The type constructor ``tuple(seq)`` converts any sequence (actually, any
1034iterable) into a tuple with the same items in the same order.
1035
1036For example, ``tuple([1, 2, 3])`` yields ``(1, 2, 3)`` and ``tuple('abc')``
1037yields ``('a', 'b', 'c')``. If the argument is a tuple, it does not make a copy
1038but returns the same object, so it is cheap to call :func:`tuple` when you
1039aren't sure that an object is already a tuple.
1040
1041The type constructor ``list(seq)`` converts any sequence or iterable into a list
1042with the same items in the same order. For example, ``list((1, 2, 3))`` yields
1043``[1, 2, 3]`` and ``list('abc')`` yields ``['a', 'b', 'c']``. If the argument
1044is a list, it makes a copy just like ``seq[:]`` would.
1045
1046
1047What's a negative index?
1048------------------------
1049
1050Python sequences are indexed with positive numbers and negative numbers. For
1051positive numbers 0 is the first index 1 is the second index and so forth. For
1052negative indices -1 is the last index and -2 is the penultimate (next to last)
1053index and so forth. Think of ``seq[-n]`` as the same as ``seq[len(seq)-n]``.
1054
1055Using negative indices can be very convenient. For example ``S[:-1]`` is all of
1056the string except for its last character, which is useful for removing the
1057trailing newline from a string.
1058
1059
1060How do I iterate over a sequence in reverse order?
1061--------------------------------------------------
1062
Georg Brandl6f82cd32010-02-06 18:44:44 +00001063Use the :func:`reversed` built-in function, which is new in Python 2.4::
Georg Brandl6728c5a2009-10-11 18:31:23 +00001064
1065 for x in reversed(sequence):
1066 ... # do something with x...
1067
1068This won't touch your original sequence, but build a new copy with reversed
1069order to iterate over.
1070
1071With Python 2.3, you can use an extended slice syntax::
1072
1073 for x in sequence[::-1]:
1074 ... # do something with x...
1075
1076
1077How do you remove duplicates from a list?
1078-----------------------------------------
1079
1080See the Python Cookbook for a long discussion of many ways to do this:
1081
1082 http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/52560
1083
1084If you don't mind reordering the list, sort it and then scan from the end of the
1085list, deleting duplicates as you go::
1086
Georg Brandl0cedb4b2009-12-20 14:20:16 +00001087 if mylist:
1088 mylist.sort()
1089 last = mylist[-1]
1090 for i in range(len(mylist)-2, -1, -1):
1091 if last == mylist[i]:
1092 del mylist[i]
Georg Brandl6728c5a2009-10-11 18:31:23 +00001093 else:
Georg Brandl0cedb4b2009-12-20 14:20:16 +00001094 last = mylist[i]
Georg Brandl6728c5a2009-10-11 18:31:23 +00001095
1096If all elements of the list may be used as dictionary keys (i.e. they are all
1097hashable) this is often faster ::
1098
1099 d = {}
Georg Brandl0cedb4b2009-12-20 14:20:16 +00001100 for x in mylist:
1101 d[x] = 1
1102 mylist = list(d.keys())
Georg Brandl6728c5a2009-10-11 18:31:23 +00001103
1104In Python 2.5 and later, the following is possible instead::
1105
Georg Brandl0cedb4b2009-12-20 14:20:16 +00001106 mylist = list(set(mylist))
Georg Brandl6728c5a2009-10-11 18:31:23 +00001107
1108This converts the list into a set, thereby removing duplicates, and then back
1109into a list.
1110
1111
1112How do you make an array in Python?
1113-----------------------------------
1114
1115Use a list::
1116
1117 ["this", 1, "is", "an", "array"]
1118
1119Lists are equivalent to C or Pascal arrays in their time complexity; the primary
1120difference is that a Python list can contain objects of many different types.
1121
1122The ``array`` module also provides methods for creating arrays of fixed types
1123with compact representations, but they are slower to index than lists. Also
1124note that the Numeric extensions and others define array-like structures with
1125various characteristics as well.
1126
1127To get Lisp-style linked lists, you can emulate cons cells using tuples::
1128
1129 lisp_list = ("like", ("this", ("example", None) ) )
1130
1131If mutability is desired, you could use lists instead of tuples. Here the
1132analogue of lisp car is ``lisp_list[0]`` and the analogue of cdr is
1133``lisp_list[1]``. Only do this if you're sure you really need to, because it's
1134usually a lot slower than using Python lists.
1135
1136
1137How do I create a multidimensional list?
1138----------------------------------------
1139
1140You probably tried to make a multidimensional array like this::
1141
1142 A = [[None] * 2] * 3
1143
1144This looks correct if you print it::
1145
1146 >>> A
1147 [[None, None], [None, None], [None, None]]
1148
1149But when you assign a value, it shows up in multiple places:
1150
1151 >>> A[0][0] = 5
1152 >>> A
1153 [[5, None], [5, None], [5, None]]
1154
1155The reason is that replicating a list with ``*`` doesn't create copies, it only
1156creates references to the existing objects. The ``*3`` creates a list
1157containing 3 references to the same list of length two. Changes to one row will
1158show in all rows, which is almost certainly not what you want.
1159
1160The suggested approach is to create a list of the desired length first and then
1161fill in each element with a newly created list::
1162
1163 A = [None] * 3
1164 for i in range(3):
1165 A[i] = [None] * 2
1166
1167This generates a list containing 3 different lists of length two. You can also
1168use a list comprehension::
1169
1170 w, h = 2, 3
1171 A = [[None] * w for i in range(h)]
1172
1173Or, you can use an extension that provides a matrix datatype; `Numeric Python
Georg Brandla4314c22009-10-11 20:16:16 +00001174<http://numpy.scipy.org/>`_ is the best known.
Georg Brandl6728c5a2009-10-11 18:31:23 +00001175
1176
1177How do I apply a method to a sequence of objects?
1178-------------------------------------------------
1179
1180Use a list comprehension::
1181
Georg Brandl0cedb4b2009-12-20 14:20:16 +00001182 result = [obj.method() for obj in mylist]
Georg Brandl6728c5a2009-10-11 18:31:23 +00001183
1184More generically, you can try the following function::
1185
1186 def method_map(objects, method, arguments):
1187 """method_map([a,b], "meth", (1,2)) gives [a.meth(1,2), b.meth(1,2)]"""
1188 nobjects = len(objects)
1189 methods = map(getattr, objects, [method]*nobjects)
1190 return map(apply, methods, [arguments]*nobjects)
1191
1192
1193Dictionaries
1194============
1195
1196How can I get a dictionary to display its keys in a consistent order?
1197---------------------------------------------------------------------
1198
1199You can't. Dictionaries store their keys in an unpredictable order, so the
1200display order of a dictionary's elements will be similarly unpredictable.
1201
1202This can be frustrating if you want to save a printable version to a file, make
1203some changes and then compare it with some other printed dictionary. In this
1204case, use the ``pprint`` module to pretty-print the dictionary; the items will
1205be presented in order sorted by the key.
1206
Georg Brandl0cedb4b2009-12-20 14:20:16 +00001207A more complicated solution is to subclass ``dict`` to create a
Georg Brandl6728c5a2009-10-11 18:31:23 +00001208``SortedDict`` class that prints itself in a predictable order. Here's one
1209simpleminded implementation of such a class::
1210
Georg Brandl0cedb4b2009-12-20 14:20:16 +00001211 class SortedDict(dict):
Georg Brandl6728c5a2009-10-11 18:31:23 +00001212 def __repr__(self):
Georg Brandl0cedb4b2009-12-20 14:20:16 +00001213 keys = sorted(self.keys())
1214 result = ("{!r}: {!r}".format(k, self[k]) for k in keys)
1215 return "{{{}}}".format(", ".join(result))
Georg Brandl6728c5a2009-10-11 18:31:23 +00001216
Georg Brandl0cedb4b2009-12-20 14:20:16 +00001217 __str__ = __repr__
Georg Brandl6728c5a2009-10-11 18:31:23 +00001218
1219This will work for many common situations you might encounter, though it's far
1220from a perfect solution. The largest flaw is that if some values in the
1221dictionary are also dictionaries, their values won't be presented in any
1222particular order.
1223
1224
1225I want to do a complicated sort: can you do a Schwartzian Transform in Python?
1226------------------------------------------------------------------------------
1227
1228The technique, attributed to Randal Schwartz of the Perl community, sorts the
1229elements of a list by a metric which maps each element to its "sort value". In
1230Python, just use the ``key`` argument for the ``sort()`` method::
1231
1232 Isorted = L[:]
1233 Isorted.sort(key=lambda s: int(s[10:15]))
1234
1235The ``key`` argument is new in Python 2.4, for older versions this kind of
1236sorting is quite simple to do with list comprehensions. To sort a list of
1237strings by their uppercase values::
1238
Georg Brandl0cedb4b2009-12-20 14:20:16 +00001239 tmp1 = [(x.upper(), x) for x in L] # Schwartzian transform
Georg Brandl6728c5a2009-10-11 18:31:23 +00001240 tmp1.sort()
1241 Usorted = [x[1] for x in tmp1]
1242
1243To sort by the integer value of a subfield extending from positions 10-15 in
1244each string::
1245
Georg Brandl0cedb4b2009-12-20 14:20:16 +00001246 tmp2 = [(int(s[10:15]), s) for s in L] # Schwartzian transform
Georg Brandl6728c5a2009-10-11 18:31:23 +00001247 tmp2.sort()
1248 Isorted = [x[1] for x in tmp2]
1249
1250Note that Isorted may also be computed by ::
1251
1252 def intfield(s):
1253 return int(s[10:15])
1254
1255 def Icmp(s1, s2):
1256 return cmp(intfield(s1), intfield(s2))
1257
1258 Isorted = L[:]
1259 Isorted.sort(Icmp)
1260
1261but since this method calls ``intfield()`` many times for each element of L, it
1262is slower than the Schwartzian Transform.
1263
1264
1265How can I sort one list by values from another list?
1266----------------------------------------------------
1267
1268Merge them into a single list of tuples, sort the resulting list, and then pick
1269out the element you want. ::
1270
1271 >>> list1 = ["what", "I'm", "sorting", "by"]
1272 >>> list2 = ["something", "else", "to", "sort"]
1273 >>> pairs = zip(list1, list2)
1274 >>> pairs
1275 [('what', 'something'), ("I'm", 'else'), ('sorting', 'to'), ('by', 'sort')]
1276 >>> pairs.sort()
1277 >>> result = [ x[1] for x in pairs ]
1278 >>> result
1279 ['else', 'sort', 'to', 'something']
1280
1281An alternative for the last step is::
1282
Georg Brandl0cedb4b2009-12-20 14:20:16 +00001283 >>> result = []
1284 >>> for p in pairs: result.append(p[1])
Georg Brandl6728c5a2009-10-11 18:31:23 +00001285
1286If you find this more legible, you might prefer to use this instead of the final
1287list comprehension. However, it is almost twice as slow for long lists. Why?
1288First, the ``append()`` operation has to reallocate memory, and while it uses
1289some tricks to avoid doing that each time, it still has to do it occasionally,
1290and that costs quite a bit. Second, the expression "result.append" requires an
1291extra attribute lookup, and third, there's a speed reduction from having to make
1292all those function calls.
1293
1294
1295Objects
1296=======
1297
1298What is a class?
1299----------------
1300
1301A class is the particular object type created by executing a class statement.
1302Class objects are used as templates to create instance objects, which embody
1303both the data (attributes) and code (methods) specific to a datatype.
1304
1305A class can be based on one or more other classes, called its base class(es). It
1306then inherits the attributes and methods of its base classes. This allows an
1307object model to be successively refined by inheritance. You might have a
1308generic ``Mailbox`` class that provides basic accessor methods for a mailbox,
1309and subclasses such as ``MboxMailbox``, ``MaildirMailbox``, ``OutlookMailbox``
1310that handle various specific mailbox formats.
1311
1312
1313What is a method?
1314-----------------
1315
1316A method is a function on some object ``x`` that you normally call as
1317``x.name(arguments...)``. Methods are defined as functions inside the class
1318definition::
1319
1320 class C:
1321 def meth (self, arg):
1322 return arg * 2 + self.attribute
1323
1324
1325What is self?
1326-------------
1327
1328Self is merely a conventional name for the first argument of a method. A method
1329defined as ``meth(self, a, b, c)`` should be called as ``x.meth(a, b, c)`` for
1330some instance ``x`` of the class in which the definition occurs; the called
1331method will think it is called as ``meth(x, a, b, c)``.
1332
1333See also :ref:`why-self`.
1334
1335
1336How do I check if an object is an instance of a given class or of a subclass of it?
1337-----------------------------------------------------------------------------------
1338
1339Use the built-in function ``isinstance(obj, cls)``. You can check if an object
1340is an instance of any of a number of classes by providing a tuple instead of a
1341single class, e.g. ``isinstance(obj, (class1, class2, ...))``, and can also
1342check whether an object is one of Python's built-in types, e.g.
1343``isinstance(obj, str)`` or ``isinstance(obj, (int, long, float, complex))``.
1344
1345Note that most programs do not use :func:`isinstance` on user-defined classes
1346very often. If you are developing the classes yourself, a more proper
1347object-oriented style is to define methods on the classes that encapsulate a
1348particular behaviour, instead of checking the object's class and doing a
1349different thing based on what class it is. For example, if you have a function
1350that does something::
1351
Georg Brandl0cedb4b2009-12-20 14:20:16 +00001352 def search(obj):
Georg Brandl6728c5a2009-10-11 18:31:23 +00001353 if isinstance(obj, Mailbox):
1354 # ... code to search a mailbox
1355 elif isinstance(obj, Document):
1356 # ... code to search a document
1357 elif ...
1358
1359A better approach is to define a ``search()`` method on all the classes and just
1360call it::
1361
1362 class Mailbox:
1363 def search(self):
1364 # ... code to search a mailbox
1365
1366 class Document:
1367 def search(self):
1368 # ... code to search a document
1369
1370 obj.search()
1371
1372
1373What is delegation?
1374-------------------
1375
1376Delegation is an object oriented technique (also called a design pattern).
1377Let's say you have an object ``x`` and want to change the behaviour of just one
1378of its methods. You can create a new class that provides a new implementation
1379of the method you're interested in changing and delegates all other methods to
1380the corresponding method of ``x``.
1381
1382Python programmers can easily implement delegation. For example, the following
1383class implements a class that behaves like a file but converts all written data
1384to uppercase::
1385
1386 class UpperOut:
1387
1388 def __init__(self, outfile):
1389 self._outfile = outfile
1390
1391 def write(self, s):
1392 self._outfile.write(s.upper())
1393
1394 def __getattr__(self, name):
1395 return getattr(self._outfile, name)
1396
1397Here the ``UpperOut`` class redefines the ``write()`` method to convert the
1398argument string to uppercase before calling the underlying
1399``self.__outfile.write()`` method. All other methods are delegated to the
1400underlying ``self.__outfile`` object. The delegation is accomplished via the
1401``__getattr__`` method; consult :ref:`the language reference <attribute-access>`
1402for more information about controlling attribute access.
1403
1404Note that for more general cases delegation can get trickier. When attributes
1405must be set as well as retrieved, the class must define a :meth:`__setattr__`
1406method too, and it must do so carefully. The basic implementation of
1407:meth:`__setattr__` is roughly equivalent to the following::
1408
1409 class X:
1410 ...
1411 def __setattr__(self, name, value):
1412 self.__dict__[name] = value
1413 ...
1414
1415Most :meth:`__setattr__` implementations must modify ``self.__dict__`` to store
1416local state for self without causing an infinite recursion.
1417
1418
1419How do I call a method defined in a base class from a derived class that overrides it?
1420--------------------------------------------------------------------------------------
1421
1422If you're using new-style classes, use the built-in :func:`super` function::
1423
1424 class Derived(Base):
1425 def meth (self):
1426 super(Derived, self).meth()
1427
1428If you're using classic classes: For a class definition such as ``class
1429Derived(Base): ...`` you can call method ``meth()`` defined in ``Base`` (or one
1430of ``Base``'s base classes) as ``Base.meth(self, arguments...)``. Here,
1431``Base.meth`` is an unbound method, so you need to provide the ``self``
1432argument.
1433
1434
1435How can I organize my code to make it easier to change the base class?
1436----------------------------------------------------------------------
1437
1438You could define an alias for the base class, assign the real base class to it
1439before your class definition, and use the alias throughout your class. Then all
1440you have to change is the value assigned to the alias. Incidentally, this trick
1441is also handy if you want to decide dynamically (e.g. depending on availability
1442of resources) which base class to use. Example::
1443
1444 BaseAlias = <real base class>
1445
1446 class Derived(BaseAlias):
1447 def meth(self):
1448 BaseAlias.meth(self)
1449 ...
1450
1451
1452How do I create static class data and static class methods?
1453-----------------------------------------------------------
1454
Georg Brandl0cedb4b2009-12-20 14:20:16 +00001455Both static data and static methods (in the sense of C++ or Java) are supported
1456in Python.
Georg Brandl6728c5a2009-10-11 18:31:23 +00001457
1458For static data, simply define a class attribute. To assign a new value to the
1459attribute, you have to explicitly use the class name in the assignment::
1460
1461 class C:
1462 count = 0 # number of times C.__init__ called
1463
1464 def __init__(self):
1465 C.count = C.count + 1
1466
1467 def getcount(self):
1468 return C.count # or return self.count
1469
1470``c.count`` also refers to ``C.count`` for any ``c`` such that ``isinstance(c,
1471C)`` holds, unless overridden by ``c`` itself or by some class on the base-class
1472search path from ``c.__class__`` back to ``C``.
1473
1474Caution: within a method of C, an assignment like ``self.count = 42`` creates a
Georg Brandl0cedb4b2009-12-20 14:20:16 +00001475new and unrelated instance named "count" in ``self``'s own dict. Rebinding of a
1476class-static data name must always specify the class whether inside a method or
1477not::
Georg Brandl6728c5a2009-10-11 18:31:23 +00001478
1479 C.count = 314
1480
1481Static methods are possible since Python 2.2::
1482
1483 class C:
1484 def static(arg1, arg2, arg3):
1485 # No 'self' parameter!
1486 ...
1487 static = staticmethod(static)
1488
1489With Python 2.4's decorators, this can also be written as ::
1490
1491 class C:
1492 @staticmethod
1493 def static(arg1, arg2, arg3):
1494 # No 'self' parameter!
1495 ...
1496
1497However, a far more straightforward way to get the effect of a static method is
1498via a simple module-level function::
1499
1500 def getcount():
1501 return C.count
1502
1503If your code is structured so as to define one class (or tightly related class
1504hierarchy) per module, this supplies the desired encapsulation.
1505
1506
1507How can I overload constructors (or methods) in Python?
1508-------------------------------------------------------
1509
1510This answer actually applies to all methods, but the question usually comes up
1511first in the context of constructors.
1512
1513In C++ you'd write
1514
1515.. code-block:: c
1516
1517 class C {
1518 C() { cout << "No arguments\n"; }
1519 C(int i) { cout << "Argument is " << i << "\n"; }
1520 }
1521
1522In Python you have to write a single constructor that catches all cases using
1523default arguments. For example::
1524
1525 class C:
1526 def __init__(self, i=None):
1527 if i is None:
1528 print "No arguments"
1529 else:
1530 print "Argument is", i
1531
1532This is not entirely equivalent, but close enough in practice.
1533
1534You could also try a variable-length argument list, e.g. ::
1535
1536 def __init__(self, *args):
1537 ...
1538
1539The same approach works for all method definitions.
1540
1541
1542I try to use __spam and I get an error about _SomeClassName__spam.
1543------------------------------------------------------------------
1544
1545Variable names with double leading underscores are "mangled" to provide a simple
1546but effective way to define class private variables. Any identifier of the form
1547``__spam`` (at least two leading underscores, at most one trailing underscore)
1548is textually replaced with ``_classname__spam``, where ``classname`` is the
1549current class name with any leading underscores stripped.
1550
1551This doesn't guarantee privacy: an outside user can still deliberately access
1552the "_classname__spam" attribute, and private values are visible in the object's
1553``__dict__``. Many Python programmers never bother to use private variable
1554names at all.
1555
1556
1557My class defines __del__ but it is not called when I delete the object.
1558-----------------------------------------------------------------------
1559
1560There are several possible reasons for this.
1561
1562The del statement does not necessarily call :meth:`__del__` -- it simply
1563decrements the object's reference count, and if this reaches zero
1564:meth:`__del__` is called.
1565
1566If your data structures contain circular links (e.g. a tree where each child has
1567a parent reference and each parent has a list of children) the reference counts
1568will never go back to zero. Once in a while Python runs an algorithm to detect
1569such cycles, but the garbage collector might run some time after the last
1570reference to your data structure vanishes, so your :meth:`__del__` method may be
1571called at an inconvenient and random time. This is inconvenient if you're trying
1572to reproduce a problem. Worse, the order in which object's :meth:`__del__`
1573methods are executed is arbitrary. You can run :func:`gc.collect` to force a
1574collection, but there *are* pathological cases where objects will never be
1575collected.
1576
1577Despite the cycle collector, it's still a good idea to define an explicit
1578``close()`` method on objects to be called whenever you're done with them. The
1579``close()`` method can then remove attributes that refer to subobjecs. Don't
1580call :meth:`__del__` directly -- :meth:`__del__` should call ``close()`` and
1581``close()`` should make sure that it can be called more than once for the same
1582object.
1583
1584Another way to avoid cyclical references is to use the :mod:`weakref` module,
1585which allows you to point to objects without incrementing their reference count.
1586Tree data structures, for instance, should use weak references for their parent
1587and sibling references (if they need them!).
1588
1589If the object has ever been a local variable in a function that caught an
1590expression in an except clause, chances are that a reference to the object still
1591exists in that function's stack frame as contained in the stack trace.
1592Normally, calling :func:`sys.exc_clear` will take care of this by clearing the
1593last recorded exception.
1594
1595Finally, if your :meth:`__del__` method raises an exception, a warning message
1596is printed to :data:`sys.stderr`.
1597
1598
1599How do I get a list of all instances of a given class?
1600------------------------------------------------------
1601
1602Python does not keep track of all instances of a class (or of a built-in type).
1603You can program the class's constructor to keep track of all instances by
1604keeping a list of weak references to each instance.
1605
1606
1607Modules
1608=======
1609
1610How do I create a .pyc file?
1611----------------------------
1612
1613When a module is imported for the first time (or when the source is more recent
1614than the current compiled file) a ``.pyc`` file containing the compiled code
1615should be created in the same directory as the ``.py`` file.
1616
1617One reason that a ``.pyc`` file may not be created is permissions problems with
1618the directory. This can happen, for example, if you develop as one user but run
1619as another, such as if you are testing with a web server. Creation of a .pyc
1620file is automatic if you're importing a module and Python has the ability
1621(permissions, free space, etc...) to write the compiled module back to the
1622directory.
1623
1624Running Python on a top level script is not considered an import and no ``.pyc``
1625will be created. For example, if you have a top-level module ``abc.py`` that
1626imports another module ``xyz.py``, when you run abc, ``xyz.pyc`` will be created
1627since xyz is imported, but no ``abc.pyc`` file will be created since ``abc.py``
1628isn't being imported.
1629
1630If you need to create abc.pyc -- that is, to create a .pyc file for a module
1631that is not imported -- you can, using the :mod:`py_compile` and
1632:mod:`compileall` modules.
1633
1634The :mod:`py_compile` module can manually compile any module. One way is to use
1635the ``compile()`` function in that module interactively::
1636
1637 >>> import py_compile
1638 >>> py_compile.compile('abc.py')
1639
1640This will write the ``.pyc`` to the same location as ``abc.py`` (or you can
1641override that with the optional parameter ``cfile``).
1642
1643You can also automatically compile all files in a directory or directories using
1644the :mod:`compileall` module. You can do it from the shell prompt by running
1645``compileall.py`` and providing the path of a directory containing Python files
1646to compile::
1647
1648 python -m compileall .
1649
1650
1651How do I find the current module name?
1652--------------------------------------
1653
1654A module can find out its own module name by looking at the predefined global
1655variable ``__name__``. If this has the value ``'__main__'``, the program is
1656running as a script. Many modules that are usually used by importing them also
1657provide a command-line interface or a self-test, and only execute this code
1658after checking ``__name__``::
1659
1660 def main():
1661 print 'Running test...'
1662 ...
1663
1664 if __name__ == '__main__':
1665 main()
1666
1667
1668How can I have modules that mutually import each other?
1669-------------------------------------------------------
1670
1671Suppose you have the following modules:
1672
1673foo.py::
1674
1675 from bar import bar_var
1676 foo_var = 1
1677
1678bar.py::
1679
1680 from foo import foo_var
1681 bar_var = 2
1682
1683The problem is that the interpreter will perform the following steps:
1684
1685* main imports foo
1686* Empty globals for foo are created
1687* foo is compiled and starts executing
1688* foo imports bar
1689* Empty globals for bar are created
1690* bar is compiled and starts executing
1691* bar imports foo (which is a no-op since there already is a module named foo)
1692* bar.foo_var = foo.foo_var
1693
1694The last step fails, because Python isn't done with interpreting ``foo`` yet and
1695the global symbol dictionary for ``foo`` is still empty.
1696
1697The same thing happens when you use ``import foo``, and then try to access
1698``foo.foo_var`` in global code.
1699
1700There are (at least) three possible workarounds for this problem.
1701
1702Guido van Rossum recommends avoiding all uses of ``from <module> import ...``,
1703and placing all code inside functions. Initializations of global variables and
1704class variables should use constants or built-in functions only. This means
1705everything from an imported module is referenced as ``<module>.<name>``.
1706
1707Jim Roskind suggests performing steps in the following order in each module:
1708
1709* exports (globals, functions, and classes that don't need imported base
1710 classes)
1711* ``import`` statements
1712* active code (including globals that are initialized from imported values).
1713
1714van Rossum doesn't like this approach much because the imports appear in a
1715strange place, but it does work.
1716
1717Matthias Urlichs recommends restructuring your code so that the recursive import
1718is not necessary in the first place.
1719
1720These solutions are not mutually exclusive.
1721
1722
1723__import__('x.y.z') returns <module 'x'>; how do I get z?
1724---------------------------------------------------------
1725
1726Try::
1727
1728 __import__('x.y.z').y.z
1729
1730For more realistic situations, you may have to do something like ::
1731
1732 m = __import__(s)
1733 for i in s.split(".")[1:]:
1734 m = getattr(m, i)
1735
1736See :mod:`importlib` for a convenience function called
1737:func:`~importlib.import_module`.
1738
1739
1740
1741When I edit an imported module and reimport it, the changes don't show up. Why does this happen?
1742-------------------------------------------------------------------------------------------------
1743
1744For reasons of efficiency as well as consistency, Python only reads the module
1745file on the first time a module is imported. If it didn't, in a program
1746consisting of many modules where each one imports the same basic module, the
1747basic module would be parsed and re-parsed many times. To force rereading of a
1748changed module, do this::
1749
1750 import modname
1751 reload(modname)
1752
1753Warning: this technique is not 100% fool-proof. In particular, modules
1754containing statements like ::
1755
1756 from modname import some_objects
1757
1758will continue to work with the old version of the imported objects. If the
1759module contains class definitions, existing class instances will *not* be
1760updated to use the new class definition. This can result in the following
1761paradoxical behaviour:
1762
1763 >>> import cls
1764 >>> c = cls.C() # Create an instance of C
1765 >>> reload(cls)
1766 <module 'cls' from 'cls.pyc'>
1767 >>> isinstance(c, cls.C) # isinstance is false?!?
1768 False
1769
1770The nature of the problem is made clear if you print out the class objects:
1771
1772 >>> c.__class__
1773 <class cls.C at 0x7352a0>
1774 >>> cls.C
1775 <class cls.C at 0x4198d0>
1776