blob: 792344dac2cf75be20f82b0155125babb828f3ae [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001********************************
2 Functional Programming HOWTO
3********************************
4
Georg Brandl09a7fe62008-03-22 11:00:48 +00005:Author: A. M. Kuchling
Andrew M. Kuchling90921cc2007-12-14 22:52:36 +00006:Release: 0.31
Georg Brandl8ec7f652007-08-15 14:28:01 +00007
8(This is a first draft. Please send comments/error reports/suggestions to
Andrew M. Kuchlingdf2d7452007-12-14 22:41:18 +00009amk@amk.ca.)
Georg Brandl8ec7f652007-08-15 14:28:01 +000010
11In this document, we'll take a tour of Python's features suitable for
12implementing programs in a functional style. After an introduction to the
13concepts of functional programming, we'll look at language features such as
Georg Brandle7a09902007-10-21 12:10:28 +000014:term:`iterator`\s and :term:`generator`\s and relevant library modules such as
Georg Brandlcf3fb252007-10-21 10:52:38 +000015:mod:`itertools` and :mod:`functools`.
Georg Brandl8ec7f652007-08-15 14:28:01 +000016
17
18Introduction
19============
20
21This section explains the basic concept of functional programming; if you're
22just interested in learning about Python language features, skip to the next
23section.
24
25Programming languages support decomposing problems in several different ways:
26
27* Most programming languages are **procedural**: programs are lists of
28 instructions that tell the computer what to do with the program's input. C,
29 Pascal, and even Unix shells are procedural languages.
30
31* In **declarative** languages, you write a specification that describes the
32 problem to be solved, and the language implementation figures out how to
33 perform the computation efficiently. SQL is the declarative language you're
34 most likely to be familiar with; a SQL query describes the data set you want
35 to retrieve, and the SQL engine decides whether to scan tables or use indexes,
36 which subclauses should be performed first, etc.
37
38* **Object-oriented** programs manipulate collections of objects. Objects have
39 internal state and support methods that query or modify this internal state in
40 some way. Smalltalk and Java are object-oriented languages. C++ and Python
41 are languages that support object-oriented programming, but don't force the
42 use of object-oriented features.
43
44* **Functional** programming decomposes a problem into a set of functions.
45 Ideally, functions only take inputs and produce outputs, and don't have any
46 internal state that affects the output produced for a given input. Well-known
47 functional languages include the ML family (Standard ML, OCaml, and other
48 variants) and Haskell.
49
Andrew M. Kuchling90921cc2007-12-14 22:52:36 +000050The designers of some computer languages choose to emphasize one
51particular approach to programming. This often makes it difficult to
52write programs that use a different approach. Other languages are
53multi-paradigm languages that support several different approaches.
54Lisp, C++, and Python are multi-paradigm; you can write programs or
55libraries that are largely procedural, object-oriented, or functional
56in all of these languages. In a large program, different sections
57might be written using different approaches; the GUI might be
58object-oriented while the processing logic is procedural or
59functional, for example.
Georg Brandl8ec7f652007-08-15 14:28:01 +000060
61In a functional program, input flows through a set of functions. Each function
Andrew M. Kuchling90921cc2007-12-14 22:52:36 +000062operates on its input and produces some output. Functional style discourages
Georg Brandl8ec7f652007-08-15 14:28:01 +000063functions with side effects that modify internal state or make other changes
64that aren't visible in the function's return value. Functions that have no side
65effects at all are called **purely functional**. Avoiding side effects means
66not using data structures that get updated as a program runs; every function's
67output must only depend on its input.
68
69Some languages are very strict about purity and don't even have assignment
70statements such as ``a=3`` or ``c = a + b``, but it's difficult to avoid all
71side effects. Printing to the screen or writing to a disk file are side
72effects, for example. For example, in Python a ``print`` statement or a
73``time.sleep(1)`` both return no useful value; they're only called for their
74side effects of sending some text to the screen or pausing execution for a
75second.
76
77Python programs written in functional style usually won't go to the extreme of
78avoiding all I/O or all assignments; instead, they'll provide a
79functional-appearing interface but will use non-functional features internally.
80For example, the implementation of a function will still use assignments to
81local variables, but won't modify global variables or have other side effects.
82
83Functional programming can be considered the opposite of object-oriented
84programming. Objects are little capsules containing some internal state along
85with a collection of method calls that let you modify this state, and programs
86consist of making the right set of state changes. Functional programming wants
87to avoid state changes as much as possible and works with data flowing between
88functions. In Python you might combine the two approaches by writing functions
89that take and return instances representing objects in your application (e-mail
90messages, transactions, etc.).
91
92Functional design may seem like an odd constraint to work under. Why should you
93avoid objects and side effects? There are theoretical and practical advantages
94to the functional style:
95
96* Formal provability.
97* Modularity.
98* Composability.
99* Ease of debugging and testing.
100
Georg Brandl09a7fe62008-03-22 11:00:48 +0000101
Georg Brandl8ec7f652007-08-15 14:28:01 +0000102Formal provability
103------------------
104
105A theoretical benefit is that it's easier to construct a mathematical proof that
106a functional program is correct.
107
108For a long time researchers have been interested in finding ways to
109mathematically prove programs correct. This is different from testing a program
110on numerous inputs and concluding that its output is usually correct, or reading
111a program's source code and concluding that the code looks right; the goal is
112instead a rigorous proof that a program produces the right result for all
113possible inputs.
114
115The technique used to prove programs correct is to write down **invariants**,
116properties of the input data and of the program's variables that are always
117true. For each line of code, you then show that if invariants X and Y are true
118**before** the line is executed, the slightly different invariants X' and Y' are
119true **after** the line is executed. This continues until you reach the end of
120the program, at which point the invariants should match the desired conditions
121on the program's output.
122
123Functional programming's avoidance of assignments arose because assignments are
124difficult to handle with this technique; assignments can break invariants that
125were true before the assignment without producing any new invariants that can be
126propagated onward.
127
128Unfortunately, proving programs correct is largely impractical and not relevant
129to Python software. Even trivial programs require proofs that are several pages
130long; the proof of correctness for a moderately complicated program would be
131enormous, and few or none of the programs you use daily (the Python interpreter,
132your XML parser, your web browser) could be proven correct. Even if you wrote
133down or generated a proof, there would then be the question of verifying the
134proof; maybe there's an error in it, and you wrongly believe you've proved the
135program correct.
136
Georg Brandl09a7fe62008-03-22 11:00:48 +0000137
Georg Brandl8ec7f652007-08-15 14:28:01 +0000138Modularity
139----------
140
141A more practical benefit of functional programming is that it forces you to
142break apart your problem into small pieces. Programs are more modular as a
143result. It's easier to specify and write a small function that does one thing
144than a large function that performs a complicated transformation. Small
145functions are also easier to read and to check for errors.
146
147
148Ease of debugging and testing
149-----------------------------
150
151Testing and debugging a functional-style program is easier.
152
153Debugging is simplified because functions are generally small and clearly
154specified. When a program doesn't work, each function is an interface point
155where you can check that the data are correct. You can look at the intermediate
156inputs and outputs to quickly isolate the function that's responsible for a bug.
157
158Testing is easier because each function is a potential subject for a unit test.
159Functions don't depend on system state that needs to be replicated before
160running a test; instead you only have to synthesize the right input and then
161check that the output matches expectations.
162
163
Georg Brandl8ec7f652007-08-15 14:28:01 +0000164Composability
165-------------
166
167As you work on a functional-style program, you'll write a number of functions
168with varying inputs and outputs. Some of these functions will be unavoidably
169specialized to a particular application, but others will be useful in a wide
170variety of programs. For example, a function that takes a directory path and
171returns all the XML files in the directory, or a function that takes a filename
172and returns its contents, can be applied to many different situations.
173
174Over time you'll form a personal library of utilities. Often you'll assemble
175new programs by arranging existing functions in a new configuration and writing
176a few functions specialized for the current task.
177
178
Georg Brandl8ec7f652007-08-15 14:28:01 +0000179Iterators
180=========
181
182I'll start by looking at a Python language feature that's an important
183foundation for writing functional-style programs: iterators.
184
185An iterator is an object representing a stream of data; this object returns the
186data one element at a time. A Python iterator must support a method called
187``next()`` that takes no arguments and always returns the next element of the
188stream. If there are no more elements in the stream, ``next()`` must raise the
189``StopIteration`` exception. Iterators don't have to be finite, though; it's
190perfectly reasonable to write an iterator that produces an infinite stream of
191data.
192
193The built-in :func:`iter` function takes an arbitrary object and tries to return
194an iterator that will return the object's contents or elements, raising
195:exc:`TypeError` if the object doesn't support iteration. Several of Python's
196built-in data types support iteration, the most common being lists and
197dictionaries. An object is called an **iterable** object if you can get an
198iterator for it.
199
Georg Brandl09a7fe62008-03-22 11:00:48 +0000200You can experiment with the iteration interface manually:
Georg Brandl8ec7f652007-08-15 14:28:01 +0000201
202 >>> L = [1,2,3]
203 >>> it = iter(L)
204 >>> print it
Georg Brandl09a7fe62008-03-22 11:00:48 +0000205 <...iterator object at ...>
Georg Brandl8ec7f652007-08-15 14:28:01 +0000206 >>> it.next()
207 1
208 >>> it.next()
209 2
210 >>> it.next()
211 3
212 >>> it.next()
213 Traceback (most recent call last):
214 File "<stdin>", line 1, in ?
215 StopIteration
216 >>>
217
218Python expects iterable objects in several different contexts, the most
219important being the ``for`` statement. In the statement ``for X in Y``, Y must
220be an iterator or some object for which ``iter()`` can create an iterator.
221These two statements are equivalent::
222
Georg Brandl09a7fe62008-03-22 11:00:48 +0000223 for i in iter(obj):
224 print i
Georg Brandl8ec7f652007-08-15 14:28:01 +0000225
Georg Brandl09a7fe62008-03-22 11:00:48 +0000226 for i in obj:
227 print i
Georg Brandl8ec7f652007-08-15 14:28:01 +0000228
229Iterators can be materialized as lists or tuples by using the :func:`list` or
Georg Brandl09a7fe62008-03-22 11:00:48 +0000230:func:`tuple` constructor functions:
Georg Brandl8ec7f652007-08-15 14:28:01 +0000231
232 >>> L = [1,2,3]
233 >>> iterator = iter(L)
234 >>> t = tuple(iterator)
235 >>> t
236 (1, 2, 3)
237
238Sequence unpacking also supports iterators: if you know an iterator will return
Georg Brandl09a7fe62008-03-22 11:00:48 +0000239N elements, you can unpack them into an N-tuple:
Georg Brandl8ec7f652007-08-15 14:28:01 +0000240
241 >>> L = [1,2,3]
242 >>> iterator = iter(L)
243 >>> a,b,c = iterator
244 >>> a,b,c
245 (1, 2, 3)
246
247Built-in functions such as :func:`max` and :func:`min` can take a single
248iterator argument and will return the largest or smallest element. The ``"in"``
249and ``"not in"`` operators also support iterators: ``X in iterator`` is true if
250X is found in the stream returned by the iterator. You'll run into obvious
251problems if the iterator is infinite; ``max()``, ``min()``, and ``"not in"``
252will never return, and if the element X never appears in the stream, the
253``"in"`` operator won't return either.
254
255Note that you can only go forward in an iterator; there's no way to get the
256previous element, reset the iterator, or make a copy of it. Iterator objects
257can optionally provide these additional capabilities, but the iterator protocol
258only specifies the ``next()`` method. Functions may therefore consume all of
259the iterator's output, and if you need to do something different with the same
260stream, you'll have to create a new iterator.
261
262
263
264Data Types That Support Iterators
265---------------------------------
266
267We've already seen how lists and tuples support iterators. In fact, any Python
268sequence type, such as strings, will automatically support creation of an
269iterator.
270
271Calling :func:`iter` on a dictionary returns an iterator that will loop over the
Georg Brandl09a7fe62008-03-22 11:00:48 +0000272dictionary's keys:
Georg Brandl8ec7f652007-08-15 14:28:01 +0000273
Georg Brandl9f662322008-03-22 11:47:10 +0000274.. not a doctest since dict ordering varies across Pythons
275
276::
277
Georg Brandl8ec7f652007-08-15 14:28:01 +0000278 >>> m = {'Jan': 1, 'Feb': 2, 'Mar': 3, 'Apr': 4, 'May': 5, 'Jun': 6,
279 ... 'Jul': 7, 'Aug': 8, 'Sep': 9, 'Oct': 10, 'Nov': 11, 'Dec': 12}
280 >>> for key in m:
281 ... print key, m[key]
282 Mar 3
283 Feb 2
284 Aug 8
285 Sep 9
Georg Brandl09a7fe62008-03-22 11:00:48 +0000286 Apr 4
Georg Brandl8ec7f652007-08-15 14:28:01 +0000287 Jun 6
288 Jul 7
289 Jan 1
Georg Brandl09a7fe62008-03-22 11:00:48 +0000290 May 5
Georg Brandl8ec7f652007-08-15 14:28:01 +0000291 Nov 11
292 Dec 12
293 Oct 10
294
295Note that the order is essentially random, because it's based on the hash
296ordering of the objects in the dictionary.
297
298Applying ``iter()`` to a dictionary always loops over the keys, but dictionaries
299have methods that return other iterators. If you want to iterate over keys,
300values, or key/value pairs, you can explicitly call the ``iterkeys()``,
301``itervalues()``, or ``iteritems()`` methods to get an appropriate iterator.
302
303The :func:`dict` constructor can accept an iterator that returns a finite stream
Georg Brandl09a7fe62008-03-22 11:00:48 +0000304of ``(key, value)`` tuples:
Georg Brandl8ec7f652007-08-15 14:28:01 +0000305
306 >>> L = [('Italy', 'Rome'), ('France', 'Paris'), ('US', 'Washington DC')]
307 >>> dict(iter(L))
308 {'Italy': 'Rome', 'US': 'Washington DC', 'France': 'Paris'}
309
310Files also support iteration by calling the ``readline()`` method until there
311are no more lines in the file. This means you can read each line of a file like
312this::
313
314 for line in file:
315 # do something for each line
316 ...
317
318Sets can take their contents from an iterable and let you iterate over the set's
319elements::
320
321 S = set((2, 3, 5, 7, 11, 13))
322 for i in S:
323 print i
324
325
326
327Generator expressions and list comprehensions
328=============================================
329
330Two common operations on an iterator's output are 1) performing some operation
331for every element, 2) selecting a subset of elements that meet some condition.
332For example, given a list of strings, you might want to strip off trailing
333whitespace from each line or extract all the strings containing a given
334substring.
335
336List comprehensions and generator expressions (short form: "listcomps" and
337"genexps") are a concise notation for such operations, borrowed from the
338functional programming language Haskell (http://www.haskell.org). You can strip
339all the whitespace from a stream of strings with the following code::
340
Georg Brandl09a7fe62008-03-22 11:00:48 +0000341 line_list = [' line 1\n', 'line 2 \n', ...]
Georg Brandl8ec7f652007-08-15 14:28:01 +0000342
Georg Brandl09a7fe62008-03-22 11:00:48 +0000343 # Generator expression -- returns iterator
344 stripped_iter = (line.strip() for line in line_list)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000345
Georg Brandl09a7fe62008-03-22 11:00:48 +0000346 # List comprehension -- returns list
347 stripped_list = [line.strip() for line in line_list]
Georg Brandl8ec7f652007-08-15 14:28:01 +0000348
349You can select only certain elements by adding an ``"if"`` condition::
350
Georg Brandl09a7fe62008-03-22 11:00:48 +0000351 stripped_list = [line.strip() for line in line_list
352 if line != ""]
Georg Brandl8ec7f652007-08-15 14:28:01 +0000353
354With a list comprehension, you get back a Python list; ``stripped_list`` is a
355list containing the resulting lines, not an iterator. Generator expressions
356return an iterator that computes the values as necessary, not needing to
357materialize all the values at once. This means that list comprehensions aren't
358useful if you're working with iterators that return an infinite stream or a very
359large amount of data. Generator expressions are preferable in these situations.
360
361Generator expressions are surrounded by parentheses ("()") and list
362comprehensions are surrounded by square brackets ("[]"). Generator expressions
363have the form::
364
365 ( expression for expr in sequence1
366 if condition1
367 for expr2 in sequence2
368 if condition2
369 for expr3 in sequence3 ...
370 if condition3
371 for exprN in sequenceN
372 if conditionN )
373
374Again, for a list comprehension only the outside brackets are different (square
375brackets instead of parentheses).
376
377The elements of the generated output will be the successive values of
378``expression``. The ``if`` clauses are all optional; if present, ``expression``
379is only evaluated and added to the result when ``condition`` is true.
380
381Generator expressions always have to be written inside parentheses, but the
382parentheses signalling a function call also count. If you want to create an
383iterator that will be immediately passed to a function you can write::
384
Georg Brandl09a7fe62008-03-22 11:00:48 +0000385 obj_total = sum(obj.count for obj in list_all_objects())
Georg Brandl8ec7f652007-08-15 14:28:01 +0000386
387The ``for...in`` clauses contain the sequences to be iterated over. The
388sequences do not have to be the same length, because they are iterated over from
389left to right, **not** in parallel. For each element in ``sequence1``,
390``sequence2`` is looped over from the beginning. ``sequence3`` is then looped
391over for each resulting pair of elements from ``sequence1`` and ``sequence2``.
392
393To put it another way, a list comprehension or generator expression is
394equivalent to the following Python code::
395
396 for expr1 in sequence1:
397 if not (condition1):
398 continue # Skip this element
399 for expr2 in sequence2:
400 if not (condition2):
401 continue # Skip this element
402 ...
403 for exprN in sequenceN:
404 if not (conditionN):
405 continue # Skip this element
406
407 # Output the value of
408 # the expression.
409
410This means that when there are multiple ``for...in`` clauses but no ``if``
411clauses, the length of the resulting output will be equal to the product of the
412lengths of all the sequences. If you have two lists of length 3, the output
Georg Brandl09a7fe62008-03-22 11:00:48 +0000413list is 9 elements long:
Georg Brandl8ec7f652007-08-15 14:28:01 +0000414
Georg Brandl09a7fe62008-03-22 11:00:48 +0000415.. doctest::
416 :options: +NORMALIZE_WHITESPACE
417
418 >>> seq1 = 'abc'
419 >>> seq2 = (1,2,3)
420 >>> [(x,y) for x in seq1 for y in seq2]
Georg Brandl8ec7f652007-08-15 14:28:01 +0000421 [('a', 1), ('a', 2), ('a', 3),
422 ('b', 1), ('b', 2), ('b', 3),
423 ('c', 1), ('c', 2), ('c', 3)]
424
425To avoid introducing an ambiguity into Python's grammar, if ``expression`` is
426creating a tuple, it must be surrounded with parentheses. The first list
427comprehension below is a syntax error, while the second one is correct::
428
429 # Syntax error
430 [ x,y for x in seq1 for y in seq2]
431 # Correct
432 [ (x,y) for x in seq1 for y in seq2]
433
434
435Generators
436==========
437
438Generators are a special class of functions that simplify the task of writing
439iterators. Regular functions compute a value and return it, but generators
440return an iterator that returns a stream of values.
441
442You're doubtless familiar with how regular function calls work in Python or C.
443When you call a function, it gets a private namespace where its local variables
444are created. When the function reaches a ``return`` statement, the local
445variables are destroyed and the value is returned to the caller. A later call
446to the same function creates a new private namespace and a fresh set of local
447variables. But, what if the local variables weren't thrown away on exiting a
448function? What if you could later resume the function where it left off? This
449is what generators provide; they can be thought of as resumable functions.
450
Georg Brandl09a7fe62008-03-22 11:00:48 +0000451Here's the simplest example of a generator function:
452
Georg Brandl838b4b02008-03-22 13:07:06 +0000453.. testcode::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000454
455 def generate_ints(N):
456 for i in range(N):
457 yield i
458
459Any function containing a ``yield`` keyword is a generator function; this is
Georg Brandl5e52db02007-10-21 10:45:46 +0000460detected by Python's :term:`bytecode` compiler which compiles the function
461specially as a result.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000462
463When you call a generator function, it doesn't return a single value; instead it
464returns a generator object that supports the iterator protocol. On executing
465the ``yield`` expression, the generator outputs the value of ``i``, similar to a
466``return`` statement. The big difference between ``yield`` and a ``return``
467statement is that on reaching a ``yield`` the generator's state of execution is
468suspended and local variables are preserved. On the next call to the
469generator's ``.next()`` method, the function will resume executing.
470
Georg Brandl09a7fe62008-03-22 11:00:48 +0000471Here's a sample usage of the ``generate_ints()`` generator:
Georg Brandl8ec7f652007-08-15 14:28:01 +0000472
473 >>> gen = generate_ints(3)
474 >>> gen
Georg Brandl09a7fe62008-03-22 11:00:48 +0000475 <generator object at ...>
Georg Brandl8ec7f652007-08-15 14:28:01 +0000476 >>> gen.next()
477 0
478 >>> gen.next()
479 1
480 >>> gen.next()
481 2
482 >>> gen.next()
483 Traceback (most recent call last):
484 File "stdin", line 1, in ?
485 File "stdin", line 2, in generate_ints
486 StopIteration
487
488You could equally write ``for i in generate_ints(5)``, or ``a,b,c =
489generate_ints(3)``.
490
491Inside a generator function, the ``return`` statement can only be used without a
492value, and signals the end of the procession of values; after executing a
493``return`` the generator cannot return any further values. ``return`` with a
494value, such as ``return 5``, is a syntax error inside a generator function. The
495end of the generator's results can also be indicated by raising
496``StopIteration`` manually, or by just letting the flow of execution fall off
497the bottom of the function.
498
499You could achieve the effect of generators manually by writing your own class
500and storing all the local variables of the generator as instance variables. For
501example, returning a list of integers could be done by setting ``self.count`` to
5020, and having the ``next()`` method increment ``self.count`` and return it.
503However, for a moderately complicated generator, writing a corresponding class
504can be much messier.
505
506The test suite included with Python's library, ``test_generators.py``, contains
507a number of more interesting examples. Here's one generator that implements an
Georg Brandl09a7fe62008-03-22 11:00:48 +0000508in-order traversal of a tree using generators recursively. ::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000509
510 # A recursive generator that generates Tree leaves in in-order.
511 def inorder(t):
512 if t:
513 for x in inorder(t.left):
514 yield x
515
516 yield t.label
517
518 for x in inorder(t.right):
519 yield x
520
521Two other examples in ``test_generators.py`` produce solutions for the N-Queens
522problem (placing N queens on an NxN chess board so that no queen threatens
523another) and the Knight's Tour (finding a route that takes a knight to every
524square of an NxN chessboard without visiting any square twice).
525
526
527
528Passing values into a generator
529-------------------------------
530
531In Python 2.4 and earlier, generators only produced output. Once a generator's
532code was invoked to create an iterator, there was no way to pass any new
533information into the function when its execution is resumed. You could hack
534together this ability by making the generator look at a global variable or by
535passing in some mutable object that callers then modify, but these approaches
536are messy.
537
538In Python 2.5 there's a simple way to pass values into a generator.
539:keyword:`yield` became an expression, returning a value that can be assigned to
540a variable or otherwise operated on::
541
542 val = (yield i)
543
544I recommend that you **always** put parentheses around a ``yield`` expression
545when you're doing something with the returned value, as in the above example.
546The parentheses aren't always necessary, but it's easier to always add them
547instead of having to remember when they're needed.
548
549(PEP 342 explains the exact rules, which are that a ``yield``-expression must
550always be parenthesized except when it occurs at the top-level expression on the
551right-hand side of an assignment. This means you can write ``val = yield i``
552but have to use parentheses when there's an operation, as in ``val = (yield i)
553+ 12``.)
554
555Values are sent into a generator by calling its ``send(value)`` method. This
556method resumes the generator's code and the ``yield`` expression returns the
557specified value. If the regular ``next()`` method is called, the ``yield``
558returns ``None``.
559
560Here's a simple counter that increments by 1 and allows changing the value of
561the internal counter.
562
Georg Brandl838b4b02008-03-22 13:07:06 +0000563.. testcode::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000564
565 def counter (maximum):
566 i = 0
567 while i < maximum:
568 val = (yield i)
569 # If value provided, change counter
570 if val is not None:
571 i = val
572 else:
573 i += 1
574
575And here's an example of changing the counter:
576
577 >>> it = counter(10)
578 >>> print it.next()
579 0
580 >>> print it.next()
581 1
582 >>> print it.send(8)
583 8
584 >>> print it.next()
585 9
586 >>> print it.next()
587 Traceback (most recent call last):
588 File ``t.py'', line 15, in ?
589 print it.next()
590 StopIteration
591
592Because ``yield`` will often be returning ``None``, you should always check for
593this case. Don't just use its value in expressions unless you're sure that the
594``send()`` method will be the only method used resume your generator function.
595
596In addition to ``send()``, there are two other new methods on generators:
597
598* ``throw(type, value=None, traceback=None)`` is used to raise an exception
599 inside the generator; the exception is raised by the ``yield`` expression
600 where the generator's execution is paused.
601
602* ``close()`` raises a :exc:`GeneratorExit` exception inside the generator to
603 terminate the iteration. On receiving this exception, the generator's code
604 must either raise :exc:`GeneratorExit` or :exc:`StopIteration`; catching the
605 exception and doing anything else is illegal and will trigger a
606 :exc:`RuntimeError`. ``close()`` will also be called by Python's garbage
607 collector when the generator is garbage-collected.
608
609 If you need to run cleanup code when a :exc:`GeneratorExit` occurs, I suggest
610 using a ``try: ... finally:`` suite instead of catching :exc:`GeneratorExit`.
611
612The cumulative effect of these changes is to turn generators from one-way
613producers of information into both producers and consumers.
614
615Generators also become **coroutines**, a more generalized form of subroutines.
616Subroutines are entered at one point and exited at another point (the top of the
617function, and a ``return`` statement), but coroutines can be entered, exited,
618and resumed at many different points (the ``yield`` statements).
619
620
621Built-in functions
622==================
623
624Let's look in more detail at built-in functions often used with iterators.
625
Andrew M. Kuchling90921cc2007-12-14 22:52:36 +0000626Two of Python's built-in functions, :func:`map` and :func:`filter`, are somewhat
Georg Brandl8ec7f652007-08-15 14:28:01 +0000627obsolete; they duplicate the features of list comprehensions but return actual
628lists instead of iterators.
629
630``map(f, iterA, iterB, ...)`` returns a list containing ``f(iterA[0], iterB[0]),
631f(iterA[1], iterB[1]), f(iterA[2], iterB[2]), ...``.
632
Georg Brandl09a7fe62008-03-22 11:00:48 +0000633 >>> def upper(s):
634 ... return s.upper()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000635
Georg Brandl09a7fe62008-03-22 11:00:48 +0000636 >>> map(upper, ['sentence', 'fragment'])
637 ['SENTENCE', 'FRAGMENT']
Georg Brandl8ec7f652007-08-15 14:28:01 +0000638
Georg Brandl09a7fe62008-03-22 11:00:48 +0000639 >>> [upper(s) for s in ['sentence', 'fragment']]
640 ['SENTENCE', 'FRAGMENT']
Georg Brandl8ec7f652007-08-15 14:28:01 +0000641
642As shown above, you can achieve the same effect with a list comprehension. The
643:func:`itertools.imap` function does the same thing but can handle infinite
644iterators; it'll be discussed later, in the section on the :mod:`itertools` module.
645
646``filter(predicate, iter)`` returns a list that contains all the sequence
647elements that meet a certain condition, and is similarly duplicated by list
648comprehensions. A **predicate** is a function that returns the truth value of
649some condition; for use with :func:`filter`, the predicate must take a single
650value.
651
Georg Brandl09a7fe62008-03-22 11:00:48 +0000652 >>> def is_even(x):
653 ... return (x % 2) == 0
Georg Brandl8ec7f652007-08-15 14:28:01 +0000654
Georg Brandl09a7fe62008-03-22 11:00:48 +0000655 >>> filter(is_even, range(10))
656 [0, 2, 4, 6, 8]
Georg Brandl8ec7f652007-08-15 14:28:01 +0000657
Georg Brandl09a7fe62008-03-22 11:00:48 +0000658This can also be written as a list comprehension:
Georg Brandl8ec7f652007-08-15 14:28:01 +0000659
660 >>> [x for x in range(10) if is_even(x)]
661 [0, 2, 4, 6, 8]
662
663:func:`filter` also has a counterpart in the :mod:`itertools` module,
664:func:`itertools.ifilter`, that returns an iterator and can therefore handle
665infinite sequences just as :func:`itertools.imap` can.
666
667``reduce(func, iter, [initial_value])`` doesn't have a counterpart in the
668:mod:`itertools` module because it cumulatively performs an operation on all the
669iterable's elements and therefore can't be applied to infinite iterables.
670``func`` must be a function that takes two elements and returns a single value.
671:func:`reduce` takes the first two elements A and B returned by the iterator and
672calculates ``func(A, B)``. It then requests the third element, C, calculates
673``func(func(A, B), C)``, combines this result with the fourth element returned,
674and continues until the iterable is exhausted. If the iterable returns no
675values at all, a :exc:`TypeError` exception is raised. If the initial value is
676supplied, it's used as a starting point and ``func(initial_value, A)`` is the
677first calculation.
678
Georg Brandl09a7fe62008-03-22 11:00:48 +0000679 >>> import operator
680 >>> reduce(operator.concat, ['A', 'BB', 'C'])
681 'ABBC'
682 >>> reduce(operator.concat, [])
683 Traceback (most recent call last):
684 ...
685 TypeError: reduce() of empty sequence with no initial value
686 >>> reduce(operator.mul, [1,2,3], 1)
687 6
688 >>> reduce(operator.mul, [], 1)
689 1
Georg Brandl8ec7f652007-08-15 14:28:01 +0000690
691If you use :func:`operator.add` with :func:`reduce`, you'll add up all the
692elements of the iterable. This case is so common that there's a special
Georg Brandl09a7fe62008-03-22 11:00:48 +0000693built-in called :func:`sum` to compute it:
Georg Brandl8ec7f652007-08-15 14:28:01 +0000694
Georg Brandl09a7fe62008-03-22 11:00:48 +0000695 >>> reduce(operator.add, [1,2,3,4], 0)
696 10
697 >>> sum([1,2,3,4])
698 10
699 >>> sum([])
700 0
Georg Brandl8ec7f652007-08-15 14:28:01 +0000701
702For many uses of :func:`reduce`, though, it can be clearer to just write the
703obvious :keyword:`for` loop::
704
705 # Instead of:
706 product = reduce(operator.mul, [1,2,3], 1)
707
708 # You can write:
709 product = 1
710 for i in [1,2,3]:
711 product *= i
712
713
714``enumerate(iter)`` counts off the elements in the iterable, returning 2-tuples
715containing the count and each element.
716
Georg Brandl09a7fe62008-03-22 11:00:48 +0000717 >>> for item in enumerate(['subject', 'verb', 'object']):
718 ... print item
719 (0, 'subject')
720 (1, 'verb')
721 (2, 'object')
Georg Brandl8ec7f652007-08-15 14:28:01 +0000722
723:func:`enumerate` is often used when looping through a list and recording the
724indexes at which certain conditions are met::
725
726 f = open('data.txt', 'r')
727 for i, line in enumerate(f):
728 if line.strip() == '':
729 print 'Blank line at line #%i' % i
730
731``sorted(iterable, [cmp=None], [key=None], [reverse=False)`` collects all the
732elements of the iterable into a list, sorts the list, and returns the sorted
733result. The ``cmp``, ``key``, and ``reverse`` arguments are passed through to
Georg Brandl09a7fe62008-03-22 11:00:48 +0000734the constructed list's ``.sort()`` method. ::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000735
Georg Brandl09a7fe62008-03-22 11:00:48 +0000736 >>> import random
737 >>> # Generate 8 random numbers between [0, 10000)
738 >>> rand_list = random.sample(range(10000), 8)
739 >>> rand_list
740 [769, 7953, 9828, 6431, 8442, 9878, 6213, 2207]
741 >>> sorted(rand_list)
742 [769, 2207, 6213, 6431, 7953, 8442, 9828, 9878]
743 >>> sorted(rand_list, reverse=True)
744 [9878, 9828, 8442, 7953, 6431, 6213, 2207, 769]
Georg Brandl8ec7f652007-08-15 14:28:01 +0000745
746(For a more detailed discussion of sorting, see the Sorting mini-HOWTO in the
747Python wiki at http://wiki.python.org/moin/HowTo/Sorting.)
748
749The ``any(iter)`` and ``all(iter)`` built-ins look at the truth values of an
750iterable's contents. :func:`any` returns True if any element in the iterable is
751a true value, and :func:`all` returns True if all of the elements are true
Georg Brandl09a7fe62008-03-22 11:00:48 +0000752values:
Georg Brandl8ec7f652007-08-15 14:28:01 +0000753
Georg Brandl09a7fe62008-03-22 11:00:48 +0000754 >>> any([0,1,0])
755 True
756 >>> any([0,0,0])
757 False
758 >>> any([1,1,1])
759 True
760 >>> all([0,1,0])
761 False
762 >>> all([0,0,0])
763 False
764 >>> all([1,1,1])
765 True
Georg Brandl8ec7f652007-08-15 14:28:01 +0000766
767
768Small functions and the lambda expression
769=========================================
770
771When writing functional-style programs, you'll often need little functions that
772act as predicates or that combine elements in some way.
773
774If there's a Python built-in or a module function that's suitable, you don't
775need to define a new function at all::
776
Georg Brandl09a7fe62008-03-22 11:00:48 +0000777 stripped_lines = [line.strip() for line in lines]
778 existing_files = filter(os.path.exists, file_list)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000779
780If the function you need doesn't exist, you need to write it. One way to write
781small functions is to use the ``lambda`` statement. ``lambda`` takes a number
782of parameters and an expression combining these parameters, and creates a small
783function that returns the value of the expression::
784
Georg Brandl09a7fe62008-03-22 11:00:48 +0000785 lowercase = lambda x: x.lower()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000786
Georg Brandl09a7fe62008-03-22 11:00:48 +0000787 print_assign = lambda name, value: name + '=' + str(value)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000788
Georg Brandl09a7fe62008-03-22 11:00:48 +0000789 adder = lambda x, y: x+y
Georg Brandl8ec7f652007-08-15 14:28:01 +0000790
791An alternative is to just use the ``def`` statement and define a function in the
792usual way::
793
Georg Brandl09a7fe62008-03-22 11:00:48 +0000794 def lowercase(x):
795 return x.lower()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000796
Georg Brandl09a7fe62008-03-22 11:00:48 +0000797 def print_assign(name, value):
798 return name + '=' + str(value)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000799
Georg Brandl09a7fe62008-03-22 11:00:48 +0000800 def adder(x,y):
801 return x + y
Georg Brandl8ec7f652007-08-15 14:28:01 +0000802
803Which alternative is preferable? That's a style question; my usual course is to
804avoid using ``lambda``.
805
806One reason for my preference is that ``lambda`` is quite limited in the
807functions it can define. The result has to be computable as a single
808expression, which means you can't have multiway ``if... elif... else``
809comparisons or ``try... except`` statements. If you try to do too much in a
810``lambda`` statement, you'll end up with an overly complicated expression that's
811hard to read. Quick, what's the following code doing?
812
813::
814
815 total = reduce(lambda a, b: (0, a[1] + b[1]), items)[1]
816
817You can figure it out, but it takes time to disentangle the expression to figure
818out what's going on. Using a short nested ``def`` statements makes things a
819little bit better::
820
821 def combine (a, b):
822 return 0, a[1] + b[1]
823
824 total = reduce(combine, items)[1]
825
826But it would be best of all if I had simply used a ``for`` loop::
827
828 total = 0
829 for a, b in items:
830 total += b
831
832Or the :func:`sum` built-in and a generator expression::
833
834 total = sum(b for a,b in items)
835
836Many uses of :func:`reduce` are clearer when written as ``for`` loops.
837
838Fredrik Lundh once suggested the following set of rules for refactoring uses of
839``lambda``:
840
8411) Write a lambda function.
8422) Write a comment explaining what the heck that lambda does.
8433) Study the comment for a while, and think of a name that captures the essence
844 of the comment.
8454) Convert the lambda to a def statement, using that name.
8465) Remove the comment.
847
Andrew M. Kuchling90921cc2007-12-14 22:52:36 +0000848I really like these rules, but you're free to disagree
849about whether this lambda-free style is better.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000850
851
852The itertools module
853====================
854
855The :mod:`itertools` module contains a number of commonly-used iterators as well
856as functions for combining several iterators. This section will introduce the
857module's contents by showing small examples.
858
859The module's functions fall into a few broad classes:
860
861* Functions that create a new iterator based on an existing iterator.
862* Functions for treating an iterator's elements as function arguments.
863* Functions for selecting portions of an iterator's output.
864* A function for grouping an iterator's output.
865
866Creating new iterators
867----------------------
868
869``itertools.count(n)`` returns an infinite stream of integers, increasing by 1
870each time. You can optionally supply the starting number, which defaults to 0::
871
Georg Brandl09a7fe62008-03-22 11:00:48 +0000872 itertools.count() =>
873 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ...
874 itertools.count(10) =>
875 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000876
877``itertools.cycle(iter)`` saves a copy of the contents of a provided iterable
878and returns a new iterator that returns its elements from first to last. The
Georg Brandl09a7fe62008-03-22 11:00:48 +0000879new iterator will repeat these elements infinitely. ::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000880
Georg Brandl09a7fe62008-03-22 11:00:48 +0000881 itertools.cycle([1,2,3,4,5]) =>
882 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000883
884``itertools.repeat(elem, [n])`` returns the provided element ``n`` times, or
Georg Brandl09a7fe62008-03-22 11:00:48 +0000885returns the element endlessly if ``n`` is not provided. ::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000886
887 itertools.repeat('abc') =>
888 abc, abc, abc, abc, abc, abc, abc, abc, abc, abc, ...
889 itertools.repeat('abc', 5) =>
890 abc, abc, abc, abc, abc
891
892``itertools.chain(iterA, iterB, ...)`` takes an arbitrary number of iterables as
893input, and returns all the elements of the first iterator, then all the elements
Georg Brandl09a7fe62008-03-22 11:00:48 +0000894of the second, and so on, until all of the iterables have been exhausted. ::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000895
896 itertools.chain(['a', 'b', 'c'], (1, 2, 3)) =>
897 a, b, c, 1, 2, 3
898
899``itertools.izip(iterA, iterB, ...)`` takes one element from each iterable and
900returns them in a tuple::
901
902 itertools.izip(['a', 'b', 'c'], (1, 2, 3)) =>
903 ('a', 1), ('b', 2), ('c', 3)
904
Georg Brandl907a7202008-02-22 12:31:45 +0000905It's similar to the built-in :func:`zip` function, but doesn't construct an
Georg Brandl8ec7f652007-08-15 14:28:01 +0000906in-memory list and exhaust all the input iterators before returning; instead
907tuples are constructed and returned only if they're requested. (The technical
908term for this behaviour is `lazy evaluation
909<http://en.wikipedia.org/wiki/Lazy_evaluation>`__.)
910
911This iterator is intended to be used with iterables that are all of the same
912length. If the iterables are of different lengths, the resulting stream will be
Georg Brandl09a7fe62008-03-22 11:00:48 +0000913the same length as the shortest iterable. ::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000914
915 itertools.izip(['a', 'b'], (1, 2, 3)) =>
916 ('a', 1), ('b', 2)
917
918You should avoid doing this, though, because an element may be taken from the
919longer iterators and discarded. This means you can't go on to use the iterators
920further because you risk skipping a discarded element.
921
922``itertools.islice(iter, [start], stop, [step])`` returns a stream that's a
923slice of the iterator. With a single ``stop`` argument, it will return the
924first ``stop`` elements. If you supply a starting index, you'll get
925``stop-start`` elements, and if you supply a value for ``step``, elements will
926be skipped accordingly. Unlike Python's string and list slicing, you can't use
Georg Brandl09a7fe62008-03-22 11:00:48 +0000927negative values for ``start``, ``stop``, or ``step``. ::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000928
929 itertools.islice(range(10), 8) =>
930 0, 1, 2, 3, 4, 5, 6, 7
931 itertools.islice(range(10), 2, 8) =>
932 2, 3, 4, 5, 6, 7
933 itertools.islice(range(10), 2, 8, 2) =>
934 2, 4, 6
935
936``itertools.tee(iter, [n])`` replicates an iterator; it returns ``n``
937independent iterators that will all return the contents of the source iterator.
938If you don't supply a value for ``n``, the default is 2. Replicating iterators
939requires saving some of the contents of the source iterator, so this can consume
940significant memory if the iterator is large and one of the new iterators is
Georg Brandl09a7fe62008-03-22 11:00:48 +0000941consumed more than the others. ::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000942
943 itertools.tee( itertools.count() ) =>
944 iterA, iterB
945
946 where iterA ->
947 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ...
948
949 and iterB ->
950 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ...
951
952
953Calling functions on elements
954-----------------------------
955
956Two functions are used for calling other functions on the contents of an
957iterable.
958
959``itertools.imap(f, iterA, iterB, ...)`` returns a stream containing
960``f(iterA[0], iterB[0]), f(iterA[1], iterB[1]), f(iterA[2], iterB[2]), ...``::
961
962 itertools.imap(operator.add, [5, 6, 5], [1, 2, 3]) =>
963 6, 8, 8
964
965The ``operator`` module contains a set of functions corresponding to Python's
966operators. Some examples are ``operator.add(a, b)`` (adds two values),
967``operator.ne(a, b)`` (same as ``a!=b``), and ``operator.attrgetter('id')``
968(returns a callable that fetches the ``"id"`` attribute).
969
970``itertools.starmap(func, iter)`` assumes that the iterable will return a stream
971of tuples, and calls ``f()`` using these tuples as the arguments::
972
973 itertools.starmap(os.path.join,
974 [('/usr', 'bin', 'java'), ('/bin', 'python'),
975 ('/usr', 'bin', 'perl'),('/usr', 'bin', 'ruby')])
976 =>
977 /usr/bin/java, /bin/python, /usr/bin/perl, /usr/bin/ruby
978
979
980Selecting elements
981------------------
982
983Another group of functions chooses a subset of an iterator's elements based on a
984predicate.
985
986``itertools.ifilter(predicate, iter)`` returns all the elements for which the
987predicate returns true::
988
989 def is_even(x):
990 return (x % 2) == 0
991
992 itertools.ifilter(is_even, itertools.count()) =>
993 0, 2, 4, 6, 8, 10, 12, 14, ...
994
995``itertools.ifilterfalse(predicate, iter)`` is the opposite, returning all
996elements for which the predicate returns false::
997
998 itertools.ifilterfalse(is_even, itertools.count()) =>
999 1, 3, 5, 7, 9, 11, 13, 15, ...
1000
1001``itertools.takewhile(predicate, iter)`` returns elements for as long as the
1002predicate returns true. Once the predicate returns false, the iterator will
1003signal the end of its results.
1004
1005::
1006
1007 def less_than_10(x):
1008 return (x < 10)
1009
1010 itertools.takewhile(less_than_10, itertools.count()) =>
1011 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
1012
1013 itertools.takewhile(is_even, itertools.count()) =>
1014 0
1015
1016``itertools.dropwhile(predicate, iter)`` discards elements while the predicate
1017returns true, and then returns the rest of the iterable's results.
1018
1019::
1020
1021 itertools.dropwhile(less_than_10, itertools.count()) =>
1022 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, ...
1023
1024 itertools.dropwhile(is_even, itertools.count()) =>
1025 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, ...
1026
1027
1028Grouping elements
1029-----------------
1030
1031The last function I'll discuss, ``itertools.groupby(iter, key_func=None)``, is
1032the most complicated. ``key_func(elem)`` is a function that can compute a key
1033value for each element returned by the iterable. If you don't supply a key
1034function, the key is simply each element itself.
1035
1036``groupby()`` collects all the consecutive elements from the underlying iterable
1037that have the same key value, and returns a stream of 2-tuples containing a key
1038value and an iterator for the elements with that key.
1039
1040::
1041
1042 city_list = [('Decatur', 'AL'), ('Huntsville', 'AL'), ('Selma', 'AL'),
1043 ('Anchorage', 'AK'), ('Nome', 'AK'),
1044 ('Flagstaff', 'AZ'), ('Phoenix', 'AZ'), ('Tucson', 'AZ'),
1045 ...
1046 ]
1047
1048 def get_state ((city, state)):
1049 return state
1050
1051 itertools.groupby(city_list, get_state) =>
1052 ('AL', iterator-1),
1053 ('AK', iterator-2),
1054 ('AZ', iterator-3), ...
1055
1056 where
1057 iterator-1 =>
1058 ('Decatur', 'AL'), ('Huntsville', 'AL'), ('Selma', 'AL')
1059 iterator-2 =>
1060 ('Anchorage', 'AK'), ('Nome', 'AK')
1061 iterator-3 =>
1062 ('Flagstaff', 'AZ'), ('Phoenix', 'AZ'), ('Tucson', 'AZ')
1063
1064``groupby()`` assumes that the underlying iterable's contents will already be
1065sorted based on the key. Note that the returned iterators also use the
1066underlying iterable, so you have to consume the results of iterator-1 before
1067requesting iterator-2 and its corresponding key.
1068
1069
1070The functools module
1071====================
1072
1073The :mod:`functools` module in Python 2.5 contains some higher-order functions.
1074A **higher-order function** takes one or more functions as input and returns a
1075new function. The most useful tool in this module is the
1076:func:`functools.partial` function.
1077
1078For programs written in a functional style, you'll sometimes want to construct
1079variants of existing functions that have some of the parameters filled in.
1080Consider a Python function ``f(a, b, c)``; you may wish to create a new function
1081``g(b, c)`` that's equivalent to ``f(1, b, c)``; you're filling in a value for
1082one of ``f()``'s parameters. This is called "partial function application".
1083
1084The constructor for ``partial`` takes the arguments ``(function, arg1, arg2,
1085... kwarg1=value1, kwarg2=value2)``. The resulting object is callable, so you
1086can just call it to invoke ``function`` with the filled-in arguments.
1087
1088Here's a small but realistic example::
1089
1090 import functools
1091
1092 def log (message, subsystem):
1093 "Write the contents of 'message' to the specified subsystem."
1094 print '%s: %s' % (subsystem, message)
1095 ...
1096
1097 server_log = functools.partial(log, subsystem='server')
1098 server_log('Unable to open socket')
1099
1100
1101The operator module
1102-------------------
1103
1104The :mod:`operator` module was mentioned earlier. It contains a set of
1105functions corresponding to Python's operators. These functions are often useful
1106in functional-style code because they save you from writing trivial functions
1107that perform a single operation.
1108
1109Some of the functions in this module are:
1110
1111* Math operations: ``add()``, ``sub()``, ``mul()``, ``div()``, ``floordiv()``,
1112 ``abs()``, ...
1113* Logical operations: ``not_()``, ``truth()``.
1114* Bitwise operations: ``and_()``, ``or_()``, ``invert()``.
1115* Comparisons: ``eq()``, ``ne()``, ``lt()``, ``le()``, ``gt()``, and ``ge()``.
1116* Object identity: ``is_()``, ``is_not()``.
1117
1118Consult the operator module's documentation for a complete list.
1119
1120
1121
1122The functional module
1123---------------------
1124
1125Collin Winter's `functional module <http://oakwinter.com/code/functional/>`__
1126provides a number of more advanced tools for functional programming. It also
1127reimplements several Python built-ins, trying to make them more intuitive to
1128those used to functional programming in other languages.
1129
1130This section contains an introduction to some of the most important functions in
1131``functional``; full documentation can be found at `the project's website
1132<http://oakwinter.com/code/functional/documentation/>`__.
1133
1134``compose(outer, inner, unpack=False)``
1135
1136The ``compose()`` function implements function composition. In other words, it
1137returns a wrapper around the ``outer`` and ``inner`` callables, such that the
Georg Brandl09a7fe62008-03-22 11:00:48 +00001138return value from ``inner`` is fed directly to ``outer``. That is, ::
Georg Brandl8ec7f652007-08-15 14:28:01 +00001139
Georg Brandl09a7fe62008-03-22 11:00:48 +00001140 >>> def add(a, b):
1141 ... return a + b
1142 ...
1143 >>> def double(a):
1144 ... return 2 * a
1145 ...
1146 >>> compose(double, add)(5, 6)
1147 22
Georg Brandl8ec7f652007-08-15 14:28:01 +00001148
Georg Brandl09a7fe62008-03-22 11:00:48 +00001149is equivalent to ::
Georg Brandl8ec7f652007-08-15 14:28:01 +00001150
Georg Brandl09a7fe62008-03-22 11:00:48 +00001151 >>> double(add(5, 6))
1152 22
Georg Brandl8ec7f652007-08-15 14:28:01 +00001153
1154The ``unpack`` keyword is provided to work around the fact that Python functions
1155are not always `fully curried <http://en.wikipedia.org/wiki/Currying>`__. By
1156default, it is expected that the ``inner`` function will return a single object
1157and that the ``outer`` function will take a single argument. Setting the
1158``unpack`` argument causes ``compose`` to expect a tuple from ``inner`` which
Georg Brandl09a7fe62008-03-22 11:00:48 +00001159will be expanded before being passed to ``outer``. Put simply, ::
Georg Brandl8ec7f652007-08-15 14:28:01 +00001160
Georg Brandl09a7fe62008-03-22 11:00:48 +00001161 compose(f, g)(5, 6)
Georg Brandl8ec7f652007-08-15 14:28:01 +00001162
1163is equivalent to::
1164
Georg Brandl09a7fe62008-03-22 11:00:48 +00001165 f(g(5, 6))
Georg Brandl8ec7f652007-08-15 14:28:01 +00001166
Georg Brandl09a7fe62008-03-22 11:00:48 +00001167while ::
Georg Brandl8ec7f652007-08-15 14:28:01 +00001168
Georg Brandl09a7fe62008-03-22 11:00:48 +00001169 compose(f, g, unpack=True)(5, 6)
Georg Brandl8ec7f652007-08-15 14:28:01 +00001170
1171is equivalent to::
1172
Georg Brandl09a7fe62008-03-22 11:00:48 +00001173 f(*g(5, 6))
Georg Brandl8ec7f652007-08-15 14:28:01 +00001174
1175Even though ``compose()`` only accepts two functions, it's trivial to build up a
1176version that will compose any number of functions. We'll use ``reduce()``,
1177``compose()`` and ``partial()`` (the last of which is provided by both
Georg Brandl09a7fe62008-03-22 11:00:48 +00001178``functional`` and ``functools``). ::
Georg Brandl8ec7f652007-08-15 14:28:01 +00001179
Georg Brandl09a7fe62008-03-22 11:00:48 +00001180 from functional import compose, partial
Georg Brandl8ec7f652007-08-15 14:28:01 +00001181
Georg Brandl09a7fe62008-03-22 11:00:48 +00001182 multi_compose = partial(reduce, compose)
Georg Brandl8ec7f652007-08-15 14:28:01 +00001183
1184
1185We can also use ``map()``, ``compose()`` and ``partial()`` to craft a version of
1186``"".join(...)`` that converts its arguments to string::
1187
Georg Brandl09a7fe62008-03-22 11:00:48 +00001188 from functional import compose, partial
Georg Brandl8ec7f652007-08-15 14:28:01 +00001189
Georg Brandl09a7fe62008-03-22 11:00:48 +00001190 join = compose("".join, partial(map, str))
Georg Brandl8ec7f652007-08-15 14:28:01 +00001191
1192
1193``flip(func)``
1194
1195``flip()`` wraps the callable in ``func`` and causes it to receive its
Georg Brandl09a7fe62008-03-22 11:00:48 +00001196non-keyword arguments in reverse order. ::
Georg Brandl8ec7f652007-08-15 14:28:01 +00001197
Georg Brandl09a7fe62008-03-22 11:00:48 +00001198 >>> def triple(a, b, c):
1199 ... return (a, b, c)
1200 ...
1201 >>> triple(5, 6, 7)
1202 (5, 6, 7)
1203 >>>
1204 >>> flipped_triple = flip(triple)
1205 >>> flipped_triple(5, 6, 7)
1206 (7, 6, 5)
Georg Brandl8ec7f652007-08-15 14:28:01 +00001207
1208``foldl(func, start, iterable)``
1209
1210``foldl()`` takes a binary function, a starting value (usually some kind of
1211'zero'), and an iterable. The function is applied to the starting value and the
1212first element of the list, then the result of that and the second element of the
1213list, then the result of that and the third element of the list, and so on.
1214
1215This means that a call such as::
1216
Georg Brandl09a7fe62008-03-22 11:00:48 +00001217 foldl(f, 0, [1, 2, 3])
Georg Brandl8ec7f652007-08-15 14:28:01 +00001218
1219is equivalent to::
1220
Georg Brandl09a7fe62008-03-22 11:00:48 +00001221 f(f(f(0, 1), 2), 3)
Georg Brandl8ec7f652007-08-15 14:28:01 +00001222
1223
1224``foldl()`` is roughly equivalent to the following recursive function::
1225
Georg Brandl09a7fe62008-03-22 11:00:48 +00001226 def foldl(func, start, seq):
1227 if len(seq) == 0:
1228 return start
Georg Brandl8ec7f652007-08-15 14:28:01 +00001229
Georg Brandl09a7fe62008-03-22 11:00:48 +00001230 return foldl(func, func(start, seq[0]), seq[1:])
Georg Brandl8ec7f652007-08-15 14:28:01 +00001231
1232Speaking of equivalence, the above ``foldl`` call can be expressed in terms of
1233the built-in ``reduce`` like so::
1234
Georg Brandl09a7fe62008-03-22 11:00:48 +00001235 reduce(f, [1, 2, 3], 0)
Georg Brandl8ec7f652007-08-15 14:28:01 +00001236
1237
1238We can use ``foldl()``, ``operator.concat()`` and ``partial()`` to write a
1239cleaner, more aesthetically-pleasing version of Python's ``"".join(...)``
1240idiom::
1241
Georg Brandl09a7fe62008-03-22 11:00:48 +00001242 from functional import foldl, partial from operator import concat
1243
1244 join = partial(foldl, concat, "")
Georg Brandl8ec7f652007-08-15 14:28:01 +00001245
1246
1247Revision History and Acknowledgements
1248=====================================
1249
1250The author would like to thank the following people for offering suggestions,
1251corrections and assistance with various drafts of this article: Ian Bicking,
1252Nick Coghlan, Nick Efford, Raymond Hettinger, Jim Jewett, Mike Krell, Leandro
1253Lameiro, Jussi Salmela, Collin Winter, Blake Winton.
1254
1255Version 0.1: posted June 30 2006.
1256
1257Version 0.11: posted July 1 2006. Typo fixes.
1258
1259Version 0.2: posted July 10 2006. Merged genexp and listcomp sections into one.
1260Typo fixes.
1261
1262Version 0.21: Added more references suggested on the tutor mailing list.
1263
1264Version 0.30: Adds a section on the ``functional`` module written by Collin
1265Winter; adds short section on the operator module; a few other edits.
1266
1267
1268References
1269==========
1270
1271General
1272-------
1273
1274**Structure and Interpretation of Computer Programs**, by Harold Abelson and
1275Gerald Jay Sussman with Julie Sussman. Full text at
1276http://mitpress.mit.edu/sicp/. In this classic textbook of computer science,
1277chapters 2 and 3 discuss the use of sequences and streams to organize the data
1278flow inside a program. The book uses Scheme for its examples, but many of the
1279design approaches described in these chapters are applicable to functional-style
1280Python code.
1281
1282http://www.defmacro.org/ramblings/fp.html: A general introduction to functional
1283programming that uses Java examples and has a lengthy historical introduction.
1284
1285http://en.wikipedia.org/wiki/Functional_programming: General Wikipedia entry
1286describing functional programming.
1287
1288http://en.wikipedia.org/wiki/Coroutine: Entry for coroutines.
1289
1290http://en.wikipedia.org/wiki/Currying: Entry for the concept of currying.
1291
1292Python-specific
1293---------------
1294
1295http://gnosis.cx/TPiP/: The first chapter of David Mertz's book
1296:title-reference:`Text Processing in Python` discusses functional programming
1297for text processing, in the section titled "Utilizing Higher-Order Functions in
1298Text Processing".
1299
1300Mertz also wrote a 3-part series of articles on functional programming
1301for IBM's DeveloperWorks site; see
1302`part 1 <http://www-128.ibm.com/developerworks/library/l-prog.html>`__,
1303`part 2 <http://www-128.ibm.com/developerworks/library/l-prog2.html>`__, and
1304`part 3 <http://www-128.ibm.com/developerworks/linux/library/l-prog3.html>`__,
1305
1306
1307Python documentation
1308--------------------
1309
1310Documentation for the :mod:`itertools` module.
1311
1312Documentation for the :mod:`operator` module.
1313
1314:pep:`289`: "Generator Expressions"
1315
1316:pep:`342`: "Coroutines via Enhanced Generators" describes the new generator
1317features in Python 2.5.
1318
1319.. comment
1320
1321 Topics to place
1322 -----------------------------
1323
1324 XXX os.walk()
1325
1326 XXX Need a large example.
1327
1328 But will an example add much? I'll post a first draft and see
1329 what the comments say.
1330
1331.. comment
1332
1333 Original outline:
1334 Introduction
1335 Idea of FP
1336 Programs built out of functions
1337 Functions are strictly input-output, no internal state
1338 Opposed to OO programming, where objects have state
1339
1340 Why FP?
1341 Formal provability
1342 Assignment is difficult to reason about
1343 Not very relevant to Python
1344 Modularity
1345 Small functions that do one thing
1346 Debuggability:
1347 Easy to test due to lack of state
1348 Easy to verify output from intermediate steps
1349 Composability
1350 You assemble a toolbox of functions that can be mixed
1351
1352 Tackling a problem
1353 Need a significant example
1354
1355 Iterators
1356 Generators
1357 The itertools module
1358 List comprehensions
1359 Small functions and the lambda statement
1360 Built-in functions
1361 map
1362 filter
1363 reduce
1364
1365.. comment
1366
1367 Handy little function for printing part of an iterator -- used
1368 while writing this document.
1369
1370 import itertools
1371 def print_iter(it):
1372 slice = itertools.islice(it, 10)
1373 for elem in slice[:-1]:
1374 sys.stdout.write(str(elem))
1375 sys.stdout.write(', ')
1376 print elem[-1]
1377
1378