blob: 341c041c9be9db953650c539325ca06d5507287a [file] [log] [blame]
Andrew M. Kuchling22145072006-08-22 23:13:43 +00001Functional Programming HOWTO
2================================
3
Andrew M. Kuchlingf4dcd1d2006-11-08 13:35:34 +00004**Version 0.30**
Andrew M. Kuchling22145072006-08-22 23:13:43 +00005
6(This is a first draft. Please send comments/error
7reports/suggestions to amk@amk.ca. This URL is probably not going to
8be the final location of the document, so be careful about linking to
9it -- you may want to add a disclaimer.)
10
11In this document, we'll take a tour of Python's features suitable for
12implementing programs in a functional style. After an introduction to
13the concepts of functional programming, we'll look at language
14features such as iterators and generators and relevant library modules
15such as ``itertools`` and ``functools``.
16
17
Andrew M. Kuchling0acdb932006-11-08 14:24:03 +000018.. contents::
19
Andrew M. Kuchling22145072006-08-22 23:13:43 +000020Introduction
21----------------------
22
23This section explains the basic concept of functional programming; if
24you're just interested in learning about Python language features,
25skip to the next section.
26
27Programming languages support decomposing problems in several different
28ways:
29
30* Most programming languages are **procedural**:
31 programs are lists of instructions that tell the computer what to
32 do with the program's input.
33 C, Pascal, and even Unix shells are procedural languages.
34
35* In **declarative** languages, you write a specification that describes
36 the problem to be solved, and the language implementation figures out
37 how to perform the computation efficiently. SQL is the declarative
38 language you're most likely to be familiar with; a SQL query describes
39 the data set you want to retrieve, and the SQL engine decides whether to
40 scan tables or use indexes, which subclauses should be performed first,
41 etc.
42
43* **Object-oriented** programs manipulate collections of objects.
44 Objects have internal state and support methods that query or modify
45 this internal state in some way. Smalltalk and Java are
46 object-oriented languages. C++ and Python are languages that
47 support object-oriented programming, but don't force the use
48 of object-oriented features.
49
50* **Functional** programming decomposes a problem into a set of functions.
51 Ideally, functions only take inputs and produce outputs, and don't have any
52 internal state that affects the output produced for a given input.
53 Well-known functional languages include the ML family (Standard ML,
54 OCaml, and other variants) and Haskell.
55
56The designers of some computer languages have chosen one approach to
57programming that's emphasized. This often makes it difficult to
58write programs that use a different approach. Other languages are
59multi-paradigm languages that support several different approaches. Lisp,
60C++, and Python are multi-paradigm; you can write programs or
61libraries that are largely procedural, object-oriented, or functional
62in all of these languages. In a large program, different sections
63might be written using different approaches; the GUI might be object-oriented
64while the processing logic is procedural or functional, for example.
65
66In a functional program, input flows through a set of functions. Each
67function operates on its input and produces some output. Functional
68style frowns upon functions with side effects that modify internal
69state or make other changes that aren't visible in the function's
70return value. Functions that have no side effects at all are
71called **purely functional**.
72Avoiding side effects means not using data structures
73that get updated as a program runs; every function's output
74must only depend on its input.
75
76Some languages are very strict about purity and don't even have
77assignment statements such as ``a=3`` or ``c = a + b``, but it's
78difficult to avoid all side effects. Printing to the screen or
79writing to a disk file are side effects, for example. For example, in
80Python a ``print`` statement or a ``time.sleep(1)`` both return no
81useful value; they're only called for their side effects of sending
82some text to the screen or pausing execution for a second.
83
84Python programs written in functional style usually won't go to the
85extreme of avoiding all I/O or all assignments; instead, they'll
86provide a functional-appearing interface but will use non-functional
87features internally. For example, the implementation of a function
88will still use assignments to local variables, but won't modify global
89variables or have other side effects.
90
91Functional programming can be considered the opposite of
92object-oriented programming. Objects are little capsules containing
93some internal state along with a collection of method calls that let
94you modify this state, and programs consist of making the right set of
95state changes. Functional programming wants to avoid state changes as
96much as possible and works with data flowing between functions. In
97Python you might combine the two approaches by writing functions that
98take and return instances representing objects in your application
99(e-mail messages, transactions, etc.).
100
101Functional design may seem like an odd constraint to work under. Why
102should you avoid objects and side effects? There are theoretical and
103practical advantages to the functional style:
104
105* Formal provability.
106* Modularity.
107* Composability.
108* Ease of debugging and testing.
109
110Formal provability
111''''''''''''''''''''''
112
113A theoretical benefit is that it's easier to construct a mathematical proof
114that a functional program is correct.
115
116For a long time researchers have been interested in finding ways to
117mathematically prove programs correct. This is different from testing
118a program on numerous inputs and concluding that its output is usually
119correct, or reading a program's source code and concluding that the
120code looks right; the goal is instead a rigorous proof that a program
121produces the right result for all possible inputs.
122
123The technique used to prove programs correct is to write down
124**invariants**, properties of the input data and of the program's
125variables that are always true. For each line of code, you then show
126that if invariants X and Y are true **before** the line is executed,
127the slightly different invariants X' and Y' are true **after**
128the line is executed. This continues until you reach the end of the
129program, at which point the invariants should match the desired
130conditions on the program's output.
131
132Functional programming's avoidance of assignments arose because
133assignments are difficult to handle with this technique;
134assignments can break invariants that were true before the assignment
135without producing any new invariants that can be propagated onward.
136
137Unfortunately, proving programs correct is largely impractical and not
138relevant to Python software. Even trivial programs require proofs that
139are several pages long; the proof of correctness for a moderately
140complicated program would be enormous, and few or none of the programs
141you use daily (the Python interpreter, your XML parser, your web
142browser) could be proven correct. Even if you wrote down or generated
143a proof, there would then be the question of verifying the proof;
144maybe there's an error in it, and you wrongly believe you've proved
145the program correct.
146
147Modularity
148''''''''''''''''''''''
149
150A more practical benefit of functional programming is that it forces
151you to break apart your problem into small pieces. Programs are more
152modular as a result. It's easier to specify and write a small
153function that does one thing than a large function that performs a
154complicated transformation. Small functions are also easier to read
155and to check for errors.
156
157
158Ease of debugging and testing
159''''''''''''''''''''''''''''''''''
160
161Testing and debugging a functional-style program is easier.
162
163Debugging is simplified because functions are generally small and
164clearly specified. When a program doesn't work, each function is an
165interface point where you can check that the data are correct. You
166can look at the intermediate inputs and outputs to quickly isolate the
167function that's responsible for a bug.
168
169Testing is easier because each function is a potential subject for a
170unit test. Functions don't depend on system state that needs to be
171replicated before running a test; instead you only have to synthesize
172the right input and then check that the output matches expectations.
173
174
175
176Composability
177''''''''''''''''''''''
178
179As you work on a functional-style program, you'll write a number of
180functions with varying inputs and outputs. Some of these functions
181will be unavoidably specialized to a particular application, but
182others will be useful in a wide variety of programs. For example, a
183function that takes a directory path and returns all the XML files in
184the directory, or a function that takes a filename and returns its
185contents, can be applied to many different situations.
186
187Over time you'll form a personal library of utilities. Often you'll
188assemble new programs by arranging existing functions in a new
189configuration and writing a few functions specialized for the current
190task.
191
192
193
194Iterators
195-----------------------
196
197I'll start by looking at a Python language feature that's an important
198foundation for writing functional-style programs: iterators.
199
200An iterator is an object representing a stream of data; this object
201returns the data one element at a time. A Python iterator must
202support a method called ``next()`` that takes no arguments and always
203returns the next element of the stream. If there are no more elements
204in the stream, ``next()`` must raise the ``StopIteration`` exception.
205Iterators don't have to be finite, though; it's perfectly reasonable
206to write an iterator that produces an infinite stream of data.
207
208The built-in ``iter()`` function takes an arbitrary object and tries
209to return an iterator that will return the object's contents or
210elements, raising ``TypeError`` if the object doesn't support
211iteration. Several of Python's built-in data types support iteration,
212the most common being lists and dictionaries. An object is called
213an **iterable** object if you can get an iterator for it.
214
215You can experiment with the iteration interface manually::
216
217 >>> L = [1,2,3]
218 >>> it = iter(L)
219 >>> print it
220 <iterator object at 0x8116870>
221 >>> it.next()
222 1
223 >>> it.next()
224 2
225 >>> it.next()
226 3
227 >>> it.next()
228 Traceback (most recent call last):
229 File "<stdin>", line 1, in ?
230 StopIteration
231 >>>
232
233Python expects iterable objects in several different contexts, the
234most important being the ``for`` statement. In the statement ``for X in Y``,
235Y must be an iterator or some object for which ``iter()`` can create
236an iterator. These two statements are equivalent::
237
238 for i in iter(obj):
239 print i
240
241 for i in obj:
242 print i
243
244Iterators can be materialized as lists or tuples by using the
245``list()`` or ``tuple()`` constructor functions::
246
247 >>> L = [1,2,3]
248 >>> iterator = iter(L)
249 >>> t = tuple(iterator)
250 >>> t
251 (1, 2, 3)
252
253Sequence unpacking also supports iterators: if you know an iterator
254will return N elements, you can unpack them into an N-tuple::
255
256 >>> L = [1,2,3]
257 >>> iterator = iter(L)
258 >>> a,b,c = iterator
259 >>> a,b,c
260 (1, 2, 3)
261
262Built-in functions such as ``max()`` and ``min()`` can take a single
263iterator argument and will return the largest or smallest element.
264The ``"in"`` and ``"not in"`` operators also support iterators: ``X in
265iterator`` is true if X is found in the stream returned by the
266iterator. You'll run into obvious problems if the iterator is
267infinite; ``max()``, ``min()``, and ``"not in"`` will never return, and
268if the element X never appears in the stream, the ``"in"`` operator
269won't return either.
270
271Note that you can only go forward in an iterator; there's no way to
272get the previous element, reset the iterator, or make a copy of it.
273Iterator objects can optionally provide these additional capabilities,
274but the iterator protocol only specifies the ``next()`` method.
275Functions may therefore consume all of the iterator's output, and if
276you need to do something different with the same stream, you'll have
277to create a new iterator.
278
279
280
281Data Types That Support Iterators
282'''''''''''''''''''''''''''''''''''
283
284We've already seen how lists and tuples support iterators. In fact,
285any Python sequence type, such as strings, will automatically support
286creation of an iterator.
287
288Calling ``iter()`` on a dictionary returns an iterator that will loop
289over the dictionary's keys::
290
291 >>> m = {'Jan': 1, 'Feb': 2, 'Mar': 3, 'Apr': 4, 'May': 5, 'Jun': 6,
292 ... 'Jul': 7, 'Aug': 8, 'Sep': 9, 'Oct': 10, 'Nov': 11, 'Dec': 12}
293 >>> for key in m:
294 ... print key, m[key]
295 Mar 3
296 Feb 2
297 Aug 8
298 Sep 9
299 May 5
300 Jun 6
301 Jul 7
302 Jan 1
303 Apr 4
304 Nov 11
305 Dec 12
306 Oct 10
307
308Note that the order is essentially random, because it's based on the
309hash ordering of the objects in the dictionary.
310
311Applying ``iter()`` to a dictionary always loops over the keys, but
312dictionaries have methods that return other iterators. If you want to
313iterate over keys, values, or key/value pairs, you can explicitly call
314the ``iterkeys()``, ``itervalues()``, or ``iteritems()`` methods to
315get an appropriate iterator.
316
317The ``dict()`` constructor can accept an iterator that returns a
318finite stream of ``(key, value)`` tuples::
319
320 >>> L = [('Italy', 'Rome'), ('France', 'Paris'), ('US', 'Washington DC')]
321 >>> dict(iter(L))
322 {'Italy': 'Rome', 'US': 'Washington DC', 'France': 'Paris'}
323
324Files also support iteration by calling the ``readline()``
325method until there are no more lines in the file. This means you can
326read each line of a file like this::
327
328 for line in file:
329 # do something for each line
330 ...
331
332Sets can take their contents from an iterable and let you iterate over
333the set's elements::
334
335 S = set((2, 3, 5, 7, 11, 13))
336 for i in S:
337 print i
338
339
340
341Generator expressions and list comprehensions
342----------------------------------------------------
343
Andrew M. Kuchling9efdd782006-11-08 14:14:30 +0000344Two common operations on an iterator's output are 1) performing some
345operation for every element, 2) selecting a subset of elements that
346meet some condition. For example, given a list of strings, you might
347want to strip off trailing whitespace from each line or extract all
348the strings containing a given substring.
Andrew M. Kuchling22145072006-08-22 23:13:43 +0000349
350List comprehensions and generator expressions (short form: "listcomps"
351and "genexps") are a concise notation for such operations, borrowed
352from the functional programming language Haskell
353(http://www.haskell.org). You can strip all the whitespace from a
354stream of strings with the following code::
355
356 line_list = [' line 1\n', 'line 2 \n', ...]
357
358 # Generator expression -- returns iterator
359 stripped_iter = (line.strip() for line in line_list)
360
361 # List comprehension -- returns list
362 stripped_list = [line.strip() for line in line_list]
363
364You can select only certain elements by adding an ``"if"`` condition::
365
366 stripped_list = [line.strip() for line in line_list
367 if line != ""]
368
369With a list comprehension, you get back a Python list;
370``stripped_list`` is a list containing the resulting lines, not an
371iterator. Generator expressions return an iterator that computes the
372values as necessary, not needing to materialize all the values at
373once. This means that list comprehensions aren't useful if you're
374working with iterators that return an infinite stream or a very large
375amount of data. Generator expressions are preferable in these
376situations.
377
378Generator expressions are surrounded by parentheses ("()") and list
379comprehensions are surrounded by square brackets ("[]"). Generator
380expressions have the form::
381
382 ( expression for expr in sequence1
383 if condition1
384 for expr2 in sequence2
385 if condition2
386 for expr3 in sequence3 ...
387 if condition3
388 for exprN in sequenceN
389 if conditionN )
390
391Again, for a list comprehension only the outside brackets are
392different (square brackets instead of parentheses).
393
394The elements of the generated output will be the successive values of
395``expression``. The ``if`` clauses are all optional; if present,
396``expression`` is only evaluated and added to the result when
397``condition`` is true.
398
399Generator expressions always have to be written inside parentheses,
400but the parentheses signalling a function call also count. If you
401want to create an iterator that will be immediately passed to a
402function you can write::
403
404 obj_total = sum(obj.count for obj in list_all_objects())
405
406The ``for...in`` clauses contain the sequences to be iterated over.
407The sequences do not have to be the same length, because they are
408iterated over from left to right, **not** in parallel. For each
409element in ``sequence1``, ``sequence2`` is looped over from the
410beginning. ``sequence3`` is then looped over for each
411resulting pair of elements from ``sequence1`` and ``sequence2``.
412
413To put it another way, a list comprehension or generator expression is
414equivalent to the following Python code::
415
416 for expr1 in sequence1:
417 if not (condition1):
418 continue # Skip this element
419 for expr2 in sequence2:
420 if not (condition2):
421 continue # Skip this element
422 ...
423 for exprN in sequenceN:
424 if not (conditionN):
425 continue # Skip this element
426
427 # Output the value of
428 # the expression.
429
430This means that when there are multiple ``for...in`` clauses but no
431``if`` clauses, the length of the resulting output will be equal to
432the product of the lengths of all the sequences. If you have two
433lists of length 3, the output list is 9 elements long::
434
435 seq1 = 'abc'
436 seq2 = (1,2,3)
437 >>> [ (x,y) for x in seq1 for y in seq2]
438 [('a', 1), ('a', 2), ('a', 3),
439 ('b', 1), ('b', 2), ('b', 3),
440 ('c', 1), ('c', 2), ('c', 3)]
441
442To avoid introducing an ambiguity into Python's grammar, if
443``expression`` is creating a tuple, it must be surrounded with
444parentheses. The first list comprehension below is a syntax error,
445while the second one is correct::
446
447 # Syntax error
448 [ x,y for x in seq1 for y in seq2]
449 # Correct
450 [ (x,y) for x in seq1 for y in seq2]
451
452
453Generators
454-----------------------
455
456Generators are a special class of functions that simplify the task of
457writing iterators. Regular functions compute a value and return it,
458but generators return an iterator that returns a stream of values.
459
460You're doubtless familiar with how regular function calls work in
461Python or C. When you call a function, it gets a private namespace
462where its local variables are created. When the function reaches a
463``return`` statement, the local variables are destroyed and the
464value is returned to the caller. A later call to the same function
465creates a new private namespace and a fresh set of local
466variables. But, what if the local variables weren't thrown away on
467exiting a function? What if you could later resume the function where
468it left off? This is what generators provide; they can be thought of
469as resumable functions.
470
471Here's the simplest example of a generator function::
472
473 def generate_ints(N):
474 for i in range(N):
475 yield i
476
477Any function containing a ``yield`` keyword is a generator function;
478this is detected by Python's bytecode compiler which compiles the
479function specially as a result.
480
481When you call a generator function, it doesn't return a single value;
482instead it returns a generator object that supports the iterator
483protocol. On executing the ``yield`` expression, the generator
484outputs the value of ``i``, similar to a ``return``
485statement. The big difference between ``yield`` and a
486``return`` statement is that on reaching a ``yield`` the
487generator's state of execution is suspended and local variables are
488preserved. On the next call to the generator's ``.next()`` method,
489the function will resume executing.
490
491Here's a sample usage of the ``generate_ints()`` generator::
492
493 >>> gen = generate_ints(3)
494 >>> gen
495 <generator object at 0x8117f90>
496 >>> gen.next()
497 0
498 >>> gen.next()
499 1
500 >>> gen.next()
501 2
502 >>> gen.next()
503 Traceback (most recent call last):
504 File "stdin", line 1, in ?
505 File "stdin", line 2, in generate_ints
506 StopIteration
507
508You could equally write ``for i in generate_ints(5)``, or
509``a,b,c = generate_ints(3)``.
510
511Inside a generator function, the ``return`` statement can only be used
512without a value, and signals the end of the procession of values;
513after executing a ``return`` the generator cannot return any further
514values. ``return`` with a value, such as ``return 5``, is a syntax
515error inside a generator function. The end of the generator's results
516can also be indicated by raising ``StopIteration`` manually, or by
517just letting the flow of execution fall off the bottom of the
518function.
519
520You could achieve the effect of generators manually by writing your
521own class and storing all the local variables of the generator as
522instance variables. For example, returning a list of integers could
523be done by setting ``self.count`` to 0, and having the
524``next()`` method increment ``self.count`` and return it.
525However, for a moderately complicated generator, writing a
526corresponding class can be much messier.
527
528The test suite included with Python's library, ``test_generators.py``,
529contains a number of more interesting examples. Here's one generator
530that implements an in-order traversal of a tree using generators
531recursively.
532
533::
534
535 # A recursive generator that generates Tree leaves in in-order.
536 def inorder(t):
537 if t:
538 for x in inorder(t.left):
539 yield x
540
541 yield t.label
542
543 for x in inorder(t.right):
544 yield x
545
546Two other examples in ``test_generators.py`` produce
547solutions for the N-Queens problem (placing N queens on an NxN
548chess board so that no queen threatens another) and the Knight's Tour
549(finding a route that takes a knight to every square of an NxN chessboard
550without visiting any square twice).
551
552
553
554Passing values into a generator
555''''''''''''''''''''''''''''''''''''''''''''''
556
557In Python 2.4 and earlier, generators only produced output. Once a
558generator's code was invoked to create an iterator, there was no way to
559pass any new information into the function when its execution is
560resumed. You could hack together this ability by making the
561generator look at a global variable or by passing in some mutable object
562that callers then modify, but these approaches are messy.
563
564In Python 2.5 there's a simple way to pass values into a generator.
565``yield`` became an expression, returning a value that can be assigned
566to a variable or otherwise operated on::
567
568 val = (yield i)
569
570I recommend that you **always** put parentheses around a ``yield``
571expression when you're doing something with the returned value, as in
572the above example. The parentheses aren't always necessary, but it's
573easier to always add them instead of having to remember when they're
574needed.
575
576(PEP 342 explains the exact rules, which are that a
577``yield``-expression must always be parenthesized except when it
578occurs at the top-level expression on the right-hand side of an
579assignment. This means you can write ``val = yield i`` but have to
580use parentheses when there's an operation, as in ``val = (yield i)
581+ 12``.)
582
583Values are sent into a generator by calling its
584``send(value)`` method. This method resumes the
585generator's code and the ``yield`` expression returns the specified
586value. If the regular ``next()`` method is called, the
587``yield`` returns ``None``.
588
589Here's a simple counter that increments by 1 and allows changing the
590value of the internal counter.
591
592::
593
594 def counter (maximum):
595 i = 0
596 while i < maximum:
597 val = (yield i)
598 # If value provided, change counter
599 if val is not None:
600 i = val
601 else:
602 i += 1
603
604And here's an example of changing the counter:
605
606 >>> it = counter(10)
607 >>> print it.next()
608 0
609 >>> print it.next()
610 1
611 >>> print it.send(8)
612 8
613 >>> print it.next()
614 9
615 >>> print it.next()
616 Traceback (most recent call last):
617 File ``t.py'', line 15, in ?
618 print it.next()
619 StopIteration
620
621Because ``yield`` will often be returning ``None``, you
622should always check for this case. Don't just use its value in
623expressions unless you're sure that the ``send()`` method
624will be the only method used resume your generator function.
625
626In addition to ``send()``, there are two other new methods on
627generators:
628
629* ``throw(type, value=None, traceback=None)`` is used to raise an exception inside the
630 generator; the exception is raised by the ``yield`` expression
631 where the generator's execution is paused.
632
633* ``close()`` raises a ``GeneratorExit``
634 exception inside the generator to terminate the iteration.
635 On receiving this
636 exception, the generator's code must either raise
637 ``GeneratorExit`` or ``StopIteration``; catching the
638 exception and doing anything else is illegal and will trigger
639 a ``RuntimeError``. ``close()`` will also be called by
640 Python's garbage collector when the generator is garbage-collected.
641
642 If you need to run cleanup code when a ``GeneratorExit`` occurs,
643 I suggest using a ``try: ... finally:`` suite instead of
644 catching ``GeneratorExit``.
645
646The cumulative effect of these changes is to turn generators from
647one-way producers of information into both producers and consumers.
648
649Generators also become **coroutines**, a more generalized form of
650subroutines. Subroutines are entered at one point and exited at
651another point (the top of the function, and a ``return``
652statement), but coroutines can be entered, exited, and resumed at
653many different points (the ``yield`` statements).
654
655
656Built-in functions
657----------------------------------------------
658
659Let's look in more detail at built-in functions often used with iterators.
660
661Two Python's built-in functions, ``map()`` and ``filter()``, are
662somewhat obsolete; they duplicate the features of list comprehensions
Andrew M. Kuchling0acdb932006-11-08 14:24:03 +0000663but return actual lists instead of iterators.
Andrew M. Kuchling22145072006-08-22 23:13:43 +0000664
665``map(f, iterA, iterB, ...)`` returns a list containing ``f(iterA[0],
666iterB[0]), f(iterA[1], iterB[1]), f(iterA[2], iterB[2]), ...``.
667
668::
669
670 def upper(s):
671 return s.upper()
672 map(upper, ['sentence', 'fragment']) =>
673 ['SENTENCE', 'FRAGMENT']
674
675 [upper(s) for s in ['sentence', 'fragment']] =>
676 ['SENTENCE', 'FRAGMENT']
677
678As shown above, you can achieve the same effect with a list
679comprehension. The ``itertools.imap()`` function does the same thing
Andrew M. Kuchling0acdb932006-11-08 14:24:03 +0000680but can handle infinite iterators; it'll be discussed later, in the section on
Andrew M. Kuchling22145072006-08-22 23:13:43 +0000681the ``itertools`` module.
682
683``filter(predicate, iter)`` returns a list
684that contains all the sequence elements that meet a certain condition,
685and is similarly duplicated by list comprehensions.
686A **predicate** is a function that returns the truth value of
687some condition; for use with ``filter()``, the predicate must take a
688single value.
689
690::
691
692 def is_even(x):
693 return (x % 2) == 0
694
695 filter(is_even, range(10)) =>
696 [0, 2, 4, 6, 8]
697
698This can also be written as a list comprehension::
699
700 >>> [x for x in range(10) if is_even(x)]
701 [0, 2, 4, 6, 8]
702
703``filter()`` also has a counterpart in the ``itertools`` module,
704``itertools.ifilter()``, that returns an iterator and
705can therefore handle infinite sequences just as ``itertools.imap()`` can.
706
707``reduce(func, iter, [initial_value])`` doesn't have a counterpart in
708the ``itertools`` module because it cumulatively performs an operation
709on all the iterable's elements and therefore can't be applied to
Andrew M. Kuchling0acdb932006-11-08 14:24:03 +0000710infinite iterables. ``func`` must be a function that takes two elements
Andrew M. Kuchling22145072006-08-22 23:13:43 +0000711and returns a single value. ``reduce()`` takes the first two elements
712A and B returned by the iterator and calculates ``func(A, B)``. It
713then requests the third element, C, calculates ``func(func(A, B),
714C)``, combines this result with the fourth element returned, and
715continues until the iterable is exhausted. If the iterable returns no
716values at all, a ``TypeError`` exception is raised. If the initial
717value is supplied, it's used as a starting point and
718``func(initial_value, A)`` is the first calculation.
719
720::
721
722 import operator
723 reduce(operator.concat, ['A', 'BB', 'C']) =>
724 'ABBC'
725 reduce(operator.concat, []) =>
726 TypeError: reduce() of empty sequence with no initial value
727 reduce(operator.mul, [1,2,3], 1) =>
728 6
729 reduce(operator.mul, [], 1) =>
730 1
731
732If you use ``operator.add`` with ``reduce()``, you'll add up all the
733elements of the iterable. This case is so common that there's a special
734built-in called ``sum()`` to compute it::
735
736 reduce(operator.add, [1,2,3,4], 0) =>
737 10
738 sum([1,2,3,4]) =>
739 10
740 sum([]) =>
741 0
742
743For many uses of ``reduce()``, though, it can be clearer to just write
744the obvious ``for`` loop::
745
746 # Instead of:
747 product = reduce(operator.mul, [1,2,3], 1)
748
749 # You can write:
750 product = 1
751 for i in [1,2,3]:
752 product *= i
753
754
755``enumerate(iter)`` counts off the elements in the iterable, returning
7562-tuples containing the count and each element.
757
758::
759
760 enumerate(['subject', 'verb', 'object']) =>
761 (0, 'subject'), (1, 'verb'), (2, 'object')
762
763``enumerate()`` is often used when looping through a list
764and recording the indexes at which certain conditions are met::
765
766 f = open('data.txt', 'r')
767 for i, line in enumerate(f):
768 if line.strip() == '':
769 print 'Blank line at line #%i' % i
770
771``sorted(iterable, [cmp=None], [key=None], [reverse=False)``
772collects all the elements of the iterable into a list, sorts
773the list, and returns the sorted result. The ``cmp``, ``key``,
774and ``reverse`` arguments are passed through to the
775constructed list's ``.sort()`` method.
776
777::
778
779 import random
780 # Generate 8 random numbers between [0, 10000)
781 rand_list = random.sample(range(10000), 8)
782 rand_list =>
783 [769, 7953, 9828, 6431, 8442, 9878, 6213, 2207]
784 sorted(rand_list) =>
785 [769, 2207, 6213, 6431, 7953, 8442, 9828, 9878]
786 sorted(rand_list, reverse=True) =>
787 [9878, 9828, 8442, 7953, 6431, 6213, 2207, 769]
788
789(For a more detailed discussion of sorting, see the Sorting mini-HOWTO
790in the Python wiki at http://wiki.python.org/moin/HowTo/Sorting.)
791
792The ``any(iter)`` and ``all(iter)`` built-ins look at
793the truth values of an iterable's contents. ``any()`` returns
794True if any element in the iterable is a true value, and ``all()``
795returns True if all of the elements are true values::
796
797 any([0,1,0]) =>
798 True
799 any([0,0,0]) =>
800 False
801 any([1,1,1]) =>
802 True
803 all([0,1,0]) =>
804 False
805 all([0,0,0]) =>
806 False
807 all([1,1,1]) =>
808 True
809
810
811Small functions and the lambda statement
812----------------------------------------------
813
814When writing functional-style programs, you'll often need little
815functions that act as predicates or that combine elements in some way.
816
817If there's a Python built-in or a module function that's suitable, you
818don't need to define a new function at all::
819
820 stripped_lines = [line.strip() for line in lines]
821 existing_files = filter(os.path.exists, file_list)
822
823If the function you need doesn't exist, you need to write it. One way
824to write small functions is to use the ``lambda`` statement. ``lambda``
825takes a number of parameters and an expression combining these parameters,
Andrew M. Kuchling0acdb932006-11-08 14:24:03 +0000826and creates a small function that returns the value of the expression::
Andrew M. Kuchling22145072006-08-22 23:13:43 +0000827
828 lowercase = lambda x: x.lower()
829
830 print_assign = lambda name, value: name + '=' + str(value)
831
832 adder = lambda x, y: x+y
833
834An alternative is to just use the ``def`` statement and define a
835function in the usual way::
836
837 def lowercase(x):
838 return x.lower()
839
840 def print_assign(name, value):
841 return name + '=' + str(value)
842
843 def adder(x,y):
844 return x + y
845
846Which alternative is preferable? That's a style question; my usual
Andrew M. Kuchling0acdb932006-11-08 14:24:03 +0000847course is to avoid using ``lambda``.
Andrew M. Kuchling22145072006-08-22 23:13:43 +0000848
Andrew M. Kuchling0acdb932006-11-08 14:24:03 +0000849One reason for my preference is that ``lambda`` is quite limited in
850the functions it can define. The result has to be computable as a
851single expression, which means you can't have multiway
852``if... elif... else`` comparisons or ``try... except`` statements.
853If you try to do too much in a ``lambda`` statement, you'll end up
854with an overly complicated expression that's hard to read. Quick,
855what's the following code doing?
Andrew M. Kuchling22145072006-08-22 23:13:43 +0000856
857::
858
859 total = reduce(lambda a, b: (0, a[1] + b[1]), items)[1]
860
861You can figure it out, but it takes time to disentangle the expression
862to figure out what's going on. Using a short nested
863``def`` statements makes things a little bit better::
864
865 def combine (a, b):
866 return 0, a[1] + b[1]
867
868 total = reduce(combine, items)[1]
869
870But it would be best of all if I had simply used a ``for`` loop::
871
872 total = 0
873 for a, b in items:
874 total += b
875
876Or the ``sum()`` built-in and a generator expression::
877
878 total = sum(b for a,b in items)
879
880Many uses of ``reduce()`` are clearer when written as ``for`` loops.
881
882Fredrik Lundh once suggested the following set of rules for refactoring
883uses of ``lambda``:
884
8851) Write a lambda function.
8862) Write a comment explaining what the heck that lambda does.
8873) Study the comment for a while, and think of a name that captures
888 the essence of the comment.
8894) Convert the lambda to a def statement, using that name.
8905) Remove the comment.
891
Andrew M. Kuchling0acdb932006-11-08 14:24:03 +0000892I really like these rules, but you're free to disagree that this
893lambda-free style is better.
Andrew M. Kuchling22145072006-08-22 23:13:43 +0000894
895
896The itertools module
897-----------------------
898
899The ``itertools`` module contains a number of commonly-used iterators
900as well as functions for combining several iterators. This section
901will introduce the module's contents by showing small examples.
902
903``itertools.count(n)`` returns an infinite stream of
904integers, increasing by 1 each time. You can optionally supply the
905starting number, which defaults to 0::
906
907 itertools.count() =>
908 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ...
909 itertools.count(10) =>
910 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, ...
911
912``itertools.cycle(iter)`` saves a copy of the contents of a provided
913iterable and returns a new iterator that returns its elements from
914first to last. The new iterator will repeat these elements infinitely.
915
916::
917
918 itertools.cycle([1,2,3,4,5]) =>
919 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, ...
920
921``itertools.repeat(elem, [n])`` returns the provided element ``n``
922times, or returns the element endlessly if ``n`` is not provided.
923
924::
925
926 itertools.repeat('abc') =>
927 abc, abc, abc, abc, abc, abc, abc, abc, abc, abc, ...
928 itertools.repeat('abc', 5) =>
929 abc, abc, abc, abc, abc
930
931``itertools.chain(iterA, iterB, ...)`` takes an arbitrary number of
932iterables as input, and returns all the elements of the first
933iterator, then all the elements of the second, and so on, until all of
934the iterables have been exhausted.
935
936::
937
938 itertools.chain(['a', 'b', 'c'], (1, 2, 3)) =>
939 a, b, c, 1, 2, 3
940
941``itertools.izip(iterA, iterB, ...)`` takes one element from each iterable
942and returns them in a tuple::
943
944 itertools.izip(['a', 'b', 'c'], (1, 2, 3)) =>
945 ('a', 1), ('b', 2), ('c', 3)
946
Andrew M. Kuchling9efdd782006-11-08 14:14:30 +0000947It's similiar to the built-in ``zip()`` function, but doesn't
948construct an in-memory list and exhaust all the input iterators before
949returning; instead tuples are constructed and returned only if they're
950requested. (The technical term for this behaviour is
951`lazy evaluation <http://en.wikipedia.org/wiki/Lazy_evaluation>`__.)
952
Andrew M. Kuchling22145072006-08-22 23:13:43 +0000953This iterator is intended to be used with iterables that are all of
954the same length. If the iterables are of different lengths, the
955resulting stream will be the same length as the shortest iterable.
956
957::
958
959 itertools.izip(['a', 'b'], (1, 2, 3)) =>
960 ('a', 1), ('b', 2)
961
962You should avoid doing this, though, because an element may be taken
963from the longer iterators and discarded. This means you can't go on
964to use the iterators further because you risk skipping a discarded
965element.
966
967``itertools.islice(iter, [start], stop, [step])`` returns a stream
Andrew M. Kuchling0acdb932006-11-08 14:24:03 +0000968that's a slice of the iterator. With a single ``stop`` argument,
969it will return the first ``stop``
Andrew M. Kuchling22145072006-08-22 23:13:43 +0000970elements. If you supply a starting index, you'll get ``stop-start``
Andrew M. Kuchling0acdb932006-11-08 14:24:03 +0000971elements, and if you supply a value for ``step`, elements will be
Andrew M. Kuchling22145072006-08-22 23:13:43 +0000972skipped accordingly. Unlike Python's string and list slicing, you
973can't use negative values for ``start``, ``stop``, or ``step``.
974
975::
976
977 itertools.islice(range(10), 8) =>
978 0, 1, 2, 3, 4, 5, 6, 7
979 itertools.islice(range(10), 2, 8) =>
980 2, 3, 4, 5, 6, 7
981 itertools.islice(range(10), 2, 8, 2) =>
982 2, 4, 6
983
984``itertools.tee(iter, [n])`` replicates an iterator; it returns ``n``
985independent iterators that will all return the contents of the source
986iterator. If you don't supply a value for ``n``, the default is 2.
987Replicating iterators requires saving some of the contents of the source
988iterator, so this can consume significant memory if the iterator is large
989and one of the new iterators is consumed more than the others.
990
991::
992
993 itertools.tee( itertools.count() ) =>
994 iterA, iterB
995
996 where iterA ->
997 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ...
998
999 and iterB ->
1000 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ...
1001
1002
1003Two functions are used for calling other functions on the contents of an
1004iterable.
1005
1006``itertools.imap(f, iterA, iterB, ...)`` returns
1007a stream containing ``f(iterA[0], iterB[0]), f(iterA[1], iterB[1]),
1008f(iterA[2], iterB[2]), ...``::
1009
1010 itertools.imap(operator.add, [5, 6, 5], [1, 2, 3]) =>
1011 6, 8, 8
1012
1013The ``operator`` module contains a set of functions
1014corresponding to Python's operators. Some examples are
1015``operator.add(a, b)`` (adds two values),
1016``operator.ne(a, b)`` (same as ``a!=b``),
1017and
1018``operator.attrgetter('id')`` (returns a callable that
1019fetches the ``"id"`` attribute).
1020
1021``itertools.starmap(func, iter)`` assumes that the iterable will
1022return a stream of tuples, and calls ``f()`` using these tuples as the
1023arguments::
1024
1025 itertools.starmap(os.path.join,
1026 [('/usr', 'bin', 'java'), ('/bin', 'python'),
1027 ('/usr', 'bin', 'perl'),('/usr', 'bin', 'ruby')])
1028 =>
1029 /usr/bin/java, /bin/python, /usr/bin/perl, /usr/bin/ruby
1030
1031Another group of functions chooses a subset of an iterator's elements
1032based on a predicate.
1033
1034``itertools.ifilter(predicate, iter)`` returns all the elements for
1035which the predicate returns true::
1036
1037 def is_even(x):
1038 return (x % 2) == 0
1039
1040 itertools.ifilter(is_even, itertools.count()) =>
1041 0, 2, 4, 6, 8, 10, 12, 14, ...
1042
1043``itertools.ifilterfalse(predicate, iter)`` is the opposite,
1044returning all elements for which the predicate returns false::
1045
1046 itertools.ifilterfalse(is_even, itertools.count()) =>
1047 1, 3, 5, 7, 9, 11, 13, 15, ...
1048
1049``itertools.takewhile(predicate, iter)`` returns elements for as long
1050as the predicate returns true. Once the predicate returns false,
1051the iterator will signal the end of its results.
1052
1053::
1054
1055 def less_than_10(x):
1056 return (x < 10)
1057
1058 itertools.takewhile(less_than_10, itertools.count()) =>
1059 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
1060
1061 itertools.takewhile(is_even, itertools.count()) =>
1062 0
1063
1064``itertools.dropwhile(predicate, iter)`` discards elements while the
1065predicate returns true, and then returns the rest of the iterable's
1066results.
1067
1068::
1069
1070 itertools.dropwhile(less_than_10, itertools.count()) =>
1071 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, ...
1072
1073 itertools.dropwhile(is_even, itertools.count()) =>
1074 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, ...
1075
1076
1077The last function I'll discuss, ``itertools.groupby(iter,
1078key_func=None)``, is the most complicated. ``key_func(elem)`` is a
1079function that can compute a key value for each element returned by the
1080iterable. If you don't supply a key function, the key is simply each
1081element itself.
1082
1083``groupby()`` collects all the consecutive elements from the
1084underlying iterable that have the same key value, and returns a stream
1085of 2-tuples containing a key value and an iterator for the elements
1086with that key.
1087
1088::
1089
1090 city_list = [('Decatur', 'AL'), ('Huntsville', 'AL'), ('Selma', 'AL'),
1091 ('Anchorage', 'AK'), ('Nome', 'AK'),
1092 ('Flagstaff', 'AZ'), ('Phoenix', 'AZ'), ('Tucson', 'AZ'),
1093 ...
1094 ]
1095
1096 def get_state ((city, state)):
1097 return state
1098
1099 itertools.groupby(city_list, get_state) =>
1100 ('AL', iterator-1),
1101 ('AK', iterator-2),
1102 ('AZ', iterator-3), ...
1103
1104 where
1105 iterator-1 =>
1106 ('Decatur', 'AL'), ('Huntsville', 'AL'), ('Selma', 'AL')
1107 iterator-2 =>
1108 ('Anchorage', 'AK'), ('Nome', 'AK')
1109 iterator-3 =>
1110 ('Flagstaff', 'AZ'), ('Phoenix', 'AZ'), ('Tucson', 'AZ')
1111
1112``groupby()`` assumes that the underlying iterable's contents will
1113already be sorted based on the key. Note that the returned iterators
1114also use the underlying iterable, so you have to consume the results
1115of iterator-1 before requesting iterator-2 and its corresponding key.
1116
1117
1118The functools module
1119----------------------------------------------
1120
1121The ``functools`` module in Python 2.5 contains some higher-order
1122functions. A **higher-order function** takes functions as input and
1123returns new functions. The most useful tool in this module is the
1124``partial()`` function.
1125
1126For programs written in a functional style, you'll sometimes want to
1127construct variants of existing functions that have some of the
1128parameters filled in. Consider a Python function ``f(a, b, c)``; you
1129may wish to create a new function ``g(b, c)`` that was equivalent to
1130``f(1, b, c)``. This is called "partial function application".
1131
1132The constructor for ``partial`` takes the arguments ``(function, arg1,
1133arg2, ... kwarg1=value1, kwarg2=value2)``. The resulting object is
1134callable, so you can just call it to invoke ``function`` with the
1135filled-in arguments.
1136
1137Here's a small but realistic example::
1138
1139 import functools
1140
1141 def log (message, subsystem):
1142 "Write the contents of 'message' to the specified subsystem."
1143 print '%s: %s' % (subsystem, message)
1144 ...
1145
1146 server_log = functools.partial(log, subsystem='server')
1147 server_log('Unable to open socket')
1148
1149There are also third-party modules, such as Collin Winter's
1150`functional package <http://cheeseshop.python.org/pypi/functional>`__,
Andrew M. Kuchling9efdd782006-11-08 14:14:30 +00001151that are intended for use in functional-style programs. See below
1152for a section describing the ``functional`` mdoule.
1153
1154
1155The operator module
Andrew M. Kuchling0acdb932006-11-08 14:24:03 +00001156-------------------
Andrew M. Kuchling9efdd782006-11-08 14:14:30 +00001157
1158The ``operator`` module was mentioned earlier. It contains a set of
1159functions corresponding to Python's operators. These functions
1160are often useful in functional-style code because they save you
1161from writing trivial functions that perform a single operation.
1162
1163Some of the functions in this module are:
1164
1165* Math operations: ``add()``, ``sub()``, ``mul()``, ``div()``, ``floordiv()``,
1166 ``abs()``, ...
1167* Logical operations: ``not_()``, ``truth()``.
1168* Bitwise operations: ``and_()``, ``or_()``, ``invert()``.
1169* Comparisons: ``eq()``, ``ne()``, ``lt()``, ``le()``, ``gt()``, and ``ge()``.
1170* Object identity: ``is_()``, ``is_not()``.
1171
1172Consult `the operator module's documentation <http://docs.python.org/lib/module-operator.html>`__ for a complete
1173list.
1174
Andrew M. Kuchling22145072006-08-22 23:13:43 +00001175
1176
Andrew M. Kuchlingf4dcd1d2006-11-08 13:35:34 +00001177The functional module
Andrew M. Kuchling0acdb932006-11-08 14:24:03 +00001178---------------------
Andrew M. Kuchlingf4dcd1d2006-11-08 13:35:34 +00001179
1180Collin Winter's `functional module <http://oakwinter.com/code/functional/>`__
1181provides a number of more
1182advanced tools for functional programming. It also reimplements
1183several Python built-ins, trying to make them more intuitive to those
1184used to functional programming in other languages.
1185
1186This section contains an introduction to some of the most important
1187functions in ``functional``; full documentation can be found at `the
1188project's website <http://oakwinter.com/code/functional/documentation/>`__.
1189
1190``compose(outer, inner, unpack=False)``
1191
1192The ``compose()`` function implements function composition.
1193In other words, it returns a wrapper around the ``outer`` and ``inner`` callables, such
1194that the return value from ``inner`` is fed directly to ``outer``. That is,
1195
1196::
1197
1198 >>> def add(a, b):
1199 ... return a + b
1200 ...
1201 >>> def double(a):
1202 ... return 2 * a
1203 ...
1204 >>> compose(double, add)(5, 6)
1205 22
1206
1207is equivalent to
1208
1209::
1210
1211 >>> double(add(5, 6))
1212 22
1213
1214The ``unpack`` keyword is provided to work around the fact that Python functions are not always
1215`fully curried <http://en.wikipedia.org/wiki/Currying>`__.
1216By default, it is expected that the ``inner`` function will return a single object and that the ``outer``
1217function will take a single argument. Setting the ``unpack`` argument causes ``compose`` to expect a
1218tuple from ``inner`` which will be expanded before being passed to ``outer``. Put simply,
1219
1220::
1221
1222 compose(f, g)(5, 6)
1223
1224is equivalent to::
1225
1226 f(g(5, 6))
1227
1228while
1229
1230::
1231
1232 compose(f, g, unpack=True)(5, 6)
1233
1234is equivalent to::
1235
1236 f(*g(5, 6))
1237
1238Even though ``compose()`` only accepts two functions, it's trivial to
1239build up a version that will compose any number of functions. We'll
1240use ``reduce()``, ``compose()`` and ``partial()`` (the last of which
1241is provided by both ``functional`` and ``functools``).
1242
1243::
1244
1245 from functional import compose, partial
1246
1247 multi_compose = partial(reduce, compose)
1248
1249
1250We can also use ``map()``, ``compose()`` and ``partial()`` to craft a
1251version of ``"".join(...)`` that converts its arguments to string::
1252
1253 from functional import compose, partial
1254
1255 join = compose("".join, partial(map, str))
1256
1257
1258``flip(func)``
1259
1260``flip()`` wraps the callable in ``func`` and
1261causes it to receive its non-keyword arguments in reverse order.
1262
1263::
1264
1265 >>> def triple(a, b, c):
1266 ... return (a, b, c)
1267 ...
1268 >>> triple(5, 6, 7)
1269 (5, 6, 7)
1270 >>>
1271 >>> flipped_triple = flip(triple)
1272 >>> flipped_triple(5, 6, 7)
1273 (7, 6, 5)
1274
1275``foldl(func, start, iterable)``
1276
1277``foldl()`` takes a binary function, a starting value (usually some kind of 'zero'), and an iterable.
1278The function is applied to the starting value and the first element of the list, then the result of
1279that and the second element of the list, then the result of that and the third element of the list,
1280and so on.
1281
1282This means that a call such as::
1283
1284 foldl(f, 0, [1, 2, 3])
1285
1286is equivalent to::
1287
1288 f(f(f(0, 1), 2), 3)
1289
1290
1291``foldl()`` is roughly equivalent to the following recursive function::
1292
1293 def foldl(func, start, seq):
1294 if len(seq) == 0:
1295 return start
1296
1297 return foldl(func, func(start, seq[0]), seq[1:])
1298
1299Speaking of equivalence, the above ``foldl`` call can be expressed in terms of the built-in ``reduce`` like
1300so::
1301
1302 reduce(f, [1, 2, 3], 0)
1303
1304
1305We can use ``foldl()``, ``operator.concat()`` and ``partial()`` to
1306write a cleaner, more aesthetically-pleasing version of Python's
1307``"".join(...)`` idiom::
1308
1309 from functional import foldl, partial
1310 from operator import concat
1311
1312 join = partial(foldl, concat, "")
1313
1314
Andrew M. Kuchling22145072006-08-22 23:13:43 +00001315Revision History and Acknowledgements
1316------------------------------------------------
1317
1318The author would like to thank the following people for offering
1319suggestions, corrections and assistance with various drafts of this
1320article: Ian Bicking, Nick Coghlan, Nick Efford, Raymond Hettinger,
1321Jim Jewett, Mike Krell, Leandro Lameiro, Jussi Salmela,
1322Collin Winter, Blake Winton.
1323
1324Version 0.1: posted June 30 2006.
1325
1326Version 0.11: posted July 1 2006. Typo fixes.
1327
1328Version 0.2: posted July 10 2006. Merged genexp and listcomp
1329sections into one. Typo fixes.
1330
1331Version 0.21: Added more references suggested on the tutor mailing list.
1332
Andrew M. Kuchling9efdd782006-11-08 14:14:30 +00001333Version 0.30: Adds a section on the ``functional`` module written by
1334Collin Winter; adds short section on the operator module; a few other
1335edits.
Andrew M. Kuchlingf4dcd1d2006-11-08 13:35:34 +00001336
Andrew M. Kuchling22145072006-08-22 23:13:43 +00001337
1338References
1339--------------------
1340
1341General
1342'''''''''''''''
1343
1344**Structure and Interpretation of Computer Programs**, by
1345Harold Abelson and Gerald Jay Sussman with Julie Sussman.
1346Full text at http://mitpress.mit.edu/sicp/.
1347In this classic textbook of computer science, chapters 2 and 3 discuss the
1348use of sequences and streams to organize the data flow inside a
1349program. The book uses Scheme for its examples, but many of the
1350design approaches described in these chapters are applicable to
1351functional-style Python code.
1352
1353http://www.defmacro.org/ramblings/fp.html: A general
1354introduction to functional programming that uses Java examples
1355and has a lengthy historical introduction.
1356
1357http://en.wikipedia.org/wiki/Functional_programming:
1358General Wikipedia entry describing functional programming.
1359
1360http://en.wikipedia.org/wiki/Coroutine:
1361Entry for coroutines.
1362
Andrew M. Kuchlingf4dcd1d2006-11-08 13:35:34 +00001363http://en.wikipedia.org/wiki/Currying:
1364Entry for the concept of currying.
Andrew M. Kuchling22145072006-08-22 23:13:43 +00001365
1366Python-specific
1367'''''''''''''''''''''''''''
1368
1369http://gnosis.cx/TPiP/:
1370The first chapter of David Mertz's book :title-reference:`Text Processing in Python`
1371discusses functional programming for text processing, in the section titled
1372"Utilizing Higher-Order Functions in Text Processing".
1373
1374Mertz also wrote a 3-part series of articles on functional programming
1375for IBM's DeveloperWorks site; see
1376`part 1 <http://www-128.ibm.com/developerworks/library/l-prog.html>`__,
1377`part 2 <http://www-128.ibm.com/developerworks/library/l-prog2.html>`__, and
1378`part 3 <http://www-128.ibm.com/developerworks/linux/library/l-prog3.html>`__,
1379
1380
1381Python documentation
1382'''''''''''''''''''''''''''
1383
1384http://docs.python.org/lib/module-itertools.html:
1385Documentation ``for the itertools`` module.
1386
1387http://docs.python.org/lib/module-operator.html:
1388Documentation ``for the operator`` module.
1389
1390http://www.python.org/dev/peps/pep-0289/:
1391PEP 289: "Generator Expressions"
1392
1393http://www.python.org/dev/peps/pep-0342/
1394PEP 342: "Coroutines via Enhanced Generators" describes the new generator
1395features in Python 2.5.
1396
1397.. comment
1398
1399 Topics to place
1400 -----------------------------
1401
1402 XXX os.walk()
1403
1404 XXX Need a large example.
1405
1406 But will an example add much? I'll post a first draft and see
1407 what the comments say.
1408
1409.. comment
1410
1411 Original outline:
1412 Introduction
1413 Idea of FP
1414 Programs built out of functions
1415 Functions are strictly input-output, no internal state
1416 Opposed to OO programming, where objects have state
1417
1418 Why FP?
1419 Formal provability
1420 Assignment is difficult to reason about
1421 Not very relevant to Python
1422 Modularity
1423 Small functions that do one thing
1424 Debuggability:
1425 Easy to test due to lack of state
1426 Easy to verify output from intermediate steps
1427 Composability
1428 You assemble a toolbox of functions that can be mixed
1429
1430 Tackling a problem
1431 Need a significant example
1432
1433 Iterators
1434 Generators
1435 The itertools module
1436 List comprehensions
1437 Small functions and the lambda statement
1438 Built-in functions
1439 map
1440 filter
1441 reduce
1442
1443.. comment
1444
1445 Handy little function for printing part of an iterator -- used
1446 while writing this document.
1447
1448 import itertools
1449 def print_iter(it):
1450 slice = itertools.islice(it, 10)
1451 for elem in slice[:-1]:
1452 sys.stdout.write(str(elem))
1453 sys.stdout.write(', ')
1454 print elem[-1]
1455
1456