Blame - Doc/library/difflib.rst - platform/external/python/cpython2

blob: 4da3be938d953012ae1df6f0f09d4d422621dcab [file] [log] [blame]

Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	1
				2	:mod:`difflib` --- Helpers for computing deltas
				3	===============================================
				4
				5	.. module:: difflib
				6	:synopsis: Helpers for computing differences between objects.
				7	.. moduleauthor:: Tim Peters <tim_one@users.sourceforge.net>
				8	.. sectionauthor:: Tim Peters <tim_one@users.sourceforge.net>
				9
				10
				11	.. % LaTeXification by Fred L. Drake, Jr. <fdrake@acm.org>.
				12
				13	.. versionadded:: 2.1
				14
Mark Summerfield	0752d20	2007-10-19 12:48:17 +0000	[diff] [blame]	15	This module provides classes and functions for comparing sequences. It
				16	can be used for example, for comparing files, and can produce difference
				17	information in various formats, including HTML and context and unified
				18	diffs. For comparing directories and files, see also, the :mod:`filecmp` module.
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	19
				20	.. class:: SequenceMatcher
				21
				22	This is a flexible class for comparing pairs of sequences of any type, so long
				23	as the sequence elements are hashable. The basic algorithm predates, and is a
				24	little fancier than, an algorithm published in the late 1980's by Ratcliff and
				25	Obershelp under the hyperbolic name "gestalt pattern matching." The idea is to
				26	find the longest contiguous matching subsequence that contains no "junk"
				27	elements (the Ratcliff and Obershelp algorithm doesn't address junk). The same
				28	idea is then applied recursively to the pieces of the sequences to the left and
				29	to the right of the matching subsequence. This does not yield minimal edit
				30	sequences, but does tend to yield matches that "look right" to people.
				31
				32	Timing: The basic Ratcliff-Obershelp algorithm is cubic time in the worst
				33	case and quadratic time in the expected case. :class:`SequenceMatcher` is
				34	quadratic time for the worst case and has expected-case behavior dependent in a
				35	complicated way on how many elements the sequences have in common; best case
				36	time is linear.
				37
				38
				39	.. class:: Differ
				40
				41	This is a class for comparing sequences of lines of text, and producing
				42	human-readable differences or deltas. Differ uses :class:`SequenceMatcher`
				43	both to compare sequences of lines, and to compare sequences of characters
				44	within similar (near-matching) lines.
				45
				46	Each line of a :class:`Differ` delta begins with a two-letter code:
				47
				48	+----------+-------------------------------------------+
				49	\| Code \| Meaning \|
				50	+==========+===========================================+
				51	\| ``'- '`` \| line unique to sequence 1 \|
				52	+----------+-------------------------------------------+
				53	\| ``'+ '`` \| line unique to sequence 2 \|
				54	+----------+-------------------------------------------+
				55	\| ``' '`` \| line common to both sequences \|
				56	+----------+-------------------------------------------+
				57	\| ``'? '`` \| line not present in either input sequence \|
				58	+----------+-------------------------------------------+
				59
				60	Lines beginning with '``?``' attempt to guide the eye to intraline differences,
				61	and were not present in either input sequence. These lines can be confusing if
				62	the sequences contain tab characters.
				63
				64
				65	.. class:: HtmlDiff
				66
				67	This class can be used to create an HTML table (or a complete HTML file
				68	containing the table) showing a side by side, line by line comparison of text
				69	with inter-line and intra-line change highlights. The table can be generated in
				70	either full or contextual difference mode.
				71
				72	The constructor for this class is:
				73
				74
				75	.. function:: __init__([tabsize][, wrapcolumn][, linejunk][, charjunk])
				76
				77	Initializes instance of :class:`HtmlDiff`.
				78
				79	tabsize is an optional keyword argument to specify tab stop spacing and
				80	defaults to ``8``.
				81
				82	wrapcolumn is an optional keyword to specify column number where lines are
				83	broken and wrapped, defaults to ``None`` where lines are not wrapped.
				84
				85	linejunk and charjunk are optional keyword arguments passed into ``ndiff()``
				86	(used by :class:`HtmlDiff` to generate the side by side HTML differences). See
				87	``ndiff()`` documentation for argument default values and descriptions.
				88
				89	The following methods are public:
				90
				91
				92	.. function:: make_file(fromlines, tolines [, fromdesc][, todesc][, context][, numlines])
				93
				94	Compares fromlines and tolines (lists of strings) and returns a string which
				95	is a complete HTML file containing a table showing line by line differences with
				96	inter-line and intra-line changes highlighted.
				97
				98	fromdesc and todesc are optional keyword arguments to specify from/to file
				99	column header strings (both default to an empty string).
				100
				101	context and numlines are both optional keyword arguments. Set context to
				102	``True`` when contextual differences are to be shown, else the default is
				103	``False`` to show the full files. numlines defaults to ``5``. When context
				104	is ``True`` numlines controls the number of context lines which surround the
				105	difference highlights. When context is ``False`` numlines controls the
				106	number of lines which are shown before a difference highlight when using the
				107	"next" hyperlinks (setting to zero would cause the "next" hyperlinks to place
				108	the next difference highlight at the top of the browser without any leading
				109	context).
				110
				111
				112	.. function:: make_table(fromlines, tolines [, fromdesc][, todesc][, context][, numlines])
				113
				114	Compares fromlines and tolines (lists of strings) and returns a string which
				115	is a complete HTML table showing line by line differences with inter-line and
				116	intra-line changes highlighted.
				117
				118	The arguments for this method are the same as those for the :meth:`make_file`
				119	method.
				120
				121	:file:`Tools/scripts/diff.py` is a command-line front-end to this class and
				122	contains a good example of its use.
				123
				124	.. versionadded:: 2.4
				125
				126
				127	.. function:: context_diff(a, b[, fromfile][, tofile][, fromfiledate][, tofiledate][, n][, lineterm])
				128
Georg Brandl	cf3fb25	2007-10-21 10:52:38 +0000	[diff] [blame^]	129	Compare a and b (lists of strings); return a delta (a :term:`generator`
				130	generating the delta lines) in context diff format.
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	131
				132	Context diffs are a compact way of showing just the lines that have changed plus
				133	a few lines of context. The changes are shown in a before/after style. The
				134	number of context lines is set by n which defaults to three.
				135
				136	By default, the diff control lines (those with ``***`` or ``---``) are created
				137	with a trailing newline. This is helpful so that inputs created from
				138	:func:`file.readlines` result in diffs that are suitable for use with
				139	:func:`file.writelines` since both the inputs and outputs have trailing
				140	newlines.
				141
				142	For inputs that do not have trailing newlines, set the lineterm argument to
				143	``""`` so that the output will be uniformly newline free.
				144
				145	The context diff format normally has a header for filenames and modification
				146	times. Any or all of these may be specified using strings for fromfile,
				147	tofile, fromfiledate, and tofiledate. The modification times are normally
				148	expressed in the format returned by :func:`time.ctime`. If not specified, the
				149	strings default to blanks.
				150
				151	:file:`Tools/scripts/diff.py` is a command-line front-end for this function.
				152
				153	.. versionadded:: 2.3
				154
				155
				156	.. function:: get_close_matches(word, possibilities[, n][, cutoff])
				157
				158	Return a list of the best "good enough" matches. word is a sequence for which
				159	close matches are desired (typically a string), and possibilities is a list of
				160	sequences against which to match word (typically a list of strings).
				161
				162	Optional argument n (default ``3``) is the maximum number of close matches to
				163	return; n must be greater than ``0``.
				164
				165	Optional argument cutoff (default ``0.6``) is a float in the range [0, 1].
				166	Possibilities that don't score at least that similar to word are ignored.
				167
				168	The best (no more than n) matches among the possibilities are returned in a
				169	list, sorted by similarity score, most similar first. ::
				170
				171	>>> get_close_matches('appel', ['ape', 'apple', 'peach', 'puppy'])
				172	['apple', 'ape']
				173	>>> import keyword
				174	>>> get_close_matches('wheel', keyword.kwlist)
				175	['while']
				176	>>> get_close_matches('apple', keyword.kwlist)
				177	[]
				178	>>> get_close_matches('accept', keyword.kwlist)
				179	['except']
				180
				181
				182	.. function:: ndiff(a, b[, linejunk][, charjunk])
				183
Georg Brandl	cf3fb25	2007-10-21 10:52:38 +0000	[diff] [blame^]	184	Compare a and b (lists of strings); return a :class:`Differ`\ -style
				185	delta (a :term:`generator` generating the delta lines).
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	186
				187	Optional keyword parameters linejunk and charjunk are for filter functions
				188	(or ``None``):
				189
				190	linejunk: A function that accepts a single string argument, and returns true
				191	if the string is junk, or false if not. The default is (``None``), starting with
				192	Python 2.3. Before then, the default was the module-level function
				193	:func:`IS_LINE_JUNK`, which filters out lines without visible characters, except
				194	for at most one pound character (``'#'``). As of Python 2.3, the underlying
				195	:class:`SequenceMatcher` class does a dynamic analysis of which lines are so
				196	frequent as to constitute noise, and this usually works better than the pre-2.3
				197	default.
				198
				199	charjunk: A function that accepts a character (a string of length 1), and
				200	returns if the character is junk, or false if not. The default is module-level
				201	function :func:`IS_CHARACTER_JUNK`, which filters out whitespace characters (a
				202	blank or tab; note: bad idea to include newline in this!).
				203
				204	:file:`Tools/scripts/ndiff.py` is a command-line front-end to this function. ::
				205
				206	>>> diff = ndiff('one\ntwo\nthree\n'.splitlines(1),
				207	... 'ore\ntree\nemu\n'.splitlines(1))
				208	>>> print ''.join(diff),
				209	- one
				210	? ^
				211	+ ore
				212	? ^
				213	- two
				214	- three
				215	? -
				216	+ tree
				217	+ emu
				218
				219
				220	.. function:: restore(sequence, which)
				221
				222	Return one of the two sequences that generated a delta.
				223
				224	Given a sequence produced by :meth:`Differ.compare` or :func:`ndiff`, extract
				225	lines originating from file 1 or 2 (parameter which), stripping off line
				226	prefixes.
				227
				228	Example::
				229
				230	>>> diff = ndiff('one\ntwo\nthree\n'.splitlines(1),
				231	... 'ore\ntree\nemu\n'.splitlines(1))
				232	>>> diff = list(diff) # materialize the generated delta into a list
				233	>>> print ''.join(restore(diff, 1)),
				234	one
				235	two
				236	three
				237	>>> print ''.join(restore(diff, 2)),
				238	ore
				239	tree
				240	emu
				241
				242
				243	.. function:: unified_diff(a, b[, fromfile][, tofile][, fromfiledate][, tofiledate][, n][, lineterm])
				244
Georg Brandl	cf3fb25	2007-10-21 10:52:38 +0000	[diff] [blame^]	245	Compare a and b (lists of strings); return a delta (a :term:`generator`
				246	generating the delta lines) in unified diff format.
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	247
				248	Unified diffs are a compact way of showing just the lines that have changed plus
				249	a few lines of context. The changes are shown in a inline style (instead of
				250	separate before/after blocks). The number of context lines is set by n which
				251	defaults to three.
				252
				253	By default, the diff control lines (those with ``---``, ``+++``, or ``@@``) are
				254	created with a trailing newline. This is helpful so that inputs created from
				255	:func:`file.readlines` result in diffs that are suitable for use with
				256	:func:`file.writelines` since both the inputs and outputs have trailing
				257	newlines.
				258
				259	For inputs that do not have trailing newlines, set the lineterm argument to
				260	``""`` so that the output will be uniformly newline free.
				261
				262	The context diff format normally has a header for filenames and modification
				263	times. Any or all of these may be specified using strings for fromfile,
				264	tofile, fromfiledate, and tofiledate. The modification times are normally
				265	expressed in the format returned by :func:`time.ctime`. If not specified, the
				266	strings default to blanks.
				267
				268	:file:`Tools/scripts/diff.py` is a command-line front-end for this function.
				269
				270	.. versionadded:: 2.3
				271
				272
				273	.. function:: IS_LINE_JUNK(line)
				274
				275	Return true for ignorable lines. The line line is ignorable if line is
				276	blank or contains a single ``'#'``, otherwise it is not ignorable. Used as a
				277	default for parameter linejunk in :func:`ndiff` before Python 2.3.
				278
				279
				280	.. function:: IS_CHARACTER_JUNK(ch)
				281
				282	Return true for ignorable characters. The character ch is ignorable if ch
				283	is a space or tab, otherwise it is not ignorable. Used as a default for
				284	parameter charjunk in :func:`ndiff`.
				285
				286
				287	.. seealso::
				288
				289	`Pattern Matching: The Gestalt Approach <http://www.ddj.com/184407970?pgno=5>`_
				290	Discussion of a similar algorithm by John W. Ratcliff and D. E. Metzener. This
				291	was published in `Dr. Dobb's Journal <http://www.ddj.com/>`_ in July, 1988.
				292
				293
				294	.. _sequence-matcher:
				295
				296	SequenceMatcher Objects
				297	-----------------------
				298
				299	The :class:`SequenceMatcher` class has this constructor:
				300
				301
				302	.. class:: SequenceMatcher([isjunk[, a[, b]]])
				303
				304	Optional argument isjunk must be ``None`` (the default) or a one-argument
				305	function that takes a sequence element and returns true if and only if the
				306	element is "junk" and should be ignored. Passing ``None`` for isjunk is
				307	equivalent to passing ``lambda x: 0``; in other words, no elements are ignored.
				308	For example, pass::
				309
				310	lambda x: x in " \t"
				311
				312	if you're comparing lines as sequences of characters, and don't want to synch up
				313	on blanks or hard tabs.
				314
				315	The optional arguments a and b are sequences to be compared; both default to
				316	empty strings. The elements of both sequences must be hashable.
				317
				318	:class:`SequenceMatcher` objects have the following methods:
				319
				320
				321	.. method:: SequenceMatcher.set_seqs(a, b)
				322
				323	Set the two sequences to be compared.
				324
				325	:class:`SequenceMatcher` computes and caches detailed information about the
				326	second sequence, so if you want to compare one sequence against many sequences,
				327	use :meth:`set_seq2` to set the commonly used sequence once and call
				328	:meth:`set_seq1` repeatedly, once for each of the other sequences.
				329
				330
				331	.. method:: SequenceMatcher.set_seq1(a)
				332
				333	Set the first sequence to be compared. The second sequence to be compared is
				334	not changed.
				335
				336
				337	.. method:: SequenceMatcher.set_seq2(b)
				338
				339	Set the second sequence to be compared. The first sequence to be compared is
				340	not changed.
				341
				342
				343	.. method:: SequenceMatcher.find_longest_match(alo, ahi, blo, bhi)
				344
				345	Find longest matching block in ``a[alo:ahi]`` and ``b[blo:bhi]``.
				346
				347	If isjunk was omitted or ``None``, :meth:`get_longest_match` returns ``(i, j,
				348	k)`` such that ``a[i:i+k]`` is equal to ``b[j:j+k]``, where ``alo <= i <= i+k <=
				349	ahi`` and ``blo <= j <= j+k <= bhi``. For all ``(i', j', k')`` meeting those
				350	conditions, the additional conditions ``k >= k'``, ``i <= i'``, and if ``i ==
				351	i'``, ``j <= j'`` are also met. In other words, of all maximal matching blocks,
				352	return one that starts earliest in a, and of all those maximal matching blocks
				353	that start earliest in a, return the one that starts earliest in b. ::
				354
				355	>>> s = SequenceMatcher(None, " abcd", "abcd abcd")
				356	>>> s.find_longest_match(0, 5, 0, 9)
				357	(0, 4, 5)
				358
				359	If isjunk was provided, first the longest matching block is determined as
				360	above, but with the additional restriction that no junk element appears in the
				361	block. Then that block is extended as far as possible by matching (only) junk
				362	elements on both sides. So the resulting block never matches on junk except as
				363	identical junk happens to be adjacent to an interesting match.
				364
				365	Here's the same example as before, but considering blanks to be junk. That
				366	prevents ``' abcd'`` from matching the ``' abcd'`` at the tail end of the second
				367	sequence directly. Instead only the ``'abcd'`` can match, and matches the
				368	leftmost ``'abcd'`` in the second sequence::
				369
				370	>>> s = SequenceMatcher(lambda x: x==" ", " abcd", "abcd abcd")
				371	>>> s.find_longest_match(0, 5, 0, 9)
				372	(1, 0, 4)
				373
				374	If no blocks match, this returns ``(alo, blo, 0)``.
				375
				376
				377	.. method:: SequenceMatcher.get_matching_blocks()
				378
				379	Return list of triples describing matching subsequences. Each triple is of the
				380	form ``(i, j, n)``, and means that ``a[i:i+n] == b[j:j+n]``. The triples are
				381	monotonically increasing in i and j.
				382
				383	The last triple is a dummy, and has the value ``(len(a), len(b), 0)``. It is
				384	the only triple with ``n == 0``. If ``(i, j, n)`` and ``(i', j', n')`` are
				385	adjacent triples in the list, and the second is not the last triple in the list,
				386	then ``i+n != i'`` or ``j+n != j'``; in other words, adjacent triples always
				387	describe non-adjacent equal blocks.
				388
				389	.. % Explain why a dummy is used!
				390
				391	.. versionchanged:: 2.5
				392	The guarantee that adjacent triples always describe non-adjacent blocks was
				393	implemented.
				394
				395	::
				396
				397	>>> s = SequenceMatcher(None, "abxcd", "abcd")
				398	>>> s.get_matching_blocks()
				399	[(0, 0, 2), (3, 2, 2), (5, 4, 0)]
				400
				401
				402	.. method:: SequenceMatcher.get_opcodes()
				403
				404	Return list of 5-tuples describing how to turn a into b. Each tuple is of
				405	the form ``(tag, i1, i2, j1, j2)``. The first tuple has ``i1 == j1 == 0``, and
				406	remaining tuples have i1 equal to the i2 from the preceding tuple, and,
				407	likewise, j1 equal to the previous j2.
				408
				409	The tag values are strings, with these meanings:
				410
				411	+---------------+---------------------------------------------+
				412	\| Value \| Meaning \|
				413	+===============+=============================================+
				414	\| ``'replace'`` \| ``a[i1:i2]`` should be replaced by \|
				415	\| \| ``b[j1:j2]``. \|
				416	+---------------+---------------------------------------------+
				417	\| ``'delete'`` \| ``a[i1:i2]`` should be deleted. Note that \|
				418	\| \| ``j1 == j2`` in this case. \|
				419	+---------------+---------------------------------------------+
				420	\| ``'insert'`` \| ``b[j1:j2]`` should be inserted at \|
				421	\| \| ``a[i1:i1]``. Note that ``i1 == i2`` in \|
				422	\| \| this case. \|
				423	+---------------+---------------------------------------------+
				424	\| ``'equal'`` \| ``a[i1:i2] == b[j1:j2]`` (the sub-sequences \|
				425	\| \| are equal). \|
				426	+---------------+---------------------------------------------+
				427
				428	For example::
				429
				430	>>> a = "qabxcd"
				431	>>> b = "abycdf"
				432	>>> s = SequenceMatcher(None, a, b)
				433	>>> for tag, i1, i2, j1, j2 in s.get_opcodes():
				434	... print ("%7s a[%d:%d] (%s) b[%d:%d] (%s)" %
				435	... (tag, i1, i2, a[i1:i2], j1, j2, b[j1:j2]))
				436	delete a[0:1] (q) b[0:0] ()
				437	equal a[1:3] (ab) b[0:2] (ab)
				438	replace a[3:4] (x) b[2:3] (y)
				439	equal a[4:6] (cd) b[3:5] (cd)
				440	insert a[6:6] () b[5:6] (f)
				441
				442
				443	.. method:: SequenceMatcher.get_grouped_opcodes([n])
				444
Georg Brandl	cf3fb25	2007-10-21 10:52:38 +0000	[diff] [blame^]	445	Return a :term:`generator` of groups with up to n lines of context.
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	446
				447	Starting with the groups returned by :meth:`get_opcodes`, this method splits out
				448	smaller change clusters and eliminates intervening ranges which have no changes.
				449
				450	The groups are returned in the same format as :meth:`get_opcodes`.
				451
				452	.. versionadded:: 2.3
				453
				454
				455	.. method:: SequenceMatcher.ratio()
				456
				457	Return a measure of the sequences' similarity as a float in the range [0, 1].
				458
				459	Where T is the total number of elements in both sequences, and M is the number
				460	of matches, this is 2.0\*M / T. Note that this is ``1.0`` if the sequences are
				461	identical, and ``0.0`` if they have nothing in common.
				462
				463	This is expensive to compute if :meth:`get_matching_blocks` or
				464	:meth:`get_opcodes` hasn't already been called, in which case you may want to
				465	try :meth:`quick_ratio` or :meth:`real_quick_ratio` first to get an upper bound.
				466
				467
				468	.. method:: SequenceMatcher.quick_ratio()
				469
				470	Return an upper bound on :meth:`ratio` relatively quickly.
				471
				472	This isn't defined beyond that it is an upper bound on :meth:`ratio`, and is
				473	faster to compute.
				474
				475
				476	.. method:: SequenceMatcher.real_quick_ratio()
				477
				478	Return an upper bound on :meth:`ratio` very quickly.
				479
				480	This isn't defined beyond that it is an upper bound on :meth:`ratio`, and is
				481	faster to compute than either :meth:`ratio` or :meth:`quick_ratio`.
				482
				483	The three methods that return the ratio of matching to total characters can give
				484	different results due to differing levels of approximation, although
				485	:meth:`quick_ratio` and :meth:`real_quick_ratio` are always at least as large as
				486	:meth:`ratio`::
				487
				488	>>> s = SequenceMatcher(None, "abcd", "bcde")
				489	>>> s.ratio()
				490	0.75
				491	>>> s.quick_ratio()
				492	0.75
				493	>>> s.real_quick_ratio()
				494	1.0
				495
				496
				497	.. _sequencematcher-examples:
				498
				499	SequenceMatcher Examples
				500	------------------------
				501
				502	This example compares two strings, considering blanks to be "junk:" ::
				503
				504	>>> s = SequenceMatcher(lambda x: x == " ",
				505	... "private Thread currentThread;",
				506	... "private volatile Thread currentThread;")
				507
				508	:meth:`ratio` returns a float in [0, 1], measuring the similarity of the
				509	sequences. As a rule of thumb, a :meth:`ratio` value over 0.6 means the
				510	sequences are close matches::
				511
				512	>>> print round(s.ratio(), 3)
				513	0.866
				514
				515	If you're only interested in where the sequences match,
				516	:meth:`get_matching_blocks` is handy::
				517
				518	>>> for block in s.get_matching_blocks():
				519	... print "a[%d] and b[%d] match for %d elements" % block
				520	a[0] and b[0] match for 8 elements
				521	a[8] and b[17] match for 6 elements
				522	a[14] and b[23] match for 15 elements
				523	a[29] and b[38] match for 0 elements
				524
				525	Note that the last tuple returned by :meth:`get_matching_blocks` is always a
				526	dummy, ``(len(a), len(b), 0)``, and this is the only case in which the last
				527	tuple element (number of elements matched) is ``0``.
				528
				529	If you want to know how to change the first sequence into the second, use
				530	:meth:`get_opcodes`::
				531
				532	>>> for opcode in s.get_opcodes():
				533	... print "%6s a[%d:%d] b[%d:%d]" % opcode
				534	equal a[0:8] b[0:8]
				535	insert a[8:8] b[8:17]
				536	equal a[8:14] b[17:23]
				537	equal a[14:29] b[23:38]
				538
				539	See also the function :func:`get_close_matches` in this module, which shows how
				540	simple code building on :class:`SequenceMatcher` can be used to do useful work.
				541
				542
				543	.. _differ-objects:
				544
				545	Differ Objects
				546	--------------
				547
				548	Note that :class:`Differ`\ -generated deltas make no claim to be minimal
				549	diffs. To the contrary, minimal diffs are often counter-intuitive, because they
				550	synch up anywhere possible, sometimes accidental matches 100 pages apart.
				551	Restricting synch points to contiguous matches preserves some notion of
				552	locality, at the occasional cost of producing a longer diff.
				553
				554	The :class:`Differ` class has this constructor:
				555
				556
				557	.. class:: Differ([linejunk[, charjunk]])
				558
				559	Optional keyword parameters linejunk and charjunk are for filter functions
				560	(or ``None``):
				561
				562	linejunk: A function that accepts a single string argument, and returns true
				563	if the string is junk. The default is ``None``, meaning that no line is
				564	considered junk.
				565
				566	charjunk: A function that accepts a single character argument (a string of
				567	length 1), and returns true if the character is junk. The default is ``None``,
				568	meaning that no character is considered junk.
				569
				570	:class:`Differ` objects are used (deltas generated) via a single method:
				571
				572
				573	.. method:: Differ.compare(a, b)
				574
				575	Compare two sequences of lines, and generate the delta (a sequence of lines).
				576
				577	Each sequence must contain individual single-line strings ending with newlines.
				578	Such sequences can be obtained from the :meth:`readlines` method of file-like
				579	objects. The delta generated also consists of newline-terminated strings, ready
				580	to be printed as-is via the :meth:`writelines` method of a file-like object.
				581
				582
				583	.. _differ-examples:
				584
				585	Differ Example
				586	--------------
				587
				588	This example compares two texts. First we set up the texts, sequences of
				589	individual single-line strings ending with newlines (such sequences can also be
				590	obtained from the :meth:`readlines` method of file-like objects)::
				591
				592	>>> text1 = ''' 1. Beautiful is better than ugly.
				593	... 2. Explicit is better than implicit.
				594	... 3. Simple is better than complex.
				595	... 4. Complex is better than complicated.
				596	... '''.splitlines(1)
				597	>>> len(text1)
				598	4
				599	>>> text1[0][-1]
				600	'\n'
				601	>>> text2 = ''' 1. Beautiful is better than ugly.
				602	... 3. Simple is better than complex.
				603	... 4. Complicated is better than complex.
				604	... 5. Flat is better than nested.
				605	... '''.splitlines(1)
				606
				607	Next we instantiate a Differ object::
				608
				609	>>> d = Differ()
				610
				611	Note that when instantiating a :class:`Differ` object we may pass functions to
				612	filter out line and character "junk." See the :meth:`Differ` constructor for
				613	details.
				614
				615	Finally, we compare the two::
				616
				617	>>> result = list(d.compare(text1, text2))
				618
				619	``result`` is a list of strings, so let's pretty-print it::
				620
				621	>>> from pprint import pprint
				622	>>> pprint(result)
				623	[' 1. Beautiful is better than ugly.\n',
				624	'- 2. Explicit is better than implicit.\n',
				625	'- 3. Simple is better than complex.\n',
				626	'+ 3. Simple is better than complex.\n',
				627	'? ++ \n',
				628	'- 4. Complex is better than complicated.\n',
				629	'? ^ ---- ^ \n',
				630	'+ 4. Complicated is better than complex.\n',
				631	'? ++++ ^ ^ \n',
				632	'+ 5. Flat is better than nested.\n']
				633
				634	As a single multi-line string it looks like this::
				635
				636	>>> import sys
				637	>>> sys.stdout.writelines(result)
				638	1. Beautiful is better than ugly.
				639	- 2. Explicit is better than implicit.
				640	- 3. Simple is better than complex.
				641	+ 3. Simple is better than complex.
				642	? ++
				643	- 4. Complex is better than complicated.
				644	? ^ ---- ^
				645	+ 4. Complicated is better than complex.
				646	? ++++ ^ ^
				647	+ 5. Flat is better than nested.
				648