Blame - Doc/library/difflib.rst - platform/external/python/cpython3

blob: d3724ddc777451a8d4d862a3a24a465d17089305 [file] [log] [blame]

Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1
				2	:mod:`difflib` --- Helpers for computing deltas
				3	===============================================
				4
				5	.. module:: difflib
				6	:synopsis: Helpers for computing differences between objects.
				7	.. moduleauthor:: Tim Peters <tim_one@users.sourceforge.net>
				8	.. sectionauthor:: Tim Peters <tim_one@users.sourceforge.net>
				9
				10
				11	.. % LaTeXification by Fred L. Drake, Jr. <fdrake@acm.org>.
				12
Georg Brandl	9afde1c	2007-11-01 20:32:30 +0000	[diff] [blame]	13	This module provides classes and functions for comparing sequences. It
				14	can be used for example, for comparing files, and can produce difference
				15	information in various formats, including HTML and context and unified
				16	diffs. For comparing directories and files, see also, the :mod:`filecmp` module.
				17
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	18	.. class:: SequenceMatcher
				19
				20	This is a flexible class for comparing pairs of sequences of any type, so long
				21	as the sequence elements are hashable. The basic algorithm predates, and is a
				22	little fancier than, an algorithm published in the late 1980's by Ratcliff and
				23	Obershelp under the hyperbolic name "gestalt pattern matching." The idea is to
				24	find the longest contiguous matching subsequence that contains no "junk"
				25	elements (the Ratcliff and Obershelp algorithm doesn't address junk). The same
				26	idea is then applied recursively to the pieces of the sequences to the left and
				27	to the right of the matching subsequence. This does not yield minimal edit
				28	sequences, but does tend to yield matches that "look right" to people.
				29
				30	Timing: The basic Ratcliff-Obershelp algorithm is cubic time in the worst
				31	case and quadratic time in the expected case. :class:`SequenceMatcher` is
				32	quadratic time for the worst case and has expected-case behavior dependent in a
				33	complicated way on how many elements the sequences have in common; best case
				34	time is linear.
				35
				36
				37	.. class:: Differ
				38
				39	This is a class for comparing sequences of lines of text, and producing
				40	human-readable differences or deltas. Differ uses :class:`SequenceMatcher`
				41	both to compare sequences of lines, and to compare sequences of characters
				42	within similar (near-matching) lines.
				43
				44	Each line of a :class:`Differ` delta begins with a two-letter code:
				45
				46	+----------+-------------------------------------------+
				47	\| Code \| Meaning \|
				48	+==========+===========================================+
				49	\| ``'- '`` \| line unique to sequence 1 \|
				50	+----------+-------------------------------------------+
				51	\| ``'+ '`` \| line unique to sequence 2 \|
				52	+----------+-------------------------------------------+
				53	\| ``' '`` \| line common to both sequences \|
				54	+----------+-------------------------------------------+
				55	\| ``'? '`` \| line not present in either input sequence \|
				56	+----------+-------------------------------------------+
				57
				58	Lines beginning with '``?``' attempt to guide the eye to intraline differences,
				59	and were not present in either input sequence. These lines can be confusing if
				60	the sequences contain tab characters.
				61
				62
				63	.. class:: HtmlDiff
				64
				65	This class can be used to create an HTML table (or a complete HTML file
				66	containing the table) showing a side by side, line by line comparison of text
				67	with inter-line and intra-line change highlights. The table can be generated in
				68	either full or contextual difference mode.
				69
				70	The constructor for this class is:
				71
				72
				73	.. function:: __init__([tabsize][, wrapcolumn][, linejunk][, charjunk])
				74
				75	Initializes instance of :class:`HtmlDiff`.
				76
				77	tabsize is an optional keyword argument to specify tab stop spacing and
				78	defaults to ``8``.
				79
				80	wrapcolumn is an optional keyword to specify column number where lines are
				81	broken and wrapped, defaults to ``None`` where lines are not wrapped.
				82
				83	linejunk and charjunk are optional keyword arguments passed into ``ndiff()``
				84	(used by :class:`HtmlDiff` to generate the side by side HTML differences). See
				85	``ndiff()`` documentation for argument default values and descriptions.
				86
				87	The following methods are public:
				88
				89
				90	.. function:: make_file(fromlines, tolines [, fromdesc][, todesc][, context][, numlines])
				91
				92	Compares fromlines and tolines (lists of strings) and returns a string which
				93	is a complete HTML file containing a table showing line by line differences with
				94	inter-line and intra-line changes highlighted.
				95
				96	fromdesc and todesc are optional keyword arguments to specify from/to file
				97	column header strings (both default to an empty string).
				98
				99	context and numlines are both optional keyword arguments. Set context to
				100	``True`` when contextual differences are to be shown, else the default is
				101	``False`` to show the full files. numlines defaults to ``5``. When context
				102	is ``True`` numlines controls the number of context lines which surround the
				103	difference highlights. When context is ``False`` numlines controls the
				104	number of lines which are shown before a difference highlight when using the
				105	"next" hyperlinks (setting to zero would cause the "next" hyperlinks to place
				106	the next difference highlight at the top of the browser without any leading
				107	context).
				108
				109
				110	.. function:: make_table(fromlines, tolines [, fromdesc][, todesc][, context][, numlines])
				111
				112	Compares fromlines and tolines (lists of strings) and returns a string which
				113	is a complete HTML table showing line by line differences with inter-line and
				114	intra-line changes highlighted.
				115
				116	The arguments for this method are the same as those for the :meth:`make_file`
				117	method.
				118
				119	:file:`Tools/scripts/diff.py` is a command-line front-end to this class and
				120	contains a good example of its use.
				121
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	122
				123	.. function:: context_diff(a, b[, fromfile][, tofile][, fromfiledate][, tofiledate][, n][, lineterm])
				124
Georg Brandl	9afde1c	2007-11-01 20:32:30 +0000	[diff] [blame]	125	Compare a and b (lists of strings); return a delta (a :term:`generator`
				126	generating the delta lines) in context diff format.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	127
				128	Context diffs are a compact way of showing just the lines that have changed plus
				129	a few lines of context. The changes are shown in a before/after style. The
				130	number of context lines is set by n which defaults to three.
				131
				132	By default, the diff control lines (those with ``***`` or ``---``) are created
				133	with a trailing newline. This is helpful so that inputs created from
				134	:func:`file.readlines` result in diffs that are suitable for use with
				135	:func:`file.writelines` since both the inputs and outputs have trailing
				136	newlines.
				137
				138	For inputs that do not have trailing newlines, set the lineterm argument to
				139	``""`` so that the output will be uniformly newline free.
				140
				141	The context diff format normally has a header for filenames and modification
				142	times. Any or all of these may be specified using strings for fromfile,
				143	tofile, fromfiledate, and tofiledate. The modification times are normally
				144	expressed in the format returned by :func:`time.ctime`. If not specified, the
				145	strings default to blanks.
				146
				147	:file:`Tools/scripts/diff.py` is a command-line front-end for this function.
				148
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	149
				150	.. function:: get_close_matches(word, possibilities[, n][, cutoff])
				151
				152	Return a list of the best "good enough" matches. word is a sequence for which
				153	close matches are desired (typically a string), and possibilities is a list of
				154	sequences against which to match word (typically a list of strings).
				155
				156	Optional argument n (default ``3``) is the maximum number of close matches to
				157	return; n must be greater than ``0``.
				158
				159	Optional argument cutoff (default ``0.6``) is a float in the range [0, 1].
				160	Possibilities that don't score at least that similar to word are ignored.
				161
				162	The best (no more than n) matches among the possibilities are returned in a
				163	list, sorted by similarity score, most similar first. ::
				164
				165	>>> get_close_matches('appel', ['ape', 'apple', 'peach', 'puppy'])
				166	['apple', 'ape']
				167	>>> import keyword
				168	>>> get_close_matches('wheel', keyword.kwlist)
				169	['while']
				170	>>> get_close_matches('apple', keyword.kwlist)
				171	[]
				172	>>> get_close_matches('accept', keyword.kwlist)
				173	['except']
				174
				175
				176	.. function:: ndiff(a, b[, linejunk][, charjunk])
				177
Georg Brandl	9afde1c	2007-11-01 20:32:30 +0000	[diff] [blame]	178	Compare a and b (lists of strings); return a :class:`Differ`\ -style
				179	delta (a :term:`generator` generating the delta lines).
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	180
				181	Optional keyword parameters linejunk and charjunk are for filter functions
				182	(or ``None``):
				183
				184	linejunk: A function that accepts a single string argument, and returns true
				185	if the string is junk, or false if not. The default is (``None``), starting with
				186	Python 2.3. Before then, the default was the module-level function
				187	:func:`IS_LINE_JUNK`, which filters out lines without visible characters, except
				188	for at most one pound character (``'#'``). As of Python 2.3, the underlying
				189	:class:`SequenceMatcher` class does a dynamic analysis of which lines are so
				190	frequent as to constitute noise, and this usually works better than the pre-2.3
				191	default.
				192
				193	charjunk: A function that accepts a character (a string of length 1), and
				194	returns if the character is junk, or false if not. The default is module-level
				195	function :func:`IS_CHARACTER_JUNK`, which filters out whitespace characters (a
				196	blank or tab; note: bad idea to include newline in this!).
				197
				198	:file:`Tools/scripts/ndiff.py` is a command-line front-end to this function. ::
				199
				200	>>> diff = ndiff('one\ntwo\nthree\n'.splitlines(1),
				201	... 'ore\ntree\nemu\n'.splitlines(1))
Georg Brandl	6911e3c	2007-09-04 07:15:32 +0000	[diff] [blame]	202	>>> print(''.join(diff), end="")
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	203	- one
				204	? ^
				205	+ ore
				206	? ^
				207	- two
				208	- three
				209	? -
				210	+ tree
				211	+ emu
				212
				213
				214	.. function:: restore(sequence, which)
				215
				216	Return one of the two sequences that generated a delta.
				217
				218	Given a sequence produced by :meth:`Differ.compare` or :func:`ndiff`, extract
				219	lines originating from file 1 or 2 (parameter which), stripping off line
				220	prefixes.
				221
				222	Example::
				223
				224	>>> diff = ndiff('one\ntwo\nthree\n'.splitlines(1),
				225	... 'ore\ntree\nemu\n'.splitlines(1))
				226	>>> diff = list(diff) # materialize the generated delta into a list
Georg Brandl	6911e3c	2007-09-04 07:15:32 +0000	[diff] [blame]	227	>>> print(''.join(restore(diff, 1)), end="")
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	228	one
				229	two
				230	three
Georg Brandl	6911e3c	2007-09-04 07:15:32 +0000	[diff] [blame]	231	>>> print(''.join(restore(diff, 2)), end="")
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	232	ore
				233	tree
				234	emu
				235
				236
				237	.. function:: unified_diff(a, b[, fromfile][, tofile][, fromfiledate][, tofiledate][, n][, lineterm])
				238
Georg Brandl	9afde1c	2007-11-01 20:32:30 +0000	[diff] [blame]	239	Compare a and b (lists of strings); return a delta (a :term:`generator`
				240	generating the delta lines) in unified diff format.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	241
				242	Unified diffs are a compact way of showing just the lines that have changed plus
				243	a few lines of context. The changes are shown in a inline style (instead of
				244	separate before/after blocks). The number of context lines is set by n which
				245	defaults to three.
				246
				247	By default, the diff control lines (those with ``---``, ``+++``, or ``@@``) are
				248	created with a trailing newline. This is helpful so that inputs created from
				249	:func:`file.readlines` result in diffs that are suitable for use with
				250	:func:`file.writelines` since both the inputs and outputs have trailing
				251	newlines.
				252
				253	For inputs that do not have trailing newlines, set the lineterm argument to
				254	``""`` so that the output will be uniformly newline free.
				255
				256	The context diff format normally has a header for filenames and modification
				257	times. Any or all of these may be specified using strings for fromfile,
				258	tofile, fromfiledate, and tofiledate. The modification times are normally
				259	expressed in the format returned by :func:`time.ctime`. If not specified, the
				260	strings default to blanks.
				261
				262	:file:`Tools/scripts/diff.py` is a command-line front-end for this function.
				263
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	264
				265	.. function:: IS_LINE_JUNK(line)
				266
				267	Return true for ignorable lines. The line line is ignorable if line is
				268	blank or contains a single ``'#'``, otherwise it is not ignorable. Used as a
				269	default for parameter linejunk in :func:`ndiff` before Python 2.3.
				270
				271
				272	.. function:: IS_CHARACTER_JUNK(ch)
				273
				274	Return true for ignorable characters. The character ch is ignorable if ch
				275	is a space or tab, otherwise it is not ignorable. Used as a default for
				276	parameter charjunk in :func:`ndiff`.
				277
				278
				279	.. seealso::
				280
				281	`Pattern Matching: The Gestalt Approach <http://www.ddj.com/184407970?pgno=5>`_
				282	Discussion of a similar algorithm by John W. Ratcliff and D. E. Metzener. This
				283	was published in `Dr. Dobb's Journal <http://www.ddj.com/>`_ in July, 1988.
				284
				285
				286	.. _sequence-matcher:
				287
				288	SequenceMatcher Objects
				289	-----------------------
				290
				291	The :class:`SequenceMatcher` class has this constructor:
				292
				293
				294	.. class:: SequenceMatcher([isjunk[, a[, b]]])
				295
				296	Optional argument isjunk must be ``None`` (the default) or a one-argument
				297	function that takes a sequence element and returns true if and only if the
				298	element is "junk" and should be ignored. Passing ``None`` for isjunk is
				299	equivalent to passing ``lambda x: 0``; in other words, no elements are ignored.
				300	For example, pass::
				301
				302	lambda x: x in " \t"
				303
				304	if you're comparing lines as sequences of characters, and don't want to synch up
				305	on blanks or hard tabs.
				306
				307	The optional arguments a and b are sequences to be compared; both default to
				308	empty strings. The elements of both sequences must be hashable.
				309
				310	:class:`SequenceMatcher` objects have the following methods:
				311
				312
				313	.. method:: SequenceMatcher.set_seqs(a, b)
				314
				315	Set the two sequences to be compared.
				316
				317	:class:`SequenceMatcher` computes and caches detailed information about the
				318	second sequence, so if you want to compare one sequence against many sequences,
				319	use :meth:`set_seq2` to set the commonly used sequence once and call
				320	:meth:`set_seq1` repeatedly, once for each of the other sequences.
				321
				322
				323	.. method:: SequenceMatcher.set_seq1(a)
				324
				325	Set the first sequence to be compared. The second sequence to be compared is
				326	not changed.
				327
				328
				329	.. method:: SequenceMatcher.set_seq2(b)
				330
				331	Set the second sequence to be compared. The first sequence to be compared is
				332	not changed.
				333
				334
				335	.. method:: SequenceMatcher.find_longest_match(alo, ahi, blo, bhi)
				336
				337	Find longest matching block in ``a[alo:ahi]`` and ``b[blo:bhi]``.
				338
				339	If isjunk was omitted or ``None``, :meth:`get_longest_match` returns ``(i, j,
				340	k)`` such that ``a[i:i+k]`` is equal to ``b[j:j+k]``, where ``alo <= i <= i+k <=
				341	ahi`` and ``blo <= j <= j+k <= bhi``. For all ``(i', j', k')`` meeting those
				342	conditions, the additional conditions ``k >= k'``, ``i <= i'``, and if ``i ==
				343	i'``, ``j <= j'`` are also met. In other words, of all maximal matching blocks,
				344	return one that starts earliest in a, and of all those maximal matching blocks
				345	that start earliest in a, return the one that starts earliest in b. ::
				346
				347	>>> s = SequenceMatcher(None, " abcd", "abcd abcd")
				348	>>> s.find_longest_match(0, 5, 0, 9)
				349	(0, 4, 5)
				350
				351	If isjunk was provided, first the longest matching block is determined as
				352	above, but with the additional restriction that no junk element appears in the
				353	block. Then that block is extended as far as possible by matching (only) junk
				354	elements on both sides. So the resulting block never matches on junk except as
				355	identical junk happens to be adjacent to an interesting match.
				356
				357	Here's the same example as before, but considering blanks to be junk. That
				358	prevents ``' abcd'`` from matching the ``' abcd'`` at the tail end of the second
				359	sequence directly. Instead only the ``'abcd'`` can match, and matches the
				360	leftmost ``'abcd'`` in the second sequence::
				361
				362	>>> s = SequenceMatcher(lambda x: x==" ", " abcd", "abcd abcd")
				363	>>> s.find_longest_match(0, 5, 0, 9)
				364	(1, 0, 4)
				365
				366	If no blocks match, this returns ``(alo, blo, 0)``.
				367
				368
				369	.. method:: SequenceMatcher.get_matching_blocks()
				370
				371	Return list of triples describing matching subsequences. Each triple is of the
				372	form ``(i, j, n)``, and means that ``a[i:i+n] == b[j:j+n]``. The triples are
				373	monotonically increasing in i and j.
				374
				375	The last triple is a dummy, and has the value ``(len(a), len(b), 0)``. It is
				376	the only triple with ``n == 0``. If ``(i, j, n)`` and ``(i', j', n')`` are
				377	adjacent triples in the list, and the second is not the last triple in the list,
				378	then ``i+n != i'`` or ``j+n != j'``; in other words, adjacent triples always
				379	describe non-adjacent equal blocks.
				380
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	381	::
				382
				383	>>> s = SequenceMatcher(None, "abxcd", "abcd")
				384	>>> s.get_matching_blocks()
				385	[(0, 0, 2), (3, 2, 2), (5, 4, 0)]
				386
				387
				388	.. method:: SequenceMatcher.get_opcodes()
				389
				390	Return list of 5-tuples describing how to turn a into b. Each tuple is of
				391	the form ``(tag, i1, i2, j1, j2)``. The first tuple has ``i1 == j1 == 0``, and
				392	remaining tuples have i1 equal to the i2 from the preceding tuple, and,
				393	likewise, j1 equal to the previous j2.
				394
				395	The tag values are strings, with these meanings:
				396
				397	+---------------+---------------------------------------------+
				398	\| Value \| Meaning \|
				399	+===============+=============================================+
				400	\| ``'replace'`` \| ``a[i1:i2]`` should be replaced by \|
				401	\| \| ``b[j1:j2]``. \|
				402	+---------------+---------------------------------------------+
				403	\| ``'delete'`` \| ``a[i1:i2]`` should be deleted. Note that \|
				404	\| \| ``j1 == j2`` in this case. \|
				405	+---------------+---------------------------------------------+
				406	\| ``'insert'`` \| ``b[j1:j2]`` should be inserted at \|
				407	\| \| ``a[i1:i1]``. Note that ``i1 == i2`` in \|
				408	\| \| this case. \|
				409	+---------------+---------------------------------------------+
				410	\| ``'equal'`` \| ``a[i1:i2] == b[j1:j2]`` (the sub-sequences \|
				411	\| \| are equal). \|
				412	+---------------+---------------------------------------------+
				413
				414	For example::
				415
				416	>>> a = "qabxcd"
				417	>>> b = "abycdf"
				418	>>> s = SequenceMatcher(None, a, b)
				419	>>> for tag, i1, i2, j1, j2 in s.get_opcodes():
Georg Brandl	6911e3c	2007-09-04 07:15:32 +0000	[diff] [blame]	420	... print(("%7s a[%d:%d] (%s) b[%d:%d] (%s)" %
				421	... (tag, i1, i2, a[i1:i2], j1, j2, b[j1:j2])))
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	422	delete a[0:1] (q) b[0:0] ()
				423	equal a[1:3] (ab) b[0:2] (ab)
				424	replace a[3:4] (x) b[2:3] (y)
				425	equal a[4:6] (cd) b[3:5] (cd)
				426	insert a[6:6] () b[5:6] (f)
				427
				428
				429	.. method:: SequenceMatcher.get_grouped_opcodes([n])
				430
Georg Brandl	9afde1c	2007-11-01 20:32:30 +0000	[diff] [blame]	431	Return a :term:`generator` of groups with up to n lines of context.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	432
				433	Starting with the groups returned by :meth:`get_opcodes`, this method splits out
				434	smaller change clusters and eliminates intervening ranges which have no changes.
				435
				436	The groups are returned in the same format as :meth:`get_opcodes`.
				437
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	438
				439	.. method:: SequenceMatcher.ratio()
				440
				441	Return a measure of the sequences' similarity as a float in the range [0, 1].
				442
				443	Where T is the total number of elements in both sequences, and M is the number
				444	of matches, this is 2.0\*M / T. Note that this is ``1.0`` if the sequences are
				445	identical, and ``0.0`` if they have nothing in common.
				446
				447	This is expensive to compute if :meth:`get_matching_blocks` or
				448	:meth:`get_opcodes` hasn't already been called, in which case you may want to
				449	try :meth:`quick_ratio` or :meth:`real_quick_ratio` first to get an upper bound.
				450
				451
				452	.. method:: SequenceMatcher.quick_ratio()
				453
				454	Return an upper bound on :meth:`ratio` relatively quickly.
				455
				456	This isn't defined beyond that it is an upper bound on :meth:`ratio`, and is
				457	faster to compute.
				458
				459
				460	.. method:: SequenceMatcher.real_quick_ratio()
				461
				462	Return an upper bound on :meth:`ratio` very quickly.
				463
				464	This isn't defined beyond that it is an upper bound on :meth:`ratio`, and is
				465	faster to compute than either :meth:`ratio` or :meth:`quick_ratio`.
				466
				467	The three methods that return the ratio of matching to total characters can give
				468	different results due to differing levels of approximation, although
				469	:meth:`quick_ratio` and :meth:`real_quick_ratio` are always at least as large as
				470	:meth:`ratio`::
				471
				472	>>> s = SequenceMatcher(None, "abcd", "bcde")
				473	>>> s.ratio()
				474	0.75
				475	>>> s.quick_ratio()
				476	0.75
				477	>>> s.real_quick_ratio()
				478	1.0
				479
				480
				481	.. _sequencematcher-examples:
				482
				483	SequenceMatcher Examples
				484	------------------------
				485
				486	This example compares two strings, considering blanks to be "junk:" ::
				487
				488	>>> s = SequenceMatcher(lambda x: x == " ",
				489	... "private Thread currentThread;",
				490	... "private volatile Thread currentThread;")
				491
				492	:meth:`ratio` returns a float in [0, 1], measuring the similarity of the
				493	sequences. As a rule of thumb, a :meth:`ratio` value over 0.6 means the
				494	sequences are close matches::
				495
Georg Brandl	6911e3c	2007-09-04 07:15:32 +0000	[diff] [blame]	496	>>> print(round(s.ratio(), 3))
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	497	0.866
				498
				499	If you're only interested in where the sequences match,
				500	:meth:`get_matching_blocks` is handy::
				501
				502	>>> for block in s.get_matching_blocks():
Georg Brandl	6911e3c	2007-09-04 07:15:32 +0000	[diff] [blame]	503	... print("a[%d] and b[%d] match for %d elements" % block)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	504	a[0] and b[0] match for 8 elements
				505	a[8] and b[17] match for 6 elements
				506	a[14] and b[23] match for 15 elements
				507	a[29] and b[38] match for 0 elements
				508
				509	Note that the last tuple returned by :meth:`get_matching_blocks` is always a
				510	dummy, ``(len(a), len(b), 0)``, and this is the only case in which the last
				511	tuple element (number of elements matched) is ``0``.
				512
				513	If you want to know how to change the first sequence into the second, use
				514	:meth:`get_opcodes`::
				515
				516	>>> for opcode in s.get_opcodes():
Georg Brandl	6911e3c	2007-09-04 07:15:32 +0000	[diff] [blame]	517	... print("%6s a[%d:%d] b[%d:%d]" % opcode)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	518	equal a[0:8] b[0:8]
				519	insert a[8:8] b[8:17]
				520	equal a[8:14] b[17:23]
				521	equal a[14:29] b[23:38]
				522
				523	See also the function :func:`get_close_matches` in this module, which shows how
				524	simple code building on :class:`SequenceMatcher` can be used to do useful work.
				525
				526
				527	.. _differ-objects:
				528
				529	Differ Objects
				530	--------------
				531
				532	Note that :class:`Differ`\ -generated deltas make no claim to be minimal
				533	diffs. To the contrary, minimal diffs are often counter-intuitive, because they
				534	synch up anywhere possible, sometimes accidental matches 100 pages apart.
				535	Restricting synch points to contiguous matches preserves some notion of
				536	locality, at the occasional cost of producing a longer diff.
				537
				538	The :class:`Differ` class has this constructor:
				539
				540
				541	.. class:: Differ([linejunk[, charjunk]])
				542
				543	Optional keyword parameters linejunk and charjunk are for filter functions
				544	(or ``None``):
				545
				546	linejunk: A function that accepts a single string argument, and returns true
				547	if the string is junk. The default is ``None``, meaning that no line is
				548	considered junk.
				549
				550	charjunk: A function that accepts a single character argument (a string of
				551	length 1), and returns true if the character is junk. The default is ``None``,
				552	meaning that no character is considered junk.
				553
				554	:class:`Differ` objects are used (deltas generated) via a single method:
				555
				556
				557	.. method:: Differ.compare(a, b)
				558
				559	Compare two sequences of lines, and generate the delta (a sequence of lines).
				560
				561	Each sequence must contain individual single-line strings ending with newlines.
				562	Such sequences can be obtained from the :meth:`readlines` method of file-like
				563	objects. The delta generated also consists of newline-terminated strings, ready
				564	to be printed as-is via the :meth:`writelines` method of a file-like object.
				565
				566
				567	.. _differ-examples:
				568
				569	Differ Example
				570	--------------
				571
				572	This example compares two texts. First we set up the texts, sequences of
				573	individual single-line strings ending with newlines (such sequences can also be
				574	obtained from the :meth:`readlines` method of file-like objects)::
				575
				576	>>> text1 = ''' 1. Beautiful is better than ugly.
				577	... 2. Explicit is better than implicit.
				578	... 3. Simple is better than complex.
				579	... 4. Complex is better than complicated.
				580	... '''.splitlines(1)
				581	>>> len(text1)
				582	4
				583	>>> text1[0][-1]
				584	'\n'
				585	>>> text2 = ''' 1. Beautiful is better than ugly.
				586	... 3. Simple is better than complex.
				587	... 4. Complicated is better than complex.
				588	... 5. Flat is better than nested.
				589	... '''.splitlines(1)
				590
				591	Next we instantiate a Differ object::
				592
				593	>>> d = Differ()
				594
				595	Note that when instantiating a :class:`Differ` object we may pass functions to
				596	filter out line and character "junk." See the :meth:`Differ` constructor for
				597	details.
				598
				599	Finally, we compare the two::
				600
				601	>>> result = list(d.compare(text1, text2))
				602
				603	``result`` is a list of strings, so let's pretty-print it::
				604
				605	>>> from pprint import pprint
				606	>>> pprint(result)
				607	[' 1. Beautiful is better than ugly.\n',
				608	'- 2. Explicit is better than implicit.\n',
				609	'- 3. Simple is better than complex.\n',
				610	'+ 3. Simple is better than complex.\n',
				611	'? ++ \n',
				612	'- 4. Complex is better than complicated.\n',
				613	'? ^ ---- ^ \n',
				614	'+ 4. Complicated is better than complex.\n',
				615	'? ++++ ^ ^ \n',
				616	'+ 5. Flat is better than nested.\n']
				617
				618	As a single multi-line string it looks like this::
				619
				620	>>> import sys
				621	>>> sys.stdout.writelines(result)
				622	1. Beautiful is better than ugly.
				623	- 2. Explicit is better than implicit.
				624	- 3. Simple is better than complex.
				625	+ 3. Simple is better than complex.
				626	? ++
				627	- 4. Complex is better than complicated.
				628	? ^ ---- ^
				629	+ 4. Complicated is better than complex.
				630	? ++++ ^ ^
				631	+ 5. Flat is better than nested.
				632