blob: 31b9b4afab93489527f893ec0b3de5dd61951410 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001:mod:`filecmp` --- File and Directory Comparisons
2=================================================
3
4.. module:: filecmp
5 :synopsis: Compare files efficiently.
Terry Jan Reedyfa089b92016-06-11 15:02:54 -04006
Georg Brandl116aa622007-08-15 14:28:22 +00007.. sectionauthor:: Moshe Zadka <moshez@zadka.site.co.il>
8
Raymond Hettinger10480942011-01-10 03:26:08 +00009**Source code:** :source:`Lib/filecmp.py`
Georg Brandl116aa622007-08-15 14:28:22 +000010
Raymond Hettinger4f707fd2011-01-10 19:54:11 +000011--------------
12
Georg Brandl116aa622007-08-15 14:28:22 +000013The :mod:`filecmp` module defines functions to compare files and directories,
Georg Brandl9afde1c2007-11-01 20:32:30 +000014with various optional time/correctness trade-offs. For comparing files,
15see also the :mod:`difflib` module.
Georg Brandl116aa622007-08-15 14:28:22 +000016
17The :mod:`filecmp` module defines the following functions:
18
19
Georg Brandl71515ca2009-05-17 12:29:12 +000020.. function:: cmp(f1, f2, shallow=True)
Georg Brandl116aa622007-08-15 14:28:22 +000021
22 Compare the files named *f1* and *f2*, returning ``True`` if they seem equal,
23 ``False`` otherwise.
24
Eli Benderskye431ed22012-07-24 19:47:34 +030025 If *shallow* is true, files with identical :func:`os.stat` signatures are
26 taken to be equal. Otherwise, the contents of the files are compared.
Georg Brandl116aa622007-08-15 14:28:22 +000027
28 Note that no external programs are called from this function, giving it
29 portability and efficiency.
30
Ned Deily7bff3cb2013-06-14 15:19:11 -070031 This function uses a cache for past comparisons and the results,
R David Murray4885f492014-02-02 11:11:01 -050032 with cache entries invalidated if the :func:`os.stat` information for the
33 file changes. The entire cache may be cleared using :func:`clear_cache`.
Ned Deily7bff3cb2013-06-14 15:19:11 -070034
Georg Brandl116aa622007-08-15 14:28:22 +000035
Georg Brandl71515ca2009-05-17 12:29:12 +000036.. function:: cmpfiles(dir1, dir2, common, shallow=True)
Georg Brandl116aa622007-08-15 14:28:22 +000037
Benjamin Petersone0124bd2009-03-09 21:04:33 +000038 Compare the files in the two directories *dir1* and *dir2* whose names are
39 given by *common*.
Georg Brandl116aa622007-08-15 14:28:22 +000040
Benjamin Petersone0124bd2009-03-09 21:04:33 +000041 Returns three lists of file names: *match*, *mismatch*,
42 *errors*. *match* contains the list of files that match, *mismatch* contains
43 the names of those that don't, and *errors* lists the names of files which
44 could not be compared. Files are listed in *errors* if they don't exist in
45 one of the directories, the user lacks permission to read them or if the
46 comparison could not be done for some other reason.
47
48 The *shallow* parameter has the same meaning and default value as for
Georg Brandl116aa622007-08-15 14:28:22 +000049 :func:`filecmp.cmp`.
50
Benjamin Petersone0124bd2009-03-09 21:04:33 +000051 For example, ``cmpfiles('a', 'b', ['c', 'd/e'])`` will compare ``a/c`` with
52 ``b/c`` and ``a/d/e`` with ``b/d/e``. ``'c'`` and ``'d/e'`` will each be in
53 one of the three returned lists.
54
55
Ned Deily7bff3cb2013-06-14 15:19:11 -070056.. function:: clear_cache()
57
Ned Deily7bff3cb2013-06-14 15:19:11 -070058 Clear the filecmp cache. This may be useful if a file is compared so quickly
59 after it is modified that it is within the mtime resolution of
60 the underlying filesystem.
61
R David Murray4885f492014-02-02 11:11:01 -050062 .. versionadded:: 3.4
63
Ned Deily7bff3cb2013-06-14 15:19:11 -070064
Georg Brandl116aa622007-08-15 14:28:22 +000065.. _dircmp-objects:
66
67The :class:`dircmp` class
68-------------------------
69
Georg Brandl71515ca2009-05-17 12:29:12 +000070.. class:: dircmp(a, b, ignore=None, hide=None)
Georg Brandl116aa622007-08-15 14:28:22 +000071
Eli Benderskyeb2884a2013-01-12 06:13:32 -080072 Construct a new directory comparison object, to compare the directories *a*
Eli Benderskyf50d6bc2013-03-14 14:39:51 -070073 and *b*. *ignore* is a list of names to ignore, and defaults to
74 :attr:`filecmp.DEFAULT_IGNORES`. *hide* is a list of names to hide, and
Eli Benderskyeb2884a2013-01-12 06:13:32 -080075 defaults to ``[os.curdir, os.pardir]``.
Georg Brandl116aa622007-08-15 14:28:22 +000076
Senthil Kumaran28a9f212012-07-22 19:12:58 -070077 The :class:`dircmp` class compares files by doing *shallow* comparisons
78 as described for :func:`filecmp.cmp`.
79
Benjamin Petersone41251e2008-04-25 01:59:09 +000080 The :class:`dircmp` class provides the following methods:
Georg Brandl116aa622007-08-15 14:28:22 +000081
Benjamin Petersone41251e2008-04-25 01:59:09 +000082 .. method:: report()
Georg Brandl116aa622007-08-15 14:28:22 +000083
Eli Benderskyf7a54a02012-07-24 20:44:48 +030084 Print (to :data:`sys.stdout`) a comparison between *a* and *b*.
Georg Brandl116aa622007-08-15 14:28:22 +000085
Benjamin Petersone41251e2008-04-25 01:59:09 +000086 .. method:: report_partial_closure()
Georg Brandl116aa622007-08-15 14:28:22 +000087
Benjamin Petersone41251e2008-04-25 01:59:09 +000088 Print a comparison between *a* and *b* and common immediate
89 subdirectories.
Georg Brandl116aa622007-08-15 14:28:22 +000090
Benjamin Petersone41251e2008-04-25 01:59:09 +000091 .. method:: report_full_closure()
Georg Brandl116aa622007-08-15 14:28:22 +000092
Benjamin Petersone41251e2008-04-25 01:59:09 +000093 Print a comparison between *a* and *b* and common subdirectories
94 (recursively).
Georg Brandl116aa622007-08-15 14:28:22 +000095
Senthil Kumaran28a9f212012-07-22 19:12:58 -070096 The :class:`dircmp` class offers a number of interesting attributes that may be
Benjamin Petersone41251e2008-04-25 01:59:09 +000097 used to get various bits of information about the directory trees being
98 compared.
Georg Brandl116aa622007-08-15 14:28:22 +000099
Benjamin Petersone41251e2008-04-25 01:59:09 +0000100 Note that via :meth:`__getattr__` hooks, all attributes are computed lazily,
101 so there is no speed penalty if only those attributes which are lightweight
102 to compute are used.
Georg Brandl116aa622007-08-15 14:28:22 +0000103
104
R David Murray2b209cd2012-08-14 21:40:13 -0400105 .. attribute:: left
106
107 The directory *a*.
108
109
110 .. attribute:: right
111
112 The directory *b*.
113
114
Benjamin Petersone41251e2008-04-25 01:59:09 +0000115 .. attribute:: left_list
Georg Brandl116aa622007-08-15 14:28:22 +0000116
Benjamin Petersone41251e2008-04-25 01:59:09 +0000117 Files and subdirectories in *a*, filtered by *hide* and *ignore*.
Georg Brandl116aa622007-08-15 14:28:22 +0000118
119
Benjamin Petersone41251e2008-04-25 01:59:09 +0000120 .. attribute:: right_list
Georg Brandl116aa622007-08-15 14:28:22 +0000121
Benjamin Petersone41251e2008-04-25 01:59:09 +0000122 Files and subdirectories in *b*, filtered by *hide* and *ignore*.
Georg Brandl116aa622007-08-15 14:28:22 +0000123
124
Benjamin Petersone41251e2008-04-25 01:59:09 +0000125 .. attribute:: common
Georg Brandl116aa622007-08-15 14:28:22 +0000126
Benjamin Petersone41251e2008-04-25 01:59:09 +0000127 Files and subdirectories in both *a* and *b*.
Georg Brandl116aa622007-08-15 14:28:22 +0000128
129
Benjamin Petersone41251e2008-04-25 01:59:09 +0000130 .. attribute:: left_only
Georg Brandl116aa622007-08-15 14:28:22 +0000131
Benjamin Petersone41251e2008-04-25 01:59:09 +0000132 Files and subdirectories only in *a*.
Georg Brandl116aa622007-08-15 14:28:22 +0000133
134
Benjamin Petersone41251e2008-04-25 01:59:09 +0000135 .. attribute:: right_only
Georg Brandl116aa622007-08-15 14:28:22 +0000136
Benjamin Petersone41251e2008-04-25 01:59:09 +0000137 Files and subdirectories only in *b*.
Georg Brandl116aa622007-08-15 14:28:22 +0000138
139
Benjamin Petersone41251e2008-04-25 01:59:09 +0000140 .. attribute:: common_dirs
Georg Brandl116aa622007-08-15 14:28:22 +0000141
Benjamin Petersone41251e2008-04-25 01:59:09 +0000142 Subdirectories in both *a* and *b*.
Georg Brandl116aa622007-08-15 14:28:22 +0000143
144
Benjamin Petersone41251e2008-04-25 01:59:09 +0000145 .. attribute:: common_files
Georg Brandl116aa622007-08-15 14:28:22 +0000146
Eli Benderskyf50d6bc2013-03-14 14:39:51 -0700147 Files in both *a* and *b*.
Georg Brandl116aa622007-08-15 14:28:22 +0000148
149
Benjamin Petersone41251e2008-04-25 01:59:09 +0000150 .. attribute:: common_funny
Georg Brandl116aa622007-08-15 14:28:22 +0000151
Benjamin Petersone41251e2008-04-25 01:59:09 +0000152 Names in both *a* and *b*, such that the type differs between the
153 directories, or names for which :func:`os.stat` reports an error.
Georg Brandl116aa622007-08-15 14:28:22 +0000154
155
Benjamin Petersone41251e2008-04-25 01:59:09 +0000156 .. attribute:: same_files
Georg Brandl116aa622007-08-15 14:28:22 +0000157
Senthil Kumaran28a9f212012-07-22 19:12:58 -0700158 Files which are identical in both *a* and *b*, using the class's
159 file comparison operator.
Georg Brandl116aa622007-08-15 14:28:22 +0000160
161
Benjamin Petersone41251e2008-04-25 01:59:09 +0000162 .. attribute:: diff_files
Georg Brandl116aa622007-08-15 14:28:22 +0000163
Senthil Kumaran28a9f212012-07-22 19:12:58 -0700164 Files which are in both *a* and *b*, whose contents differ according
165 to the class's file comparison operator.
Georg Brandl116aa622007-08-15 14:28:22 +0000166
167
Benjamin Petersone41251e2008-04-25 01:59:09 +0000168 .. attribute:: funny_files
Georg Brandl116aa622007-08-15 14:28:22 +0000169
Benjamin Petersone41251e2008-04-25 01:59:09 +0000170 Files which are in both *a* and *b*, but could not be compared.
Georg Brandl116aa622007-08-15 14:28:22 +0000171
172
Benjamin Petersone41251e2008-04-25 01:59:09 +0000173 .. attribute:: subdirs
Georg Brandl116aa622007-08-15 14:28:22 +0000174
Georg Brandl71515ca2009-05-17 12:29:12 +0000175 A dictionary mapping names in :attr:`common_dirs` to :class:`dircmp`
176 objects.
Georg Brandl116aa622007-08-15 14:28:22 +0000177
Eli Benderskyeb2884a2013-01-12 06:13:32 -0800178.. attribute:: DEFAULT_IGNORES
179
Eli Benderskyabdcf2c2013-01-12 14:02:29 -0800180 .. versionadded:: 3.4
Eli Benderskyeb2884a2013-01-12 06:13:32 -0800181
182 List of directories ignored by :class:`dircmp` by default.
183
R David Murray2b209cd2012-08-14 21:40:13 -0400184
185Here is a simplified example of using the ``subdirs`` attribute to search
186recursively through two directories to show common different files::
187
188 >>> from filecmp import dircmp
189 >>> def print_diff_files(dcmp):
190 ... for name in dcmp.diff_files:
191 ... print("diff_file %s found in %s and %s" % (name, dcmp.left,
192 ... dcmp.right))
193 ... for sub_dcmp in dcmp.subdirs.values():
194 ... print_diff_files(sub_dcmp)
195 ...
Ezio Melotti40507922013-01-11 09:09:07 +0200196 >>> dcmp = dircmp('dir1', 'dir2') # doctest: +SKIP
197 >>> print_diff_files(dcmp) # doctest: +SKIP
R David Murray2b209cd2012-08-14 21:40:13 -0400198