blob: 83e9e14ddcacd8f0e9a925e4e24efd74636fa9f7 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001:mod:`filecmp` --- File and Directory Comparisons
2=================================================
3
4.. module:: filecmp
5 :synopsis: Compare files efficiently.
Terry Jan Reedyfa089b92016-06-11 15:02:54 -04006
Georg Brandl116aa622007-08-15 14:28:22 +00007.. sectionauthor:: Moshe Zadka <moshez@zadka.site.co.il>
8
Raymond Hettinger10480942011-01-10 03:26:08 +00009**Source code:** :source:`Lib/filecmp.py`
Georg Brandl116aa622007-08-15 14:28:22 +000010
Raymond Hettinger4f707fd2011-01-10 19:54:11 +000011--------------
12
Georg Brandl116aa622007-08-15 14:28:22 +000013The :mod:`filecmp` module defines functions to compare files and directories,
Georg Brandl9afde1c2007-11-01 20:32:30 +000014with various optional time/correctness trade-offs. For comparing files,
15see also the :mod:`difflib` module.
Georg Brandl116aa622007-08-15 14:28:22 +000016
17The :mod:`filecmp` module defines the following functions:
18
19
Georg Brandl71515ca2009-05-17 12:29:12 +000020.. function:: cmp(f1, f2, shallow=True)
Georg Brandl116aa622007-08-15 14:28:22 +000021
22 Compare the files named *f1* and *f2*, returning ``True`` if they seem equal,
23 ``False`` otherwise.
24
Miss Islington (bot)c2593b42021-08-04 13:03:33 -070025 If *shallow* is true and the :func:`os.stat` signatures (file type, size, and
26 modification time) of both files are identical, the files are taken to be
27 equal.
28
29 Otherwise, the files are treated as different if their sizes or contents differ.
Georg Brandl116aa622007-08-15 14:28:22 +000030
31 Note that no external programs are called from this function, giving it
32 portability and efficiency.
33
Ned Deily7bff3cb2013-06-14 15:19:11 -070034 This function uses a cache for past comparisons and the results,
R David Murray4885f492014-02-02 11:11:01 -050035 with cache entries invalidated if the :func:`os.stat` information for the
36 file changes. The entire cache may be cleared using :func:`clear_cache`.
Ned Deily7bff3cb2013-06-14 15:19:11 -070037
Georg Brandl116aa622007-08-15 14:28:22 +000038
Georg Brandl71515ca2009-05-17 12:29:12 +000039.. function:: cmpfiles(dir1, dir2, common, shallow=True)
Georg Brandl116aa622007-08-15 14:28:22 +000040
Benjamin Petersone0124bd2009-03-09 21:04:33 +000041 Compare the files in the two directories *dir1* and *dir2* whose names are
42 given by *common*.
Georg Brandl116aa622007-08-15 14:28:22 +000043
Benjamin Petersone0124bd2009-03-09 21:04:33 +000044 Returns three lists of file names: *match*, *mismatch*,
45 *errors*. *match* contains the list of files that match, *mismatch* contains
46 the names of those that don't, and *errors* lists the names of files which
47 could not be compared. Files are listed in *errors* if they don't exist in
48 one of the directories, the user lacks permission to read them or if the
49 comparison could not be done for some other reason.
50
51 The *shallow* parameter has the same meaning and default value as for
Georg Brandl116aa622007-08-15 14:28:22 +000052 :func:`filecmp.cmp`.
53
Benjamin Petersone0124bd2009-03-09 21:04:33 +000054 For example, ``cmpfiles('a', 'b', ['c', 'd/e'])`` will compare ``a/c`` with
55 ``b/c`` and ``a/d/e`` with ``b/d/e``. ``'c'`` and ``'d/e'`` will each be in
56 one of the three returned lists.
57
58
Ned Deily7bff3cb2013-06-14 15:19:11 -070059.. function:: clear_cache()
60
Ned Deily7bff3cb2013-06-14 15:19:11 -070061 Clear the filecmp cache. This may be useful if a file is compared so quickly
62 after it is modified that it is within the mtime resolution of
63 the underlying filesystem.
64
R David Murray4885f492014-02-02 11:11:01 -050065 .. versionadded:: 3.4
66
Ned Deily7bff3cb2013-06-14 15:19:11 -070067
Georg Brandl116aa622007-08-15 14:28:22 +000068.. _dircmp-objects:
69
70The :class:`dircmp` class
71-------------------------
72
Georg Brandl71515ca2009-05-17 12:29:12 +000073.. class:: dircmp(a, b, ignore=None, hide=None)
Georg Brandl116aa622007-08-15 14:28:22 +000074
Eli Benderskyeb2884a2013-01-12 06:13:32 -080075 Construct a new directory comparison object, to compare the directories *a*
Eli Benderskyf50d6bc2013-03-14 14:39:51 -070076 and *b*. *ignore* is a list of names to ignore, and defaults to
77 :attr:`filecmp.DEFAULT_IGNORES`. *hide* is a list of names to hide, and
Eli Benderskyeb2884a2013-01-12 06:13:32 -080078 defaults to ``[os.curdir, os.pardir]``.
Georg Brandl116aa622007-08-15 14:28:22 +000079
Senthil Kumaran28a9f212012-07-22 19:12:58 -070080 The :class:`dircmp` class compares files by doing *shallow* comparisons
81 as described for :func:`filecmp.cmp`.
82
Benjamin Petersone41251e2008-04-25 01:59:09 +000083 The :class:`dircmp` class provides the following methods:
Georg Brandl116aa622007-08-15 14:28:22 +000084
Benjamin Petersone41251e2008-04-25 01:59:09 +000085 .. method:: report()
Georg Brandl116aa622007-08-15 14:28:22 +000086
Eli Benderskyf7a54a02012-07-24 20:44:48 +030087 Print (to :data:`sys.stdout`) a comparison between *a* and *b*.
Georg Brandl116aa622007-08-15 14:28:22 +000088
Benjamin Petersone41251e2008-04-25 01:59:09 +000089 .. method:: report_partial_closure()
Georg Brandl116aa622007-08-15 14:28:22 +000090
Benjamin Petersone41251e2008-04-25 01:59:09 +000091 Print a comparison between *a* and *b* and common immediate
92 subdirectories.
Georg Brandl116aa622007-08-15 14:28:22 +000093
Benjamin Petersone41251e2008-04-25 01:59:09 +000094 .. method:: report_full_closure()
Georg Brandl116aa622007-08-15 14:28:22 +000095
Benjamin Petersone41251e2008-04-25 01:59:09 +000096 Print a comparison between *a* and *b* and common subdirectories
97 (recursively).
Georg Brandl116aa622007-08-15 14:28:22 +000098
Senthil Kumaran28a9f212012-07-22 19:12:58 -070099 The :class:`dircmp` class offers a number of interesting attributes that may be
Benjamin Petersone41251e2008-04-25 01:59:09 +0000100 used to get various bits of information about the directory trees being
101 compared.
Georg Brandl116aa622007-08-15 14:28:22 +0000102
Benjamin Petersone41251e2008-04-25 01:59:09 +0000103 Note that via :meth:`__getattr__` hooks, all attributes are computed lazily,
104 so there is no speed penalty if only those attributes which are lightweight
105 to compute are used.
Georg Brandl116aa622007-08-15 14:28:22 +0000106
107
R David Murray2b209cd2012-08-14 21:40:13 -0400108 .. attribute:: left
109
110 The directory *a*.
111
112
113 .. attribute:: right
114
115 The directory *b*.
116
117
Benjamin Petersone41251e2008-04-25 01:59:09 +0000118 .. attribute:: left_list
Georg Brandl116aa622007-08-15 14:28:22 +0000119
Benjamin Petersone41251e2008-04-25 01:59:09 +0000120 Files and subdirectories in *a*, filtered by *hide* and *ignore*.
Georg Brandl116aa622007-08-15 14:28:22 +0000121
122
Benjamin Petersone41251e2008-04-25 01:59:09 +0000123 .. attribute:: right_list
Georg Brandl116aa622007-08-15 14:28:22 +0000124
Benjamin Petersone41251e2008-04-25 01:59:09 +0000125 Files and subdirectories in *b*, filtered by *hide* and *ignore*.
Georg Brandl116aa622007-08-15 14:28:22 +0000126
127
Benjamin Petersone41251e2008-04-25 01:59:09 +0000128 .. attribute:: common
Georg Brandl116aa622007-08-15 14:28:22 +0000129
Benjamin Petersone41251e2008-04-25 01:59:09 +0000130 Files and subdirectories in both *a* and *b*.
Georg Brandl116aa622007-08-15 14:28:22 +0000131
132
Benjamin Petersone41251e2008-04-25 01:59:09 +0000133 .. attribute:: left_only
Georg Brandl116aa622007-08-15 14:28:22 +0000134
Benjamin Petersone41251e2008-04-25 01:59:09 +0000135 Files and subdirectories only in *a*.
Georg Brandl116aa622007-08-15 14:28:22 +0000136
137
Benjamin Petersone41251e2008-04-25 01:59:09 +0000138 .. attribute:: right_only
Georg Brandl116aa622007-08-15 14:28:22 +0000139
Benjamin Petersone41251e2008-04-25 01:59:09 +0000140 Files and subdirectories only in *b*.
Georg Brandl116aa622007-08-15 14:28:22 +0000141
142
Benjamin Petersone41251e2008-04-25 01:59:09 +0000143 .. attribute:: common_dirs
Georg Brandl116aa622007-08-15 14:28:22 +0000144
Benjamin Petersone41251e2008-04-25 01:59:09 +0000145 Subdirectories in both *a* and *b*.
Georg Brandl116aa622007-08-15 14:28:22 +0000146
147
Benjamin Petersone41251e2008-04-25 01:59:09 +0000148 .. attribute:: common_files
Georg Brandl116aa622007-08-15 14:28:22 +0000149
Eli Benderskyf50d6bc2013-03-14 14:39:51 -0700150 Files in both *a* and *b*.
Georg Brandl116aa622007-08-15 14:28:22 +0000151
152
Benjamin Petersone41251e2008-04-25 01:59:09 +0000153 .. attribute:: common_funny
Georg Brandl116aa622007-08-15 14:28:22 +0000154
Benjamin Petersone41251e2008-04-25 01:59:09 +0000155 Names in both *a* and *b*, such that the type differs between the
156 directories, or names for which :func:`os.stat` reports an error.
Georg Brandl116aa622007-08-15 14:28:22 +0000157
158
Benjamin Petersone41251e2008-04-25 01:59:09 +0000159 .. attribute:: same_files
Georg Brandl116aa622007-08-15 14:28:22 +0000160
Senthil Kumaran28a9f212012-07-22 19:12:58 -0700161 Files which are identical in both *a* and *b*, using the class's
162 file comparison operator.
Georg Brandl116aa622007-08-15 14:28:22 +0000163
164
Benjamin Petersone41251e2008-04-25 01:59:09 +0000165 .. attribute:: diff_files
Georg Brandl116aa622007-08-15 14:28:22 +0000166
Senthil Kumaran28a9f212012-07-22 19:12:58 -0700167 Files which are in both *a* and *b*, whose contents differ according
168 to the class's file comparison operator.
Georg Brandl116aa622007-08-15 14:28:22 +0000169
170
Benjamin Petersone41251e2008-04-25 01:59:09 +0000171 .. attribute:: funny_files
Georg Brandl116aa622007-08-15 14:28:22 +0000172
Benjamin Petersone41251e2008-04-25 01:59:09 +0000173 Files which are in both *a* and *b*, but could not be compared.
Georg Brandl116aa622007-08-15 14:28:22 +0000174
175
Benjamin Petersone41251e2008-04-25 01:59:09 +0000176 .. attribute:: subdirs
Georg Brandl116aa622007-08-15 14:28:22 +0000177
Georg Brandl71515ca2009-05-17 12:29:12 +0000178 A dictionary mapping names in :attr:`common_dirs` to :class:`dircmp`
Nick Crews2f2f9d02020-11-23 09:29:37 -0700179 instances (or MyDirCmp instances if this instance is of type MyDirCmp, a
180 subclass of :class:`dircmp`).
181
182 .. versionchanged:: 3.10
183 Previously entries were always :class:`dircmp` instances. Now entries
184 are the same type as *self*, if *self* is a subclass of
185 :class:`dircmp`.
Georg Brandl116aa622007-08-15 14:28:22 +0000186
Eli Benderskyeb2884a2013-01-12 06:13:32 -0800187.. attribute:: DEFAULT_IGNORES
188
Eli Benderskyabdcf2c2013-01-12 14:02:29 -0800189 .. versionadded:: 3.4
Eli Benderskyeb2884a2013-01-12 06:13:32 -0800190
191 List of directories ignored by :class:`dircmp` by default.
192
R David Murray2b209cd2012-08-14 21:40:13 -0400193
194Here is a simplified example of using the ``subdirs`` attribute to search
195recursively through two directories to show common different files::
196
197 >>> from filecmp import dircmp
198 >>> def print_diff_files(dcmp):
199 ... for name in dcmp.diff_files:
200 ... print("diff_file %s found in %s and %s" % (name, dcmp.left,
201 ... dcmp.right))
202 ... for sub_dcmp in dcmp.subdirs.values():
203 ... print_diff_files(sub_dcmp)
204 ...
Ezio Melotti40507922013-01-11 09:09:07 +0200205 >>> dcmp = dircmp('dir1', 'dir2') # doctest: +SKIP
206 >>> print_diff_files(dcmp) # doctest: +SKIP
R David Murray2b209cd2012-08-14 21:40:13 -0400207