blob: 06d3f21300a77163884a9b2b055edbb7ec400a88 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001:mod:`filecmp` --- File and Directory Comparisons
2=================================================
3
4.. module:: filecmp
5 :synopsis: Compare files efficiently.
6.. sectionauthor:: Moshe Zadka <moshez@zadka.site.co.il>
7
Raymond Hettinger10480942011-01-10 03:26:08 +00008**Source code:** :source:`Lib/filecmp.py`
Georg Brandl116aa622007-08-15 14:28:22 +00009
Raymond Hettinger4f707fd2011-01-10 19:54:11 +000010--------------
11
Georg Brandl116aa622007-08-15 14:28:22 +000012The :mod:`filecmp` module defines functions to compare files and directories,
Georg Brandl9afde1c2007-11-01 20:32:30 +000013with various optional time/correctness trade-offs. For comparing files,
14see also the :mod:`difflib` module.
Georg Brandl116aa622007-08-15 14:28:22 +000015
16The :mod:`filecmp` module defines the following functions:
17
18
Georg Brandl71515ca2009-05-17 12:29:12 +000019.. function:: cmp(f1, f2, shallow=True)
Georg Brandl116aa622007-08-15 14:28:22 +000020
21 Compare the files named *f1* and *f2*, returning ``True`` if they seem equal,
22 ``False`` otherwise.
23
Eli Benderskye431ed22012-07-24 19:47:34 +030024 If *shallow* is true, files with identical :func:`os.stat` signatures are
25 taken to be equal. Otherwise, the contents of the files are compared.
Georg Brandl116aa622007-08-15 14:28:22 +000026
27 Note that no external programs are called from this function, giving it
28 portability and efficiency.
29
Ned Deily7bff3cb2013-06-14 15:19:11 -070030 This function uses a cache for past comparisons and the results,
R David Murray4885f492014-02-02 11:11:01 -050031 with cache entries invalidated if the :func:`os.stat` information for the
32 file changes. The entire cache may be cleared using :func:`clear_cache`.
Ned Deily7bff3cb2013-06-14 15:19:11 -070033
Georg Brandl116aa622007-08-15 14:28:22 +000034
Georg Brandl71515ca2009-05-17 12:29:12 +000035.. function:: cmpfiles(dir1, dir2, common, shallow=True)
Georg Brandl116aa622007-08-15 14:28:22 +000036
Benjamin Petersone0124bd2009-03-09 21:04:33 +000037 Compare the files in the two directories *dir1* and *dir2* whose names are
38 given by *common*.
Georg Brandl116aa622007-08-15 14:28:22 +000039
Benjamin Petersone0124bd2009-03-09 21:04:33 +000040 Returns three lists of file names: *match*, *mismatch*,
41 *errors*. *match* contains the list of files that match, *mismatch* contains
42 the names of those that don't, and *errors* lists the names of files which
43 could not be compared. Files are listed in *errors* if they don't exist in
44 one of the directories, the user lacks permission to read them or if the
45 comparison could not be done for some other reason.
46
47 The *shallow* parameter has the same meaning and default value as for
Georg Brandl116aa622007-08-15 14:28:22 +000048 :func:`filecmp.cmp`.
49
Benjamin Petersone0124bd2009-03-09 21:04:33 +000050 For example, ``cmpfiles('a', 'b', ['c', 'd/e'])`` will compare ``a/c`` with
51 ``b/c`` and ``a/d/e`` with ``b/d/e``. ``'c'`` and ``'d/e'`` will each be in
52 one of the three returned lists.
53
54
Ned Deily7bff3cb2013-06-14 15:19:11 -070055.. function:: clear_cache()
56
Ned Deily7bff3cb2013-06-14 15:19:11 -070057 Clear the filecmp cache. This may be useful if a file is compared so quickly
58 after it is modified that it is within the mtime resolution of
59 the underlying filesystem.
60
R David Murray4885f492014-02-02 11:11:01 -050061 .. versionadded:: 3.4
62
Ned Deily7bff3cb2013-06-14 15:19:11 -070063
Georg Brandl116aa622007-08-15 14:28:22 +000064.. _dircmp-objects:
65
66The :class:`dircmp` class
67-------------------------
68
Georg Brandl71515ca2009-05-17 12:29:12 +000069.. class:: dircmp(a, b, ignore=None, hide=None)
Georg Brandl116aa622007-08-15 14:28:22 +000070
Eli Benderskyeb2884a2013-01-12 06:13:32 -080071 Construct a new directory comparison object, to compare the directories *a*
Eli Benderskyf50d6bc2013-03-14 14:39:51 -070072 and *b*. *ignore* is a list of names to ignore, and defaults to
73 :attr:`filecmp.DEFAULT_IGNORES`. *hide* is a list of names to hide, and
Eli Benderskyeb2884a2013-01-12 06:13:32 -080074 defaults to ``[os.curdir, os.pardir]``.
Georg Brandl116aa622007-08-15 14:28:22 +000075
Senthil Kumaran28a9f212012-07-22 19:12:58 -070076 The :class:`dircmp` class compares files by doing *shallow* comparisons
77 as described for :func:`filecmp.cmp`.
78
Benjamin Petersone41251e2008-04-25 01:59:09 +000079 The :class:`dircmp` class provides the following methods:
Georg Brandl116aa622007-08-15 14:28:22 +000080
Benjamin Petersone41251e2008-04-25 01:59:09 +000081 .. method:: report()
Georg Brandl116aa622007-08-15 14:28:22 +000082
Eli Benderskyf7a54a02012-07-24 20:44:48 +030083 Print (to :data:`sys.stdout`) a comparison between *a* and *b*.
Georg Brandl116aa622007-08-15 14:28:22 +000084
Benjamin Petersone41251e2008-04-25 01:59:09 +000085 .. method:: report_partial_closure()
Georg Brandl116aa622007-08-15 14:28:22 +000086
Benjamin Petersone41251e2008-04-25 01:59:09 +000087 Print a comparison between *a* and *b* and common immediate
88 subdirectories.
Georg Brandl116aa622007-08-15 14:28:22 +000089
Benjamin Petersone41251e2008-04-25 01:59:09 +000090 .. method:: report_full_closure()
Georg Brandl116aa622007-08-15 14:28:22 +000091
Benjamin Petersone41251e2008-04-25 01:59:09 +000092 Print a comparison between *a* and *b* and common subdirectories
93 (recursively).
Georg Brandl116aa622007-08-15 14:28:22 +000094
Senthil Kumaran28a9f212012-07-22 19:12:58 -070095 The :class:`dircmp` class offers a number of interesting attributes that may be
Benjamin Petersone41251e2008-04-25 01:59:09 +000096 used to get various bits of information about the directory trees being
97 compared.
Georg Brandl116aa622007-08-15 14:28:22 +000098
Benjamin Petersone41251e2008-04-25 01:59:09 +000099 Note that via :meth:`__getattr__` hooks, all attributes are computed lazily,
100 so there is no speed penalty if only those attributes which are lightweight
101 to compute are used.
Georg Brandl116aa622007-08-15 14:28:22 +0000102
103
R David Murray2b209cd2012-08-14 21:40:13 -0400104 .. attribute:: left
105
106 The directory *a*.
107
108
109 .. attribute:: right
110
111 The directory *b*.
112
113
Benjamin Petersone41251e2008-04-25 01:59:09 +0000114 .. attribute:: left_list
Georg Brandl116aa622007-08-15 14:28:22 +0000115
Benjamin Petersone41251e2008-04-25 01:59:09 +0000116 Files and subdirectories in *a*, filtered by *hide* and *ignore*.
Georg Brandl116aa622007-08-15 14:28:22 +0000117
118
Benjamin Petersone41251e2008-04-25 01:59:09 +0000119 .. attribute:: right_list
Georg Brandl116aa622007-08-15 14:28:22 +0000120
Benjamin Petersone41251e2008-04-25 01:59:09 +0000121 Files and subdirectories in *b*, filtered by *hide* and *ignore*.
Georg Brandl116aa622007-08-15 14:28:22 +0000122
123
Benjamin Petersone41251e2008-04-25 01:59:09 +0000124 .. attribute:: common
Georg Brandl116aa622007-08-15 14:28:22 +0000125
Benjamin Petersone41251e2008-04-25 01:59:09 +0000126 Files and subdirectories in both *a* and *b*.
Georg Brandl116aa622007-08-15 14:28:22 +0000127
128
Benjamin Petersone41251e2008-04-25 01:59:09 +0000129 .. attribute:: left_only
Georg Brandl116aa622007-08-15 14:28:22 +0000130
Benjamin Petersone41251e2008-04-25 01:59:09 +0000131 Files and subdirectories only in *a*.
Georg Brandl116aa622007-08-15 14:28:22 +0000132
133
Benjamin Petersone41251e2008-04-25 01:59:09 +0000134 .. attribute:: right_only
Georg Brandl116aa622007-08-15 14:28:22 +0000135
Benjamin Petersone41251e2008-04-25 01:59:09 +0000136 Files and subdirectories only in *b*.
Georg Brandl116aa622007-08-15 14:28:22 +0000137
138
Benjamin Petersone41251e2008-04-25 01:59:09 +0000139 .. attribute:: common_dirs
Georg Brandl116aa622007-08-15 14:28:22 +0000140
Benjamin Petersone41251e2008-04-25 01:59:09 +0000141 Subdirectories in both *a* and *b*.
Georg Brandl116aa622007-08-15 14:28:22 +0000142
143
Benjamin Petersone41251e2008-04-25 01:59:09 +0000144 .. attribute:: common_files
Georg Brandl116aa622007-08-15 14:28:22 +0000145
Eli Benderskyf50d6bc2013-03-14 14:39:51 -0700146 Files in both *a* and *b*.
Georg Brandl116aa622007-08-15 14:28:22 +0000147
148
Benjamin Petersone41251e2008-04-25 01:59:09 +0000149 .. attribute:: common_funny
Georg Brandl116aa622007-08-15 14:28:22 +0000150
Benjamin Petersone41251e2008-04-25 01:59:09 +0000151 Names in both *a* and *b*, such that the type differs between the
152 directories, or names for which :func:`os.stat` reports an error.
Georg Brandl116aa622007-08-15 14:28:22 +0000153
154
Benjamin Petersone41251e2008-04-25 01:59:09 +0000155 .. attribute:: same_files
Georg Brandl116aa622007-08-15 14:28:22 +0000156
Senthil Kumaran28a9f212012-07-22 19:12:58 -0700157 Files which are identical in both *a* and *b*, using the class's
158 file comparison operator.
Georg Brandl116aa622007-08-15 14:28:22 +0000159
160
Benjamin Petersone41251e2008-04-25 01:59:09 +0000161 .. attribute:: diff_files
Georg Brandl116aa622007-08-15 14:28:22 +0000162
Senthil Kumaran28a9f212012-07-22 19:12:58 -0700163 Files which are in both *a* and *b*, whose contents differ according
164 to the class's file comparison operator.
Georg Brandl116aa622007-08-15 14:28:22 +0000165
166
Benjamin Petersone41251e2008-04-25 01:59:09 +0000167 .. attribute:: funny_files
Georg Brandl116aa622007-08-15 14:28:22 +0000168
Benjamin Petersone41251e2008-04-25 01:59:09 +0000169 Files which are in both *a* and *b*, but could not be compared.
Georg Brandl116aa622007-08-15 14:28:22 +0000170
171
Benjamin Petersone41251e2008-04-25 01:59:09 +0000172 .. attribute:: subdirs
Georg Brandl116aa622007-08-15 14:28:22 +0000173
Georg Brandl71515ca2009-05-17 12:29:12 +0000174 A dictionary mapping names in :attr:`common_dirs` to :class:`dircmp`
175 objects.
Georg Brandl116aa622007-08-15 14:28:22 +0000176
Eli Benderskyeb2884a2013-01-12 06:13:32 -0800177.. attribute:: DEFAULT_IGNORES
178
Eli Benderskyabdcf2c2013-01-12 14:02:29 -0800179 .. versionadded:: 3.4
Eli Benderskyeb2884a2013-01-12 06:13:32 -0800180
181 List of directories ignored by :class:`dircmp` by default.
182
R David Murray2b209cd2012-08-14 21:40:13 -0400183
184Here is a simplified example of using the ``subdirs`` attribute to search
185recursively through two directories to show common different files::
186
187 >>> from filecmp import dircmp
188 >>> def print_diff_files(dcmp):
189 ... for name in dcmp.diff_files:
190 ... print("diff_file %s found in %s and %s" % (name, dcmp.left,
191 ... dcmp.right))
192 ... for sub_dcmp in dcmp.subdirs.values():
193 ... print_diff_files(sub_dcmp)
194 ...
Ezio Melotti40507922013-01-11 09:09:07 +0200195 >>> dcmp = dircmp('dir1', 'dir2') # doctest: +SKIP
196 >>> print_diff_files(dcmp) # doctest: +SKIP
R David Murray2b209cd2012-08-14 21:40:13 -0400197