Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1 | |
| 2 | :mod:`shutil` --- High-level file operations |
| 3 | ============================================ |
| 4 | |
| 5 | .. module:: shutil |
| 6 | :synopsis: High-level file operations, including copying. |
| 7 | .. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org> |
Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 8 | .. partly based on the docstrings |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 9 | |
| 10 | .. index:: |
| 11 | single: file; copying |
| 12 | single: copying files |
| 13 | |
| 14 | The :mod:`shutil` module offers a number of high-level operations on files and |
| 15 | collections of files. In particular, functions are provided which support file |
Mark Summerfield | ac3d429 | 2007-11-02 08:24:59 +0000 | [diff] [blame] | 16 | copying and removal. For operations on individual files, see also the |
| 17 | :mod:`os` module. |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 18 | |
Georg Brandl | bf863b1 | 2007-08-15 19:06:04 +0000 | [diff] [blame] | 19 | .. warning:: |
Georg Brandl | ec32b6b | 2008-01-06 16:12:39 +0000 | [diff] [blame] | 20 | |
| 21 | Even the higher-level file copying functions (:func:`copy`, :func:`copy2`) |
| 22 | can't copy all file metadata. |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 23 | |
Georg Brandl | ec32b6b | 2008-01-06 16:12:39 +0000 | [diff] [blame] | 24 | On POSIX platforms, this means that file owner and group are lost as well |
Georg Brandl | 9af9498 | 2008-09-13 17:41:16 +0000 | [diff] [blame] | 25 | as ACLs. On Mac OS, the resource fork and other metadata are not used. |
Georg Brandl | ec32b6b | 2008-01-06 16:12:39 +0000 | [diff] [blame] | 26 | This means that resources will be lost and file type and creator codes will |
| 27 | not be correct. On Windows, file owners, ACLs and alternate data streams |
| 28 | are not copied. |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 29 | |
Tarek Ziadé | 48cc8dc | 2010-02-23 05:16:41 +0000 | [diff] [blame] | 30 | Directory and files operations |
| 31 | ------------------------------ |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 32 | |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 33 | .. function:: copyfileobj(fsrc, fdst[, length]) |
| 34 | |
| 35 | Copy the contents of the file-like object *fsrc* to the file-like object *fdst*. |
| 36 | The integer *length*, if given, is the buffer size. In particular, a negative |
| 37 | *length* value means to copy the data without looping over the source data in |
| 38 | chunks; by default the data is read in chunks to avoid uncontrolled memory |
| 39 | consumption. Note that if the current file position of the *fsrc* object is not |
| 40 | 0, only the contents from the current file position to the end of the file will |
| 41 | be copied. |
| 42 | |
| 43 | |
Georg Brandl | 786ead6 | 2008-04-19 16:57:43 +0000 | [diff] [blame] | 44 | .. function:: copyfile(src, dst) |
| 45 | |
| 46 | Copy the contents (no metadata) of the file named *src* to a file named *dst*. |
| 47 | *dst* must be the complete target file name; look at :func:`copy` for a copy that |
Georg Brandl | 905e0f6 | 2008-12-05 15:32:29 +0000 | [diff] [blame] | 48 | accepts a target directory path. If *src* and *dst* are the same files, |
| 49 | :exc:`Error` is raised. |
Georg Brandl | 786ead6 | 2008-04-19 16:57:43 +0000 | [diff] [blame] | 50 | The destination location must be writable; otherwise, an :exc:`IOError` exception |
| 51 | will be raised. If *dst* already exists, it will be replaced. Special files |
| 52 | such as character or block devices and pipes cannot be copied with this |
| 53 | function. *src* and *dst* are path names given as strings. |
| 54 | |
| 55 | |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 56 | .. function:: copymode(src, dst) |
| 57 | |
| 58 | Copy the permission bits from *src* to *dst*. The file contents, owner, and |
| 59 | group are unaffected. *src* and *dst* are path names given as strings. |
| 60 | |
| 61 | |
| 62 | .. function:: copystat(src, dst) |
| 63 | |
| 64 | Copy the permission bits, last access time, last modification time, and flags |
| 65 | from *src* to *dst*. The file contents, owner, and group are unaffected. *src* |
| 66 | and *dst* are path names given as strings. |
| 67 | |
| 68 | |
| 69 | .. function:: copy(src, dst) |
| 70 | |
| 71 | Copy the file *src* to the file or directory *dst*. If *dst* is a directory, a |
| 72 | file with the same basename as *src* is created (or overwritten) in the |
| 73 | directory specified. Permission bits are copied. *src* and *dst* are path |
| 74 | names given as strings. |
| 75 | |
| 76 | |
| 77 | .. function:: copy2(src, dst) |
| 78 | |
Georg Brandl | 88107da | 2008-05-16 13:18:50 +0000 | [diff] [blame] | 79 | Similar to :func:`copy`, but metadata is copied as well -- in fact, this is just |
| 80 | :func:`copy` followed by :func:`copystat`. This is similar to the |
| 81 | Unix command :program:`cp -p`. |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 82 | |
| 83 | |
Georg Brandl | e78fbcc | 2008-07-05 10:13:36 +0000 | [diff] [blame] | 84 | .. function:: ignore_patterns(\*patterns) |
| 85 | |
| 86 | This factory function creates a function that can be used as a callable for |
| 87 | :func:`copytree`\'s *ignore* argument, ignoring files and directories that |
Andrew M. Kuchling | c406084 | 2008-07-06 17:43:16 +0000 | [diff] [blame] | 88 | match one of the glob-style *patterns* provided. See the example below. |
Georg Brandl | e78fbcc | 2008-07-05 10:13:36 +0000 | [diff] [blame] | 89 | |
| 90 | .. versionadded:: 2.6 |
| 91 | |
| 92 | |
| 93 | .. function:: copytree(src, dst[, symlinks=False[, ignore=None]]) |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 94 | |
| 95 | Recursively copy an entire directory tree rooted at *src*. The destination |
Georg Brandl | e78fbcc | 2008-07-05 10:13:36 +0000 | [diff] [blame] | 96 | directory, named by *dst*, must not already exist; it will be created as well |
| 97 | as missing parent directories. Permissions and times of directories are |
| 98 | copied with :func:`copystat`, individual files are copied using |
| 99 | :func:`copy2`. |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 100 | |
Georg Brandl | e78fbcc | 2008-07-05 10:13:36 +0000 | [diff] [blame] | 101 | If *symlinks* is true, symbolic links in the source tree are represented as |
| 102 | symbolic links in the new tree; if false or omitted, the contents of the |
| 103 | linked files are copied to the new tree. |
| 104 | |
| 105 | If *ignore* is given, it must be a callable that will receive as its |
| 106 | arguments the directory being visited by :func:`copytree`, and a list of its |
| 107 | contents, as returned by :func:`os.listdir`. Since :func:`copytree` is |
| 108 | called recursively, the *ignore* callable will be called once for each |
| 109 | directory that is copied. The callable must return a sequence of directory |
| 110 | and file names relative to the current directory (i.e. a subset of the items |
| 111 | in its second argument); these names will then be ignored in the copy |
| 112 | process. :func:`ignore_patterns` can be used to create such a callable that |
| 113 | ignores names based on glob-style patterns. |
| 114 | |
| 115 | If exception(s) occur, an :exc:`Error` is raised with a list of reasons. |
| 116 | |
| 117 | The source code for this should be considered an example rather than the |
| 118 | ultimate tool. |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 119 | |
| 120 | .. versionchanged:: 2.3 |
| 121 | :exc:`Error` is raised if any exceptions occur during copying, rather than |
| 122 | printing a message. |
| 123 | |
| 124 | .. versionchanged:: 2.5 |
| 125 | Create intermediate directories needed to create *dst*, rather than raising an |
| 126 | error. Copy permissions and times of directories using :func:`copystat`. |
| 127 | |
Georg Brandl | e78fbcc | 2008-07-05 10:13:36 +0000 | [diff] [blame] | 128 | .. versionchanged:: 2.6 |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 129 | Added the *ignore* argument to be able to influence what is being copied. |
Georg Brandl | e78fbcc | 2008-07-05 10:13:36 +0000 | [diff] [blame] | 130 | |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 131 | |
| 132 | .. function:: rmtree(path[, ignore_errors[, onerror]]) |
| 133 | |
| 134 | .. index:: single: directory; deleting |
| 135 | |
Georg Brandl | 5235398 | 2008-01-20 14:17:42 +0000 | [diff] [blame] | 136 | Delete an entire directory tree; *path* must point to a directory (but not a |
| 137 | symbolic link to a directory). If *ignore_errors* is true, errors resulting |
| 138 | from failed removals will be ignored; if false or omitted, such errors are |
| 139 | handled by calling a handler specified by *onerror* or, if that is omitted, |
| 140 | they raise an exception. |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 141 | |
Georg Brandl | 5235398 | 2008-01-20 14:17:42 +0000 | [diff] [blame] | 142 | If *onerror* is provided, it must be a callable that accepts three |
| 143 | parameters: *function*, *path*, and *excinfo*. The first parameter, |
| 144 | *function*, is the function which raised the exception; it will be |
| 145 | :func:`os.path.islink`, :func:`os.listdir`, :func:`os.remove` or |
| 146 | :func:`os.rmdir`. The second parameter, *path*, will be the path name passed |
| 147 | to *function*. The third parameter, *excinfo*, will be the exception |
| 148 | information return by :func:`sys.exc_info`. Exceptions raised by *onerror* |
| 149 | will not be caught. |
| 150 | |
| 151 | .. versionchanged:: 2.6 |
| 152 | Explicitly check for *path* being a symbolic link and raise :exc:`OSError` |
| 153 | in that case. |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 154 | |
| 155 | |
| 156 | .. function:: move(src, dst) |
| 157 | |
| 158 | Recursively move a file or directory to another location. |
| 159 | |
Georg Brandl | ec32b6b | 2008-01-06 16:12:39 +0000 | [diff] [blame] | 160 | If the destination is on the current filesystem, then simply use rename. |
Benjamin Peterson | d729aad | 2008-12-09 02:03:03 +0000 | [diff] [blame] | 161 | Otherwise, copy src (with :func:`copy2`) to the dst and then remove src. |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 162 | |
| 163 | .. versionadded:: 2.3 |
| 164 | |
| 165 | |
| 166 | .. exception:: Error |
| 167 | |
Georg Brandl | ec32b6b | 2008-01-06 16:12:39 +0000 | [diff] [blame] | 168 | This exception collects exceptions that raised during a multi-file operation. For |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 169 | :func:`copytree`, the exception argument is a list of 3-tuples (*srcname*, |
| 170 | *dstname*, *exception*). |
| 171 | |
| 172 | .. versionadded:: 2.3 |
| 173 | |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 174 | .. _shutil-example: |
| 175 | |
Tarek Ziadé | 48cc8dc | 2010-02-23 05:16:41 +0000 | [diff] [blame] | 176 | copytree example |
| 177 | :::::::::::::::: |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 178 | |
| 179 | This example is the implementation of the :func:`copytree` function, described |
| 180 | above, with the docstring omitted. It demonstrates many of the other functions |
| 181 | provided by this module. :: |
| 182 | |
Georg Brandl | e78fbcc | 2008-07-05 10:13:36 +0000 | [diff] [blame] | 183 | def copytree(src, dst, symlinks=False, ignore=None): |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 184 | names = os.listdir(src) |
Georg Brandl | e78fbcc | 2008-07-05 10:13:36 +0000 | [diff] [blame] | 185 | if ignore is not None: |
| 186 | ignored_names = ignore(src, names) |
| 187 | else: |
| 188 | ignored_names = set() |
| 189 | |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 190 | os.makedirs(dst) |
| 191 | errors = [] |
| 192 | for name in names: |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 193 | if name in ignored_names: |
Georg Brandl | e78fbcc | 2008-07-05 10:13:36 +0000 | [diff] [blame] | 194 | continue |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 195 | srcname = os.path.join(src, name) |
| 196 | dstname = os.path.join(dst, name) |
| 197 | try: |
| 198 | if symlinks and os.path.islink(srcname): |
| 199 | linkto = os.readlink(srcname) |
| 200 | os.symlink(linkto, dstname) |
| 201 | elif os.path.isdir(srcname): |
Georg Brandl | e78fbcc | 2008-07-05 10:13:36 +0000 | [diff] [blame] | 202 | copytree(srcname, dstname, symlinks, ignore) |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 203 | else: |
| 204 | copy2(srcname, dstname) |
| 205 | # XXX What about devices, sockets etc.? |
| 206 | except (IOError, os.error), why: |
| 207 | errors.append((srcname, dstname, str(why))) |
| 208 | # catch the Error from the recursive copytree so that we can |
| 209 | # continue with other files |
| 210 | except Error, err: |
| 211 | errors.extend(err.args[0]) |
| 212 | try: |
| 213 | copystat(src, dst) |
| 214 | except WindowsError: |
| 215 | # can't copy file access times on Windows |
| 216 | pass |
| 217 | except OSError, why: |
| 218 | errors.extend((src, dst, str(why))) |
| 219 | if errors: |
Georg Brandl | c1edec3 | 2009-06-03 07:25:35 +0000 | [diff] [blame] | 220 | raise Error(errors) |
Georg Brandl | e78fbcc | 2008-07-05 10:13:36 +0000 | [diff] [blame] | 221 | |
| 222 | Another example that uses the :func:`ignore_patterns` helper:: |
| 223 | |
| 224 | from shutil import copytree, ignore_patterns |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 225 | |
Georg Brandl | e78fbcc | 2008-07-05 10:13:36 +0000 | [diff] [blame] | 226 | copytree(source, destination, ignore=ignore_patterns('*.pyc', 'tmp*')) |
| 227 | |
| 228 | This will copy everything except ``.pyc`` files and files or directories whose |
| 229 | name starts with ``tmp``. |
| 230 | |
| 231 | Another example that uses the *ignore* argument to add a logging call:: |
| 232 | |
| 233 | from shutil import copytree |
| 234 | import logging |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 235 | |
Georg Brandl | e78fbcc | 2008-07-05 10:13:36 +0000 | [diff] [blame] | 236 | def _logpath(path, names): |
| 237 | logging.info('Working in %s' % path) |
| 238 | return [] # nothing will be ignored |
| 239 | |
| 240 | copytree(source, destination, ignore=_logpath) |
| 241 | |
Tarek Ziadé | 48cc8dc | 2010-02-23 05:16:41 +0000 | [diff] [blame] | 242 | |
| 243 | Archives operations |
| 244 | ------------------- |
| 245 | |
| 246 | .. function:: make_archive(base_name, format, [root_dir, [base_dir, [verbose, [dry_run, [owner, [group, [logger]]]]]]]) |
| 247 | |
| 248 | Create an archive file (eg. zip or tar) and returns its name. |
| 249 | |
| 250 | *base_name* is the name of the file to create, including the path, minus |
| 251 | any format-specific extension. *format* is the archive format: one of |
Tarek Ziadé | e593fad | 2010-04-20 21:09:06 +0000 | [diff] [blame] | 252 | "zip", "tar", "bztar" or "gztar". |
Tarek Ziadé | 48cc8dc | 2010-02-23 05:16:41 +0000 | [diff] [blame] | 253 | |
| 254 | *root_dir* is a directory that will be the root directory of the |
| 255 | archive; ie. we typically chdir into *root_dir* before creating the |
| 256 | archive. |
| 257 | |
| 258 | *base_dir* is the directory where we start archiving from; |
| 259 | ie. *base_dir* will be the common prefix of all files and |
| 260 | directories in the archive. |
| 261 | |
| 262 | *root_dir* and *base_dir* both default to the current directory. |
| 263 | |
| 264 | *owner* and *group* are used when creating a tar archive. By default, |
| 265 | uses the current owner and group. |
| 266 | |
| 267 | .. versionadded:: 2.7 |
| 268 | |
| 269 | |
| 270 | .. function:: get_archive_formats() |
| 271 | |
| 272 | Returns a list of supported formats for archiving. |
| 273 | Each element of the returned sequence is a tuple ``(name, description)`` |
| 274 | |
| 275 | By default :mod:`shutil` provides these formats: |
| 276 | |
| 277 | - *gztar*: gzip'ed tar-file |
| 278 | - *bztar*: bzip2'ed tar-file |
Tarek Ziadé | 48cc8dc | 2010-02-23 05:16:41 +0000 | [diff] [blame] | 279 | - *tar*: uncompressed tar file |
| 280 | - *zip*: ZIP file |
| 281 | |
| 282 | You can register new formats or provide your own archiver for any existing |
| 283 | formats, by using :func:`register_archive_format`. |
| 284 | |
| 285 | .. versionadded:: 2.7 |
| 286 | |
| 287 | |
| 288 | .. function:: register_archive_format(name, function, [extra_args, [description]]) |
| 289 | |
| 290 | Registers an archiver for the format *name*. *function* is a callable that |
| 291 | will be used to invoke the archiver. |
| 292 | |
| 293 | If given, *extra_args* is a sequence of ``(name, value)`` that will be |
| 294 | used as extra keywords arguments when the archiver callable is used. |
| 295 | |
| 296 | *description* is used by :func:`get_archive_formats` which returns the |
| 297 | list of archivers. Defaults to an empty list. |
| 298 | |
| 299 | .. versionadded:: 2.7 |
| 300 | |
| 301 | |
| 302 | .. function:: unregister_archive_format(name) |
| 303 | |
| 304 | Remove the archive format *name* from the list of supported formats. |
| 305 | |
| 306 | .. versionadded:: 2.7 |
| 307 | |
| 308 | |
| 309 | Archiving example |
| 310 | ::::::::::::::::: |
| 311 | |
| 312 | In this example, we create a gzip'ed tar-file archive containing all files |
| 313 | found in the :file:`.ssh` directory of the user:: |
| 314 | |
| 315 | >>> from shutil import make_archive |
| 316 | >>> import os |
| 317 | >>> archive_name = os.path.expanduser(os.path.join('~', 'myarchive')) |
| 318 | >>> root_dir = os.path.expanduser(os.path.join('~', '.ssh')) |
| 319 | >>> make_archive(archive_name, 'gztar', root_dir) |
| 320 | '/Users/tarek/myarchive.tar.gz' |
| 321 | |
| 322 | The resulting archive contains:: |
| 323 | |
| 324 | $ tar -tzvf /Users/tarek/myarchive.tar.gz |
| 325 | drwx------ tarek/staff 0 2010-02-01 16:23:40 ./ |
| 326 | -rw-r--r-- tarek/staff 609 2008-06-09 13:26:54 ./authorized_keys |
| 327 | -rwxr-xr-x tarek/staff 65 2008-06-09 13:26:54 ./config |
| 328 | -rwx------ tarek/staff 668 2008-06-09 13:26:54 ./id_dsa |
| 329 | -rwxr-xr-x tarek/staff 609 2008-06-09 13:26:54 ./id_dsa.pub |
| 330 | -rw------- tarek/staff 1675 2008-06-09 13:26:54 ./id_rsa |
| 331 | -rw-r--r-- tarek/staff 397 2008-06-09 13:26:54 ./id_rsa.pub |
| 332 | -rw-r--r-- tarek/staff 37192 2010-02-06 18:23:10 ./known_hosts |
| 333 | |
| 334 | |