blob: a48a8294042c38c88fa3d6c8fd7a0e9e30906b83 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001.. _tarfile-mod:
2
3:mod:`tarfile` --- Read and write tar archive files
4===================================================
5
6.. module:: tarfile
7 :synopsis: Read and write tar-format archive files.
8
9
10.. versionadded:: 2.3
11
12.. moduleauthor:: Lars Gustäbel <lars@gustaebel.de>
13.. sectionauthor:: Lars Gustäbel <lars@gustaebel.de>
14
15
16The :mod:`tarfile` module makes it possible to read and create tar archives.
17Some facts and figures:
18
19* reads and writes :mod:`gzip` and :mod:`bzip2` compressed archives.
20
21* read/write support for the POSIX.1-1988 (ustar) format.
22
23* read/write support for the GNU tar format including *longname* and *longlink*
24 extensions, read-only support for the *sparse* extension.
25
26* read/write support for the POSIX.1-2001 (pax) format.
27
28 .. versionadded:: 2.6
29
30* handles directories, regular files, hardlinks, symbolic links, fifos,
31 character devices and block devices and is able to acquire and restore file
32 information like timestamp, access permissions and owner.
33
34* can handle tape devices.
35
36
37.. function:: open(name[, mode[, fileobj[, bufsize]]], **kwargs)
38
39 Return a :class:`TarFile` object for the pathname *name*. For detailed
40 information on :class:`TarFile` objects and the keyword arguments that are
41 allowed, see :ref:`tarfile-objects`.
42
43 *mode* has to be a string of the form ``'filemode[:compression]'``, it defaults
44 to ``'r'``. Here is a full list of mode combinations:
45
46 +------------------+---------------------------------------------+
47 | mode | action |
48 +==================+=============================================+
49 | ``'r' or 'r:*'`` | Open for reading with transparent |
50 | | compression (recommended). |
51 +------------------+---------------------------------------------+
52 | ``'r:'`` | Open for reading exclusively without |
53 | | compression. |
54 +------------------+---------------------------------------------+
55 | ``'r:gz'`` | Open for reading with gzip compression. |
56 +------------------+---------------------------------------------+
57 | ``'r:bz2'`` | Open for reading with bzip2 compression. |
58 +------------------+---------------------------------------------+
59 | ``'a' or 'a:'`` | Open for appending with no compression. The |
60 | | file is created if it does not exist. |
61 +------------------+---------------------------------------------+
62 | ``'w' or 'w:'`` | Open for uncompressed writing. |
63 +------------------+---------------------------------------------+
64 | ``'w:gz'`` | Open for gzip compressed writing. |
65 +------------------+---------------------------------------------+
66 | ``'w:bz2'`` | Open for bzip2 compressed writing. |
67 +------------------+---------------------------------------------+
68
69 Note that ``'a:gz'`` or ``'a:bz2'`` is not possible. If *mode* is not suitable
70 to open a certain (compressed) file for reading, :exc:`ReadError` is raised. Use
71 *mode* ``'r'`` to avoid this. If a compression method is not supported,
72 :exc:`CompressionError` is raised.
73
74 If *fileobj* is specified, it is used as an alternative to a file object opened
75 for *name*. It is supposed to be at position 0.
76
77 For special purposes, there is a second format for *mode*:
78 ``'filemode|[compression]'``. :func:`open` will return a :class:`TarFile`
79 object that processes its data as a stream of blocks. No random seeking will
80 be done on the file. If given, *fileobj* may be any object that has a
81 :meth:`read` or :meth:`write` method (depending on the *mode*). *bufsize*
82 specifies the blocksize and defaults to ``20 * 512`` bytes. Use this variant
83 in combination with e.g. ``sys.stdin``, a socket file object or a tape
84 device. However, such a :class:`TarFile` object is limited in that it does
85 not allow to be accessed randomly, see :ref:`tar-examples`. The currently
86 possible modes:
87
88 +-------------+--------------------------------------------+
89 | Mode | Action |
90 +=============+============================================+
91 | ``'r|*'`` | Open a *stream* of tar blocks for reading |
92 | | with transparent compression. |
93 +-------------+--------------------------------------------+
94 | ``'r|'`` | Open a *stream* of uncompressed tar blocks |
95 | | for reading. |
96 +-------------+--------------------------------------------+
97 | ``'r|gz'`` | Open a gzip compressed *stream* for |
98 | | reading. |
99 +-------------+--------------------------------------------+
100 | ``'r|bz2'`` | Open a bzip2 compressed *stream* for |
101 | | reading. |
102 +-------------+--------------------------------------------+
103 | ``'w|'`` | Open an uncompressed *stream* for writing. |
104 +-------------+--------------------------------------------+
105 | ``'w|gz'`` | Open an gzip compressed *stream* for |
106 | | writing. |
107 +-------------+--------------------------------------------+
108 | ``'w|bz2'`` | Open an bzip2 compressed *stream* for |
109 | | writing. |
110 +-------------+--------------------------------------------+
111
112
113.. class:: TarFile
114
115 Class for reading and writing tar archives. Do not use this class directly,
116 better use :func:`open` instead. See :ref:`tarfile-objects`.
117
118
119.. function:: is_tarfile(name)
120
121 Return :const:`True` if *name* is a tar archive file, that the :mod:`tarfile`
122 module can read.
123
124
125.. class:: TarFileCompat(filename[, mode[, compression]])
126
127 Class for limited access to tar archives with a :mod:`zipfile`\ -like interface.
128 Please consult the documentation of the :mod:`zipfile` module for more details.
129 *compression* must be one of the following constants:
130
131
132 .. data:: TAR_PLAIN
133
134 Constant for an uncompressed tar archive.
135
136
137 .. data:: TAR_GZIPPED
138
139 Constant for a :mod:`gzip` compressed tar archive.
140
141
142.. exception:: TarError
143
144 Base class for all :mod:`tarfile` exceptions.
145
146
147.. exception:: ReadError
148
149 Is raised when a tar archive is opened, that either cannot be handled by the
150 :mod:`tarfile` module or is somehow invalid.
151
152
153.. exception:: CompressionError
154
155 Is raised when a compression method is not supported or when the data cannot be
156 decoded properly.
157
158
159.. exception:: StreamError
160
161 Is raised for the limitations that are typical for stream-like :class:`TarFile`
162 objects.
163
164
165.. exception:: ExtractError
166
167 Is raised for *non-fatal* errors when using :meth:`extract`, but only if
168 :attr:`TarFile.errorlevel`\ ``== 2``.
169
170
171.. exception:: HeaderError
172
173 Is raised by :meth:`frombuf` if the buffer it gets is invalid.
174
175 .. versionadded:: 2.6
176
177Each of the following constants defines a tar archive format that the
178:mod:`tarfile` module is able to create. See section :ref:`tar-formats` for
179details.
180
181
182.. data:: USTAR_FORMAT
183
184 POSIX.1-1988 (ustar) format.
185
186
187.. data:: GNU_FORMAT
188
189 GNU tar format.
190
191
192.. data:: PAX_FORMAT
193
194 POSIX.1-2001 (pax) format.
195
196
197.. data:: DEFAULT_FORMAT
198
199 The default format for creating archives. This is currently :const:`GNU_FORMAT`.
200
201
202.. seealso::
203
204 Module :mod:`zipfile`
205 Documentation of the :mod:`zipfile` standard module.
206
207 `GNU tar manual, Basic Tar Format <http://www.gnu.org/software/tar/manual/html_node/tar_134.html#SEC134>`_
208 Documentation for tar archive files, including GNU tar extensions.
209
210.. % -----------------
211.. % TarFile Objects
212.. % -----------------
213
214
215.. _tarfile-objects:
216
217TarFile Objects
218---------------
219
220The :class:`TarFile` object provides an interface to a tar archive. A tar
221archive is a sequence of blocks. An archive member (a stored file) is made up of
222a header block followed by data blocks. It is possible to store a file in a tar
223archive several times. Each archive member is represented by a :class:`TarInfo`
224object, see :ref:`tarinfo-objects` for details.
225
226
227.. class:: TarFile(name=None, mode='r', fileobj=None, format=DEFAULT_FORMAT, tarinfo=TarInfo, dereference=False, ignore_zeros=False, encoding=None, errors=None, pax_headers=None, debug=0, errorlevel=0)
228
229 All following arguments are optional and can be accessed as instance attributes
230 as well.
231
232 *name* is the pathname of the archive. It can be omitted if *fileobj* is given.
233 In this case, the file object's :attr:`name` attribute is used if it exists.
234
235 *mode* is either ``'r'`` to read from an existing archive, ``'a'`` to append
236 data to an existing file or ``'w'`` to create a new file overwriting an existing
237 one.
238
239 If *fileobj* is given, it is used for reading or writing data. If it can be
240 determined, *mode* is overridden by *fileobj*'s mode. *fileobj* will be used
241 from position 0.
242
243 .. note::
244
245 *fileobj* is not closed, when :class:`TarFile` is closed.
246
247 *format* controls the archive format. It must be one of the constants
248 :const:`USTAR_FORMAT`, :const:`GNU_FORMAT` or :const:`PAX_FORMAT` that are
249 defined at module level.
250
251 .. versionadded:: 2.6
252
253 The *tarinfo* argument can be used to replace the default :class:`TarInfo` class
254 with a different one.
255
256 .. versionadded:: 2.6
257
258 If *dereference* is ``False``, add symbolic and hard links to the archive. If it
259 is ``True``, add the content of the target files to the archive. This has no
260 effect on systems that do not support symbolic links.
261
262 If *ignore_zeros* is ``False``, treat an empty block as the end of the archive.
263 If it is *True*, skip empty (and invalid) blocks and try to get as many members
264 as possible. This is only useful for reading concatenated or damaged archives.
265
266 *debug* can be set from ``0`` (no debug messages) up to ``3`` (all debug
267 messages). The messages are written to ``sys.stderr``.
268
269 If *errorlevel* is ``0``, all errors are ignored when using :meth:`extract`.
270 Nevertheless, they appear as error messages in the debug output, when debugging
271 is enabled. If ``1``, all *fatal* errors are raised as :exc:`OSError` or
272 :exc:`IOError` exceptions. If ``2``, all *non-fatal* errors are raised as
273 :exc:`TarError` exceptions as well.
274
Lars Gustäbel3741eff2007-08-21 12:17:05 +0000275 The *encoding* and *errors* arguments define the character encoding to be
276 used for reading or writing the archive and how conversion errors are going
277 to be handled. The default settings will work for most users.
Georg Brandl116aa622007-08-15 14:28:22 +0000278 See section :ref:`tar-unicode` for in-depth information.
279
280 .. versionadded:: 2.6
281
Lars Gustäbel3741eff2007-08-21 12:17:05 +0000282 The *pax_headers* argument is an optional dictionary of strings which
Georg Brandl116aa622007-08-15 14:28:22 +0000283 will be added as a pax global header if *format* is :const:`PAX_FORMAT`.
284
285 .. versionadded:: 2.6
286
287
288.. method:: TarFile.open(...)
289
290 Alternative constructor. The :func:`open` function on module level is actually a
291 shortcut to this classmethod. See section :ref:`tarfile-mod` for details.
292
293
294.. method:: TarFile.getmember(name)
295
296 Return a :class:`TarInfo` object for member *name*. If *name* can not be found
297 in the archive, :exc:`KeyError` is raised.
298
299 .. note::
300
301 If a member occurs more than once in the archive, its last occurrence is assumed
302 to be the most up-to-date version.
303
304
305.. method:: TarFile.getmembers()
306
307 Return the members of the archive as a list of :class:`TarInfo` objects. The
308 list has the same order as the members in the archive.
309
310
311.. method:: TarFile.getnames()
312
313 Return the members as a list of their names. It has the same order as the list
314 returned by :meth:`getmembers`.
315
316
317.. method:: TarFile.list(verbose=True)
318
319 Print a table of contents to ``sys.stdout``. If *verbose* is :const:`False`,
320 only the names of the members are printed. If it is :const:`True`, output
321 similar to that of :program:`ls -l` is produced.
322
323
324.. method:: TarFile.next()
325
326 Return the next member of the archive as a :class:`TarInfo` object, when
327 :class:`TarFile` is opened for reading. Return ``None`` if there is no more
328 available.
329
330
331.. method:: TarFile.extractall([path[, members]])
332
333 Extract all members from the archive to the current working directory or
334 directory *path*. If optional *members* is given, it must be a subset of the
335 list returned by :meth:`getmembers`. Directory information like owner,
336 modification time and permissions are set after all members have been extracted.
337 This is done to work around two problems: A directory's modification time is
338 reset each time a file is created in it. And, if a directory's permissions do
339 not allow writing, extracting files to it will fail.
340
341 .. versionadded:: 2.5
342
343
344.. method:: TarFile.extract(member[, path])
345
346 Extract a member from the archive to the current working directory, using its
347 full name. Its file information is extracted as accurately as possible. *member*
348 may be a filename or a :class:`TarInfo` object. You can specify a different
349 directory using *path*.
350
351 .. note::
352
353 Because the :meth:`extract` method allows random access to a tar archive there
354 are some issues you must take care of yourself. See the description for
355 :meth:`extractall` above.
356
357
358.. method:: TarFile.extractfile(member)
359
360 Extract a member from the archive as a file object. *member* may be a filename
361 or a :class:`TarInfo` object. If *member* is a regular file, a file-like object
362 is returned. If *member* is a link, a file-like object is constructed from the
363 link's target. If *member* is none of the above, ``None`` is returned.
364
365 .. note::
366
367 The file-like object is read-only and provides the following methods:
368 :meth:`read`, :meth:`readline`, :meth:`readlines`, :meth:`seek`, :meth:`tell`.
369
370
371.. method:: TarFile.add(name[, arcname[, recursive[, exclude]]])
372
373 Add the file *name* to the archive. *name* may be any type of file (directory,
374 fifo, symbolic link, etc.). If given, *arcname* specifies an alternative name
375 for the file in the archive. Directories are added recursively by default. This
376 can be avoided by setting *recursive* to :const:`False`. If *exclude* is given
377 it must be a function that takes one filename argument and returns a boolean
378 value. Depending on this value the respective file is either excluded
379 (:const:`True`) or added (:const:`False`).
380
381 .. versionchanged:: 2.6
382 Added the *exclude* parameter.
383
384
385.. method:: TarFile.addfile(tarinfo[, fileobj])
386
387 Add the :class:`TarInfo` object *tarinfo* to the archive. If *fileobj* is given,
388 ``tarinfo.size`` bytes are read from it and added to the archive. You can
389 create :class:`TarInfo` objects using :meth:`gettarinfo`.
390
391 .. note::
392
393 On Windows platforms, *fileobj* should always be opened with mode ``'rb'`` to
394 avoid irritation about the file size.
395
396
397.. method:: TarFile.gettarinfo([name[, arcname[, fileobj]]])
398
399 Create a :class:`TarInfo` object for either the file *name* or the file object
400 *fileobj* (using :func:`os.fstat` on its file descriptor). You can modify some
401 of the :class:`TarInfo`'s attributes before you add it using :meth:`addfile`.
402 If given, *arcname* specifies an alternative name for the file in the archive.
403
404
405.. method:: TarFile.close()
406
407 Close the :class:`TarFile`. In write mode, two finishing zero blocks are
408 appended to the archive.
409
410
411.. attribute:: TarFile.posix
412
413 Setting this to :const:`True` is equivalent to setting the :attr:`format`
414 attribute to :const:`USTAR_FORMAT`, :const:`False` is equivalent to
415 :const:`GNU_FORMAT`.
416
417 .. versionchanged:: 2.4
418 *posix* defaults to :const:`False`.
419
420 .. deprecated:: 2.6
421 Use the :attr:`format` attribute instead.
422
423
424.. attribute:: TarFile.pax_headers
425
426 A dictionary containing key-value pairs of pax global headers.
427
428 .. versionadded:: 2.6
429
430.. % -----------------
431.. % TarInfo Objects
432.. % -----------------
433
434
435.. _tarinfo-objects:
436
437TarInfo Objects
438---------------
439
440A :class:`TarInfo` object represents one member in a :class:`TarFile`. Aside
441from storing all required attributes of a file (like file type, size, time,
442permissions, owner etc.), it provides some useful methods to determine its type.
443It does *not* contain the file's data itself.
444
445:class:`TarInfo` objects are returned by :class:`TarFile`'s methods
446:meth:`getmember`, :meth:`getmembers` and :meth:`gettarinfo`.
447
448
449.. class:: TarInfo([name])
450
451 Create a :class:`TarInfo` object.
452
453
454.. method:: TarInfo.frombuf(buf)
455
456 Create and return a :class:`TarInfo` object from string buffer *buf*.
457
458 .. versionadded:: 2.6
459 Raises :exc:`HeaderError` if the buffer is invalid..
460
461
462.. method:: TarInfo.fromtarfile(tarfile)
463
464 Read the next member from the :class:`TarFile` object *tarfile* and return it as
465 a :class:`TarInfo` object.
466
467 .. versionadded:: 2.6
468
469
470.. method:: TarInfo.tobuf([format[, encoding [, errors]]])
471
472 Create a string buffer from a :class:`TarInfo` object. For information on the
473 arguments see the constructor of the :class:`TarFile` class.
474
475 .. versionchanged:: 2.6
476 The arguments were added.
477
478A ``TarInfo`` object has the following public data attributes:
479
480
481.. attribute:: TarInfo.name
482
483 Name of the archive member.
484
485
486.. attribute:: TarInfo.size
487
488 Size in bytes.
489
490
491.. attribute:: TarInfo.mtime
492
493 Time of last modification.
494
495
496.. attribute:: TarInfo.mode
497
498 Permission bits.
499
500
501.. attribute:: TarInfo.type
502
503 File type. *type* is usually one of these constants: :const:`REGTYPE`,
504 :const:`AREGTYPE`, :const:`LNKTYPE`, :const:`SYMTYPE`, :const:`DIRTYPE`,
505 :const:`FIFOTYPE`, :const:`CONTTYPE`, :const:`CHRTYPE`, :const:`BLKTYPE`,
506 :const:`GNUTYPE_SPARSE`. To determine the type of a :class:`TarInfo` object
507 more conveniently, use the ``is_*()`` methods below.
508
509
510.. attribute:: TarInfo.linkname
511
512 Name of the target file name, which is only present in :class:`TarInfo` objects
513 of type :const:`LNKTYPE` and :const:`SYMTYPE`.
514
515
516.. attribute:: TarInfo.uid
517
518 User ID of the user who originally stored this member.
519
520
521.. attribute:: TarInfo.gid
522
523 Group ID of the user who originally stored this member.
524
525
526.. attribute:: TarInfo.uname
527
528 User name.
529
530
531.. attribute:: TarInfo.gname
532
533 Group name.
534
535
536.. attribute:: TarInfo.pax_headers
537
538 A dictionary containing key-value pairs of an associated pax extended header.
539
540 .. versionadded:: 2.6
541
542A :class:`TarInfo` object also provides some convenient query methods:
543
544
545.. method:: TarInfo.isfile()
546
547 Return :const:`True` if the :class:`Tarinfo` object is a regular file.
548
549
550.. method:: TarInfo.isreg()
551
552 Same as :meth:`isfile`.
553
554
555.. method:: TarInfo.isdir()
556
557 Return :const:`True` if it is a directory.
558
559
560.. method:: TarInfo.issym()
561
562 Return :const:`True` if it is a symbolic link.
563
564
565.. method:: TarInfo.islnk()
566
567 Return :const:`True` if it is a hard link.
568
569
570.. method:: TarInfo.ischr()
571
572 Return :const:`True` if it is a character device.
573
574
575.. method:: TarInfo.isblk()
576
577 Return :const:`True` if it is a block device.
578
579
580.. method:: TarInfo.isfifo()
581
582 Return :const:`True` if it is a FIFO.
583
584
585.. method:: TarInfo.isdev()
586
587 Return :const:`True` if it is one of character device, block device or FIFO.
588
589.. % ------------------------
590.. % Examples
591.. % ------------------------
592
593
594.. _tar-examples:
595
596Examples
597--------
598
599How to extract an entire tar archive to the current working directory::
600
601 import tarfile
602 tar = tarfile.open("sample.tar.gz")
603 tar.extractall()
604 tar.close()
605
606How to create an uncompressed tar archive from a list of filenames::
607
608 import tarfile
609 tar = tarfile.open("sample.tar", "w")
610 for name in ["foo", "bar", "quux"]:
611 tar.add(name)
612 tar.close()
613
614How to read a gzip compressed tar archive and display some member information::
615
616 import tarfile
617 tar = tarfile.open("sample.tar.gz", "r:gz")
618 for tarinfo in tar:
619 print tarinfo.name, "is", tarinfo.size, "bytes in size and is",
620 if tarinfo.isreg():
621 print "a regular file."
622 elif tarinfo.isdir():
623 print "a directory."
624 else:
625 print "something else."
626 tar.close()
627
628How to create a tar archive with faked information::
629
630 import tarfile
631 tar = tarfile.open("sample.tar.gz", "w:gz")
632 for name in namelist:
633 tarinfo = tar.gettarinfo(name, "fakeproj-1.0/" + name)
634 tarinfo.uid = 123
635 tarinfo.gid = 456
636 tarinfo.uname = "johndoe"
637 tarinfo.gname = "fake"
638 tar.addfile(tarinfo, file(name))
639 tar.close()
640
641The *only* way to extract an uncompressed tar stream from ``sys.stdin``::
642
643 import sys
644 import tarfile
645 tar = tarfile.open(mode="r|", fileobj=sys.stdin)
646 for tarinfo in tar:
647 tar.extract(tarinfo)
648 tar.close()
649
650.. % ------------
651.. % Tar format
652.. % ------------
653
654
655.. _tar-formats:
656
657Supported tar formats
658---------------------
659
660There are three tar formats that can be created with the :mod:`tarfile` module:
661
662* The POSIX.1-1988 ustar format (:const:`USTAR_FORMAT`). It supports filenames
663 up to a length of at best 256 characters and linknames up to 100 characters. The
664 maximum file size is 8 gigabytes. This is an old and limited but widely
665 supported format.
666
667* The GNU tar format (:const:`GNU_FORMAT`). It supports long filenames and
668 linknames, files bigger than 8 gigabytes and sparse files. It is the de facto
669 standard on GNU/Linux systems. :mod:`tarfile` fully supports the GNU tar
670 extensions for long names, sparse file support is read-only.
671
672* The POSIX.1-2001 pax format (:const:`PAX_FORMAT`). It is the most flexible
673 format with virtually no limits. It supports long filenames and linknames, large
674 files and stores pathnames in a portable way. However, not all tar
675 implementations today are able to handle pax archives properly.
676
677 The *pax* format is an extension to the existing *ustar* format. It uses extra
678 headers for information that cannot be stored otherwise. There are two flavours
679 of pax headers: Extended headers only affect the subsequent file header, global
680 headers are valid for the complete archive and affect all following files. All
681 the data in a pax header is encoded in *UTF-8* for portability reasons.
682
683There are some more variants of the tar format which can be read, but not
684created:
685
686* The ancient V7 format. This is the first tar format from Unix Seventh Edition,
687 storing only regular files and directories. Names must not be longer than 100
688 characters, there is no user/group name information. Some archives have
689 miscalculated header checksums in case of fields with non-ASCII characters.
690
691* The SunOS tar extended format. This format is a variant of the POSIX.1-2001
692 pax format, but is not compatible.
693
694.. % ----------------
695.. % Unicode issues
696.. % ----------------
697
698
699.. _tar-unicode:
700
701Unicode issues
702--------------
703
704The tar format was originally conceived to make backups on tape drives with the
705main focus on preserving file system information. Nowadays tar archives are
706commonly used for file distribution and exchanging archives over networks. One
Lars Gustäbel3741eff2007-08-21 12:17:05 +0000707problem of the original format (which is the basis of all other formats) is
708that there is no concept of supporting different character encodings. For
Georg Brandl116aa622007-08-15 14:28:22 +0000709example, an ordinary tar archive created on a *UTF-8* system cannot be read
Lars Gustäbel3741eff2007-08-21 12:17:05 +0000710correctly on a *Latin-1* system if it contains non-*ASCII* characters. Textual
711metadata (like filenames, linknames, user/group names) will appear damaged.
712Unfortunately, there is no way to autodetect the encoding of an archive. The
713pax format was designed to solve this problem. It stores non-ASCII metadata
714using the universal character encoding *UTF-8*.
Georg Brandl116aa622007-08-15 14:28:22 +0000715
Lars Gustäbel3741eff2007-08-21 12:17:05 +0000716The details of character conversion in :mod:`tarfile` are controlled by the
717*encoding* and *errors* keyword arguments of the :class:`TarFile` class.
Georg Brandl116aa622007-08-15 14:28:22 +0000718
Lars Gustäbel3741eff2007-08-21 12:17:05 +0000719*encoding* defines the character encoding to use for the metadata in the
720archive. The default value is :func:`sys.getfilesystemencoding` or ``'ascii'``
721as a fallback. Depending on whether the archive is read or written, the
722metadata must be either decoded or encoded. If *encoding* is not set
723appropriately, this conversion may fail.
Georg Brandl116aa622007-08-15 14:28:22 +0000724
725The *errors* argument defines how characters are treated that cannot be
Lars Gustäbel3741eff2007-08-21 12:17:05 +0000726converted. Possible values are listed in section :ref:`codec-base-classes`. In
727read mode the default scheme is ``'replace'``. This avoids unexpected
728:exc:`UnicodeError` exceptions and guarantees that an archive can always be
729read. In write mode the default value for *errors* is ``'strict'``. This
730ensures that name information is not altered unnoticed.
Georg Brandl116aa622007-08-15 14:28:22 +0000731
Lars Gustäbel3741eff2007-08-21 12:17:05 +0000732In case of writing :const:`PAX_FORMAT` archives, *encoding* is ignored because
733non-ASCII metadata is stored using *UTF-8*.