Issue #8633: Support for POSIX.1-2008 binary pax headers.
tarfile is now able to read and write pax headers with a
"hdrcharset=BINARY" record. This record was introduced in
POSIX.1-2008 as a method to store unencoded binary strings that
cannot be translated to UTF-8. In practice, this is just a workaround
that allows a tar implementation to store filenames that do not
comply with the current filesystem encoding and thus cannot be
decoded correctly.
Additionally, tarfile works around a bug in current versions of GNU
tar: undecodable filenames are stored as-is in a pax header without a
"hdrcharset" record being added. Technically, these headers are
invalid, but tarfile manages to read them correctly anyway.
diff --git a/Misc/NEWS b/Misc/NEWS
index 1df122b..ee19f36 100644
--- a/Misc/NEWS
+++ b/Misc/NEWS
@@ -366,6 +366,9 @@
Library
-------
+- Issue #8633: tarfile is now able to read and write archives with "raw" binary
+ pax headers as described in POSIX.1-2008.
+
- Issue #1285086: Speed up urllib.parse functions: quote, quote_from_bytes,
unquote, unquote_to_bytes.