SF bug #1582282; decode_header() incorrectly splits not-conformant RFC
2047-like headers where there is no whitespace between encoded words. This
fix changes the matching regexp to include a trailing lookahead assertion that
the closing ?= must be followed by whitespace, newline, or end-of-string.
This also changes the regexp to add the MULTILINE flag.
diff --git a/Lib/email/header.py b/Lib/email/header.py
index 183c337..e139ccf 100644
--- a/Lib/email/header.py
+++ b/Lib/email/header.py
@@ -39,7 +39,8 @@
\? # literal ?
(?P<encoded>.*?) # non-greedy up to the next ?= is the encoded string
\?= # literal ?=
- ''', re.VERBOSE | re.IGNORECASE)
+ (?=[ \t]|$) # whitespace or the end of the string
+ ''', re.VERBOSE | re.IGNORECASE | re.MULTILINE)
# Field name regexp, including trailing colon, but not separating whitespace,
# according to RFC 2822. Character range is from tilde to exclamation mark.