bpo-34454: Clean up datetime.fromisoformat surrogate handling (GH-8959)
* Use _PyUnicode_Copy in sanitize_isoformat_str
* Use repr in fromisoformat error message
This reverses commit 67b74a98b2 per Serhiy Storchaka's suggestion:
I suggested to use %R in the error message because including the raw
string can be confusing in the case of empty string, or string
containing trailing whitespaces, invisible or unprintable characters.
We agree that it is better to change both the C and pure Python versions
to use repr.
* Retain non-sanitized dtstr for error printing
This does not create an extra string, it just holds on to a reference to
the original input string for purposes of creating the error message.
* PEP 7 fixes to from_isoformat
* Separate handling of Unicode and other errors
In the initial implementation, errors other than encoding errors would
both raise an error indicating an invalid format, which would not be
true for errors like MemoryError.
* Drop needs_decref from _sanitize_isoformat_str
Instead _sanitize_isoformat_str returns a new reference, even to the
original string.
(cherry picked from commit 3df85404d4bf420db3362eeae1345f2cad948a71)
Co-authored-by: Paul Ganssle <pganssle@users.noreply.github.com>
diff --git a/Lib/datetime.py b/Lib/datetime.py
index 12a0f14..8bffbef 100644
--- a/Lib/datetime.py
+++ b/Lib/datetime.py
@@ -857,7 +857,7 @@
assert len(date_string) == 10
return cls(*_parse_isoformat_date(date_string))
except Exception:
- raise ValueError('Invalid isoformat string: %s' % date_string)
+ raise ValueError(f'Invalid isoformat string: {date_string!r}')
# Conversions to string
@@ -1369,7 +1369,7 @@
try:
return cls(*_parse_isoformat_time(time_string))
except Exception:
- raise ValueError('Invalid isoformat string: %s' % time_string)
+ raise ValueError(f'Invalid isoformat string: {time_string!r}')
def strftime(self, fmt):
@@ -1646,13 +1646,13 @@
try:
date_components = _parse_isoformat_date(dstr)
except ValueError:
- raise ValueError('Invalid isoformat string: %s' % date_string)
+ raise ValueError(f'Invalid isoformat string: {date_string!r}')
if tstr:
try:
time_components = _parse_isoformat_time(tstr)
except ValueError:
- raise ValueError('Invalid isoformat string: %s' % date_string)
+ raise ValueError(f'Invalid isoformat string: {date_string!r}')
else:
time_components = [0, 0, 0, 0, None]