bpo-40612: Fix SyntaxError edge cases in traceback formatting (GH-20072)



This fixes both the traceback.py module and the C code for formatting syntax errors (in Python/pythonrun.c). They now both consistently do the following:

- Suppress caret if it points left of text
- Allow caret pointing just past end of line
- If caret points past end of line, clip to *just* past end of line

The syntax error formatting code in traceback.py was mostly rewritten; small, subtle changes were applied to the C code in pythonrun.c.

There's still a difference when the text contains embedded newlines. Neither handles these very well, and I don't think the case occurs in practice.

Automerge-Triggered-By: @gvanrossum
diff --git a/Lib/traceback.py b/Lib/traceback.py
index bf34bba..a19e387 100644
--- a/Lib/traceback.py
+++ b/Lib/traceback.py
@@ -569,23 +569,30 @@
 
         if not issubclass(self.exc_type, SyntaxError):
             yield _format_final_exc_line(stype, self._str)
-            return
+        else:
+            yield from self._format_syntax_error(stype)
 
-        # It was a syntax error; show exactly where the problem was found.
+    def _format_syntax_error(self, stype):
+        """Format SyntaxError exceptions (internal helper)."""
+        # Show exactly where the problem was found.
         filename = self.filename or "<string>"
         lineno = str(self.lineno) or '?'
         yield '  File "{}", line {}\n'.format(filename, lineno)
 
-        badline = self.text
-        offset = self.offset
-        if badline is not None:
-            yield '    {}\n'.format(badline.strip())
-            if offset is not None:
-                caretspace = badline.rstrip('\n')
-                offset = min(len(caretspace), offset) - 1
-                caretspace = caretspace[:offset].lstrip()
+        text = self.text
+        if text is not None:
+            # text  = "   foo\n"
+            # rtext = "   foo"
+            # ltext =    "foo"
+            rtext = text.rstrip('\n')
+            ltext = rtext.lstrip(' \n\f')
+            spaces = len(rtext) - len(ltext)
+            yield '    {}\n'.format(ltext)
+            # Convert 1-based column offset to 0-based index into stripped text
+            caret = (self.offset or 0) - 1 - spaces
+            if caret >= 0:
                 # non-space whitespace (likes tabs) must be kept for alignment
-                caretspace = ((c.isspace() and c or ' ') for c in caretspace)
+                caretspace = ((c if c.isspace() else ' ') for c in ltext[:caret])
                 yield '    {}^\n'.format(''.join(caretspace))
         msg = self.msg or "<no detail available>"
         yield "{}: {}\n".format(stype, msg)