Tim Peters | 6cd6a82 | 2001-08-17 22:11:27 +0000 | [diff] [blame] | 1 | r"""Utilities to compile possibly incomplete Python source code. |
Guido van Rossum | c41c1a9 | 1998-10-22 21:56:15 +0000 | [diff] [blame] | 2 | |
Tim Peters | 6cd6a82 | 2001-08-17 22:11:27 +0000 | [diff] [blame] | 3 | This module provides two interfaces, broadly similar to the builtin |
Walter Dörwald | 4df3068 | 2003-11-20 13:38:01 +0000 | [diff] [blame] | 4 | function compile(), which take program text, a filename and a 'mode' |
Tim Peters | 6cd6a82 | 2001-08-17 22:11:27 +0000 | [diff] [blame] | 5 | and: |
Skip Montanaro | e99d5ea | 2001-01-20 19:54:20 +0000 | [diff] [blame] | 6 | |
Walter Dörwald | 4df3068 | 2003-11-20 13:38:01 +0000 | [diff] [blame] | 7 | - Return code object if the command is complete and valid |
| 8 | - Return None if the command is incomplete |
| 9 | - Raise SyntaxError, ValueError or OverflowError if the command is a |
Tim Peters | 6cd6a82 | 2001-08-17 22:11:27 +0000 | [diff] [blame] | 10 | syntax error (OverflowError and ValueError can be produced by |
| 11 | malformed literals). |
Guido van Rossum | c41c1a9 | 1998-10-22 21:56:15 +0000 | [diff] [blame] | 12 | |
Tim Peters | 6cd6a82 | 2001-08-17 22:11:27 +0000 | [diff] [blame] | 13 | Approach: |
Guido van Rossum | c41c1a9 | 1998-10-22 21:56:15 +0000 | [diff] [blame] | 14 | |
Tim Peters | 6cd6a82 | 2001-08-17 22:11:27 +0000 | [diff] [blame] | 15 | First, check if the source consists entirely of blank lines and |
| 16 | comments; if so, replace it with 'pass', because the built-in |
| 17 | parser doesn't always do the right thing for these. |
Guido van Rossum | c41c1a9 | 1998-10-22 21:56:15 +0000 | [diff] [blame] | 18 | |
Tim Peters | 6cd6a82 | 2001-08-17 22:11:27 +0000 | [diff] [blame] | 19 | Compile three times: as is, with \n, and with \n\n appended. If it |
| 20 | compiles as is, it's complete. If it compiles with one \n appended, |
| 21 | we expect more. If it doesn't compile either way, we compare the |
| 22 | error we get when compiling with \n or \n\n appended. If the errors |
| 23 | are the same, the code is broken. But if the errors are different, we |
| 24 | expect more. Not intuitive; not even guaranteed to hold in future |
| 25 | releases; but this matches the compiler's behavior from Python 1.4 |
| 26 | through 2.2, at least. |
Guido van Rossum | c41c1a9 | 1998-10-22 21:56:15 +0000 | [diff] [blame] | 27 | |
Tim Peters | 6cd6a82 | 2001-08-17 22:11:27 +0000 | [diff] [blame] | 28 | Caveat: |
Guido van Rossum | c41c1a9 | 1998-10-22 21:56:15 +0000 | [diff] [blame] | 29 | |
Tim Peters | 6cd6a82 | 2001-08-17 22:11:27 +0000 | [diff] [blame] | 30 | It is possible (but not likely) that the parser stops parsing with a |
| 31 | successful outcome before reaching the end of the source; in this |
| 32 | case, trailing symbols may be ignored instead of causing an error. |
| 33 | For example, a backslash followed by two newlines may be followed by |
| 34 | arbitrary garbage. This will be fixed once the API for the parser is |
| 35 | better. |
Guido van Rossum | c41c1a9 | 1998-10-22 21:56:15 +0000 | [diff] [blame] | 36 | |
Tim Peters | 6cd6a82 | 2001-08-17 22:11:27 +0000 | [diff] [blame] | 37 | The two interfaces are: |
Guido van Rossum | c41c1a9 | 1998-10-22 21:56:15 +0000 | [diff] [blame] | 38 | |
Tim Peters | 6cd6a82 | 2001-08-17 22:11:27 +0000 | [diff] [blame] | 39 | compile_command(source, filename, symbol): |
Guido van Rossum | c41c1a9 | 1998-10-22 21:56:15 +0000 | [diff] [blame] | 40 | |
Tim Peters | 6cd6a82 | 2001-08-17 22:11:27 +0000 | [diff] [blame] | 41 | Compiles a single command in the manner described above. |
Guido van Rossum | c41c1a9 | 1998-10-22 21:56:15 +0000 | [diff] [blame] | 42 | |
Tim Peters | 6cd6a82 | 2001-08-17 22:11:27 +0000 | [diff] [blame] | 43 | CommandCompiler(): |
Guido van Rossum | c41c1a9 | 1998-10-22 21:56:15 +0000 | [diff] [blame] | 44 | |
Tim Peters | 6cd6a82 | 2001-08-17 22:11:27 +0000 | [diff] [blame] | 45 | Instances of this class have __call__ methods identical in |
| 46 | signature to compile_command; the difference is that if the |
| 47 | instance compiles program text containing a __future__ statement, |
| 48 | the instance 'remembers' and compiles all subsequent program texts |
| 49 | with the statement in force. |
Guido van Rossum | c41c1a9 | 1998-10-22 21:56:15 +0000 | [diff] [blame] | 50 | |
Tim Peters | 6cd6a82 | 2001-08-17 22:11:27 +0000 | [diff] [blame] | 51 | The module also provides another class: |
| 52 | |
| 53 | Compile(): |
| 54 | |
| 55 | Instances of this class act like the built-in function compile, |
| 56 | but with 'memory' in the sense described above. |
| 57 | """ |
| 58 | |
| 59 | import __future__ |
Cheryl Sabella | 052d3fc | 2020-06-04 19:40:24 -0400 | [diff] [blame] | 60 | import warnings |
Tim Peters | 6cd6a82 | 2001-08-17 22:11:27 +0000 | [diff] [blame] | 61 | |
| 62 | _features = [getattr(__future__, fname) |
| 63 | for fname in __future__.all_feature_names] |
| 64 | |
| 65 | __all__ = ["compile_command", "Compile", "CommandCompiler"] |
| 66 | |
Guido van Rossum | 4b499dd3 | 2003-02-13 22:07:59 +0000 | [diff] [blame] | 67 | PyCF_DONT_IMPLY_DEDENT = 0x200 # Matches pythonrun.h |
| 68 | |
Tim Peters | 6cd6a82 | 2001-08-17 22:11:27 +0000 | [diff] [blame] | 69 | def _maybe_compile(compiler, source, filename, symbol): |
Guido van Rossum | c41c1a9 | 1998-10-22 21:56:15 +0000 | [diff] [blame] | 70 | # Check for source consisting of only blank lines and comments |
Eric S. Raymond | 6b71e74 | 2001-02-09 08:56:30 +0000 | [diff] [blame] | 71 | for line in source.split("\n"): |
| 72 | line = line.strip() |
Guido van Rossum | c41c1a9 | 1998-10-22 21:56:15 +0000 | [diff] [blame] | 73 | if line and line[0] != '#': |
| 74 | break # Leave it alone |
| 75 | else: |
Guido van Rossum | 993bc3a | 2003-05-16 01:24:30 +0000 | [diff] [blame] | 76 | if symbol != "eval": |
| 77 | source = "pass" # Replace it with a 'pass' statement |
Guido van Rossum | c41c1a9 | 1998-10-22 21:56:15 +0000 | [diff] [blame] | 78 | |
| 79 | err = err1 = err2 = None |
| 80 | code = code1 = code2 = None |
| 81 | |
| 82 | try: |
Tim Peters | 6cd6a82 | 2001-08-17 22:11:27 +0000 | [diff] [blame] | 83 | code = compiler(source, filename, symbol) |
Pablo Galindo | 293dd23 | 2019-11-19 21:34:03 +0000 | [diff] [blame] | 84 | except SyntaxError: |
Guido van Rossum | c41c1a9 | 1998-10-22 21:56:15 +0000 | [diff] [blame] | 85 | pass |
| 86 | |
Victor Stinner | 369a1cb | 2020-08-12 14:53:28 +0200 | [diff] [blame] | 87 | # Catch syntax warnings after the first compile |
Terry Jan Reedy | c818b15 | 2020-08-13 13:18:49 -0400 | [diff] [blame] | 88 | # to emit warnings (SyntaxWarning, DeprecationWarning) at most once. |
Cheryl Sabella | 052d3fc | 2020-06-04 19:40:24 -0400 | [diff] [blame] | 89 | with warnings.catch_warnings(): |
Terry Jan Reedy | c818b15 | 2020-08-13 13:18:49 -0400 | [diff] [blame] | 90 | warnings.simplefilter("error") |
Victor Stinner | 369a1cb | 2020-08-12 14:53:28 +0200 | [diff] [blame] | 91 | |
Cheryl Sabella | 052d3fc | 2020-06-04 19:40:24 -0400 | [diff] [blame] | 92 | try: |
| 93 | code1 = compiler(source + "\n", filename, symbol) |
| 94 | except SyntaxError as e: |
| 95 | err1 = e |
Guido van Rossum | c41c1a9 | 1998-10-22 21:56:15 +0000 | [diff] [blame] | 96 | |
Cheryl Sabella | 052d3fc | 2020-06-04 19:40:24 -0400 | [diff] [blame] | 97 | try: |
| 98 | code2 = compiler(source + "\n\n", filename, symbol) |
| 99 | except SyntaxError as e: |
| 100 | err2 = e |
Guido van Rossum | c41c1a9 | 1998-10-22 21:56:15 +0000 | [diff] [blame] | 101 | |
Mario Corchero | b64334c | 2019-12-06 14:27:38 +0000 | [diff] [blame] | 102 | try: |
| 103 | if code: |
| 104 | return code |
Pablo Galindo | dbb2281 | 2021-02-09 20:07:38 +0000 | [diff] [blame] | 105 | if not code1 and _is_syntax_error(err1, err2): |
Mario Corchero | b64334c | 2019-12-06 14:27:38 +0000 | [diff] [blame] | 106 | raise err1 |
| 107 | finally: |
| 108 | err1 = err2 = None |
Tim Peters | 6cd6a82 | 2001-08-17 22:11:27 +0000 | [diff] [blame] | 109 | |
Pablo Galindo | dbb2281 | 2021-02-09 20:07:38 +0000 | [diff] [blame] | 110 | def _is_syntax_error(err1, err2): |
| 111 | rep1 = repr(err1) |
| 112 | rep2 = repr(err2) |
| 113 | if "was never closed" in rep1 and "was never closed" in rep2: |
| 114 | return False |
| 115 | if rep1 == rep2: |
| 116 | return True |
| 117 | return False |
| 118 | |
Guido van Rossum | 4b499dd3 | 2003-02-13 22:07:59 +0000 | [diff] [blame] | 119 | def _compile(source, filename, symbol): |
| 120 | return compile(source, filename, symbol, PyCF_DONT_IMPLY_DEDENT) |
| 121 | |
Tim Peters | 6cd6a82 | 2001-08-17 22:11:27 +0000 | [diff] [blame] | 122 | def compile_command(source, filename="<input>", symbol="single"): |
| 123 | r"""Compile a command and determine whether it is incomplete. |
| 124 | |
| 125 | Arguments: |
| 126 | |
| 127 | source -- the source string; may contain \n characters |
| 128 | filename -- optional filename from which source was read; default |
| 129 | "<input>" |
Joannah Nanjekye | 7ba1f75 | 2020-05-14 21:59:46 -0300 | [diff] [blame] | 130 | symbol -- optional grammar start symbol; "single" (default), "exec" |
| 131 | or "eval" |
Tim Peters | 6cd6a82 | 2001-08-17 22:11:27 +0000 | [diff] [blame] | 132 | |
| 133 | Return value / exceptions raised: |
| 134 | |
| 135 | - Return a code object if the command is complete and valid |
| 136 | - Return None if the command is incomplete |
| 137 | - Raise SyntaxError, ValueError or OverflowError if the command is a |
| 138 | syntax error (OverflowError and ValueError can be produced by |
| 139 | malformed literals). |
| 140 | """ |
Guido van Rossum | 4b499dd3 | 2003-02-13 22:07:59 +0000 | [diff] [blame] | 141 | return _maybe_compile(_compile, source, filename, symbol) |
Tim Peters | 6cd6a82 | 2001-08-17 22:11:27 +0000 | [diff] [blame] | 142 | |
| 143 | class Compile: |
| 144 | """Instances of this class behave much like the built-in compile |
| 145 | function, but if one is used to compile text containing a future |
| 146 | statement, it "remembers" and compiles all subsequent program texts |
| 147 | with the statement in force.""" |
| 148 | def __init__(self): |
Guido van Rossum | 4b499dd3 | 2003-02-13 22:07:59 +0000 | [diff] [blame] | 149 | self.flags = PyCF_DONT_IMPLY_DEDENT |
Tim Peters | 6cd6a82 | 2001-08-17 22:11:27 +0000 | [diff] [blame] | 150 | |
| 151 | def __call__(self, source, filename, symbol): |
Serhiy Storchaka | 1f21eaa | 2019-09-01 12:16:51 +0300 | [diff] [blame] | 152 | codeob = compile(source, filename, symbol, self.flags, True) |
Tim Peters | 6cd6a82 | 2001-08-17 22:11:27 +0000 | [diff] [blame] | 153 | for feature in _features: |
| 154 | if codeob.co_flags & feature.compiler_flag: |
| 155 | self.flags |= feature.compiler_flag |
| 156 | return codeob |
| 157 | |
| 158 | class CommandCompiler: |
| 159 | """Instances of this class have __call__ methods identical in |
| 160 | signature to compile_command; the difference is that if the |
| 161 | instance compiles program text containing a __future__ statement, |
| 162 | the instance 'remembers' and compiles all subsequent program texts |
| 163 | with the statement in force.""" |
| 164 | |
| 165 | def __init__(self,): |
| 166 | self.compiler = Compile() |
| 167 | |
| 168 | def __call__(self, source, filename="<input>", symbol="single"): |
| 169 | r"""Compile a command and determine whether it is incomplete. |
| 170 | |
| 171 | Arguments: |
| 172 | |
| 173 | source -- the source string; may contain \n characters |
| 174 | filename -- optional filename from which source was read; |
| 175 | default "<input>" |
| 176 | symbol -- optional grammar start symbol; "single" (default) or |
| 177 | "eval" |
| 178 | |
| 179 | Return value / exceptions raised: |
| 180 | |
| 181 | - Return a code object if the command is complete and valid |
| 182 | - Return None if the command is incomplete |
| 183 | - Raise SyntaxError, ValueError or OverflowError if the command is a |
| 184 | syntax error (OverflowError and ValueError can be produced by |
| 185 | malformed literals). |
| 186 | """ |
| 187 | return _maybe_compile(self.compiler, source, filename, symbol) |