closes bpo-31650: PEP 552 (Deterministic pycs) implementation (#4575)
Python now supports checking bytecode cache up-to-dateness with a hash of the
source contents rather than volatile source metadata. See the PEP for details.
While a fairly straightforward idea, quite a lot of code had to be modified due
to the pervasiveness of pyc implementation details in the codebase. Changes in
this commit include:
- The core changes to importlib to understand how to read, validate, and
regenerate hash-based pycs.
- Support for generating hash-based pycs in py_compile and compileall.
- Modifications to our siphash implementation to support passing a custom
key. We then expose it to importlib through _imp.
- Updates to all places in the interpreter, standard library, and tests that
manually generate or parse pyc files to grok the new format.
- Support in the interpreter command line code for long options like
--check-hash-based-pycs.
- Tests and documentation for all of the above.
diff --git a/Lib/importlib/util.py b/Lib/importlib/util.py
index 41c74d4..9d0a90d 100644
--- a/Lib/importlib/util.py
+++ b/Lib/importlib/util.py
@@ -5,18 +5,25 @@
from ._bootstrap import spec_from_loader
from ._bootstrap import _find_spec
from ._bootstrap_external import MAGIC_NUMBER
+from ._bootstrap_external import _RAW_MAGIC_NUMBER
from ._bootstrap_external import cache_from_source
from ._bootstrap_external import decode_source
from ._bootstrap_external import source_from_cache
from ._bootstrap_external import spec_from_file_location
from contextlib import contextmanager
+import _imp
import functools
import sys
import types
import warnings
+def source_hash(source_bytes):
+ "Return the hash of *source_bytes* as used in hash-based pyc files."
+ return _imp.source_hash(_RAW_MAGIC_NUMBER, source_bytes)
+
+
def resolve_name(name, package):
"""Resolve a relative module name to an absolute one."""
if not name.startswith('.'):