blob: 9eb093cbd0a1b829240a2cea691421065e5153ed [file] [log] [blame]
Barry Warsawd7d21942012-07-29 16:36:17 -04001
Barry Warsawdadebab2012-07-31 16:03:09 -04002.. _importsystem:
Barry Warsawd7d21942012-07-29 16:36:17 -04003
Barry Warsawdadebab2012-07-31 16:03:09 -04004*****************
5The import system
6*****************
Barry Warsawd7d21942012-07-29 16:36:17 -04007
8.. index:: single: import machinery
9
10Python code in one :term:`module` gains access to the code in another module
Barry Warsawc1e721b2012-07-30 16:24:12 -040011by the process of :term:`importing` it. The :keyword:`import` statement is
12the most common way of invoking the import machinery, but it is not the only
13way. Functions such as :func:`importlib.import_module` and built-in
14:func:`__import__` can also be used to invoke the import machinery.
Barry Warsawd7d21942012-07-29 16:36:17 -040015
16The :keyword:`import` statement combines two operations; it searches for the
17named module, then it binds the results of that search to a name in the local
18scope. The search operation of the :keyword:`import` statement is defined as
Nick Coghlan49417742012-08-02 23:03:58 +100019a call to the :func:`__import__` function, with the appropriate arguments.
20The return value of :func:`__import__` is used to perform the name
Barry Warsawc1e721b2012-07-30 16:24:12 -040021binding operation of the :keyword:`import` statement. See the
22:keyword:`import` statement for the exact details of that name binding
23operation.
Barry Warsawd7d21942012-07-29 16:36:17 -040024
Barry Warsawc1e721b2012-07-30 16:24:12 -040025A direct call to :func:`__import__` performs only the module search and, if
26found, the module creation operation. While certain side-effects may occur,
27such as the importing of parent packages, and the updating of various caches
28(including :data:`sys.modules`), only the :keyword:`import` statement performs
29a name binding operation.
Barry Warsawd7d21942012-07-29 16:36:17 -040030
Nick Coghlan49417742012-08-02 23:03:58 +100031When calling :func:`__import__` as part of an import statement, the
32import system first checks the module global namespace for a function by
33that name. If it is not found, then the standard builtin :func:`__import__`
34is called. Other mechanisms for invoking the import system (such as
35:func:`importlib.import_module`) do not perform this check and will always
36use the standard import system.
37
Barry Warsawd7d21942012-07-29 16:36:17 -040038When a module is first imported, Python searches for the module and if found,
Barry Warsawc1e721b2012-07-30 16:24:12 -040039it creates a module object [#fnmo]_, initializing it. If the named module
40cannot be found, an :exc:`ImportError` is raised. Python implements various
41strategies to search for the named module when the import machinery is
42invoked. These strategies can be modified and extended by using various hooks
Nick Coghlan49417742012-08-02 23:03:58 +100043described in the sections below.
Barry Warsawc1e721b2012-07-30 16:24:12 -040044
45
46:mod:`importlib`
47================
48
49The :mod:`importlib` module provides a rich API for interacting with the
50import system. For example :func:`importlib.import_module` provides a
51recommended, simpler API than built-in :func:`__import__` for invoking the
52import machinery. Refer to the :mod:`importlib` library documentation for
53additional detail.
54
Barry Warsawd7d21942012-07-29 16:36:17 -040055
56
57Packages
58========
59
60.. index::
61 single: package
62
63Python has only one type of module object, and all modules are of this type,
64regardless of whether the module is implemented in Python, C, or something
65else. To help organize modules and provide a naming hierarchy, Python has a
Barry Warsawc1e721b2012-07-30 16:24:12 -040066concept of :term:`packages <package>`.
Barry Warsawd7d21942012-07-29 16:36:17 -040067
Barry Warsawc1e721b2012-07-30 16:24:12 -040068You can think of packages as the directories on a file system and modules as
69files within directories, but don't take this analogy too literally since
70packages and modules need not originate from the file system. For the
71purposes of this documentation, we'll use this convenient analogy of
72directories and files. Like file system directories, packages are organized
73hierarchically, and packages may themselves contain subpackages, as well as
74regular modules.
Barry Warsawd7d21942012-07-29 16:36:17 -040075
Barry Warsawc1e721b2012-07-30 16:24:12 -040076It's important to keep in mind that all packages are modules, but not all
77modules are packages. Or put another way, packages are just a special kind of
Nick Coghlan49417742012-08-02 23:03:58 +100078module. Specifically, any module that contains a ``__path__`` attribute is
Barry Warsawc1e721b2012-07-30 16:24:12 -040079considered a package.
80
81All modules have a name. Subpackage names are separated from their parent
82package name by dots, akin to Python's standard attribute access syntax. Thus
83you might have a module called :mod:`sys` and a package called :mod:`email`,
84which in turn has a subpackage called :mod:`email.mime` and a module within
85that subpackage called :mod:`email.mime.text`.
Barry Warsawd7d21942012-07-29 16:36:17 -040086
87
88Regular packages
89----------------
90
91.. index::
92 pair: package; regular
93
94Python defines two types of packages, :term:`regular packages <regular
95package>` and :term:`namespace packages <namespace package>`. Regular
96packages are traditional packages as they existed in Python 3.2 and earlier.
97A regular package is typically implemented as a directory containing an
98``__init__.py`` file. When a regular package is imported, this
Nick Coghlan49417742012-08-02 23:03:58 +100099``__init__.py`` file is implicitly executed, and the objects it defines are
Barry Warsawd7d21942012-07-29 16:36:17 -0400100bound to names in the package's namespace. The ``__init__.py`` file can
101contain the same Python code that any other module can contain, and Python
102will add some additional attributes to the module when it is imported.
103
Barry Warsawd7d21942012-07-29 16:36:17 -0400104For example, the following file system layout defines a top level ``parent``
105package with three subpackages::
106
107 parent/
108 __init__.py
109 one/
110 __init__.py
111 two/
112 __init__.py
113 three/
114 __init__.py
115
Nick Coghlan49417742012-08-02 23:03:58 +1000116Importing ``parent.one`` will implicitly execute ``parent/__init__.py`` and
Barry Warsawd7d21942012-07-29 16:36:17 -0400117``parent/one/__init__.py``. Subsequent imports of ``parent.two`` or
Nick Coghlan49417742012-08-02 23:03:58 +1000118``parent.three`` will execute ``parent/two/__init__.py`` and
Barry Warsawd7d21942012-07-29 16:36:17 -0400119``parent/three/__init__.py`` respectively.
120
Barry Warsawc1e721b2012-07-30 16:24:12 -0400121
122Namespace packages
123------------------
124
125.. index::
126 pair:: package; namespace
127 pair:: package; portion
128
129A namespace package is a composite of various :term:`portions <portion>`,
130where each portion contributes a subpackage to the parent package. Portions
131may reside in different locations on the file system. Portions may also be
132found in zip files, on the network, or anywhere else that Python searches
133during import. Namespace packages may or may not correspond directly to
134objects on the file system; they may be virtual modules that have no concrete
135representation.
136
Nick Coghlan49417742012-08-02 23:03:58 +1000137Namespace packages do not use an ordinary list for their ``__path__``
138attribute. They instead use a custom iterable type which will automatically
139perform a new search for package portions on the next import attempt within
140that package if the path of their parent package (or :data:`sys.path` for a
141top level package) changes.
142
Barry Warsawd7d21942012-07-29 16:36:17 -0400143With namespace packages, there is no ``parent/__init__.py`` file. In fact,
144there may be multiple ``parent`` directories found during import search, where
Barry Warsawc1e721b2012-07-30 16:24:12 -0400145each one is provided by a different portion. Thus ``parent/one`` may not be
Barry Warsawd7d21942012-07-29 16:36:17 -0400146physically located next to ``parent/two``. In this case, Python will create a
147namespace package for the top-level ``parent`` package whenever it or one of
148its subpackages is imported.
149
Barry Warsawc1e721b2012-07-30 16:24:12 -0400150See also :pep:`420` for the namespace package specification.
151
Barry Warsawd7d21942012-07-29 16:36:17 -0400152
153Searching
154=========
155
156To begin the search, Python needs the :term:`fully qualified <qualified name>`
157name of the module (or package, but for the purposes of this discussion, the
158difference is immaterial) being imported. This name may come from various
159arguments to the :keyword:`import` statement, or from the parameters to the
Barry Warsawc1e721b2012-07-30 16:24:12 -0400160:func:`importlib.import_module` or :func:`__import__` functions.
Barry Warsawd7d21942012-07-29 16:36:17 -0400161
162This name will be used in various phases of the import search, and it may be
163the dotted path to a submodule, e.g. ``foo.bar.baz``. In this case, Python
164first tries to import ``foo``, then ``foo.bar``, and finally ``foo.bar.baz``.
165If any of the intermediate imports fail, an :exc:`ImportError` is raised.
166
167
168The module cache
169----------------
170
171.. index::
172 single: sys.modules
173
174The first place checked during import search is :data:`sys.modules`. This
175mapping serves as a cache of all modules that have been previously imported,
176including the intermediate paths. So if ``foo.bar.baz`` was previously
177imported, :data:`sys.modules` will contain entries for ``foo``, ``foo.bar``,
178and ``foo.bar.baz``. Each key will have as its value the corresponding module
179object.
180
181During import, the module name is looked up in :data:`sys.modules` and if
182present, the associated value is the module satisfying the import, and the
183process completes. However, if the value is ``None``, then an
184:exc:`ImportError` is raised. If the module name is missing, Python will
185continue searching for the module.
186
Nick Coghlan49417742012-08-02 23:03:58 +1000187:data:`sys.modules` is writable. Deleting a key may not destroy the
188associated module (as other modules may hold references to it),
189but it will invalidate the cache entry for the named module, causing
190Python to search anew for the named module upon its next
191import. The key can also be assigned to ``None``, forcing the next import
192of the module to result in an :exc:`ImportError`.
193
194Beware though, as if you keep a reference to the module object,
Barry Warsawd7d21942012-07-29 16:36:17 -0400195invalidate its cache entry in :data:`sys.modules`, and then re-import the
Nick Coghlan49417742012-08-02 23:03:58 +1000196named module, the two module objects will *not* be the same. By contrast,
197:func:`imp.reload` will reuse the *same* module object, and simply
198reinitialise the module contents by rerunning the module's code.
Barry Warsawd7d21942012-07-29 16:36:17 -0400199
200
201Finders and loaders
202-------------------
203
204.. index::
205 single: finder
206 single: loader
207
Barry Warsawdadebab2012-07-31 16:03:09 -0400208If the named module is not found in :data:`sys.modules`, then Python's import
209protocol is invoked to find and load the module. This protocol consists of
210two conceptual objects, :term:`finders <finder>` and :term:`loaders <loader>`.
211A finder's job is to determine whether it can find the named module using
Nick Coghlan49417742012-08-02 23:03:58 +1000212whatever strategy it knows about. Objects that implement both of these
213interfaces are referred to as :term:`importers <importer>` - they return
214themselves when they find that they can load the requested module.
Barry Warsawdadebab2012-07-31 16:03:09 -0400215
Nick Coghlan49417742012-08-02 23:03:58 +1000216By default, Python comes with several default finders and importers. One
217knows how to locate frozen modules, and another knows how to locate
218built-in modules. A third default finder searches an :term:`import path`
219for modules. The :term:`import path` is a list of locations that may
220name file system paths or zip files. It can also be extended to search
221for any locatable resource, such as those identified by URLs.
Barry Warsawdadebab2012-07-31 16:03:09 -0400222
223The import machinery is extensible, so new finders can be added to extend the
224range and scope of module searching.
Barry Warsawd7d21942012-07-29 16:36:17 -0400225
226Finders do not actually load modules. If they can find the named module, they
Barry Warsawdadebab2012-07-31 16:03:09 -0400227return a :term:`loader`, which the import machinery then invokes to load the
228module and create the corresponding module object.
Barry Warsawd7d21942012-07-29 16:36:17 -0400229
230The following sections describe the protocol for finders and loaders in more
231detail, including how you can create and register new ones to extend the
232import machinery.
233
234
235Import hooks
236------------
237
238.. index::
239 single: import hooks
240 single: meta hooks
241 single: path hooks
242 pair: hooks; import
243 pair: hooks; meta
244 pair: hooks; path
245
246The import machinery is designed to be extensible; the primary mechanism for
247this are the *import hooks*. There are two types of import hooks: *meta
Barry Warsawdadebab2012-07-31 16:03:09 -0400248hooks* and *import path hooks*.
Barry Warsawd7d21942012-07-29 16:36:17 -0400249
250Meta hooks are called at the start of import processing, before any other
Barry Warsawdadebab2012-07-31 16:03:09 -0400251import processing has occurred, other than :data:`sys.modules` cache look up.
252This allows meta hooks to override :data:`sys.path` processing, frozen
253modules, or even built-in modules. Meta hooks are registered by adding new
254finder objects to :data:`sys.meta_path`, as described below.
Barry Warsawd7d21942012-07-29 16:36:17 -0400255
Barry Warsawdadebab2012-07-31 16:03:09 -0400256Import path hooks are called as part of :data:`sys.path` (or
257``package.__path__``) processing, at the point where their associated path
258item is encountered. Import path hooks are registered by adding new callables
259to :data:`sys.path_hooks` as described below.
Barry Warsawd7d21942012-07-29 16:36:17 -0400260
261
262The meta path
263-------------
264
265.. index::
266 single: sys.meta_path
267 pair: finder; find_module
268 pair: finder; find_loader
269
270When the named module is not found in :data:`sys.modules`, Python next
271searches :data:`sys.meta_path`, which contains a list of meta path finder
272objects. These finders are queried in order to see if they know how to handle
273the named module. Meta path finders must implement a method called
Barry Warsawdadebab2012-07-31 16:03:09 -0400274:meth:`find_module()` which takes two arguments, a name and an import path.
275The meta path finder can use any strategy it wants to determine whether it can
276handle the named module or not.
Barry Warsawd7d21942012-07-29 16:36:17 -0400277
278If the meta path finder knows how to handle the named module, it returns a
279loader object. If it cannot handle the named module, it returns ``None``. If
280:data:`sys.meta_path` processing reaches the end of its list without returning
281a loader, then an :exc:`ImportError` is raised. Any other exceptions raised
282are simply propagated up, aborting the import process.
283
284The :meth:`find_module()` method of meta path finders is called with two
285arguments. The first is the fully qualified name of the module being
Nick Coghlan49417742012-08-02 23:03:58 +1000286imported, for example ``foo.bar.baz``. The second argument is the path
287entries to use for the module search. For top-level modules, the second
288argument is ``None``, but for submodules or subpackages, the second
289argument is the value of the parent package's ``__path__`` attribute. If
290the appropriate ``__path__`` attribute cannot be accessed, an
291:exc:`ImportError` is raised.
292
293The meta path may be traversed multiple times for a single import request.
294For example, assuming none of the modules involved has already been cached,
295importing ``foo.bar.baz`` will first perform a top level import, calling
296``mpf.find_module("foo", None)`` on each meta path finder (``mpf``). After
297``foo`` has been imported, ``foo.bar`` will be imported by traversing the
298meta path a second time, calling
299``mpf.find_module("foo.bar", foo.__path__)``. Once ``foo.bar`` has been
300imported, the final traversal will call
301``mpf.find_module("foo.bar.baz", foo.bar.__path__)``.
302
303Some meta path finders only support top level imports. These importers will
304always return ``None`` when anything other than ``None`` is passed as the
305second argument.
Barry Warsawd7d21942012-07-29 16:36:17 -0400306
307Python's default :data:`sys.meta_path` has three meta path finders, one that
308knows how to import built-in modules, one that knows how to import frozen
Barry Warsawdadebab2012-07-31 16:03:09 -0400309modules, and one that knows how to import modules from an :term:`import path`
Barry Warsawd7d21942012-07-29 16:36:17 -0400310(i.e. the :term:`path importer`).
311
312
Barry Warsawdadebab2012-07-31 16:03:09 -0400313Loaders
314=======
Barry Warsawd7d21942012-07-29 16:36:17 -0400315
Barry Warsawdadebab2012-07-31 16:03:09 -0400316If and when a module loader is found its
Barry Warsawc1e721b2012-07-30 16:24:12 -0400317:meth:`~importlib.abc.Loader.load_module` method is called, with a single
318argument, the fully qualified name of the module being imported. This method
319has several responsibilities, and should return the module object it has
320loaded [#fnlo]_. If it cannot load the module, it should raise an
321:exc:`ImportError`, although any other exception raised during
322:meth:`load_module()` will be propagated.
Barry Warsawd7d21942012-07-29 16:36:17 -0400323
Barry Warsawdadebab2012-07-31 16:03:09 -0400324In many cases, the finder and loader can be the same object; in such cases the
325:meth:`finder.find_module()` would just return ``self``.
Barry Warsawd7d21942012-07-29 16:36:17 -0400326
327Loaders must satisfy the following requirements:
328
329 * If there is an existing module object with the given name in
330 :data:`sys.modules`, the loader must use that existing module. (Otherwise,
Nick Coghlan49417742012-08-02 23:03:58 +1000331 :func:`imp.reload` will not work correctly.) If the named module does
Barry Warsawc1e721b2012-07-30 16:24:12 -0400332 not exist in :data:`sys.modules`, the loader must create a new module
Barry Warsawd7d21942012-07-29 16:36:17 -0400333 object and add it to :data:`sys.modules`.
334
335 Note that the module *must* exist in :data:`sys.modules` before the loader
336 executes the module code. This is crucial because the module code may
337 (directly or indirectly) import itself; adding it to :data:`sys.modules`
338 beforehand prevents unbounded recursion in the worst case and multiple
339 loading in the best.
340
Barry Warsawdadebab2012-07-31 16:03:09 -0400341 If loading fails, the loader must remove any modules it has inserted into
342 :data:`sys.modules`, but it must remove **only** the failing module, and
343 only if the loader itself has loaded it explicitly. Any module already in
344 the :data:`sys.modules` cache, and any module that was successfully loaded
345 as a side-effect, must remain in the cache.
Barry Warsawd7d21942012-07-29 16:36:17 -0400346
347 * The loader may set the ``__file__`` attribute of the module. If set, this
348 attribute's value must be a string. The loader may opt to leave
349 ``__file__`` unset if it has no semantic meaning (e.g. a module loaded from
350 a database).
351
352 * The loader may set the ``__name__`` attribute of the module. While not
353 required, setting this attribute is highly recommended so that the
354 :meth:`repr()` of the module is more informative.
355
Nick Coghlan49417742012-08-02 23:03:58 +1000356 * If the module is a package (either regular or namespace), the loader must
357 set the module object's ``__path__`` attribute. The value must be
358 iterable, but may be empty if ``__path__`` has no further significance
359 to the importer. If ``__path__`` is not empty, it must produce strings
360 when iterated over. More details on the semantics of ``__path__`` are
361 given :ref`below <package-path-rules>`.
Barry Warsawd7d21942012-07-29 16:36:17 -0400362
363 * The ``__loader__`` attribute must be set to the loader object that loaded
364 the module. This is mostly for introspection and reloading, but can be
365 used for additional importer-specific functionality, for example getting
366 data associated with an importer.
367
368 * The module's ``__package__`` attribute should be set. Its value must be a
Barry Warsawdadebab2012-07-31 16:03:09 -0400369 string, but it can be the same value as its ``__name__``. If the attribute
370 is set to ``None`` or is missing, the import system will fill it in with a
371 more appropriate value. When the module is a package, its ``__package__``
372 value should be set to its ``__name__``. When the module is not a package,
373 ``__package__`` should be set to the empty string for top-level modules, or
374 for submodules, to the parent package's name. See :pep:`366` for further
375 details.
Barry Warsawd7d21942012-07-29 16:36:17 -0400376
377 This attribute is used instead of ``__name__`` to calculate explicit
378 relative imports for main modules, as defined in :pep:`366`.
379
380 * If the module is a Python module (as opposed to a built-in module or a
Barry Warsawc1e721b2012-07-30 16:24:12 -0400381 dynamically loaded extension), the loader should execute the module's code
382 in the module's global name space (``module.__dict__``).
Barry Warsawd7d21942012-07-29 16:36:17 -0400383
384
385Module reprs
386------------
387
388By default, all modules have a usable repr, however depending on the
Barry Warsawc1e721b2012-07-30 16:24:12 -0400389attributes set above, and hooks in the loader, you can more explicitly control
Barry Warsawd7d21942012-07-29 16:36:17 -0400390the repr of module objects.
391
392Loaders may implement a :meth:`module_repr()` method which takes a single
393argument, the module object. When ``repr(module)`` is called for a module
394with a loader supporting this protocol, whatever is returned from
Barry Warsawc1e721b2012-07-30 16:24:12 -0400395``module.__loader__.module_repr(module)`` is returned as the module's repr
396without further processing. This return value must be a string.
Barry Warsawd7d21942012-07-29 16:36:17 -0400397
398If the module has no ``__loader__`` attribute, or the loader has no
399:meth:`module_repr()` method, then the module object implementation itself
400will craft a default repr using whatever information is available. It will
401try to use the ``module.__name__``, ``module.__file__``, and
402``module.__loader__`` as input into the repr, with defaults for whatever
403information is missing.
404
405Here are the exact rules used:
406
Nick Coghlan49417742012-08-02 23:03:58 +1000407 * If the module has a ``__loader__`` and that loader has a
Barry Warsawd7d21942012-07-29 16:36:17 -0400408 :meth:`module_repr()` method, call it with a single argument, which is the
409 module object. The value returned is used as the module's repr.
410
411 * If an exception occurs in :meth:`module_repr()`, the exception is caught
412 and discarded, and the calculation of the module's repr continues as if
413 :meth:`module_repr()` did not exist.
414
Nick Coghlan49417742012-08-02 23:03:58 +1000415 * If the module has a ``__file__`` attribute, this is used as part of the
Barry Warsawd7d21942012-07-29 16:36:17 -0400416 module's repr.
417
Nick Coghlan49417742012-08-02 23:03:58 +1000418 * If the module has no ``__file__`` but does have a ``__loader__``, then the
Barry Warsawd7d21942012-07-29 16:36:17 -0400419 loader's repr is used as part of the module's repr.
420
421 * Otherwise, just use the module's ``__name__`` in the repr.
422
423This example, from :pep:`420` shows how a loader can craft its own module
424repr::
425
426 class NamespaceLoader:
427 @classmethod
428 def module_repr(cls, module):
429 return "<module '{}' (namespace)>".format(module.__name__)
430
431
Nick Coghlan49417742012-08-02 23:03:58 +1000432.. _package-path-rules:
433
Barry Warsawd7d21942012-07-29 16:36:17 -0400434module.__path__
435---------------
436
437By definition, if a module has an ``__path__`` attribute, it is a package,
438regardless of its value.
439
440A package's ``__path__`` attribute is used during imports of its subpackages.
441Within the import machinery, it functions much the same as :data:`sys.path`,
442i.e. providing a list of locations to search for modules during import.
443However, ``__path__`` is typically much more constrained than
444:data:`sys.path`.
445
Nick Coghlan49417742012-08-02 23:03:58 +1000446``__path__`` must be an iterable of strings, but it may be empty.
447The same rules used for :data:`sys.path` also apply to a package's
448``__path__``, and :data:`sys.path_hooks` (described below) are
449consulted when traversing a package's ``__path__``.
Barry Warsawd7d21942012-07-29 16:36:17 -0400450
451A package's ``__init__.py`` file may set or alter the package's ``__path__``
452attribute, and this was typically the way namespace packages were implemented
453prior to :pep:`420`. With the adoption of :pep:`420`, namespace packages no
454longer need to supply ``__init__.py`` files containing only ``__path__``
455manipulation code; the namespace loader automatically sets ``__path__``
456correctly for the namespace package.
457
458
459The Path Importer
460=================
461
462.. index::
463 single: path importer
464
465As mentioned previously, Python comes with several default meta path finders.
Barry Warsawdadebab2012-07-31 16:03:09 -0400466One of these, called the :term:`path importer`, searches an :term:`import
467path`, which contains a list of :term:`path entries <path entry>`. Each path
468entry names a location to search for modules.
Barry Warsawd7d21942012-07-29 16:36:17 -0400469
Nick Coghlan49417742012-08-02 23:03:58 +1000470The path importer itself doesn't know how to import anything. Instead, it
471traverses the individual path entries, associating each of them with a
472path entry finder that knows how to handle that particular kind of path.
473
474The default set of path entry finders implement all the semantics for finding
475modules on the file system, handling special file types such as Python source
476code (``.py`` files), Python byte code (``.pyc`` and ``.pyo`` files) and
477shared libraries (e.g. ``.so`` files). When supported by the :mod:`zipimport`
478module in the standard library, the default path entry finders also handle
479loading all of these file types (other than shared libraries) from zipfiles.
Barry Warsawdadebab2012-07-31 16:03:09 -0400480
481Path entries need not be limited to file system locations. They can refer to
Nick Coghlan49417742012-08-02 23:03:58 +1000482the URLs, database queries, or any other location that can be specified as a
483string.
Barry Warsawdadebab2012-07-31 16:03:09 -0400484
485The :term:`path importer` provides additional hooks and protocols so that you
486can extend and customize the types of searchable path entries. For example,
487if you wanted to support path entries as network URLs, you could write a hook
488that implements HTTP semantics to find modules on the web. This hook (a
489callable) would return a :term:`path entry finder` supporting the protocol
490described below, which was then used to get a loader for the module from the
491web.
Barry Warsawd7d21942012-07-29 16:36:17 -0400492
493A word of warning: this section and the previous both use the term *finder*,
494distinguishing between them by using the terms :term:`meta path finder` and
Barry Warsawdadebab2012-07-31 16:03:09 -0400495:term:`path entry finder`. These two types of finders are very similar,
496support similar protocols, and function in similar ways during the import
497process, but it's important to keep in mind that they are subtly different.
498In particular, meta path finders operate at the beginning of the import
499process, as keyed off the :data:`sys.meta_path` traversal.
Barry Warsawd7d21942012-07-29 16:36:17 -0400500
Barry Warsawdadebab2012-07-31 16:03:09 -0400501On the other hand, path entry finders are in a sense an implementation detail
502of the :term:`path importer`, and in fact, if the path importer were to be
503removed from :data:`sys.meta_path`, none of the path entry finder semantics
504would be invoked.
Barry Warsawd7d21942012-07-29 16:36:17 -0400505
506
Barry Warsawdadebab2012-07-31 16:03:09 -0400507Path entry finders
508------------------
Barry Warsawd7d21942012-07-29 16:36:17 -0400509
510.. index::
511 single: sys.path
512 single: sys.path_hooks
513 single: sys.path_importer_cache
514 single: PYTHONPATH
515
Barry Warsawdadebab2012-07-31 16:03:09 -0400516The :term:`path importer` is responsible for finding and loading Python
517modules and packages whose location is specified with a string :term:`path
518entry`. Most path entries name locations in the file system, but they need
519not be limited to this.
520
521As a meta path finder, the :term:`path importer` implements the
Barry Warsawd7d21942012-07-29 16:36:17 -0400522:meth:`find_module()` protocol previously described, however it exposes
523additional hooks that can be used to customize how modules are found and
Barry Warsawdadebab2012-07-31 16:03:09 -0400524loaded from the :term:`import path`.
Barry Warsawd7d21942012-07-29 16:36:17 -0400525
Barry Warsawdadebab2012-07-31 16:03:09 -0400526Three variables are used by the :term:`path importer`, :data:`sys.path`,
527:data:`sys.path_hooks` and :data:`sys.path_importer_cache`. The ``__path__``
Nick Coghlan49417742012-08-02 23:03:58 +1000528attributes on package objects are also used. These provide additional ways
529that the import machinery can be customized.
Barry Warsawd7d21942012-07-29 16:36:17 -0400530
531:data:`sys.path` contains a list of strings providing search locations for
532modules and packages. It is initialized from the :data:`PYTHONPATH`
533environment variable and various other installation- and
534implementation-specific defaults. Entries in :data:`sys.path` can name
535directories on the file system, zip files, and potentially other "locations"
Barry Warsawdadebab2012-07-31 16:03:09 -0400536(see the :mod:`site` module) that should be searched for modules, such as
537URLs, or database queries.
Barry Warsawd7d21942012-07-29 16:36:17 -0400538
Barry Warsawdadebab2012-07-31 16:03:09 -0400539The :term:`path importer` is a :term:`meta path finder`, so the import
Nick Coghlan49417742012-08-02 23:03:58 +1000540machinery begins the :term:`import path` search by calling the path
541importer's :meth:`find_module()` method as described previously. When
542the ``path`` argument to :meth:`find_module()` is given, it will be a
543list of string paths to traverse - typically a package's ``__path__``
544attribute for an import within that package. If the ``path`` argument
545is ``None``, this indicates a top level import and :data:`sys.path` is used.
Barry Warsawd7d21942012-07-29 16:36:17 -0400546
Barry Warsawdadebab2012-07-31 16:03:09 -0400547The :term:`path importer` iterates over every entry in the search path, and
548for each of these, looks for an appropriate :term:`path entry finder` for the
549path entry. Because this can be an expensive operation (e.g. there may be
550`stat()` call overheads for this search), the :term:`path importer` maintains
551a cache mapping path entries to path entry finders. This cache is maintained
552in :data:`sys.path_importer_cache`. In this way, the expensive search for a
553particular :term:`path entry` location's :term:`path entry finder` need only
554be done once. User code is free to remove cache entries from
555:data:`sys.path_importer_cache` forcing the :term:`path importer` to perform
556the path entry search again [#fnpic]_.
Barry Warsawd7d21942012-07-29 16:36:17 -0400557
558If the path entry is not present in the cache, the path importer iterates over
Barry Warsawdadebab2012-07-31 16:03:09 -0400559every callable in :data:`sys.path_hooks`. Each of the :term:`path entry hooks
560<path entry hook>` in this list is called with a single argument, the path
561entry being searched. This callable may either return a :term:`path entry
562finder` that can handle the path entry, or it may raise :exc:`ImportError`.
563An :exc:`ImportError` is used by the path importer to signal that the hook
564cannot find a :term:`path entry finder` for that :term:`path entry`. The
565exception is ignored and :term:`import path` iteration continues.
Barry Warsawd7d21942012-07-29 16:36:17 -0400566
Barry Warsawdadebab2012-07-31 16:03:09 -0400567If :data:`sys.path_hooks` iteration ends with no :term:`path entry finder`
568being returned, then the path importer's :meth:`find_module()` method will
Nick Coghlan49417742012-08-02 23:03:58 +1000569store ``None`` in :data:`sys.path_importer_cache` (to indicate that there
570is no finder for this path entry) and return ``None``, indicating that
571this :term:`meta path finder` could not find the module.
Barry Warsawd7d21942012-07-29 16:36:17 -0400572
Barry Warsawdadebab2012-07-31 16:03:09 -0400573If a :term:`path entry finder` *is* returned by one of the :term:`path entry
574hook` callables on :data:`sys.path_hooks`, then the following protocol is used
575to ask the finder for a module loader, which is then used to load the module.
Barry Warsawd7d21942012-07-29 16:36:17 -0400576
577
Barry Warsawdadebab2012-07-31 16:03:09 -0400578Path entry finder protocol
579--------------------------
Barry Warsawd7d21942012-07-29 16:36:17 -0400580
Nick Coghlan49417742012-08-02 23:03:58 +1000581In order to support imports of modules and initialized packages and also to
582contribute portions to namespace packages, path entry finders must implement
583the :meth:`find_loader()` method.
Barry Warsawd7d21942012-07-29 16:36:17 -0400584
585:meth:`find_loader()` takes one argument, the fully qualified name of the
586module being imported. :meth:`find_loader()` returns a 2-tuple where the
587first item is the loader and the second item is a namespace :term:`portion`.
588When the first item (i.e. the loader) is ``None``, this means that while the
Nick Coghlan49417742012-08-02 23:03:58 +1000589path entry finder does not have a loader for the named module, it knows that the
590path entry contributes to a namespace portion for the named module. This will
591almost always be the case where Python is asked to import a namespace package
592that has no physical presence on the file system. When a path entry finder
593returns ``None`` for the loader, the second item of the 2-tuple return value
594must be a sequence, although it can be empty.
Barry Warsawd7d21942012-07-29 16:36:17 -0400595
596If :meth:`find_loader()` returns a non-``None`` loader value, the portion is
Nick Coghlan49417742012-08-02 23:03:58 +1000597ignored and the loader is returned from the path importer, terminating the
598search through the path entries.
599
600For backwards compatibility with other implementations of the import
601protocol, many path entry finders also support the same,
602traditional :meth:`find_module()` method that meta path finders support.
603However path entry finder :meth:`find_module()` methods are never called
604with a ``path`` argument (they are expected to record the appropriate
605path information from the initial call to the path hook).
606
607The :meth:`find_module()` method on path entry finders is deprecated,
608as it does not allow the path entry finder to contribute portions to
609namespace packages. Instead path entry finders should implement the
610:meth:`find_loader()` method as described above. If it exists on the path
611entry finder, the import system will always call :meth:`find_loader()`
612in preference to :meth:`find_module()`.
613
614
615Replacing the standard import system
616====================================
617
618The most reliable mechanism for replacing the entire import system is to
619delete the default contents of :data:`sys.meta_path`, replacing them
620entirely with a custom meta path hook.
621
622If it is acceptable to only alter the behaviour of import statements
623without affecting other APIs that access the import system, then replacing
624the builtin :func:`__import__` function may be sufficient. This technique
625may also be employed at the module level to only alter the behaviour of
626import statements within that module.
627
628To selectively prevent import of some modules from a hook early on the
629meta path (rather than disabling the standard import system entirely),
630it is sufficient to raise :exc:`ImportError` directly from
631:meth:`find_module` instead of returning ``None``. The latter indicates
632that the meta path search should continue. while raising an exception
633terminates it immediately.
Barry Warsawd7d21942012-07-29 16:36:17 -0400634
635
636Open issues
637===========
638
Barry Warsawd7d21942012-07-29 16:36:17 -0400639XXX It would be really nice to have a diagram.
640
Barry Warsawc1e721b2012-07-30 16:24:12 -0400641XXX * (import_machinery.rst) how about a section devoted just to the
642attributes of modules and packages, perhaps expanding upon or supplanting the
643related entries in the data model reference page?
644
Barry Warsawdadebab2012-07-31 16:03:09 -0400645XXX runpy, pkgutil, et al in the library manual should all get "See Also"
646links at the top pointing to the new import system section.
647
Nick Coghlan49417742012-08-02 23:03:58 +1000648XXX The :term:`path importer` is not, in fact, an :term:`importer`. That's
649why the corresponding implementation class is :class:`importlib.PathFinder`.
650
Barry Warsawd7d21942012-07-29 16:36:17 -0400651
652References
653==========
654
655The import machinery has evolved considerably since Python's early days. The
656original `specification for packages
657<http://www.python.org/doc/essays/packages.html>`_ is still available to read,
658although some details have changed since the writing of that document.
659
660The original specification for :data:`sys.meta_path` was :pep:`302`, with
Barry Warsawdadebab2012-07-31 16:03:09 -0400661subsequent extension in :pep:`420`.
662
663:pep:`420` introduced :term:`namespace packages <namespace package>` for
664Python 3.3. :pep:`420` also introduced the :meth:`find_loader` protocol as an
665alternative to :meth:`find_module`.
Barry Warsawd7d21942012-07-29 16:36:17 -0400666
667:pep:`366` describes the addition of the ``__package__`` attribute for
668explicit relative imports in main modules.
Barry Warsawc1e721b2012-07-30 16:24:12 -0400669
Barry Warsawdadebab2012-07-31 16:03:09 -0400670:pep:`328` introduced absolute and relative imports and initially proposed
671``__name__`` for semantics :pep:`366` would eventually specify for
672``__package__``.
673
674:pep:`338` defines executing modules as scripts.
675
Barry Warsawc1e721b2012-07-30 16:24:12 -0400676
677Footnotes
678=========
679
680.. [#fnmo] See :class:`types.ModuleType`.
681
682.. [#fnlo] The importlib implementation appears not to use the return value
683 directly. Instead, it gets the module object by looking the module name up
684 in :data:`sys.modules`.) The indirect effect of this is that an imported
685 module may replace itself in :data:`sys.modules`. This is
686 implementation-specific behavior that is not guaranteed to work in other
687 Python implementations.
688
Barry Warsawc1e721b2012-07-30 16:24:12 -0400689.. [#fnpic] In legacy code, it is possible to find instances of
690 :class:`imp.NullImporter` in the :data:`sys.path_importer_cache`. It
691 recommended that code be changed to use ``None`` instead. See
692 :ref:`portingpythoncode` for more details.