Doc/library/packaging.pypi.simple.rst - platform/external/python/cpython2 - Gitiles

 :mod:`packaging.pypi.simple` --- Crawler using the PyPI "simple" interface
 ==========================================================================

 .. module:: packaging.pypi.simple
    :synopsis: Crawler using the screen-scraping "simple" interface to fetch info
               and distributions.


 `packaging.pypi.simple` can process Python Package Indexes  and provides
 useful information about distributions. It also can crawl local indexes, for
 instance.

 You should use `packaging.pypi.simple` for:

     * Search distributions by name and versions.
     * Process index external pages.
     * Download distributions by name and versions.

 And should not be used for:

     * Things that will end up in too long index processing (like "finding all
       distributions with a specific version, no matters the name")


 API
 ---

 .. class:: Crawler


 Usage Exemples
 ---------------

 To help you understand how using the `Crawler` class, here are some basic
 usages.

 Request the simple index to get a specific distribution
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

 Supposing you want to scan an index to get a list of distributions for
 the "foobar" project. You can use the "get_releases" method for that.
 The get_releases method will browse the project page, and return
 :class:`ReleaseInfo`  objects for each found link that rely on downloads. ::

    >>> from packaging.pypi.simple import Crawler
    >>> crawler = Crawler()
    >>> crawler.get_releases("FooBar")
    [<ReleaseInfo "Foobar 1.1">, <ReleaseInfo "Foobar 1.2">]


 Note that you also can request the client about specific versions, using version
 specifiers (described in `PEP 345
 <http://www.python.org/dev/peps/pep-0345/#version-specifiers>`_)::

    >>> client.get_releases("FooBar < 1.2")
    [<ReleaseInfo "FooBar 1.1">, ]


 `get_releases` returns a list of :class:`ReleaseInfo`, but you also can get the
 best distribution that fullfil your requirements, using "get_release"::

    >>> client.get_release("FooBar < 1.2")
    <ReleaseInfo "FooBar 1.1">


 Download distributions
 ^^^^^^^^^^^^^^^^^^^^^^

 As it can get the urls of distributions provided by PyPI, the `Crawler`
 client also can download the distributions and put it for you in a temporary
 destination::

    >>> client.download("foobar")
    /tmp/temp_dir/foobar-1.2.tar.gz


 You also can specify the directory you want to download to::

    >>> client.download("foobar", "/path/to/my/dir")
    /path/to/my/dir/foobar-1.2.tar.gz


 While downloading, the md5 of the archive will be checked, if not matches, it
 will try another time, then if fails again, raise `MD5HashDoesNotMatchError`.

 Internally, that's not the Crawler which download the distributions, but the
 `DistributionInfo` class. Please refer to this documentation for more details.


 Following PyPI external links
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

 The default behavior for packaging is to *not* follow the links provided
 by HTML pages in the "simple index", to find distributions related
 downloads.

 It's possible to tell the PyPIClient to follow external links by setting the
 `follow_externals` attribute, on instantiation or after::

    >>> client = Crawler(follow_externals=True)

 or ::

    >>> client = Crawler()
    >>> client.follow_externals = True


 Working with external indexes, and mirrors
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

 The default `Crawler` behavior is to rely on the Python Package index stored
 on PyPI (http://pypi.python.org/simple).

 As you can need to work with a local index, or private indexes, you can specify
 it using the index_url parameter::

    >>> client = Crawler(index_url="file://filesystem/path/")

 or ::

    >>> client = Crawler(index_url="http://some.specific.url/")


 You also can specify mirrors to fallback on in case the first index_url you
 provided doesnt respond, or not correctly. The default behavior for
 `Crawler` is to use the list provided by Python.org DNS records, as
 described in the :PEP:`381` about mirroring infrastructure.

 If you don't want to rely on these, you could specify the list of mirrors you
 want to try by specifying the `mirrors` attribute. It's a simple iterable::

    >>> mirrors = ["http://first.mirror","http://second.mirror"]
    >>> client = Crawler(mirrors=mirrors)


 Searching in the simple index
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

 It's possible to search for projects with specific names in the package index.
 Assuming you want to find all projects containing the "distutils" keyword::

    >>> c.search_projects("distutils")
    [<Project "collective.recipe.distutils">, <Project "Distutils">, <Project
    "Packaging">, <Project "distutilscross">, <Project "lpdistutils">, <Project
    "taras.recipe.distutils">, <Project "zerokspot.recipe.distutils">]


 You can also search the projects starting with a specific text, or ending with
 that text, using a wildcard::

    >>> c.search_projects("distutils*")
    [<Project "Distutils">, <Project "Packaging">, <Project "distutilscross">]

    >>> c.search_projects("*distutils")
    [<Project "collective.recipe.distutils">, <Project "Distutils">, <Project
    "lpdistutils">, <Project "taras.recipe.distutils">, <Project
    "zerokspot.recipe.distutils">]
	:mod:`packaging.pypi.simple` --- Crawler using the PyPI "simple" interface
	==========================================================================

	.. module:: packaging.pypi.simple
	:synopsis: Crawler using the screen-scraping "simple" interface to fetch info
	and distributions.


	`packaging.pypi.simple` can process Python Package Indexes and provides
	useful information about distributions. It also can crawl local indexes, for
	instance.

	You should use `packaging.pypi.simple` for:

	* Search distributions by name and versions.
	* Process index external pages.
	* Download distributions by name and versions.

	And should not be used for:

	* Things that will end up in too long index processing (like "finding all
	distributions with a specific version, no matters the name")


	API
	---

	.. class:: Crawler


	Usage Exemples
	---------------

	To help you understand how using the `Crawler` class, here are some basic
	usages.

	Request the simple index to get a specific distribution
	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

	Supposing you want to scan an index to get a list of distributions for
	the "foobar" project. You can use the "get_releases" method for that.
	The get_releases method will browse the project page, and return
	:class:`ReleaseInfo` objects for each found link that rely on downloads. ::

	>>> from packaging.pypi.simple import Crawler
	>>> crawler = Crawler()
	>>> crawler.get_releases("FooBar")
	[<ReleaseInfo "Foobar 1.1">, <ReleaseInfo "Foobar 1.2">]


	Note that you also can request the client about specific versions, using version
	specifiers (described in `PEP 345
	<http://www.python.org/dev/peps/pep-0345/#version-specifiers>`_)::

	>>> client.get_releases("FooBar < 1.2")
	[<ReleaseInfo "FooBar 1.1">, ]


	`get_releases` returns a list of :class:`ReleaseInfo`, but you also can get the
	best distribution that fullfil your requirements, using "get_release"::

	>>> client.get_release("FooBar < 1.2")
	<ReleaseInfo "FooBar 1.1">


	Download distributions
	^^^^^^^^^^^^^^^^^^^^^^

	As it can get the urls of distributions provided by PyPI, the `Crawler`
	client also can download the distributions and put it for you in a temporary
	destination::

	>>> client.download("foobar")
	/tmp/temp_dir/foobar-1.2.tar.gz


	You also can specify the directory you want to download to::

	>>> client.download("foobar", "/path/to/my/dir")
	/path/to/my/dir/foobar-1.2.tar.gz


	While downloading, the md5 of the archive will be checked, if not matches, it
	will try another time, then if fails again, raise `MD5HashDoesNotMatchError`.

	Internally, that's not the Crawler which download the distributions, but the
	`DistributionInfo` class. Please refer to this documentation for more details.


	Following PyPI external links
	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

	The default behavior for packaging is to not follow the links provided
	by HTML pages in the "simple index", to find distributions related
	downloads.

	It's possible to tell the PyPIClient to follow external links by setting the
	`follow_externals` attribute, on instantiation or after::

	>>> client = Crawler(follow_externals=True)

	or ::

	>>> client = Crawler()
	>>> client.follow_externals = True


	Working with external indexes, and mirrors
	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

	The default `Crawler` behavior is to rely on the Python Package index stored
	on PyPI (http://pypi.python.org/simple).

	As you can need to work with a local index, or private indexes, you can specify
	it using the index_url parameter::

	>>> client = Crawler(index_url="file://filesystem/path/")

	or ::

	>>> client = Crawler(index_url="http://some.specific.url/")


	You also can specify mirrors to fallback on in case the first index_url you
	provided doesnt respond, or not correctly. The default behavior for
	`Crawler` is to use the list provided by Python.org DNS records, as
	described in the :PEP:`381` about mirroring infrastructure.

	If you don't want to rely on these, you could specify the list of mirrors you
	want to try by specifying the `mirrors` attribute. It's a simple iterable::

	>>> mirrors = ["http://first.mirror","http://second.mirror"]
	>>> client = Crawler(mirrors=mirrors)


	Searching in the simple index
	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

	It's possible to search for projects with specific names in the package index.
	Assuming you want to find all projects containing the "distutils" keyword::

	>>> c.search_projects("distutils")
	[<Project "collective.recipe.distutils">, <Project "Distutils">, <Project
	"Packaging">, <Project "distutilscross">, <Project "lpdistutils">, <Project
	"taras.recipe.distutils">, <Project "zerokspot.recipe.distutils">]


	You can also search the projects starting with a specific text, or ending with
	that text, using a wildcard::

	>>> c.search_projects("distutils*")
	[<Project "Distutils">, <Project "Packaging">, <Project "distutilscross">]

	>>> c.search_projects("*distutils")
	[<Project "collective.recipe.distutils">, <Project "Distutils">, <Project
	"lpdistutils">, <Project "taras.recipe.distutils">, <Project
	"zerokspot.recipe.distutils">]