Blame - Doc/library/packaging.pypi.simple.rst - platform/external/python/cpython3

blob: 9153738156b132d148f640d1af4c4c189354440b [file] [log] [blame]

Éric Araujo	3a9f58f	2011-06-01 20:42:49 +0200	[diff] [blame]	1	:mod:`packaging.pypi.simple` --- Crawler using the PyPI "simple" interface
				2	==========================================================================
				3
				4	.. module:: packaging.pypi.simple
				5	:synopsis: Crawler using the screen-scraping "simple" interface to fetch info
				6	and distributions.
				7
				8
Éric Araujo	e043b6b	2011-06-19 19:23:48 +0200	[diff] [blame]	9	The class provided by :mod:`packaging.pypi.simple` can access project indexes
				10	and provide useful information about distributions. PyPI, other indexes and
				11	local indexes are supported.
Éric Araujo	3a9f58f	2011-06-01 20:42:49 +0200	[diff] [blame]	12
Éric Araujo	e043b6b	2011-06-19 19:23:48 +0200	[diff] [blame]	13	You should use this module to search distributions by name and versions, process
				14	index external pages and download distributions. It is not suited for things
				15	that will end up in too long index processing (like "finding all distributions
				16	with a specific version, no matter the name"); use :mod:`packaging.pypi.xmlrpc`
				17	for that.
Éric Araujo	3a9f58f	2011-06-01 20:42:49 +0200	[diff] [blame]	18
				19
				20	API
				21	---
				22
Éric Araujo	e043b6b	2011-06-19 19:23:48 +0200	[diff] [blame]	23	.. class:: Crawler(index_url=DEFAULT_SIMPLE_INDEX_URL, \
				24	prefer_final=False, prefer_source=True, \
				25	hosts=('*',), follow_externals=False, \
				26	mirrors_url=None, mirrors=None, timeout=15, \
Éric Araujo	dd2d55c	2011-09-21 16:28:03 +0200	[diff] [blame]	27	mirrors_max_tries=0)
Éric Araujo	e043b6b	2011-06-19 19:23:48 +0200	[diff] [blame]	28
				29	index_url is the address of the index to use for requests.
				30
				31	The first two parameters control the query results. prefer_final
				32	indicates whether a final version (not alpha, beta or candidate) is to be
				33	prefered over a newer but non-final version (for example, whether to pick
				34	up 1.0 over 2.0a3). It is used only for queries that don't give a version
				35	argument. Likewise, prefer_source tells whether to prefer a source
				36	distribution over a binary one, if no distribution argument was prodived.
				37
				38	Other parameters are related to external links (that is links that go
				39	outside the simple index): hosts is a list of hosts allowed to be
				40	processed if follow_externals is true (default behavior is to follow all
				41	hosts), follow_externals enables or disables following external links
				42	(default is false, meaning disabled).
				43
				44	The remaining parameters are related to the mirroring infrastructure
				45	defined in :PEP:`381`. mirrors_url gives a URL to look on for DNS
				46	records giving mirror adresses; mirrors is a list of mirror URLs (see
				47	the PEP). If both mirrors and mirrors_url are given, mirrors_url
				48	will only be used if mirrors is set to ``None``. timeout is the time
				49	(in seconds) to wait before considering a URL has timed out;
				50	mirrors_max_tries" is the number of times to try requesting informations
				51	on mirrors before switching.
				52
				53	The following methods are defined:
				54
				55	.. method:: get_distributions(project_name, version)
				56
				57	Return the distributions found in the index for the given release.
				58
				59	.. method:: get_metadata(project_name, version)
				60
				61	Return the metadata found on the index for this project name and
				62	version. Currently downloads and unpacks a distribution to read the
				63	PKG-INFO file.
				64
				65	.. method:: get_release(requirements, prefer_final=None)
				66
				67	Return one release that fulfills the given requirements.
				68
				69	.. method:: get_releases(requirements, prefer_final=None, force_update=False)
				70
				71	Search for releases and return a
				72	:class:`~packaging.pypi.dist.ReleasesList` object containing the
				73	results.
				74
				75	.. method:: search_projects(name=None)
				76
				77	Search the index for projects containing the given name and return a
				78	list of matching names.
				79
				80	See also the base class :class:`packaging.pypi.base.BaseClient` for inherited
				81	methods.
				82
				83
				84	.. data:: DEFAULT_SIMPLE_INDEX_URL
				85
				86	The address used by default by the crawler class. It is currently
				87	``'http://a.pypi.python.org/simple/'``, the main PyPI installation.
				88
				89
Éric Araujo	3a9f58f	2011-06-01 20:42:49 +0200	[diff] [blame]	90
				91
				92	Usage Exemples
				93	---------------
				94
				95	To help you understand how using the `Crawler` class, here are some basic
				96	usages.
				97
				98	Request the simple index to get a specific distribution
				99	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				100
				101	Supposing you want to scan an index to get a list of distributions for
				102	the "foobar" project. You can use the "get_releases" method for that.
				103	The get_releases method will browse the project page, and return
				104	:class:`ReleaseInfo` objects for each found link that rely on downloads. ::
				105
				106	>>> from packaging.pypi.simple import Crawler
				107	>>> crawler = Crawler()
				108	>>> crawler.get_releases("FooBar")
				109	[<ReleaseInfo "Foobar 1.1">, <ReleaseInfo "Foobar 1.2">]
				110
				111
				112	Note that you also can request the client about specific versions, using version
				113	specifiers (described in `PEP 345
				114	<http://www.python.org/dev/peps/pep-0345/#version-specifiers>`_)::
				115
				116	>>> client.get_releases("FooBar < 1.2")
				117	[<ReleaseInfo "FooBar 1.1">, ]
				118
				119
				120	`get_releases` returns a list of :class:`ReleaseInfo`, but you also can get the
				121	best distribution that fullfil your requirements, using "get_release"::
				122
				123	>>> client.get_release("FooBar < 1.2")
				124	<ReleaseInfo "FooBar 1.1">
				125
				126
				127	Download distributions
				128	^^^^^^^^^^^^^^^^^^^^^^
				129
				130	As it can get the urls of distributions provided by PyPI, the `Crawler`
				131	client also can download the distributions and put it for you in a temporary
				132	destination::
				133
				134	>>> client.download("foobar")
				135	/tmp/temp_dir/foobar-1.2.tar.gz
				136
				137
				138	You also can specify the directory you want to download to::
				139
				140	>>> client.download("foobar", "/path/to/my/dir")
				141	/path/to/my/dir/foobar-1.2.tar.gz
				142
				143
				144	While downloading, the md5 of the archive will be checked, if not matches, it
				145	will try another time, then if fails again, raise `MD5HashDoesNotMatchError`.
				146
				147	Internally, that's not the Crawler which download the distributions, but the
				148	`DistributionInfo` class. Please refer to this documentation for more details.
				149
				150
				151	Following PyPI external links
				152	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				153
				154	The default behavior for packaging is to not follow the links provided
				155	by HTML pages in the "simple index", to find distributions related
				156	downloads.
				157
				158	It's possible to tell the PyPIClient to follow external links by setting the
				159	`follow_externals` attribute, on instantiation or after::
				160
				161	>>> client = Crawler(follow_externals=True)
				162
				163	or ::
				164
				165	>>> client = Crawler()
				166	>>> client.follow_externals = True
				167
				168
				169	Working with external indexes, and mirrors
				170	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				171
				172	The default `Crawler` behavior is to rely on the Python Package index stored
				173	on PyPI (http://pypi.python.org/simple).
				174
				175	As you can need to work with a local index, or private indexes, you can specify
				176	it using the index_url parameter::
				177
				178	>>> client = Crawler(index_url="file://filesystem/path/")
				179
				180	or ::
				181
				182	>>> client = Crawler(index_url="http://some.specific.url/")
				183
				184
				185	You also can specify mirrors to fallback on in case the first index_url you
				186	provided doesnt respond, or not correctly. The default behavior for
				187	`Crawler` is to use the list provided by Python.org DNS records, as
				188	described in the :PEP:`381` about mirroring infrastructure.
				189
				190	If you don't want to rely on these, you could specify the list of mirrors you
				191	want to try by specifying the `mirrors` attribute. It's a simple iterable::
				192
				193	>>> mirrors = ["http://first.mirror","http://second.mirror"]
				194	>>> client = Crawler(mirrors=mirrors)
				195
				196
				197	Searching in the simple index
				198	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				199
				200	It's possible to search for projects with specific names in the package index.
				201	Assuming you want to find all projects containing the "distutils" keyword::
				202
				203	>>> c.search_projects("distutils")
				204	[<Project "collective.recipe.distutils">, <Project "Distutils">, <Project
				205	"Packaging">, <Project "distutilscross">, <Project "lpdistutils">, <Project
				206	"taras.recipe.distutils">, <Project "zerokspot.recipe.distutils">]
				207
				208
				209	You can also search the projects starting with a specific text, or ending with
				210	that text, using a wildcard::
				211
				212	>>> c.search_projects("distutils*")
				213	[<Project "Distutils">, <Project "Packaging">, <Project "distutilscross">]
				214
				215	>>> c.search_projects("*distutils")
				216	[<Project "collective.recipe.distutils">, <Project "Distutils">, <Project
				217	"lpdistutils">, <Project "taras.recipe.distutils">, <Project
				218	"zerokspot.recipe.distutils">]