Blame - Doc/library/urllib.parse.rst - platform/external/python/cpython3

blob: ac04f99deb74b76234f92e16539be51e45fb06ac [file] [log] [blame]

Senthil Kumaran	aca8fd7	2008-06-23 04:41:59 +0000	[diff] [blame]	1	:mod:`urllib.parse` --- Parse URLs into components
				2	==================================================
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	3
Senthil Kumaran	aca8fd7	2008-06-23 04:41:59 +0000	[diff] [blame]	4	.. module:: urllib.parse
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	5	:synopsis: Parse URLs into or assemble them from components.
				6
				7
				8	.. index::
				9	single: WWW
				10	single: World Wide Web
				11	single: URL
				12	pair: URL; parsing
				13	pair: relative; URL
				14
Éric Araujo	19f9b71	2011-08-19 00:49:18 +0200	[diff] [blame]	15	Source code: :source:`Lib/urllib/parse.py`
				16
				17	--------------
				18
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	19	This module defines a standard interface to break Uniform Resource Locator (URL)
				20	strings up in components (addressing scheme, network location, path etc.), to
				21	combine the components back into a URL string, and to convert a "relative URL"
				22	to an absolute URL given a "base URL."
				23
				24	The module has been designed to match the Internet RFC on Relative Uniform
Senthil Kumaran	4a27d9f	2012-06-28 21:07:58 -0700	[diff] [blame]	25	Resource Locators. It supports the following URL schemes: ``file``, ``ftp``,
				26	``gopher``, ``hdl``, ``http``, ``https``, ``imap``, ``mailto``, ``mms``,
				27	``news``, ``nntp``, ``prospero``, ``rsync``, ``rtsp``, ``rtspu``, ``sftp``,
				28	``shttp``, ``sip``, ``sips``, ``snews``, ``svn``, ``svn+ssh``, ``telnet``,
				29	``wais``.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	30
Nick Coghlan	9fc443c	2010-11-30 15:48:08 +0000	[diff] [blame]	31	The :mod:`urllib.parse` module defines functions that fall into two broad
				32	categories: URL parsing and URL quoting. These are covered in detail in
				33	the following sections.
				34
				35	URL Parsing
				36	-----------
				37
				38	The URL parsing functions focus on splitting a URL string into its components,
				39	or on combining URL components into a URL string.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	40
R. David Murray	f5077aa	2010-05-25 15:36:46 +0000	[diff] [blame]	41	.. function:: urlparse(urlstring, scheme='', allow_fragments=True)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	42
				43	Parse a URL into six components, returning a 6-tuple. This corresponds to the
				44	general structure of a URL: ``scheme://netloc/path;parameters?query#fragment``.
				45	Each tuple item is a string, possibly empty. The components are not broken up in
				46	smaller parts (for example, the network location is a single string), and %
				47	escapes are not expanded. The delimiters as shown above are not part of the
				48	result, except for a leading slash in the path component, which is retained if
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	49	present. For example:
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	50
Senthil Kumaran	aca8fd7	2008-06-23 04:41:59 +0000	[diff] [blame]	51	>>> from urllib.parse import urlparse
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	52	>>> o = urlparse('http://www.cwi.nl:80/%7Eguido/Python.html')
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	53	>>> o # doctest: +NORMALIZE_WHITESPACE
				54	ParseResult(scheme='http', netloc='www.cwi.nl:80', path='/%7Eguido/Python.html',
				55	params='', query='', fragment='')
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	56	>>> o.scheme
				57	'http'
				58	>>> o.port
				59	80
				60	>>> o.geturl()
				61	'http://www.cwi.nl:80/%7Eguido/Python.html'
				62
Senthil Kumaran	7089a4e	2010-11-07 12:57:04 +0000	[diff] [blame]	63	Following the syntax specifications in :rfc:`1808`, urlparse recognizes
				64	a netloc only if it is properly introduced by '//'. Otherwise the
				65	input is presumed to be a relative URL and thus to start with
				66	a path component.
Senthil Kumaran	84c7d9f	2010-08-04 04:50:44 +0000	[diff] [blame]	67
Senthil Kumaran	fe9230a	2011-06-19 13:52:49 -0700	[diff] [blame]	68	>>> from urllib.parse import urlparse
Senthil Kumaran	84c7d9f	2010-08-04 04:50:44 +0000	[diff] [blame]	69	>>> urlparse('//www.cwi.nl:80/%7Eguido/Python.html')
				70	ParseResult(scheme='', netloc='www.cwi.nl:80', path='/%7Eguido/Python.html',
				71	params='', query='', fragment='')
Senthil Kumaran	8fd3669	2013-02-26 01:02:58 -0800	[diff] [blame]	72	>>> urlparse('www.cwi.nl/%7Eguido/Python.html')
Senthil Kumaran	21b2933	2013-09-30 22:12:16 -0700	[diff] [blame]	73	ParseResult(scheme='', netloc='', path='www.cwi.nl/%7Eguido/Python.html',
Senthil Kumaran	84c7d9f	2010-08-04 04:50:44 +0000	[diff] [blame]	74	params='', query='', fragment='')
				75	>>> urlparse('help/Python.html')
				76	ParseResult(scheme='', netloc='', path='help/Python.html', params='',
				77	query='', fragment='')
				78
Berker Peksag	89584c9	2015-06-25 23:38:48 +0300	[diff] [blame]	79	The scheme argument gives the default addressing scheme, to be
				80	used only if the URL does not specify one. It should be the same type
				81	(text or bytes) as urlstring, except that the default value ``''`` is
				82	always allowed, and is automatically converted to ``b''`` if appropriate.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	83
				84	If the allow_fragments argument is false, fragment identifiers are not
Berker Peksag	89584c9	2015-06-25 23:38:48 +0300	[diff] [blame]	85	recognized. Instead, they are parsed as part of the path, parameters
				86	or query component, and :attr:`fragment` is set to the empty string in
				87	the return value.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	88
				89	The return value is actually an instance of a subclass of :class:`tuple`. This
				90	class has the following additional read-only convenience attributes:
				91
				92	+------------------+-------+--------------------------+----------------------+
				93	\| Attribute \| Index \| Value \| Value if not present \|
				94	+==================+=======+==========================+======================+
Berker Peksag	89584c9	2015-06-25 23:38:48 +0300	[diff] [blame]	95	\| :attr:`scheme` \| 0 \| URL scheme specifier \| scheme parameter \|
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	96	+------------------+-------+--------------------------+----------------------+
				97	\| :attr:`netloc` \| 1 \| Network location part \| empty string \|
				98	+------------------+-------+--------------------------+----------------------+
				99	\| :attr:`path` \| 2 \| Hierarchical path \| empty string \|
				100	+------------------+-------+--------------------------+----------------------+
				101	\| :attr:`params` \| 3 \| Parameters for last path \| empty string \|
				102	\| \| \| element \| \|
				103	+------------------+-------+--------------------------+----------------------+
				104	\| :attr:`query` \| 4 \| Query component \| empty string \|
				105	+------------------+-------+--------------------------+----------------------+
				106	\| :attr:`fragment` \| 5 \| Fragment identifier \| empty string \|
				107	+------------------+-------+--------------------------+----------------------+
				108	\| :attr:`username` \| \| User name \| :const:`None` \|
				109	+------------------+-------+--------------------------+----------------------+
				110	\| :attr:`password` \| \| Password \| :const:`None` \|
				111	+------------------+-------+--------------------------+----------------------+
				112	\| :attr:`hostname` \| \| Host name (lower case) \| :const:`None` \|
				113	+------------------+-------+--------------------------+----------------------+
				114	\| :attr:`port` \| \| Port number as integer, \| :const:`None` \|
				115	\| \| \| if present \| \|
				116	+------------------+-------+--------------------------+----------------------+
				117
				118	See section :ref:`urlparse-result-object` for more information on the result
				119	object.
				120
Senthil Kumaran	7a1e09f	2010-04-22 12:19:46 +0000	[diff] [blame]	121	.. versionchanged:: 3.2
				122	Added IPv6 URL parsing capabilities.
				123
Georg Brandl	a79b8dc	2012-09-29 08:59:23 +0200	[diff] [blame]	124	.. versionchanged:: 3.3
				125	The fragment is now parsed for all URL schemes (unless allow_fragment is
				126	false), in accordance with :rfc:`3986`. Previously, a whitelist of
				127	schemes that support fragments existed.
				128
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	129
Victor Stinner	ac71c54	2011-01-14 12:52:12 +0000	[diff] [blame]	130	.. function:: parse_qs(qs, keep_blank_values=False, strict_parsing=False, encoding='utf-8', errors='replace')
Facundo Batista	c469d4c	2008-09-03 22:49:01 +0000	[diff] [blame]	131
				132	Parse a query string given as a string argument (data of type
				133	:mimetype:`application/x-www-form-urlencoded`). Data are returned as a
				134	dictionary. The dictionary keys are the unique query variable names and the
				135	values are lists of values for each name.
				136
				137	The optional argument keep_blank_values is a flag indicating whether blank
Senthil Kumaran	f0769e8	2010-08-09 19:53:52 +0000	[diff] [blame]	138	values in percent-encoded queries should be treated as blank strings. A true value
Facundo Batista	c469d4c	2008-09-03 22:49:01 +0000	[diff] [blame]	139	indicates that blanks should be retained as blank strings. The default false
				140	value indicates that blank values are to be ignored and treated as if they were
				141	not included.
				142
				143	The optional argument strict_parsing is a flag indicating what to do with
				144	parsing errors. If false (the default), errors are silently ignored. If true,
				145	errors raise a :exc:`ValueError` exception.
				146
Victor Stinner	ac71c54	2011-01-14 12:52:12 +0000	[diff] [blame]	147	The optional encoding and errors parameters specify how to decode
				148	percent-encoded sequences into Unicode characters, as accepted by the
				149	:meth:`bytes.decode` method.
				150
Michael Foord	207d229	2012-09-28 14:40:44 +0100	[diff] [blame]	151	Use the :func:`urllib.parse.urlencode` function (with the ``doseq``
				152	parameter set to ``True``) to convert such dictionaries into query
				153	strings.
Facundo Batista	c469d4c	2008-09-03 22:49:01 +0000	[diff] [blame]	154
Senthil Kumaran	2933312	2011-02-11 11:25:47 +0000	[diff] [blame]	155
Victor Stinner	c58be2d	2011-01-14 13:31:45 +0000	[diff] [blame]	156	.. versionchanged:: 3.2
				157	Add encoding and errors parameters.
				158
Facundo Batista	c469d4c	2008-09-03 22:49:01 +0000	[diff] [blame]	159
Victor Stinner	ac71c54	2011-01-14 12:52:12 +0000	[diff] [blame]	160	.. function:: parse_qsl(qs, keep_blank_values=False, strict_parsing=False, encoding='utf-8', errors='replace')
Facundo Batista	c469d4c	2008-09-03 22:49:01 +0000	[diff] [blame]	161
				162	Parse a query string given as a string argument (data of type
				163	:mimetype:`application/x-www-form-urlencoded`). Data are returned as a list of
				164	name, value pairs.
				165
				166	The optional argument keep_blank_values is a flag indicating whether blank
Senthil Kumaran	f0769e8	2010-08-09 19:53:52 +0000	[diff] [blame]	167	values in percent-encoded queries should be treated as blank strings. A true value
Facundo Batista	c469d4c	2008-09-03 22:49:01 +0000	[diff] [blame]	168	indicates that blanks should be retained as blank strings. The default false
				169	value indicates that blank values are to be ignored and treated as if they were
				170	not included.
				171
				172	The optional argument strict_parsing is a flag indicating what to do with
				173	parsing errors. If false (the default), errors are silently ignored. If true,
				174	errors raise a :exc:`ValueError` exception.
				175
Victor Stinner	ac71c54	2011-01-14 12:52:12 +0000	[diff] [blame]	176	The optional encoding and errors parameters specify how to decode
				177	percent-encoded sequences into Unicode characters, as accepted by the
				178	:meth:`bytes.decode` method.
				179
Facundo Batista	c469d4c	2008-09-03 22:49:01 +0000	[diff] [blame]	180	Use the :func:`urllib.parse.urlencode` function to convert such lists of pairs into
				181	query strings.
				182
Victor Stinner	c58be2d	2011-01-14 13:31:45 +0000	[diff] [blame]	183	.. versionchanged:: 3.2
				184	Add encoding and errors parameters.
				185
Facundo Batista	c469d4c	2008-09-03 22:49:01 +0000	[diff] [blame]	186
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	187	.. function:: urlunparse(parts)
				188
Georg Brandl	0f7ede4	2008-06-23 11:23:31 +0000	[diff] [blame]	189	Construct a URL from a tuple as returned by ``urlparse()``. The parts
				190	argument can be any six-item iterable. This may result in a slightly
				191	different, but equivalent URL, if the URL that was parsed originally had
				192	unnecessary delimiters (for example, a ``?`` with an empty query; the RFC
				193	states that these are equivalent).
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	194
				195
R. David Murray	f5077aa	2010-05-25 15:36:46 +0000	[diff] [blame]	196	.. function:: urlsplit(urlstring, scheme='', allow_fragments=True)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	197
				198	This is similar to :func:`urlparse`, but does not split the params from the URL.
				199	This should generally be used instead of :func:`urlparse` if the more recent URL
				200	syntax allowing parameters to be applied to each segment of the path portion
				201	of the URL (see :rfc:`2396`) is wanted. A separate function is needed to
				202	separate the path segments and parameters. This function returns a 5-tuple:
				203	(addressing scheme, network location, path, query, fragment identifier).
				204
				205	The return value is actually an instance of a subclass of :class:`tuple`. This
				206	class has the following additional read-only convenience attributes:
				207
				208	+------------------+-------+-------------------------+----------------------+
				209	\| Attribute \| Index \| Value \| Value if not present \|
				210	+==================+=======+=========================+======================+
Berker Peksag	89584c9	2015-06-25 23:38:48 +0300	[diff] [blame]	211	\| :attr:`scheme` \| 0 \| URL scheme specifier \| scheme parameter \|
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	212	+------------------+-------+-------------------------+----------------------+
				213	\| :attr:`netloc` \| 1 \| Network location part \| empty string \|
				214	+------------------+-------+-------------------------+----------------------+
				215	\| :attr:`path` \| 2 \| Hierarchical path \| empty string \|
				216	+------------------+-------+-------------------------+----------------------+
				217	\| :attr:`query` \| 3 \| Query component \| empty string \|
				218	+------------------+-------+-------------------------+----------------------+
				219	\| :attr:`fragment` \| 4 \| Fragment identifier \| empty string \|
				220	+------------------+-------+-------------------------+----------------------+
				221	\| :attr:`username` \| \| User name \| :const:`None` \|
				222	+------------------+-------+-------------------------+----------------------+
				223	\| :attr:`password` \| \| Password \| :const:`None` \|
				224	+------------------+-------+-------------------------+----------------------+
				225	\| :attr:`hostname` \| \| Host name (lower case) \| :const:`None` \|
				226	+------------------+-------+-------------------------+----------------------+
				227	\| :attr:`port` \| \| Port number as integer, \| :const:`None` \|
				228	\| \| \| if present \| \|
				229	+------------------+-------+-------------------------+----------------------+
				230
				231	See section :ref:`urlparse-result-object` for more information on the result
				232	object.
				233
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	234
				235	.. function:: urlunsplit(parts)
				236
Georg Brandl	0f7ede4	2008-06-23 11:23:31 +0000	[diff] [blame]	237	Combine the elements of a tuple as returned by :func:`urlsplit` into a
				238	complete URL as a string. The parts argument can be any five-item
				239	iterable. This may result in a slightly different, but equivalent URL, if the
				240	URL that was parsed originally had unnecessary delimiters (for example, a ?
				241	with an empty query; the RFC states that these are equivalent).
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	242
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	243
Georg Brandl	7f01a13	2009-09-16 15:58:14 +0000	[diff] [blame]	244	.. function:: urljoin(base, url, allow_fragments=True)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	245
				246	Construct a full ("absolute") URL by combining a "base URL" (base) with
				247	another URL (url). Informally, this uses components of the base URL, in
Georg Brandl	0f7ede4	2008-06-23 11:23:31 +0000	[diff] [blame]	248	particular the addressing scheme, the network location and (part of) the
				249	path, to provide missing components in the relative URL. For example:
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	250
Senthil Kumaran	aca8fd7	2008-06-23 04:41:59 +0000	[diff] [blame]	251	>>> from urllib.parse import urljoin
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	252	>>> urljoin('http://www.cwi.nl/%7Eguido/Python.html', 'FAQ.html')
				253	'http://www.cwi.nl/%7Eguido/FAQ.html'
				254
				255	The allow_fragments argument has the same meaning and default as for
				256	:func:`urlparse`.
				257
				258	.. note::
				259
				260	If url is an absolute URL (that is, starting with ``//`` or ``scheme://``),
				261	the url's host name and/or scheme will be present in the result. For example:
				262
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	263	.. doctest::
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	264
				265	>>> urljoin('http://www.cwi.nl/%7Eguido/Python.html',
				266	... '//www.python.org/%7Eguido')
				267	'http://www.python.org/%7Eguido'
				268
				269	If you do not want that behavior, preprocess the url with :func:`urlsplit` and
				270	:func:`urlunsplit`, removing possible scheme and netloc parts.
				271
				272
				273	.. function:: urldefrag(url)
				274
Georg Brandl	0f7ede4	2008-06-23 11:23:31 +0000	[diff] [blame]	275	If url contains a fragment identifier, return a modified version of url
				276	with no fragment identifier, and the fragment identifier as a separate
				277	string. If there is no fragment identifier in url, return url unmodified
				278	and an empty string.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	279
Nick Coghlan	9fc443c	2010-11-30 15:48:08 +0000	[diff] [blame]	280	The return value is actually an instance of a subclass of :class:`tuple`. This
				281	class has the following additional read-only convenience attributes:
				282
				283	+------------------+-------+-------------------------+----------------------+
				284	\| Attribute \| Index \| Value \| Value if not present \|
				285	+==================+=======+=========================+======================+
				286	\| :attr:`url` \| 0 \| URL with no fragment \| empty string \|
				287	+------------------+-------+-------------------------+----------------------+
				288	\| :attr:`fragment` \| 1 \| Fragment identifier \| empty string \|
				289	+------------------+-------+-------------------------+----------------------+
				290
				291	See section :ref:`urlparse-result-object` for more information on the result
				292	object.
				293
				294	.. versionchanged:: 3.2
Raymond Hettinger	9a236b0	2011-01-24 09:01:27 +0000	[diff] [blame]	295	Result is a structured object rather than a simple 2-tuple.
Nick Coghlan	9fc443c	2010-11-30 15:48:08 +0000	[diff] [blame]	296
Georg Brandl	009a6bd	2011-01-24 19:59:08 +0000	[diff] [blame]	297	.. _parsing-ascii-encoded-bytes:
Nick Coghlan	9fc443c	2010-11-30 15:48:08 +0000	[diff] [blame]	298
				299	Parsing ASCII Encoded Bytes
				300	---------------------------
				301
				302	The URL parsing functions were originally designed to operate on character
				303	strings only. In practice, it is useful to be able to manipulate properly
				304	quoted and encoded URLs as sequences of ASCII bytes. Accordingly, the
				305	URL parsing functions in this module all operate on :class:`bytes` and
				306	:class:`bytearray` objects in addition to :class:`str` objects.
				307
				308	If :class:`str` data is passed in, the result will also contain only
				309	:class:`str` data. If :class:`bytes` or :class:`bytearray` data is
				310	passed in, the result will contain only :class:`bytes` data.
				311
				312	Attempting to mix :class:`str` data with :class:`bytes` or
				313	:class:`bytearray` in a single function call will result in a
Éric Araujo	ff2a4ba	2010-11-30 17:20:31 +0000	[diff] [blame]	314	:exc:`TypeError` being raised, while attempting to pass in non-ASCII
Nick Coghlan	9fc443c	2010-11-30 15:48:08 +0000	[diff] [blame]	315	byte values will trigger :exc:`UnicodeDecodeError`.
				316
				317	To support easier conversion of result objects between :class:`str` and
				318	:class:`bytes`, all return values from URL parsing functions provide
				319	either an :meth:`encode` method (when the result contains :class:`str`
				320	data) or a :meth:`decode` method (when the result contains :class:`bytes`
				321	data). The signatures of these methods match those of the corresponding
				322	:class:`str` and :class:`bytes` methods (except that the default encoding
				323	is ``'ascii'`` rather than ``'utf-8'``). Each produces a value of a
				324	corresponding type that contains either :class:`bytes` data (for
				325	:meth:`encode` methods) or :class:`str` data (for
				326	:meth:`decode` methods).
				327
				328	Applications that need to operate on potentially improperly quoted URLs
				329	that may contain non-ASCII data will need to do their own decoding from
				330	bytes to characters before invoking the URL parsing methods.
				331
				332	The behaviour described in this section applies only to the URL parsing
				333	functions. The URL quoting functions use their own rules when producing
				334	or consuming byte sequences as detailed in the documentation of the
				335	individual URL quoting functions.
				336
				337	.. versionchanged:: 3.2
				338	URL parsing functions now accept ASCII encoded byte sequences
				339
				340
				341	.. _urlparse-result-object:
				342
				343	Structured Parse Results
				344	------------------------
				345
				346	The result objects from the :func:`urlparse`, :func:`urlsplit` and
Georg Brandl	4640237	2010-12-04 19:06:18 +0000	[diff] [blame]	347	:func:`urldefrag` functions are subclasses of the :class:`tuple` type.
Nick Coghlan	9fc443c	2010-11-30 15:48:08 +0000	[diff] [blame]	348	These subclasses add the attributes listed in the documentation for
				349	those functions, the encoding and decoding support described in the
				350	previous section, as well as an additional method:
				351
				352	.. method:: urllib.parse.SplitResult.geturl()
				353
				354	Return the re-combined version of the original URL as a string. This may
				355	differ from the original URL in that the scheme may be normalized to lower
				356	case and empty components may be dropped. Specifically, empty parameters,
				357	queries, and fragment identifiers will be removed.
				358
				359	For :func:`urldefrag` results, only empty fragment identifiers will be removed.
				360	For :func:`urlsplit` and :func:`urlparse` results, all noted changes will be
				361	made to the URL returned by this method.
				362
				363	The result of this method remains unchanged if passed back through the original
				364	parsing function:
				365
				366	>>> from urllib.parse import urlsplit
				367	>>> url = 'HTTP://www.Python.org/doc/#'
				368	>>> r1 = urlsplit(url)
				369	>>> r1.geturl()
				370	'http://www.Python.org/doc/'
				371	>>> r2 = urlsplit(r1.geturl())
				372	>>> r2.geturl()
				373	'http://www.Python.org/doc/'
				374
				375
				376	The following classes provide the implementations of the structured parse
				377	results when operating on :class:`str` objects:
				378
				379	.. class:: DefragResult(url, fragment)
				380
				381	Concrete class for :func:`urldefrag` results containing :class:`str`
				382	data. The :meth:`encode` method returns a :class:`DefragResultBytes`
				383	instance.
				384
				385	.. versionadded:: 3.2
				386
				387	.. class:: ParseResult(scheme, netloc, path, params, query, fragment)
				388
				389	Concrete class for :func:`urlparse` results containing :class:`str`
				390	data. The :meth:`encode` method returns a :class:`ParseResultBytes`
				391	instance.
				392
				393	.. class:: SplitResult(scheme, netloc, path, query, fragment)
				394
				395	Concrete class for :func:`urlsplit` results containing :class:`str`
				396	data. The :meth:`encode` method returns a :class:`SplitResultBytes`
				397	instance.
				398
				399
				400	The following classes provide the implementations of the parse results when
				401	operating on :class:`bytes` or :class:`bytearray` objects:
				402
				403	.. class:: DefragResultBytes(url, fragment)
				404
				405	Concrete class for :func:`urldefrag` results containing :class:`bytes`
				406	data. The :meth:`decode` method returns a :class:`DefragResult`
				407	instance.
				408
				409	.. versionadded:: 3.2
				410
				411	.. class:: ParseResultBytes(scheme, netloc, path, params, query, fragment)
				412
				413	Concrete class for :func:`urlparse` results containing :class:`bytes`
				414	data. The :meth:`decode` method returns a :class:`ParseResult`
				415	instance.
				416
				417	.. versionadded:: 3.2
				418
				419	.. class:: SplitResultBytes(scheme, netloc, path, query, fragment)
				420
				421	Concrete class for :func:`urlsplit` results containing :class:`bytes`
				422	data. The :meth:`decode` method returns a :class:`SplitResult`
				423	instance.
				424
				425	.. versionadded:: 3.2
				426
				427
				428	URL Quoting
				429	-----------
				430
				431	The URL quoting functions focus on taking program data and making it safe
				432	for use as URL components by quoting special characters and appropriately
				433	encoding non-ASCII text. They also support reversing these operations to
				434	recreate the original data from the contents of a URL component if that
				435	task isn't already covered by the URL parsing functions above.
Georg Brandl	7f01a13	2009-09-16 15:58:14 +0000	[diff] [blame]	436
				437	.. function:: quote(string, safe='/', encoding=None, errors=None)
Senthil Kumaran	aca8fd7	2008-06-23 04:41:59 +0000	[diff] [blame]	438
				439	Replace special characters in string using the ``%xx`` escape. Letters,
Senthil Kumaran	8aa8bbe	2009-08-31 16:43:45 +0000	[diff] [blame]	440	digits, and the characters ``'_.-'`` are never quoted. By default, this
				441	function is intended for quoting the path section of URL. The optional safe
Guido van Rossum	52dbbb9	2008-08-18 21:44:30 +0000	[diff] [blame]	442	parameter specifies additional ASCII characters that should not be quoted
				443	--- its default value is ``'/'``.
Senthil Kumaran	aca8fd7	2008-06-23 04:41:59 +0000	[diff] [blame]	444
Guido van Rossum	52dbbb9	2008-08-18 21:44:30 +0000	[diff] [blame]	445	string may be either a :class:`str` or a :class:`bytes`.
				446
				447	The optional encoding and errors parameters specify how to deal with
				448	non-ASCII characters, as accepted by the :meth:`str.encode` method.
				449	encoding defaults to ``'utf-8'``.
				450	errors defaults to ``'strict'``, meaning unsupported characters raise a
				451	:class:`UnicodeEncodeError`.
				452	encoding and errors must not be supplied if string is a
				453	:class:`bytes`, or a :class:`TypeError` is raised.
				454
				455	Note that ``quote(string, safe, encoding, errors)`` is equivalent to
				456	``quote_from_bytes(string.encode(encoding, errors), safe)``.
				457
				458	Example: ``quote('/El Niño/')`` yields ``'/El%20Ni%C3%B1o/'``.
Senthil Kumaran	aca8fd7	2008-06-23 04:41:59 +0000	[diff] [blame]	459
				460
Georg Brandl	7f01a13	2009-09-16 15:58:14 +0000	[diff] [blame]	461	.. function:: quote_plus(string, safe='', encoding=None, errors=None)
Senthil Kumaran	aca8fd7	2008-06-23 04:41:59 +0000	[diff] [blame]	462
Georg Brandl	0f7ede4	2008-06-23 11:23:31 +0000	[diff] [blame]	463	Like :func:`quote`, but also replace spaces by plus signs, as required for
Georg Brandl	81c09db	2009-07-29 07:27:08 +0000	[diff] [blame]	464	quoting HTML form values when building up a query string to go into a URL.
				465	Plus signs in the original string are escaped unless they are included in
				466	safe. It also does not have safe default to ``'/'``.
Senthil Kumaran	aca8fd7	2008-06-23 04:41:59 +0000	[diff] [blame]	467
Guido van Rossum	52dbbb9	2008-08-18 21:44:30 +0000	[diff] [blame]	468	Example: ``quote_plus('/El Niño/')`` yields ``'%2FEl+Ni%C3%B1o%2F'``.
Senthil Kumaran	aca8fd7	2008-06-23 04:41:59 +0000	[diff] [blame]	469
Georg Brandl	7f01a13	2009-09-16 15:58:14 +0000	[diff] [blame]	470
				471	.. function:: quote_from_bytes(bytes, safe='/')
Guido van Rossum	52dbbb9	2008-08-18 21:44:30 +0000	[diff] [blame]	472
				473	Like :func:`quote`, but accepts a :class:`bytes` object rather than a
				474	:class:`str`, and does not perform string-to-bytes encoding.
				475
				476	Example: ``quote_from_bytes(b'a&\xef')`` yields
				477	``'a%26%EF'``.
				478
Georg Brandl	7f01a13	2009-09-16 15:58:14 +0000	[diff] [blame]	479
				480	.. function:: unquote(string, encoding='utf-8', errors='replace')
Senthil Kumaran	aca8fd7	2008-06-23 04:41:59 +0000	[diff] [blame]	481
				482	Replace ``%xx`` escapes by their single-character equivalent.
Guido van Rossum	52dbbb9	2008-08-18 21:44:30 +0000	[diff] [blame]	483	The optional encoding and errors parameters specify how to decode
				484	percent-encoded sequences into Unicode characters, as accepted by the
				485	:meth:`bytes.decode` method.
Senthil Kumaran	aca8fd7	2008-06-23 04:41:59 +0000	[diff] [blame]	486
Guido van Rossum	52dbbb9	2008-08-18 21:44:30 +0000	[diff] [blame]	487	string must be a :class:`str`.
				488
				489	encoding defaults to ``'utf-8'``.
				490	errors defaults to ``'replace'``, meaning invalid sequences are replaced
				491	by a placeholder character.
				492
				493	Example: ``unquote('/El%20Ni%C3%B1o/')`` yields ``'/El Niño/'``.
Senthil Kumaran	aca8fd7	2008-06-23 04:41:59 +0000	[diff] [blame]	494
				495
Georg Brandl	7f01a13	2009-09-16 15:58:14 +0000	[diff] [blame]	496	.. function:: unquote_plus(string, encoding='utf-8', errors='replace')
Senthil Kumaran	aca8fd7	2008-06-23 04:41:59 +0000	[diff] [blame]	497
Georg Brandl	0f7ede4	2008-06-23 11:23:31 +0000	[diff] [blame]	498	Like :func:`unquote`, but also replace plus signs by spaces, as required for
Senthil Kumaran	aca8fd7	2008-06-23 04:41:59 +0000	[diff] [blame]	499	unquoting HTML form values.
				500
Guido van Rossum	52dbbb9	2008-08-18 21:44:30 +0000	[diff] [blame]	501	string must be a :class:`str`.
				502
				503	Example: ``unquote_plus('/El+Ni%C3%B1o/')`` yields ``'/El Niño/'``.
				504
Georg Brandl	7f01a13	2009-09-16 15:58:14 +0000	[diff] [blame]	505
Guido van Rossum	52dbbb9	2008-08-18 21:44:30 +0000	[diff] [blame]	506	.. function:: unquote_to_bytes(string)
				507
				508	Replace ``%xx`` escapes by their single-octet equivalent, and return a
				509	:class:`bytes` object.
				510
				511	string may be either a :class:`str` or a :class:`bytes`.
				512
				513	If it is a :class:`str`, unescaped non-ASCII characters in string
				514	are encoded into UTF-8 bytes.
				515
Nick Coghlan	9fc443c	2010-11-30 15:48:08 +0000	[diff] [blame]	516	Example: ``unquote_to_bytes('a%26%EF')`` yields ``b'a&\xef'``.
Guido van Rossum	52dbbb9	2008-08-18 21:44:30 +0000	[diff] [blame]	517
Senthil Kumaran	aca8fd7	2008-06-23 04:41:59 +0000	[diff] [blame]	518
Senthil Kumaran	df022da	2010-07-03 17:48:22 +0000	[diff] [blame]	519	.. function:: urlencode(query, doseq=False, safe='', encoding=None, errors=None)
Senthil Kumaran	aca8fd7	2008-06-23 04:41:59 +0000	[diff] [blame]	520
Senthil Kumaran	df022da	2010-07-03 17:48:22 +0000	[diff] [blame]	521	Convert a mapping object or a sequence of two-element tuples, which may
Martin Panter	cda85a0	2015-11-24 22:33:18 +0000	[diff] [blame^]	522	contain :class:`str` or :class:`bytes` objects, to a percent-encoded ASCII
				523	text string. If the resultant string is to be used as a data for POST
				524	operation with the :func:`~urllib.request.urlopen` function, then
				525	it should be encoded to bytes, otherwise it would result in a
				526	:exc:`TypeError`.
Senthil Kumaran	6b3434a	2012-03-15 18:11:16 -0700	[diff] [blame]	527
Senthil Kumaran	df022da	2010-07-03 17:48:22 +0000	[diff] [blame]	528	The resulting string is a series of ``key=value`` pairs separated by ``'&'``
				529	characters, where both key and value are quoted using :func:`quote_plus`
				530	above. When a sequence of two-element tuples is used as the query
				531	argument, the first element of each tuple is a key and the second is a
				532	value. The value element in itself can be a sequence and in that case, if
				533	the optional parameter doseq is evaluates to True, individual
				534	``key=value`` pairs separated by ``'&'`` are generated for each element of
				535	the value sequence for the key. The order of parameters in the encoded
Nick Coghlan	9fc443c	2010-11-30 15:48:08 +0000	[diff] [blame]	536	string will match the order of parameter tuples in the sequence.
Senthil Kumaran	df022da	2010-07-03 17:48:22 +0000	[diff] [blame]	537
R David Murray	8c4e112	2014-12-24 21:23:18 -0500	[diff] [blame]	538	The safe, encoding, and errors parameters are passed down to
				539	:func:`quote_plus` (the encoding and errors parameters are only passed
				540	when a query element is a :class:`str`).
Nick Coghlan	9fc443c	2010-11-30 15:48:08 +0000	[diff] [blame]	541
				542	To reverse this encoding process, :func:`parse_qs` and :func:`parse_qsl` are
				543	provided in this module to parse query strings into Python data structures.
Senthil Kumaran	df022da	2010-07-03 17:48:22 +0000	[diff] [blame]	544
Senthil Kumaran	2933312	2011-02-11 11:25:47 +0000	[diff] [blame]	545	Refer to :ref:`urllib examples <urllib-examples>` to find out how urlencode
				546	method can be used for generating query string for a URL or data for POST.
				547
Senthil Kumaran	df022da	2010-07-03 17:48:22 +0000	[diff] [blame]	548	.. versionchanged:: 3.2
Georg Brandl	67b21b7	2010-08-17 15:07:14 +0000	[diff] [blame]	549	Query parameter supports bytes and string objects.
Senthil Kumaran	aca8fd7	2008-06-23 04:41:59 +0000	[diff] [blame]	550
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	551
				552	.. seealso::
				553
Senthil Kumaran	6257bdd	2010-04-22 05:53:18 +0000	[diff] [blame]	554	:rfc:`3986` - Uniform Resource Identifiers
Senthil Kumaran	fe9230a	2011-06-19 13:52:49 -0700	[diff] [blame]	555	This is the current standard (STD66). Any changes to urllib.parse module
Senthil Kumaran	6257bdd	2010-04-22 05:53:18 +0000	[diff] [blame]	556	should conform to this. Certain deviations could be observed, which are
Georg Brandl	6faee4e	2010-09-21 14:48:28 +0000	[diff] [blame]	557	mostly for backward compatibility purposes and for certain de-facto
Senthil Kumaran	6257bdd	2010-04-22 05:53:18 +0000	[diff] [blame]	558	parsing requirements as commonly observed in major browsers.
				559
				560	:rfc:`2732` - Format for Literal IPv6 Addresses in URL's.
				561	This specifies the parsing requirements of IPv6 URLs.
				562
				563	:rfc:`2396` - Uniform Resource Identifiers (URI): Generic Syntax
				564	Document describing the generic syntactic requirements for both Uniform Resource
				565	Names (URNs) and Uniform Resource Locators (URLs).
				566
				567	:rfc:`2368` - The mailto URL scheme.
				568	Parsing requirements for mailto url schemes.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	569
				570	:rfc:`1808` - Relative Uniform Resource Locators
				571	This Request For Comments includes the rules for joining an absolute and a
				572	relative URL, including a fair number of "Abnormal Examples" which govern the
				573	treatment of border cases.
				574
Senthil Kumaran	6257bdd	2010-04-22 05:53:18 +0000	[diff] [blame]	575	:rfc:`1738` - Uniform Resource Locators (URL)
				576	This specifies the formal syntax and semantics of absolute URLs.