Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1 | |
| 2 | :mod:`cookielib` --- Cookie handling for HTTP clients |
| 3 | ===================================================== |
| 4 | |
| 5 | .. module:: cookielib |
| 6 | :synopsis: Classes for automatic handling of HTTP cookies. |
| 7 | .. moduleauthor:: John J. Lee <jjl@pobox.com> |
| 8 | .. sectionauthor:: John J. Lee <jjl@pobox.com> |
| 9 | |
| 10 | |
| 11 | .. versionadded:: 2.4 |
| 12 | |
| 13 | |
| 14 | |
| 15 | The :mod:`cookielib` module defines classes for automatic handling of HTTP |
| 16 | cookies. It is useful for accessing web sites that require small pieces of data |
| 17 | -- :dfn:`cookies` -- to be set on the client machine by an HTTP response from a |
| 18 | web server, and then returned to the server in later HTTP requests. |
| 19 | |
| 20 | Both the regular Netscape cookie protocol and the protocol defined by |
| 21 | :rfc:`2965` are handled. RFC 2965 handling is switched off by default. |
| 22 | :rfc:`2109` cookies are parsed as Netscape cookies and subsequently treated |
| 23 | either as Netscape or RFC 2965 cookies according to the 'policy' in effect. |
| 24 | Note that the great majority of cookies on the Internet are Netscape cookies. |
| 25 | :mod:`cookielib` attempts to follow the de-facto Netscape cookie protocol (which |
| 26 | differs substantially from that set out in the original Netscape specification), |
| 27 | including taking note of the ``max-age`` and ``port`` cookie-attributes |
| 28 | introduced with RFC 2965. |
| 29 | |
| 30 | .. note:: |
| 31 | |
| 32 | The various named parameters found in :mailheader:`Set-Cookie` and |
| 33 | :mailheader:`Set-Cookie2` headers (eg. ``domain`` and ``expires``) are |
| 34 | conventionally referred to as :dfn:`attributes`. To distinguish them from |
| 35 | Python attributes, the documentation for this module uses the term |
| 36 | :dfn:`cookie-attribute` instead. |
| 37 | |
| 38 | |
| 39 | The module defines the following exception: |
| 40 | |
| 41 | |
| 42 | .. exception:: LoadError |
| 43 | |
| 44 | Instances of :class:`FileCookieJar` raise this exception on failure to load |
| 45 | cookies from a file. |
| 46 | |
| 47 | .. note:: |
| 48 | |
| 49 | For backwards-compatibility with Python 2.4 (which raised an :exc:`IOError`), |
| 50 | :exc:`LoadError` is a subclass of :exc:`IOError`. |
| 51 | |
| 52 | |
| 53 | The following classes are provided: |
| 54 | |
| 55 | |
| 56 | .. class:: CookieJar(policy=None) |
| 57 | |
| 58 | *policy* is an object implementing the :class:`CookiePolicy` interface. |
| 59 | |
| 60 | The :class:`CookieJar` class stores HTTP cookies. It extracts cookies from HTTP |
| 61 | requests, and returns them in HTTP responses. :class:`CookieJar` instances |
| 62 | automatically expire contained cookies when necessary. Subclasses are also |
| 63 | responsible for storing and retrieving cookies from a file or database. |
| 64 | |
| 65 | |
| 66 | .. class:: FileCookieJar(filename, delayload=None, policy=None) |
| 67 | |
| 68 | *policy* is an object implementing the :class:`CookiePolicy` interface. For the |
| 69 | other arguments, see the documentation for the corresponding attributes. |
| 70 | |
| 71 | A :class:`CookieJar` which can load cookies from, and perhaps save cookies to, a |
| 72 | file on disk. Cookies are **NOT** loaded from the named file until either the |
| 73 | :meth:`load` or :meth:`revert` method is called. Subclasses of this class are |
| 74 | documented in section :ref:`file-cookie-jar-classes`. |
| 75 | |
| 76 | |
| 77 | .. class:: CookiePolicy() |
| 78 | |
| 79 | This class is responsible for deciding whether each cookie should be accepted |
| 80 | from / returned to the server. |
| 81 | |
| 82 | |
| 83 | .. class:: DefaultCookiePolicy( blocked_domains=None, allowed_domains=None, netscape=True, rfc2965=False, rfc2109_as_netscape=None, hide_cookie2=False, strict_domain=False, strict_rfc2965_unverifiable=True, strict_ns_unverifiable=False, strict_ns_domain=DefaultCookiePolicy.DomainLiberal, strict_ns_set_initial_dollar=False, strict_ns_set_path=False ) |
| 84 | |
| 85 | Constructor arguments should be passed as keyword arguments only. |
| 86 | *blocked_domains* is a sequence of domain names that we never accept cookies |
| 87 | from, nor return cookies to. *allowed_domains* if not :const:`None`, this is a |
| 88 | sequence of the only domains for which we accept and return cookies. For all |
| 89 | other arguments, see the documentation for :class:`CookiePolicy` and |
| 90 | :class:`DefaultCookiePolicy` objects. |
| 91 | |
| 92 | :class:`DefaultCookiePolicy` implements the standard accept / reject rules for |
| 93 | Netscape and RFC 2965 cookies. By default, RFC 2109 cookies (ie. cookies |
| 94 | received in a :mailheader:`Set-Cookie` header with a version cookie-attribute of |
| 95 | 1) are treated according to the RFC 2965 rules. However, if RFC 2965 handling |
| 96 | is turned off or :attr:`rfc2109_as_netscape` is True, RFC 2109 cookies are |
| 97 | 'downgraded' by the :class:`CookieJar` instance to Netscape cookies, by |
| 98 | setting the :attr:`version` attribute of the :class:`Cookie` instance to 0. |
| 99 | :class:`DefaultCookiePolicy` also provides some parameters to allow some |
| 100 | fine-tuning of policy. |
| 101 | |
| 102 | |
| 103 | .. class:: Cookie() |
| 104 | |
| 105 | This class represents Netscape, RFC 2109 and RFC 2965 cookies. It is not |
| 106 | expected that users of :mod:`cookielib` construct their own :class:`Cookie` |
| 107 | instances. Instead, if necessary, call :meth:`make_cookies` on a |
| 108 | :class:`CookieJar` instance. |
| 109 | |
| 110 | |
| 111 | .. seealso:: |
| 112 | |
| 113 | Module :mod:`urllib2` |
| 114 | URL opening with automatic cookie handling. |
| 115 | |
| 116 | Module :mod:`Cookie` |
| 117 | HTTP cookie classes, principally useful for server-side code. The |
| 118 | :mod:`cookielib` and :mod:`Cookie` modules do not depend on each other. |
| 119 | |
| 120 | http://wwwsearch.sf.net/ClientCookie/ |
| 121 | Extensions to this module, including a class for reading Microsoft Internet |
| 122 | Explorer cookies on Windows. |
| 123 | |
| 124 | http://www.netscape.com/newsref/std/cookie_spec.html |
| 125 | The specification of the original Netscape cookie protocol. Though this is |
| 126 | still the dominant protocol, the 'Netscape cookie protocol' implemented by all |
| 127 | the major browsers (and :mod:`cookielib`) only bears a passing resemblance to |
| 128 | the one sketched out in ``cookie_spec.html``. |
| 129 | |
| 130 | :rfc:`2109` - HTTP State Management Mechanism |
| 131 | Obsoleted by RFC 2965. Uses :mailheader:`Set-Cookie` with version=1. |
| 132 | |
| 133 | :rfc:`2965` - HTTP State Management Mechanism |
| 134 | The Netscape protocol with the bugs fixed. Uses :mailheader:`Set-Cookie2` in |
| 135 | place of :mailheader:`Set-Cookie`. Not widely used. |
| 136 | |
| 137 | http://kristol.org/cookie/errata.html |
| 138 | Unfinished errata to RFC 2965. |
| 139 | |
| 140 | :rfc:`2964` - Use of HTTP State Management |
| 141 | |
| 142 | .. _cookie-jar-objects: |
| 143 | |
| 144 | CookieJar and FileCookieJar Objects |
| 145 | ----------------------------------- |
| 146 | |
Georg Brandl | e7a0990 | 2007-10-21 12:10:28 +0000 | [diff] [blame] | 147 | :class:`CookieJar` objects support the :term:`iterator` protocol for iterating over |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 148 | contained :class:`Cookie` objects. |
| 149 | |
| 150 | :class:`CookieJar` has the following methods: |
| 151 | |
| 152 | |
| 153 | .. method:: CookieJar.add_cookie_header(request) |
| 154 | |
| 155 | Add correct :mailheader:`Cookie` header to *request*. |
| 156 | |
| 157 | If policy allows (ie. the :attr:`rfc2965` and :attr:`hide_cookie2` attributes of |
| 158 | the :class:`CookieJar`'s :class:`CookiePolicy` instance are true and false |
| 159 | respectively), the :mailheader:`Cookie2` header is also added when appropriate. |
| 160 | |
| 161 | The *request* object (usually a :class:`urllib2.Request` instance) must support |
| 162 | the methods :meth:`get_full_url`, :meth:`get_host`, :meth:`get_type`, |
| 163 | :meth:`unverifiable`, :meth:`get_origin_req_host`, :meth:`has_header`, |
| 164 | :meth:`get_header`, :meth:`header_items`, and :meth:`add_unredirected_header`,as |
| 165 | documented by :mod:`urllib2`. |
| 166 | |
| 167 | |
| 168 | .. method:: CookieJar.extract_cookies(response, request) |
| 169 | |
| 170 | Extract cookies from HTTP *response* and store them in the :class:`CookieJar`, |
| 171 | where allowed by policy. |
| 172 | |
| 173 | The :class:`CookieJar` will look for allowable :mailheader:`Set-Cookie` and |
| 174 | :mailheader:`Set-Cookie2` headers in the *response* argument, and store cookies |
| 175 | as appropriate (subject to the :meth:`CookiePolicy.set_ok` method's approval). |
| 176 | |
| 177 | The *response* object (usually the result of a call to :meth:`urllib2.urlopen`, |
| 178 | or similar) should support an :meth:`info` method, which returns an object with |
| 179 | a :meth:`getallmatchingheaders` method (usually a :class:`mimetools.Message` |
| 180 | instance). |
| 181 | |
| 182 | The *request* object (usually a :class:`urllib2.Request` instance) must support |
| 183 | the methods :meth:`get_full_url`, :meth:`get_host`, :meth:`unverifiable`, and |
| 184 | :meth:`get_origin_req_host`, as documented by :mod:`urllib2`. The request is |
| 185 | used to set default values for cookie-attributes as well as for checking that |
| 186 | the cookie is allowed to be set. |
| 187 | |
| 188 | |
| 189 | .. method:: CookieJar.set_policy(policy) |
| 190 | |
| 191 | Set the :class:`CookiePolicy` instance to be used. |
| 192 | |
| 193 | |
| 194 | .. method:: CookieJar.make_cookies(response, request) |
| 195 | |
| 196 | Return sequence of :class:`Cookie` objects extracted from *response* object. |
| 197 | |
| 198 | See the documentation for :meth:`extract_cookies` for the interfaces required of |
| 199 | the *response* and *request* arguments. |
| 200 | |
| 201 | |
| 202 | .. method:: CookieJar.set_cookie_if_ok(cookie, request) |
| 203 | |
| 204 | Set a :class:`Cookie` if policy says it's OK to do so. |
| 205 | |
| 206 | |
| 207 | .. method:: CookieJar.set_cookie(cookie) |
| 208 | |
| 209 | Set a :class:`Cookie`, without checking with policy to see whether or not it |
| 210 | should be set. |
| 211 | |
| 212 | |
| 213 | .. method:: CookieJar.clear([domain[, path[, name]]]) |
| 214 | |
| 215 | Clear some cookies. |
| 216 | |
| 217 | If invoked without arguments, clear all cookies. If given a single argument, |
| 218 | only cookies belonging to that *domain* will be removed. If given two arguments, |
| 219 | cookies belonging to the specified *domain* and URL *path* are removed. If |
| 220 | given three arguments, then the cookie with the specified *domain*, *path* and |
| 221 | *name* is removed. |
| 222 | |
| 223 | Raises :exc:`KeyError` if no matching cookie exists. |
| 224 | |
| 225 | |
| 226 | .. method:: CookieJar.clear_session_cookies() |
| 227 | |
| 228 | Discard all session cookies. |
| 229 | |
| 230 | Discards all contained cookies that have a true :attr:`discard` attribute |
| 231 | (usually because they had either no ``max-age`` or ``expires`` cookie-attribute, |
| 232 | or an explicit ``discard`` cookie-attribute). For interactive browsers, the end |
| 233 | of a session usually corresponds to closing the browser window. |
| 234 | |
| 235 | Note that the :meth:`save` method won't save session cookies anyway, unless you |
| 236 | ask otherwise by passing a true *ignore_discard* argument. |
| 237 | |
| 238 | :class:`FileCookieJar` implements the following additional methods: |
| 239 | |
| 240 | |
| 241 | .. method:: FileCookieJar.save(filename=None, ignore_discard=False, ignore_expires=False) |
| 242 | |
| 243 | Save cookies to a file. |
| 244 | |
| 245 | This base class raises :exc:`NotImplementedError`. Subclasses may leave this |
| 246 | method unimplemented. |
| 247 | |
| 248 | *filename* is the name of file in which to save cookies. If *filename* is not |
| 249 | specified, :attr:`self.filename` is used (whose default is the value passed to |
| 250 | the constructor, if any); if :attr:`self.filename` is :const:`None`, |
| 251 | :exc:`ValueError` is raised. |
| 252 | |
| 253 | *ignore_discard*: save even cookies set to be discarded. *ignore_expires*: save |
| 254 | even cookies that have expired |
| 255 | |
| 256 | The file is overwritten if it already exists, thus wiping all the cookies it |
| 257 | contains. Saved cookies can be restored later using the :meth:`load` or |
| 258 | :meth:`revert` methods. |
| 259 | |
| 260 | |
| 261 | .. method:: FileCookieJar.load(filename=None, ignore_discard=False, ignore_expires=False) |
| 262 | |
| 263 | Load cookies from a file. |
| 264 | |
| 265 | Old cookies are kept unless overwritten by newly loaded ones. |
| 266 | |
| 267 | Arguments are as for :meth:`save`. |
| 268 | |
| 269 | The named file must be in the format understood by the class, or |
| 270 | :exc:`LoadError` will be raised. Also, :exc:`IOError` may be raised, for |
| 271 | example if the file does not exist. |
| 272 | |
| 273 | .. note:: |
| 274 | |
| 275 | For backwards-compatibility with Python 2.4 (which raised an :exc:`IOError`), |
| 276 | :exc:`LoadError` is a subclass of :exc:`IOError`. |
| 277 | |
| 278 | |
| 279 | .. method:: FileCookieJar.revert(filename=None, ignore_discard=False, ignore_expires=False) |
| 280 | |
| 281 | Clear all cookies and reload cookies from a saved file. |
| 282 | |
| 283 | :meth:`revert` can raise the same exceptions as :meth:`load`. If there is a |
| 284 | failure, the object's state will not be altered. |
| 285 | |
| 286 | :class:`FileCookieJar` instances have the following public attributes: |
| 287 | |
| 288 | |
| 289 | .. attribute:: FileCookieJar.filename |
| 290 | |
| 291 | Filename of default file in which to keep cookies. This attribute may be |
| 292 | assigned to. |
| 293 | |
| 294 | |
| 295 | .. attribute:: FileCookieJar.delayload |
| 296 | |
| 297 | If true, load cookies lazily from disk. This attribute should not be assigned |
| 298 | to. This is only a hint, since this only affects performance, not behaviour |
| 299 | (unless the cookies on disk are changing). A :class:`CookieJar` object may |
| 300 | ignore it. None of the :class:`FileCookieJar` classes included in the standard |
| 301 | library lazily loads cookies. |
| 302 | |
| 303 | |
| 304 | .. _file-cookie-jar-classes: |
| 305 | |
| 306 | FileCookieJar subclasses and co-operation with web browsers |
| 307 | ----------------------------------------------------------- |
| 308 | |
| 309 | The following :class:`CookieJar` subclasses are provided for reading and writing |
| 310 | . Further :class:`CookieJar` subclasses, including one that reads Microsoft |
| 311 | Internet Explorer cookies, are available at |
| 312 | http://wwwsearch.sf.net/ClientCookie/. |
| 313 | |
| 314 | |
| 315 | .. class:: MozillaCookieJar(filename, delayload=None, policy=None) |
| 316 | |
| 317 | A :class:`FileCookieJar` that can load from and save cookies to disk in the |
| 318 | Mozilla ``cookies.txt`` file format (which is also used by the Lynx and Netscape |
| 319 | browsers). |
| 320 | |
| 321 | .. note:: |
| 322 | |
| 323 | This loses information about RFC 2965 cookies, and also about newer or |
| 324 | non-standard cookie-attributes such as ``port``. |
| 325 | |
| 326 | .. warning:: |
| 327 | |
| 328 | Back up your cookies before saving if you have cookies whose loss / corruption |
| 329 | would be inconvenient (there are some subtleties which may lead to slight |
| 330 | changes in the file over a load / save round-trip). |
| 331 | |
| 332 | Also note that cookies saved while Mozilla is running will get clobbered by |
| 333 | Mozilla. |
| 334 | |
| 335 | |
| 336 | .. class:: LWPCookieJar(filename, delayload=None, policy=None) |
| 337 | |
| 338 | A :class:`FileCookieJar` that can load from and save cookies to disk in format |
| 339 | compatible with the libwww-perl library's ``Set-Cookie3`` file format. This is |
| 340 | convenient if you want to store cookies in a human-readable file. |
| 341 | |
| 342 | |
| 343 | .. _cookie-policy-objects: |
| 344 | |
| 345 | CookiePolicy Objects |
| 346 | -------------------- |
| 347 | |
| 348 | Objects implementing the :class:`CookiePolicy` interface have the following |
| 349 | methods: |
| 350 | |
| 351 | |
| 352 | .. method:: CookiePolicy.set_ok(cookie, request) |
| 353 | |
| 354 | Return boolean value indicating whether cookie should be accepted from server. |
| 355 | |
| 356 | *cookie* is a :class:`cookielib.Cookie` instance. *request* is an object |
| 357 | implementing the interface defined by the documentation for |
| 358 | :meth:`CookieJar.extract_cookies`. |
| 359 | |
| 360 | |
| 361 | .. method:: CookiePolicy.return_ok(cookie, request) |
| 362 | |
| 363 | Return boolean value indicating whether cookie should be returned to server. |
| 364 | |
| 365 | *cookie* is a :class:`cookielib.Cookie` instance. *request* is an object |
| 366 | implementing the interface defined by the documentation for |
| 367 | :meth:`CookieJar.add_cookie_header`. |
| 368 | |
| 369 | |
| 370 | .. method:: CookiePolicy.domain_return_ok(domain, request) |
| 371 | |
| 372 | Return false if cookies should not be returned, given cookie domain. |
| 373 | |
| 374 | This method is an optimization. It removes the need for checking every cookie |
| 375 | with a particular domain (which might involve reading many files). Returning |
| 376 | true from :meth:`domain_return_ok` and :meth:`path_return_ok` leaves all the |
| 377 | work to :meth:`return_ok`. |
| 378 | |
| 379 | If :meth:`domain_return_ok` returns true for the cookie domain, |
| 380 | :meth:`path_return_ok` is called for the cookie path. Otherwise, |
| 381 | :meth:`path_return_ok` and :meth:`return_ok` are never called for that cookie |
| 382 | domain. If :meth:`path_return_ok` returns true, :meth:`return_ok` is called |
| 383 | with the :class:`Cookie` object itself for a full check. Otherwise, |
| 384 | :meth:`return_ok` is never called for that cookie path. |
| 385 | |
| 386 | Note that :meth:`domain_return_ok` is called for every *cookie* domain, not just |
| 387 | for the *request* domain. For example, the function might be called with both |
| 388 | ``".example.com"`` and ``"www.example.com"`` if the request domain is |
| 389 | ``"www.example.com"``. The same goes for :meth:`path_return_ok`. |
| 390 | |
| 391 | The *request* argument is as documented for :meth:`return_ok`. |
| 392 | |
| 393 | |
| 394 | .. method:: CookiePolicy.path_return_ok(path, request) |
| 395 | |
| 396 | Return false if cookies should not be returned, given cookie path. |
| 397 | |
| 398 | See the documentation for :meth:`domain_return_ok`. |
| 399 | |
| 400 | In addition to implementing the methods above, implementations of the |
| 401 | :class:`CookiePolicy` interface must also supply the following attributes, |
| 402 | indicating which protocols should be used, and how. All of these attributes may |
| 403 | be assigned to. |
| 404 | |
| 405 | |
| 406 | .. attribute:: CookiePolicy.netscape |
| 407 | |
| 408 | Implement Netscape protocol. |
| 409 | |
| 410 | |
| 411 | .. attribute:: CookiePolicy.rfc2965 |
| 412 | |
| 413 | Implement RFC 2965 protocol. |
| 414 | |
| 415 | |
| 416 | .. attribute:: CookiePolicy.hide_cookie2 |
| 417 | |
| 418 | Don't add :mailheader:`Cookie2` header to requests (the presence of this header |
| 419 | indicates to the server that we understand RFC 2965 cookies). |
| 420 | |
| 421 | The most useful way to define a :class:`CookiePolicy` class is by subclassing |
| 422 | from :class:`DefaultCookiePolicy` and overriding some or all of the methods |
| 423 | above. :class:`CookiePolicy` itself may be used as a 'null policy' to allow |
| 424 | setting and receiving any and all cookies (this is unlikely to be useful). |
| 425 | |
| 426 | |
| 427 | .. _default-cookie-policy-objects: |
| 428 | |
| 429 | DefaultCookiePolicy Objects |
| 430 | --------------------------- |
| 431 | |
| 432 | Implements the standard rules for accepting and returning cookies. |
| 433 | |
| 434 | Both RFC 2965 and Netscape cookies are covered. RFC 2965 handling is switched |
| 435 | off by default. |
| 436 | |
| 437 | The easiest way to provide your own policy is to override this class and call |
| 438 | its methods in your overridden implementations before adding your own additional |
| 439 | checks:: |
| 440 | |
| 441 | import cookielib |
| 442 | class MyCookiePolicy(cookielib.DefaultCookiePolicy): |
| 443 | def set_ok(self, cookie, request): |
| 444 | if not cookielib.DefaultCookiePolicy.set_ok(self, cookie, request): |
| 445 | return False |
| 446 | if i_dont_want_to_store_this_cookie(cookie): |
| 447 | return False |
| 448 | return True |
| 449 | |
| 450 | In addition to the features required to implement the :class:`CookiePolicy` |
| 451 | interface, this class allows you to block and allow domains from setting and |
| 452 | receiving cookies. There are also some strictness switches that allow you to |
| 453 | tighten up the rather loose Netscape protocol rules a little bit (at the cost of |
| 454 | blocking some benign cookies). |
| 455 | |
| 456 | A domain blacklist and whitelist is provided (both off by default). Only domains |
| 457 | not in the blacklist and present in the whitelist (if the whitelist is active) |
| 458 | participate in cookie setting and returning. Use the *blocked_domains* |
| 459 | constructor argument, and :meth:`blocked_domains` and |
| 460 | :meth:`set_blocked_domains` methods (and the corresponding argument and methods |
| 461 | for *allowed_domains*). If you set a whitelist, you can turn it off again by |
| 462 | setting it to :const:`None`. |
| 463 | |
| 464 | Domains in block or allow lists that do not start with a dot must equal the |
| 465 | cookie domain to be matched. For example, ``"example.com"`` matches a blacklist |
| 466 | entry of ``"example.com"``, but ``"www.example.com"`` does not. Domains that do |
| 467 | start with a dot are matched by more specific domains too. For example, both |
| 468 | ``"www.example.com"`` and ``"www.coyote.example.com"`` match ``".example.com"`` |
| 469 | (but ``"example.com"`` itself does not). IP addresses are an exception, and |
| 470 | must match exactly. For example, if blocked_domains contains ``"192.168.1.2"`` |
| 471 | and ``".168.1.2"``, 192.168.1.2 is blocked, but 193.168.1.2 is not. |
| 472 | |
| 473 | :class:`DefaultCookiePolicy` implements the following additional methods: |
| 474 | |
| 475 | |
| 476 | .. method:: DefaultCookiePolicy.blocked_domains() |
| 477 | |
| 478 | Return the sequence of blocked domains (as a tuple). |
| 479 | |
| 480 | |
| 481 | .. method:: DefaultCookiePolicy.set_blocked_domains(blocked_domains) |
| 482 | |
| 483 | Set the sequence of blocked domains. |
| 484 | |
| 485 | |
| 486 | .. method:: DefaultCookiePolicy.is_blocked(domain) |
| 487 | |
| 488 | Return whether *domain* is on the blacklist for setting or receiving cookies. |
| 489 | |
| 490 | |
| 491 | .. method:: DefaultCookiePolicy.allowed_domains() |
| 492 | |
| 493 | Return :const:`None`, or the sequence of allowed domains (as a tuple). |
| 494 | |
| 495 | |
| 496 | .. method:: DefaultCookiePolicy.set_allowed_domains(allowed_domains) |
| 497 | |
| 498 | Set the sequence of allowed domains, or :const:`None`. |
| 499 | |
| 500 | |
| 501 | .. method:: DefaultCookiePolicy.is_not_allowed(domain) |
| 502 | |
| 503 | Return whether *domain* is not on the whitelist for setting or receiving |
| 504 | cookies. |
| 505 | |
| 506 | :class:`DefaultCookiePolicy` instances have the following attributes, which are |
| 507 | all initialised from the constructor arguments of the same name, and which may |
| 508 | all be assigned to. |
| 509 | |
| 510 | |
| 511 | .. attribute:: DefaultCookiePolicy.rfc2109_as_netscape |
| 512 | |
| 513 | If true, request that the :class:`CookieJar` instance downgrade RFC 2109 cookies |
| 514 | (ie. cookies received in a :mailheader:`Set-Cookie` header with a version |
| 515 | cookie-attribute of 1) to Netscape cookies by setting the version attribute of |
| 516 | the :class:`Cookie` instance to 0. The default value is :const:`None`, in which |
| 517 | case RFC 2109 cookies are downgraded if and only if RFC 2965 handling is turned |
| 518 | off. Therefore, RFC 2109 cookies are downgraded by default. |
| 519 | |
| 520 | .. versionadded:: 2.5 |
| 521 | |
| 522 | General strictness switches: |
| 523 | |
| 524 | |
| 525 | .. attribute:: DefaultCookiePolicy.strict_domain |
| 526 | |
| 527 | Don't allow sites to set two-component domains with country-code top-level |
| 528 | domains like ``.co.uk``, ``.gov.uk``, ``.co.nz``.etc. This is far from perfect |
| 529 | and isn't guaranteed to work! |
| 530 | |
| 531 | RFC 2965 protocol strictness switches: |
| 532 | |
| 533 | |
| 534 | .. attribute:: DefaultCookiePolicy.strict_rfc2965_unverifiable |
| 535 | |
| 536 | Follow RFC 2965 rules on unverifiable transactions (usually, an unverifiable |
| 537 | transaction is one resulting from a redirect or a request for an image hosted on |
| 538 | another site). If this is false, cookies are *never* blocked on the basis of |
| 539 | verifiability |
| 540 | |
| 541 | Netscape protocol strictness switches: |
| 542 | |
| 543 | |
| 544 | .. attribute:: DefaultCookiePolicy.strict_ns_unverifiable |
| 545 | |
| 546 | apply RFC 2965 rules on unverifiable transactions even to Netscape cookies |
| 547 | |
| 548 | |
| 549 | .. attribute:: DefaultCookiePolicy.strict_ns_domain |
| 550 | |
| 551 | Flags indicating how strict to be with domain-matching rules for Netscape |
| 552 | cookies. See below for acceptable values. |
| 553 | |
| 554 | |
| 555 | .. attribute:: DefaultCookiePolicy.strict_ns_set_initial_dollar |
| 556 | |
| 557 | Ignore cookies in Set-Cookie: headers that have names starting with ``'$'``. |
| 558 | |
| 559 | |
| 560 | .. attribute:: DefaultCookiePolicy.strict_ns_set_path |
| 561 | |
| 562 | Don't allow setting cookies whose path doesn't path-match request URI. |
| 563 | |
| 564 | :attr:`strict_ns_domain` is a collection of flags. Its value is constructed by |
| 565 | or-ing together (for example, ``DomainStrictNoDots|DomainStrictNonDomain`` means |
| 566 | both flags are set). |
| 567 | |
| 568 | |
| 569 | .. attribute:: DefaultCookiePolicy.DomainStrictNoDots |
| 570 | |
| 571 | When setting cookies, the 'host prefix' must not contain a dot (eg. |
| 572 | ``www.foo.bar.com`` can't set a cookie for ``.bar.com``, because ``www.foo`` |
| 573 | contains a dot). |
| 574 | |
| 575 | |
| 576 | .. attribute:: DefaultCookiePolicy.DomainStrictNonDomain |
| 577 | |
| 578 | Cookies that did not explicitly specify a ``domain`` cookie-attribute can only |
| 579 | be returned to a domain equal to the domain that set the cookie (eg. |
| 580 | ``spam.example.com`` won't be returned cookies from ``example.com`` that had no |
| 581 | ``domain`` cookie-attribute). |
| 582 | |
| 583 | |
| 584 | .. attribute:: DefaultCookiePolicy.DomainRFC2965Match |
| 585 | |
| 586 | When setting cookies, require a full RFC 2965 domain-match. |
| 587 | |
| 588 | The following attributes are provided for convenience, and are the most useful |
| 589 | combinations of the above flags: |
| 590 | |
| 591 | |
| 592 | .. attribute:: DefaultCookiePolicy.DomainLiberal |
| 593 | |
| 594 | Equivalent to 0 (ie. all of the above Netscape domain strictness flags switched |
| 595 | off). |
| 596 | |
| 597 | |
| 598 | .. attribute:: DefaultCookiePolicy.DomainStrict |
| 599 | |
| 600 | Equivalent to ``DomainStrictNoDots|DomainStrictNonDomain``. |
| 601 | |
| 602 | |
| 603 | .. _cookielib-cookie-objects: |
| 604 | |
| 605 | Cookie Objects |
| 606 | -------------- |
| 607 | |
| 608 | :class:`Cookie` instances have Python attributes roughly corresponding to the |
| 609 | standard cookie-attributes specified in the various cookie standards. The |
| 610 | correspondence is not one-to-one, because there are complicated rules for |
| 611 | assigning default values, because the ``max-age`` and ``expires`` |
| 612 | cookie-attributes contain equivalent information, and because RFC 2109 cookies |
| 613 | may be 'downgraded' by :mod:`cookielib` from version 1 to version 0 (Netscape) |
| 614 | cookies. |
| 615 | |
| 616 | Assignment to these attributes should not be necessary other than in rare |
| 617 | circumstances in a :class:`CookiePolicy` method. The class does not enforce |
| 618 | internal consistency, so you should know what you're doing if you do that. |
| 619 | |
| 620 | |
| 621 | .. attribute:: Cookie.version |
| 622 | |
| 623 | Integer or :const:`None`. Netscape cookies have :attr:`version` 0. RFC 2965 and |
| 624 | RFC 2109 cookies have a ``version`` cookie-attribute of 1. However, note that |
| 625 | :mod:`cookielib` may 'downgrade' RFC 2109 cookies to Netscape cookies, in which |
| 626 | case :attr:`version` is 0. |
| 627 | |
| 628 | |
| 629 | .. attribute:: Cookie.name |
| 630 | |
| 631 | Cookie name (a string). |
| 632 | |
| 633 | |
| 634 | .. attribute:: Cookie.value |
| 635 | |
| 636 | Cookie value (a string), or :const:`None`. |
| 637 | |
| 638 | |
| 639 | .. attribute:: Cookie.port |
| 640 | |
| 641 | String representing a port or a set of ports (eg. '80', or '80,8080'), or |
| 642 | :const:`None`. |
| 643 | |
| 644 | |
| 645 | .. attribute:: Cookie.path |
| 646 | |
| 647 | Cookie path (a string, eg. ``'/acme/rocket_launchers'``). |
| 648 | |
| 649 | |
| 650 | .. attribute:: Cookie.secure |
| 651 | |
| 652 | True if cookie should only be returned over a secure connection. |
| 653 | |
| 654 | |
| 655 | .. attribute:: Cookie.expires |
| 656 | |
| 657 | Integer expiry date in seconds since epoch, or :const:`None`. See also the |
| 658 | :meth:`is_expired` method. |
| 659 | |
| 660 | |
| 661 | .. attribute:: Cookie.discard |
| 662 | |
| 663 | True if this is a session cookie. |
| 664 | |
| 665 | |
| 666 | .. attribute:: Cookie.comment |
| 667 | |
| 668 | String comment from the server explaining the function of this cookie, or |
| 669 | :const:`None`. |
| 670 | |
| 671 | |
| 672 | .. attribute:: Cookie.comment_url |
| 673 | |
| 674 | URL linking to a comment from the server explaining the function of this cookie, |
| 675 | or :const:`None`. |
| 676 | |
| 677 | |
| 678 | .. attribute:: Cookie.rfc2109 |
| 679 | |
| 680 | True if this cookie was received as an RFC 2109 cookie (ie. the cookie |
| 681 | arrived in a :mailheader:`Set-Cookie` header, and the value of the Version |
| 682 | cookie-attribute in that header was 1). This attribute is provided because |
| 683 | :mod:`cookielib` may 'downgrade' RFC 2109 cookies to Netscape cookies, in |
| 684 | which case :attr:`version` is 0. |
| 685 | |
| 686 | .. versionadded:: 2.5 |
| 687 | |
| 688 | |
| 689 | .. attribute:: Cookie.port_specified |
| 690 | |
| 691 | True if a port or set of ports was explicitly specified by the server (in the |
| 692 | :mailheader:`Set-Cookie` / :mailheader:`Set-Cookie2` header). |
| 693 | |
| 694 | |
| 695 | .. attribute:: Cookie.domain_specified |
| 696 | |
| 697 | True if a domain was explicitly specified by the server. |
| 698 | |
| 699 | |
| 700 | .. attribute:: Cookie.domain_initial_dot |
| 701 | |
| 702 | True if the domain explicitly specified by the server began with a dot |
| 703 | (``'.'``). |
| 704 | |
| 705 | Cookies may have additional non-standard cookie-attributes. These may be |
| 706 | accessed using the following methods: |
| 707 | |
| 708 | |
| 709 | .. method:: Cookie.has_nonstandard_attr(name) |
| 710 | |
| 711 | Return true if cookie has the named cookie-attribute. |
| 712 | |
| 713 | |
| 714 | .. method:: Cookie.get_nonstandard_attr(name, default=None) |
| 715 | |
| 716 | If cookie has the named cookie-attribute, return its value. Otherwise, return |
| 717 | *default*. |
| 718 | |
| 719 | |
| 720 | .. method:: Cookie.set_nonstandard_attr(name, value) |
| 721 | |
| 722 | Set the value of the named cookie-attribute. |
| 723 | |
| 724 | The :class:`Cookie` class also defines the following method: |
| 725 | |
| 726 | |
| 727 | .. method:: Cookie.is_expired([now=:const:`None`]) |
| 728 | |
| 729 | True if cookie has passed the time at which the server requested it should |
| 730 | expire. If *now* is given (in seconds since the epoch), return whether the |
| 731 | cookie has expired at the specified time. |
| 732 | |
| 733 | |
| 734 | .. _cookielib-examples: |
| 735 | |
| 736 | Examples |
| 737 | -------- |
| 738 | |
| 739 | The first example shows the most common usage of :mod:`cookielib`:: |
| 740 | |
| 741 | import cookielib, urllib2 |
| 742 | cj = cookielib.CookieJar() |
| 743 | opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj)) |
| 744 | r = opener.open("http://example.com/") |
| 745 | |
| 746 | This example illustrates how to open a URL using your Netscape, Mozilla, or Lynx |
| 747 | cookies (assumes Unix/Netscape convention for location of the cookies file):: |
| 748 | |
| 749 | import os, cookielib, urllib2 |
| 750 | cj = cookielib.MozillaCookieJar() |
| 751 | cj.load(os.path.join(os.environ["HOME"], ".netscape/cookies.txt")) |
| 752 | opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj)) |
| 753 | r = opener.open("http://example.com/") |
| 754 | |
| 755 | The next example illustrates the use of :class:`DefaultCookiePolicy`. Turn on |
| 756 | RFC 2965 cookies, be more strict about domains when setting and returning |
| 757 | Netscape cookies, and block some domains from setting cookies or having them |
| 758 | returned:: |
| 759 | |
| 760 | import urllib2 |
| 761 | from cookielib import CookieJar, DefaultCookiePolicy |
| 762 | policy = DefaultCookiePolicy( |
| 763 | rfc2965=True, strict_ns_domain=Policy.DomainStrict, |
| 764 | blocked_domains=["ads.net", ".ads.net"]) |
| 765 | cj = CookieJar(policy) |
| 766 | opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj)) |
| 767 | r = opener.open("http://example.com/") |
| 768 | |