Georg Brandl | 0d8f073 | 2009-04-05 22:20:44 +0000 | [diff] [blame] | 1 | :mod:`cgi` --- Common Gateway Interface support |
| 2 | =============================================== |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 3 | |
| 4 | .. module:: cgi |
| 5 | :synopsis: Helpers for running Python scripts via the Common Gateway Interface. |
| 6 | |
Terry Jan Reedy | fa089b9 | 2016-06-11 15:02:54 -0400 | [diff] [blame] | 7 | **Source code:** :source:`Lib/cgi.py` |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 8 | |
| 9 | .. index:: |
| 10 | pair: WWW; server |
| 11 | pair: CGI; protocol |
| 12 | pair: HTTP; protocol |
| 13 | pair: MIME; headers |
| 14 | single: URL |
| 15 | single: Common Gateway Interface |
| 16 | |
Raymond Hettinger | a199368 | 2011-01-27 01:20:32 +0000 | [diff] [blame] | 17 | -------------- |
| 18 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 19 | Support module for Common Gateway Interface (CGI) scripts. |
| 20 | |
| 21 | This module defines a number of utilities for use by CGI scripts written in |
| 22 | Python. |
| 23 | |
| 24 | |
| 25 | Introduction |
| 26 | ------------ |
| 27 | |
| 28 | .. _cgi-intro: |
| 29 | |
| 30 | A CGI script is invoked by an HTTP server, usually to process user input |
| 31 | submitted through an HTML ``<FORM>`` or ``<ISINDEX>`` element. |
| 32 | |
| 33 | Most often, CGI scripts live in the server's special :file:`cgi-bin` directory. |
| 34 | The HTTP server places all sorts of information about the request (such as the |
| 35 | client's hostname, the requested URL, the query string, and lots of other |
| 36 | goodies) in the script's shell environment, executes the script, and sends the |
| 37 | script's output back to the client. |
| 38 | |
| 39 | The script's input is connected to the client too, and sometimes the form data |
| 40 | is read this way; at other times the form data is passed via the "query string" |
| 41 | part of the URL. This module is intended to take care of the different cases |
| 42 | and provide a simpler interface to the Python script. It also provides a number |
| 43 | of utilities that help in debugging scripts, and the latest addition is support |
| 44 | for file uploads from a form (if your browser supports it). |
| 45 | |
| 46 | The output of a CGI script should consist of two sections, separated by a blank |
| 47 | line. The first section contains a number of headers, telling the client what |
| 48 | kind of data is following. Python code to generate a minimal header section |
| 49 | looks like this:: |
| 50 | |
Georg Brandl | 6911e3c | 2007-09-04 07:15:32 +0000 | [diff] [blame] | 51 | print("Content-Type: text/html") # HTML is following |
| 52 | print() # blank line, end of headers |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 53 | |
| 54 | The second section is usually HTML, which allows the client software to display |
| 55 | nicely formatted text with header, in-line images, etc. Here's Python code that |
| 56 | prints a simple piece of HTML:: |
| 57 | |
Georg Brandl | 6911e3c | 2007-09-04 07:15:32 +0000 | [diff] [blame] | 58 | print("<TITLE>CGI script output</TITLE>") |
| 59 | print("<H1>This is my first CGI script</H1>") |
| 60 | print("Hello, world!") |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 61 | |
| 62 | |
| 63 | .. _using-the-cgi-module: |
| 64 | |
| 65 | Using the cgi module |
| 66 | -------------------- |
| 67 | |
Georg Brandl | 49d1b4f | 2008-05-11 21:42:51 +0000 | [diff] [blame] | 68 | Begin by writing ``import cgi``. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 69 | |
Benjamin Peterson | ad3d5c2 | 2009-02-26 03:38:59 +0000 | [diff] [blame] | 70 | When you write a new script, consider adding these lines:: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 71 | |
Benjamin Peterson | ad3d5c2 | 2009-02-26 03:38:59 +0000 | [diff] [blame] | 72 | import cgitb |
| 73 | cgitb.enable() |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 74 | |
| 75 | This activates a special exception handler that will display detailed reports in |
| 76 | the Web browser if any errors occur. If you'd rather not show the guts of your |
| 77 | program to users of your script, you can have the reports saved to files |
Benjamin Peterson | ad3d5c2 | 2009-02-26 03:38:59 +0000 | [diff] [blame] | 78 | instead, with code like this:: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 79 | |
Benjamin Peterson | ad3d5c2 | 2009-02-26 03:38:59 +0000 | [diff] [blame] | 80 | import cgitb |
Petri Lehtinen | 9f74c6c | 2013-02-23 19:26:56 +0100 | [diff] [blame] | 81 | cgitb.enable(display=0, logdir="/path/to/logdir") |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 82 | |
| 83 | It's very helpful to use this feature during script development. The reports |
| 84 | produced by :mod:`cgitb` provide information that can save you a lot of time in |
| 85 | tracking down bugs. You can always remove the ``cgitb`` line later when you |
| 86 | have tested your script and are confident that it works correctly. |
| 87 | |
Senthil Kumaran | 290416f | 2012-04-30 22:43:13 +0800 | [diff] [blame] | 88 | To get at submitted form data, use the :class:`FieldStorage` class. If the form |
| 89 | contains non-ASCII characters, use the *encoding* keyword parameter set to the |
| 90 | value of the encoding defined for the document. It is usually contained in the |
| 91 | META tag in the HEAD section of the HTML document or by the |
| 92 | :mailheader:`Content-Type` header). This reads the form contents from the |
| 93 | standard input or the environment (depending on the value of various |
| 94 | environment variables set according to the CGI standard). Since it may consume |
| 95 | standard input, it should be instantiated only once. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 96 | |
Ezio Melotti | c7e994d | 2009-07-22 21:17:14 +0000 | [diff] [blame] | 97 | The :class:`FieldStorage` instance can be indexed like a Python dictionary. |
| 98 | It allows membership testing with the :keyword:`in` operator, and also supports |
Serhiy Storchaka | fd1c3d3 | 2013-10-13 18:28:26 +0300 | [diff] [blame] | 99 | the standard dictionary method :meth:`~dict.keys` and the built-in function |
Ezio Melotti | c7e994d | 2009-07-22 21:17:14 +0000 | [diff] [blame] | 100 | :func:`len`. Form fields containing empty strings are ignored and do not appear |
| 101 | in the dictionary; to keep such values, provide a true value for the optional |
| 102 | *keep_blank_values* keyword parameter when creating the :class:`FieldStorage` |
| 103 | instance. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 104 | |
| 105 | For instance, the following code (which assumes that the |
| 106 | :mailheader:`Content-Type` header and blank line have already been printed) |
| 107 | checks that the fields ``name`` and ``addr`` are both set to a non-empty |
| 108 | string:: |
| 109 | |
| 110 | form = cgi.FieldStorage() |
Ezio Melotti | c7e994d | 2009-07-22 21:17:14 +0000 | [diff] [blame] | 111 | if "name" not in form or "addr" not in form: |
Georg Brandl | 6911e3c | 2007-09-04 07:15:32 +0000 | [diff] [blame] | 112 | print("<H1>Error</H1>") |
| 113 | print("Please fill in the name and addr fields.") |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 114 | return |
Georg Brandl | 6911e3c | 2007-09-04 07:15:32 +0000 | [diff] [blame] | 115 | print("<p>name:", form["name"].value) |
| 116 | print("<p>addr:", form["addr"].value) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 117 | ...further form processing here... |
| 118 | |
| 119 | Here the fields, accessed through ``form[key]``, are themselves instances of |
| 120 | :class:`FieldStorage` (or :class:`MiniFieldStorage`, depending on the form |
Serhiy Storchaka | fd1c3d3 | 2013-10-13 18:28:26 +0300 | [diff] [blame] | 121 | encoding). The :attr:`~FieldStorage.value` attribute of the instance yields |
| 122 | the string value of the field. The :meth:`~FieldStorage.getvalue` method |
| 123 | returns this string value directly; it also accepts an optional second argument |
| 124 | as a default to return if the requested key is not present. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 125 | |
| 126 | If the submitted form data contains more than one field with the same name, the |
| 127 | object retrieved by ``form[key]`` is not a :class:`FieldStorage` or |
| 128 | :class:`MiniFieldStorage` instance but a list of such instances. Similarly, in |
| 129 | this situation, ``form.getvalue(key)`` would return a list of strings. If you |
| 130 | expect this possibility (when your HTML form contains multiple fields with the |
Serhiy Storchaka | fd1c3d3 | 2013-10-13 18:28:26 +0300 | [diff] [blame] | 131 | same name), use the :meth:`~FieldStorage.getlist` method, which always returns |
| 132 | a list of values (so that you do not need to special-case the single item |
| 133 | case). For example, this code concatenates any number of username fields, |
| 134 | separated by commas:: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 135 | |
| 136 | value = form.getlist("username") |
| 137 | usernames = ",".join(value) |
| 138 | |
| 139 | If a field represents an uploaded file, accessing the value via the |
Serhiy Storchaka | fd1c3d3 | 2013-10-13 18:28:26 +0300 | [diff] [blame] | 140 | :attr:`~FieldStorage.value` attribute or the :meth:`~FieldStorage.getvalue` |
| 141 | method reads the entire file in memory as bytes. This may not be what you |
| 142 | want. You can test for an uploaded file by testing either the |
| 143 | :attr:`~FieldStorage.filename` attribute or the :attr:`~FieldStorage.file` |
Brett Cannon | c089f70 | 2014-01-17 11:03:19 -0500 | [diff] [blame] | 144 | attribute. You can then read the data from the :attr:`!file` |
| 145 | attribute before it is automatically closed as part of the garbage collection of |
| 146 | the :class:`FieldStorage` instance |
| 147 | (the :func:`~io.RawIOBase.read` and :func:`~io.IOBase.readline` methods will |
| 148 | return bytes):: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 149 | |
| 150 | fileitem = form["userfile"] |
| 151 | if fileitem.file: |
| 152 | # It's an uploaded file; count lines |
| 153 | linecount = 0 |
Collin Winter | 4633448 | 2007-09-10 00:49:57 +0000 | [diff] [blame] | 154 | while True: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 155 | line = fileitem.file.readline() |
| 156 | if not line: break |
| 157 | linecount = linecount + 1 |
| 158 | |
Berker Peksag | bf5e960 | 2015-02-06 10:21:37 +0200 | [diff] [blame] | 159 | :class:`FieldStorage` objects also support being used in a :keyword:`with` |
| 160 | statement, which will automatically close them when done. |
| 161 | |
Sean Reifscheider | 782d6b4 | 2007-09-18 23:39:35 +0000 | [diff] [blame] | 162 | If an error is encountered when obtaining the contents of an uploaded file |
| 163 | (for example, when the user interrupts the form submission by clicking on |
Serhiy Storchaka | fd1c3d3 | 2013-10-13 18:28:26 +0300 | [diff] [blame] | 164 | a Back or Cancel button) the :attr:`~FieldStorage.done` attribute of the |
| 165 | object for the field will be set to the value -1. |
Sean Reifscheider | 782d6b4 | 2007-09-18 23:39:35 +0000 | [diff] [blame] | 166 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 167 | The file upload draft standard entertains the possibility of uploading multiple |
| 168 | files from one field (using a recursive :mimetype:`multipart/\*` encoding). |
| 169 | When this occurs, the item will be a dictionary-like :class:`FieldStorage` item. |
Georg Brandl | 502d9a5 | 2009-07-26 15:02:41 +0000 | [diff] [blame] | 170 | This can be determined by testing its :attr:`!type` attribute, which should be |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 171 | :mimetype:`multipart/form-data` (or perhaps another MIME type matching |
| 172 | :mimetype:`multipart/\*`). In this case, it can be iterated over recursively |
| 173 | just like the top-level form object. |
| 174 | |
| 175 | When a form is submitted in the "old" format (as the query string or as a single |
| 176 | data part of type :mimetype:`application/x-www-form-urlencoded`), the items will |
| 177 | actually be instances of the class :class:`MiniFieldStorage`. In this case, the |
Georg Brandl | 502d9a5 | 2009-07-26 15:02:41 +0000 | [diff] [blame] | 178 | :attr:`!list`, :attr:`!file`, and :attr:`filename` attributes are always ``None``. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 179 | |
Benjamin Peterson | dcf97b9 | 2008-07-02 17:30:14 +0000 | [diff] [blame] | 180 | A form submitted via POST that also has a query string will contain both |
| 181 | :class:`FieldStorage` and :class:`MiniFieldStorage` items. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 182 | |
Brett Cannon | c089f70 | 2014-01-17 11:03:19 -0500 | [diff] [blame] | 183 | .. versionchanged:: 3.4 |
| 184 | The :attr:`~FieldStorage.file` attribute is automatically closed upon the |
| 185 | garbage collection of the creating :class:`FieldStorage` instance. |
| 186 | |
Berker Peksag | bf5e960 | 2015-02-06 10:21:37 +0200 | [diff] [blame] | 187 | .. versionchanged:: 3.5 |
| 188 | Added support for the context management protocol to the |
| 189 | :class:`FieldStorage` class. |
| 190 | |
Brett Cannon | c089f70 | 2014-01-17 11:03:19 -0500 | [diff] [blame] | 191 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 192 | Higher Level Interface |
| 193 | ---------------------- |
| 194 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 195 | The previous section explains how to read CGI form data using the |
| 196 | :class:`FieldStorage` class. This section describes a higher level interface |
| 197 | which was added to this class to allow one to do it in a more readable and |
| 198 | intuitive way. The interface doesn't make the techniques described in previous |
| 199 | sections obsolete --- they are still useful to process file uploads efficiently, |
| 200 | for example. |
| 201 | |
Christian Heimes | 5b5e81c | 2007-12-31 16:14:33 +0000 | [diff] [blame] | 202 | .. XXX: Is this true ? |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 203 | |
| 204 | The interface consists of two simple methods. Using the methods you can process |
| 205 | form data in a generic way, without the need to worry whether only one or more |
| 206 | values were posted under one name. |
| 207 | |
| 208 | In the previous section, you learned to write following code anytime you |
| 209 | expected a user to post more than one value under one name:: |
| 210 | |
| 211 | item = form.getvalue("item") |
| 212 | if isinstance(item, list): |
| 213 | # The user is requesting more than one item. |
| 214 | else: |
| 215 | # The user is requesting only one item. |
| 216 | |
| 217 | This situation is common for example when a form contains a group of multiple |
| 218 | checkboxes with the same name:: |
| 219 | |
| 220 | <input type="checkbox" name="item" value="1" /> |
| 221 | <input type="checkbox" name="item" value="2" /> |
| 222 | |
| 223 | In most situations, however, there's only one form control with a particular |
| 224 | name in a form and then you expect and need only one value associated with this |
| 225 | name. So you write a script containing for example this code:: |
| 226 | |
| 227 | user = form.getvalue("user").upper() |
| 228 | |
| 229 | The problem with the code is that you should never expect that a client will |
| 230 | provide valid input to your scripts. For example, if a curious user appends |
| 231 | another ``user=foo`` pair to the query string, then the script would crash, |
| 232 | because in this situation the ``getvalue("user")`` method call returns a list |
Benjamin Peterson | 8719ad5 | 2009-09-11 22:24:02 +0000 | [diff] [blame] | 233 | instead of a string. Calling the :meth:`~str.upper` method on a list is not valid |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 234 | (since lists do not have a method of this name) and results in an |
| 235 | :exc:`AttributeError` exception. |
| 236 | |
| 237 | Therefore, the appropriate way to read form data values was to always use the |
| 238 | code which checks whether the obtained value is a single value or a list of |
| 239 | values. That's annoying and leads to less readable scripts. |
| 240 | |
Serhiy Storchaka | fd1c3d3 | 2013-10-13 18:28:26 +0300 | [diff] [blame] | 241 | A more convenient approach is to use the methods :meth:`~FieldStorage.getfirst` |
| 242 | and :meth:`~FieldStorage.getlist` provided by this higher level interface. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 243 | |
| 244 | |
Georg Brandl | 0d8f073 | 2009-04-05 22:20:44 +0000 | [diff] [blame] | 245 | .. method:: FieldStorage.getfirst(name, default=None) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 246 | |
| 247 | This method always returns only one value associated with form field *name*. |
| 248 | The method returns only the first value in case that more values were posted |
| 249 | under such name. Please note that the order in which the values are received |
| 250 | may vary from browser to browser and should not be counted on. [#]_ If no such |
| 251 | form field or value exists then the method returns the value specified by the |
| 252 | optional parameter *default*. This parameter defaults to ``None`` if not |
| 253 | specified. |
| 254 | |
| 255 | |
| 256 | .. method:: FieldStorage.getlist(name) |
| 257 | |
| 258 | This method always returns a list of values associated with form field *name*. |
| 259 | The method returns an empty list if no such form field or value exists for |
| 260 | *name*. It returns a list consisting of one item if only one such value exists. |
| 261 | |
| 262 | Using these methods you can write nice compact code:: |
| 263 | |
| 264 | import cgi |
| 265 | form = cgi.FieldStorage() |
| 266 | user = form.getfirst("user", "").upper() # This way it's safe. |
| 267 | for item in form.getlist("item"): |
| 268 | do_something(item) |
| 269 | |
| 270 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 271 | .. _functions-in-cgi-module: |
| 272 | |
| 273 | Functions |
| 274 | --------- |
| 275 | |
| 276 | These are useful if you want more control, or if you want to employ some of the |
| 277 | algorithms implemented in this module in other circumstances. |
| 278 | |
| 279 | |
Georg Brandl | 0d8f073 | 2009-04-05 22:20:44 +0000 | [diff] [blame] | 280 | .. function:: parse(fp=None, environ=os.environ, keep_blank_values=False, strict_parsing=False) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 281 | |
| 282 | Parse a query in the environment or from a file (the file defaults to |
| 283 | ``sys.stdin``). The *keep_blank_values* and *strict_parsing* parameters are |
Facundo Batista | c469d4c | 2008-09-03 22:49:01 +0000 | [diff] [blame] | 284 | passed to :func:`urllib.parse.parse_qs` unchanged. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 285 | |
| 286 | |
Georg Brandl | 0d8f073 | 2009-04-05 22:20:44 +0000 | [diff] [blame] | 287 | .. function:: parse_qs(qs, keep_blank_values=False, strict_parsing=False) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 288 | |
Facundo Batista | c469d4c | 2008-09-03 22:49:01 +0000 | [diff] [blame] | 289 | This function is deprecated in this module. Use :func:`urllib.parse.parse_qs` |
Georg Brandl | ae2dbe2 | 2009-03-13 19:04:40 +0000 | [diff] [blame] | 290 | instead. It is maintained here only for backward compatibility. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 291 | |
Georg Brandl | 0d8f073 | 2009-04-05 22:20:44 +0000 | [diff] [blame] | 292 | .. function:: parse_qsl(qs, keep_blank_values=False, strict_parsing=False) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 293 | |
Martin Panter | cebfdac | 2015-09-20 00:28:50 +0000 | [diff] [blame] | 294 | This function is deprecated in this module. Use :func:`urllib.parse.parse_qsl` |
Georg Brandl | ae2dbe2 | 2009-03-13 19:04:40 +0000 | [diff] [blame] | 295 | instead. It is maintained here only for backward compatibility. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 296 | |
| 297 | .. function:: parse_multipart(fp, pdict) |
| 298 | |
| 299 | Parse input of type :mimetype:`multipart/form-data` (for file uploads). |
| 300 | Arguments are *fp* for the input file and *pdict* for a dictionary containing |
| 301 | other parameters in the :mailheader:`Content-Type` header. |
| 302 | |
Facundo Batista | c469d4c | 2008-09-03 22:49:01 +0000 | [diff] [blame] | 303 | Returns a dictionary just like :func:`urllib.parse.parse_qs` keys are the field names, each |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 304 | value is a list of values for that field. This is easy to use but not much good |
| 305 | if you are expecting megabytes to be uploaded --- in that case, use the |
| 306 | :class:`FieldStorage` class instead which is much more flexible. |
| 307 | |
| 308 | Note that this does not parse nested multipart parts --- use |
| 309 | :class:`FieldStorage` for that. |
| 310 | |
| 311 | |
| 312 | .. function:: parse_header(string) |
| 313 | |
| 314 | Parse a MIME header (such as :mailheader:`Content-Type`) into a main value and a |
| 315 | dictionary of parameters. |
| 316 | |
| 317 | |
| 318 | .. function:: test() |
| 319 | |
| 320 | Robust test CGI script, usable as main program. Writes minimal HTTP headers and |
| 321 | formats all information provided to the script in HTML form. |
| 322 | |
| 323 | |
| 324 | .. function:: print_environ() |
| 325 | |
| 326 | Format the shell environment in HTML. |
| 327 | |
| 328 | |
| 329 | .. function:: print_form(form) |
| 330 | |
| 331 | Format a form in HTML. |
| 332 | |
| 333 | |
| 334 | .. function:: print_directory() |
| 335 | |
| 336 | Format the current directory in HTML. |
| 337 | |
| 338 | |
| 339 | .. function:: print_environ_usage() |
| 340 | |
| 341 | Print a list of useful (used by CGI) environment variables in HTML. |
| 342 | |
| 343 | |
Georg Brandl | 0d8f073 | 2009-04-05 22:20:44 +0000 | [diff] [blame] | 344 | .. function:: escape(s, quote=False) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 345 | |
| 346 | Convert the characters ``'&'``, ``'<'`` and ``'>'`` in string *s* to HTML-safe |
| 347 | sequences. Use this if you need to display text that might contain such |
| 348 | characters in HTML. If the optional flag *quote* is true, the quotation mark |
Georg Brandl | 1800934 | 2010-08-02 21:51:18 +0000 | [diff] [blame] | 349 | character (``"``) is also translated; this helps for inclusion in an HTML |
| 350 | attribute value delimited by double quotes, as in ``<a href="...">``. Note |
| 351 | that single quotes are never translated. |
| 352 | |
Georg Brandl | 1f7fffb | 2010-10-15 15:57:45 +0000 | [diff] [blame] | 353 | .. deprecated:: 3.2 |
| 354 | This function is unsafe because *quote* is false by default, and therefore |
| 355 | deprecated. Use :func:`html.escape` instead. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 356 | |
| 357 | |
| 358 | .. _cgi-security: |
| 359 | |
| 360 | Caring about security |
| 361 | --------------------- |
| 362 | |
| 363 | .. index:: pair: CGI; security |
| 364 | |
| 365 | There's one important rule: if you invoke an external program (via the |
| 366 | :func:`os.system` or :func:`os.popen` functions. or others with similar |
| 367 | functionality), make very sure you don't pass arbitrary strings received from |
| 368 | the client to the shell. This is a well-known security hole whereby clever |
| 369 | hackers anywhere on the Web can exploit a gullible CGI script to invoke |
| 370 | arbitrary shell commands. Even parts of the URL or field names cannot be |
| 371 | trusted, since the request doesn't have to come from your form! |
| 372 | |
| 373 | To be on the safe side, if you must pass a string gotten from a form to a shell |
| 374 | command, you should make sure the string contains only alphanumeric characters, |
| 375 | dashes, underscores, and periods. |
| 376 | |
| 377 | |
| 378 | Installing your CGI script on a Unix system |
| 379 | ------------------------------------------- |
| 380 | |
| 381 | Read the documentation for your HTTP server and check with your local system |
| 382 | administrator to find the directory where CGI scripts should be installed; |
| 383 | usually this is in a directory :file:`cgi-bin` in the server tree. |
| 384 | |
| 385 | Make sure that your script is readable and executable by "others"; the Unix file |
Georg Brandl | f4a4123 | 2008-05-26 17:55:52 +0000 | [diff] [blame] | 386 | mode should be ``0o755`` octal (use ``chmod 0755 filename``). Make sure that the |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 387 | first line of the script contains ``#!`` starting in column 1 followed by the |
| 388 | pathname of the Python interpreter, for instance:: |
| 389 | |
| 390 | #!/usr/local/bin/python |
| 391 | |
| 392 | Make sure the Python interpreter exists and is executable by "others". |
| 393 | |
| 394 | Make sure that any files your script needs to read or write are readable or |
Georg Brandl | f4a4123 | 2008-05-26 17:55:52 +0000 | [diff] [blame] | 395 | writable, respectively, by "others" --- their mode should be ``0o644`` for |
| 396 | readable and ``0o666`` for writable. This is because, for security reasons, the |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 397 | HTTP server executes your script as user "nobody", without any special |
| 398 | privileges. It can only read (write, execute) files that everybody can read |
| 399 | (write, execute). The current directory at execution time is also different (it |
| 400 | is usually the server's cgi-bin directory) and the set of environment variables |
| 401 | is also different from what you get when you log in. In particular, don't count |
| 402 | on the shell's search path for executables (:envvar:`PATH`) or the Python module |
| 403 | search path (:envvar:`PYTHONPATH`) to be set to anything interesting. |
| 404 | |
| 405 | If you need to load modules from a directory which is not on Python's default |
| 406 | module search path, you can change the path in your script, before importing |
| 407 | other modules. For example:: |
| 408 | |
| 409 | import sys |
| 410 | sys.path.insert(0, "/usr/home/joe/lib/python") |
| 411 | sys.path.insert(0, "/usr/local/lib/python") |
| 412 | |
| 413 | (This way, the directory inserted last will be searched first!) |
| 414 | |
| 415 | Instructions for non-Unix systems will vary; check your HTTP server's |
| 416 | documentation (it will usually have a section on CGI scripts). |
| 417 | |
| 418 | |
| 419 | Testing your CGI script |
| 420 | ----------------------- |
| 421 | |
| 422 | Unfortunately, a CGI script will generally not run when you try it from the |
| 423 | command line, and a script that works perfectly from the command line may fail |
| 424 | mysteriously when run from the server. There's one reason why you should still |
| 425 | test your script from the command line: if it contains a syntax error, the |
| 426 | Python interpreter won't execute it at all, and the HTTP server will most likely |
| 427 | send a cryptic error to the client. |
| 428 | |
| 429 | Assuming your script has no syntax errors, yet it does not work, you have no |
| 430 | choice but to read the next section. |
| 431 | |
| 432 | |
| 433 | Debugging CGI scripts |
| 434 | --------------------- |
| 435 | |
| 436 | .. index:: pair: CGI; debugging |
| 437 | |
| 438 | First of all, check for trivial installation errors --- reading the section |
| 439 | above on installing your CGI script carefully can save you a lot of time. If |
| 440 | you wonder whether you have understood the installation procedure correctly, try |
| 441 | installing a copy of this module file (:file:`cgi.py`) as a CGI script. When |
| 442 | invoked as a script, the file will dump its environment and the contents of the |
| 443 | form in HTML form. Give it the right mode etc, and send it a request. If it's |
| 444 | installed in the standard :file:`cgi-bin` directory, it should be possible to |
| 445 | send it a request by entering a URL into your browser of the form:: |
| 446 | |
| 447 | http://yourhostname/cgi-bin/cgi.py?name=Joe+Blow&addr=At+Home |
| 448 | |
| 449 | If this gives an error of type 404, the server cannot find the script -- perhaps |
| 450 | you need to install it in a different directory. If it gives another error, |
| 451 | there's an installation problem that you should fix before trying to go any |
| 452 | further. If you get a nicely formatted listing of the environment and form |
| 453 | content (in this example, the fields should be listed as "addr" with value "At |
| 454 | Home" and "name" with value "Joe Blow"), the :file:`cgi.py` script has been |
| 455 | installed correctly. If you follow the same procedure for your own script, you |
| 456 | should now be able to debug it. |
| 457 | |
| 458 | The next step could be to call the :mod:`cgi` module's :func:`test` function |
| 459 | from your script: replace its main code with the single statement :: |
| 460 | |
| 461 | cgi.test() |
| 462 | |
| 463 | This should produce the same results as those gotten from installing the |
| 464 | :file:`cgi.py` file itself. |
| 465 | |
| 466 | When an ordinary Python script raises an unhandled exception (for whatever |
| 467 | reason: of a typo in a module name, a file that can't be opened, etc.), the |
| 468 | Python interpreter prints a nice traceback and exits. While the Python |
| 469 | interpreter will still do this when your CGI script raises an exception, most |
| 470 | likely the traceback will end up in one of the HTTP server's log files, or be |
| 471 | discarded altogether. |
| 472 | |
| 473 | Fortunately, once you have managed to get your script to execute *some* code, |
| 474 | you can easily send tracebacks to the Web browser using the :mod:`cgitb` module. |
Benjamin Peterson | ad3d5c2 | 2009-02-26 03:38:59 +0000 | [diff] [blame] | 475 | If you haven't done so already, just add the lines:: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 476 | |
Benjamin Peterson | ad3d5c2 | 2009-02-26 03:38:59 +0000 | [diff] [blame] | 477 | import cgitb |
| 478 | cgitb.enable() |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 479 | |
| 480 | to the top of your script. Then try running it again; when a problem occurs, |
| 481 | you should see a detailed report that will likely make apparent the cause of the |
| 482 | crash. |
| 483 | |
| 484 | If you suspect that there may be a problem in importing the :mod:`cgitb` module, |
| 485 | you can use an even more robust approach (which only uses built-in modules):: |
| 486 | |
| 487 | import sys |
| 488 | sys.stderr = sys.stdout |
Georg Brandl | 6911e3c | 2007-09-04 07:15:32 +0000 | [diff] [blame] | 489 | print("Content-Type: text/plain") |
| 490 | print() |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 491 | ...your code here... |
| 492 | |
| 493 | This relies on the Python interpreter to print the traceback. The content type |
| 494 | of the output is set to plain text, which disables all HTML processing. If your |
| 495 | script works, the raw HTML will be displayed by your client. If it raises an |
| 496 | exception, most likely after the first two lines have been printed, a traceback |
| 497 | will be displayed. Because no HTML interpretation is going on, the traceback |
| 498 | will be readable. |
| 499 | |
| 500 | |
| 501 | Common problems and solutions |
| 502 | ----------------------------- |
| 503 | |
| 504 | * Most HTTP servers buffer the output from CGI scripts until the script is |
| 505 | completed. This means that it is not possible to display a progress report on |
| 506 | the client's display while the script is running. |
| 507 | |
| 508 | * Check the installation instructions above. |
| 509 | |
| 510 | * Check the HTTP server's log files. (``tail -f logfile`` in a separate window |
| 511 | may be useful!) |
| 512 | |
| 513 | * Always check a script for syntax errors first, by doing something like |
| 514 | ``python script.py``. |
| 515 | |
| 516 | * If your script does not have any syntax errors, try adding ``import cgitb; |
| 517 | cgitb.enable()`` to the top of the script. |
| 518 | |
| 519 | * When invoking external programs, make sure they can be found. Usually, this |
| 520 | means using absolute path names --- :envvar:`PATH` is usually not set to a very |
| 521 | useful value in a CGI script. |
| 522 | |
| 523 | * When reading or writing external files, make sure they can be read or written |
| 524 | by the userid under which your CGI script will be running: this is typically the |
| 525 | userid under which the web server is running, or some explicitly specified |
| 526 | userid for a web server's ``suexec`` feature. |
| 527 | |
| 528 | * Don't try to give a CGI script a set-uid mode. This doesn't work on most |
| 529 | systems, and is a security liability as well. |
| 530 | |
| 531 | .. rubric:: Footnotes |
| 532 | |
Georg Brandl | 1f7fffb | 2010-10-15 15:57:45 +0000 | [diff] [blame] | 533 | .. [#] Note that some recent versions of the HTML specification do state what |
| 534 | order the field values should be supplied in, but knowing whether a request |
| 535 | was received from a conforming browser, or even from a browser at all, is |
| 536 | tedious and error-prone. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 537 | |