| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 1 |  | 
 | 2 | :mod:`cgi` --- Common Gateway Interface support. | 
 | 3 | ================================================ | 
 | 4 |  | 
 | 5 | .. module:: cgi | 
 | 6 |    :synopsis: Helpers for running Python scripts via the Common Gateway Interface. | 
 | 7 |  | 
 | 8 |  | 
 | 9 | .. index:: | 
 | 10 |    pair: WWW; server | 
 | 11 |    pair: CGI; protocol | 
 | 12 |    pair: HTTP; protocol | 
 | 13 |    pair: MIME; headers | 
 | 14 |    single: URL | 
 | 15 |    single: Common Gateway Interface | 
 | 16 |  | 
 | 17 | Support module for Common Gateway Interface (CGI) scripts. | 
 | 18 |  | 
 | 19 | This module defines a number of utilities for use by CGI scripts written in | 
 | 20 | Python. | 
 | 21 |  | 
 | 22 |  | 
 | 23 | Introduction | 
 | 24 | ------------ | 
 | 25 |  | 
 | 26 | .. _cgi-intro: | 
 | 27 |  | 
 | 28 | A CGI script is invoked by an HTTP server, usually to process user input | 
 | 29 | submitted through an HTML ``<FORM>`` or ``<ISINDEX>`` element. | 
 | 30 |  | 
 | 31 | Most often, CGI scripts live in the server's special :file:`cgi-bin` directory. | 
 | 32 | The HTTP server places all sorts of information about the request (such as the | 
 | 33 | client's hostname, the requested URL, the query string, and lots of other | 
 | 34 | goodies) in the script's shell environment, executes the script, and sends the | 
 | 35 | script's output back to the client. | 
 | 36 |  | 
 | 37 | The script's input is connected to the client too, and sometimes the form data | 
 | 38 | is read this way; at other times the form data is passed via the "query string" | 
 | 39 | part of the URL.  This module is intended to take care of the different cases | 
 | 40 | and provide a simpler interface to the Python script.  It also provides a number | 
 | 41 | of utilities that help in debugging scripts, and the latest addition is support | 
 | 42 | for file uploads from a form (if your browser supports it). | 
 | 43 |  | 
 | 44 | The output of a CGI script should consist of two sections, separated by a blank | 
 | 45 | line.  The first section contains a number of headers, telling the client what | 
 | 46 | kind of data is following.  Python code to generate a minimal header section | 
 | 47 | looks like this:: | 
 | 48 |  | 
| Georg Brandl | 6911e3c | 2007-09-04 07:15:32 +0000 | [diff] [blame] | 49 |    print("Content-Type: text/html")    # HTML is following | 
 | 50 |    print()                             # blank line, end of headers | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 51 |  | 
 | 52 | The second section is usually HTML, which allows the client software to display | 
 | 53 | nicely formatted text with header, in-line images, etc. Here's Python code that | 
 | 54 | prints a simple piece of HTML:: | 
 | 55 |  | 
| Georg Brandl | 6911e3c | 2007-09-04 07:15:32 +0000 | [diff] [blame] | 56 |    print("<TITLE>CGI script output</TITLE>") | 
 | 57 |    print("<H1>This is my first CGI script</H1>") | 
 | 58 |    print("Hello, world!") | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 59 |  | 
 | 60 |  | 
 | 61 | .. _using-the-cgi-module: | 
 | 62 |  | 
 | 63 | Using the cgi module | 
 | 64 | -------------------- | 
 | 65 |  | 
 | 66 | Begin by writing ``import cgi``.  Do not use ``from cgi import *`` --- the | 
 | 67 | module defines all sorts of names for its own use or for backward compatibility | 
 | 68 | that you don't want in your namespace. | 
 | 69 |  | 
 | 70 | When you write a new script, consider adding the line:: | 
 | 71 |  | 
 | 72 |    import cgitb; cgitb.enable() | 
 | 73 |  | 
 | 74 | This activates a special exception handler that will display detailed reports in | 
 | 75 | the Web browser if any errors occur.  If you'd rather not show the guts of your | 
 | 76 | program to users of your script, you can have the reports saved to files | 
 | 77 | instead, with a line like this:: | 
 | 78 |  | 
 | 79 |    import cgitb; cgitb.enable(display=0, logdir="/tmp") | 
 | 80 |  | 
 | 81 | It's very helpful to use this feature during script development. The reports | 
 | 82 | produced by :mod:`cgitb` provide information that can save you a lot of time in | 
 | 83 | tracking down bugs.  You can always remove the ``cgitb`` line later when you | 
 | 84 | have tested your script and are confident that it works correctly. | 
 | 85 |  | 
 | 86 | To get at submitted form data, it's best to use the :class:`FieldStorage` class. | 
 | 87 | The other classes defined in this module are provided mostly for backward | 
 | 88 | compatibility. Instantiate it exactly once, without arguments.  This reads the | 
 | 89 | form contents from standard input or the environment (depending on the value of | 
 | 90 | various environment variables set according to the CGI standard).  Since it may | 
 | 91 | consume standard input, it should be instantiated only once. | 
 | 92 |  | 
 | 93 | The :class:`FieldStorage` instance can be indexed like a Python dictionary, and | 
| Collin Winter | c79461b | 2007-09-01 23:34:30 +0000 | [diff] [blame] | 94 | also supports the standard dictionary methods :meth:`__contains__` and | 
 | 95 | :meth:`keys`.  The built-in :func:`len` is also supported.  Form fields | 
 | 96 | containing empty strings are ignored and do not appear in the dictionary; to | 
 | 97 | keep such values, provide a true value for the optional *keep_blank_values* | 
 | 98 | keyword parameter when creating the :class:`FieldStorage` instance. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 99 |  | 
 | 100 | For instance, the following code (which assumes that the | 
 | 101 | :mailheader:`Content-Type` header and blank line have already been printed) | 
 | 102 | checks that the fields ``name`` and ``addr`` are both set to a non-empty | 
 | 103 | string:: | 
 | 104 |  | 
 | 105 |    form = cgi.FieldStorage() | 
| Collin Winter | c79461b | 2007-09-01 23:34:30 +0000 | [diff] [blame] | 106 |    if not ("name" in form and "addr" in form): | 
| Georg Brandl | 6911e3c | 2007-09-04 07:15:32 +0000 | [diff] [blame] | 107 |        print("<H1>Error</H1>") | 
 | 108 |        print("Please fill in the name and addr fields.") | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 109 |        return | 
| Georg Brandl | 6911e3c | 2007-09-04 07:15:32 +0000 | [diff] [blame] | 110 |    print("<p>name:", form["name"].value) | 
 | 111 |    print("<p>addr:", form["addr"].value) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 112 |    ...further form processing here... | 
 | 113 |  | 
 | 114 | Here the fields, accessed through ``form[key]``, are themselves instances of | 
 | 115 | :class:`FieldStorage` (or :class:`MiniFieldStorage`, depending on the form | 
 | 116 | encoding). The :attr:`value` attribute of the instance yields the string value | 
 | 117 | of the field.  The :meth:`getvalue` method returns this string value directly; | 
 | 118 | it also accepts an optional second argument as a default to return if the | 
 | 119 | requested key is not present. | 
 | 120 |  | 
 | 121 | If the submitted form data contains more than one field with the same name, the | 
 | 122 | object retrieved by ``form[key]`` is not a :class:`FieldStorage` or | 
 | 123 | :class:`MiniFieldStorage` instance but a list of such instances.  Similarly, in | 
 | 124 | this situation, ``form.getvalue(key)`` would return a list of strings. If you | 
 | 125 | expect this possibility (when your HTML form contains multiple fields with the | 
 | 126 | same name), use the :func:`getlist` function, which always returns a list of | 
 | 127 | values (so that you do not need to special-case the single item case).  For | 
 | 128 | example, this code concatenates any number of username fields, separated by | 
 | 129 | commas:: | 
 | 130 |  | 
 | 131 |    value = form.getlist("username") | 
 | 132 |    usernames = ",".join(value) | 
 | 133 |  | 
 | 134 | If a field represents an uploaded file, accessing the value via the | 
 | 135 | :attr:`value` attribute or the :func:`getvalue` method reads the entire file in | 
 | 136 | memory as a string.  This may not be what you want. You can test for an uploaded | 
 | 137 | file by testing either the :attr:`filename` attribute or the :attr:`file` | 
 | 138 | attribute.  You can then read the data at leisure from the :attr:`file` | 
 | 139 | attribute:: | 
 | 140 |  | 
 | 141 |    fileitem = form["userfile"] | 
 | 142 |    if fileitem.file: | 
 | 143 |        # It's an uploaded file; count lines | 
 | 144 |        linecount = 0 | 
| Collin Winter | 4633448 | 2007-09-10 00:49:57 +0000 | [diff] [blame] | 145 |        while True: | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 146 |            line = fileitem.file.readline() | 
 | 147 |            if not line: break | 
 | 148 |            linecount = linecount + 1 | 
 | 149 |  | 
| Sean Reifscheider | 782d6b4 | 2007-09-18 23:39:35 +0000 | [diff] [blame] | 150 | If an error is encountered when obtaining the contents of an uploaded file | 
 | 151 | (for example, when the user interrupts the form submission by clicking on | 
 | 152 | a Back or Cancel button) the :attr:`done` attribute of the object for the | 
 | 153 | field will be set to the value -1. | 
 | 154 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 155 | The file upload draft standard entertains the possibility of uploading multiple | 
 | 156 | files from one field (using a recursive :mimetype:`multipart/\*` encoding). | 
 | 157 | When this occurs, the item will be a dictionary-like :class:`FieldStorage` item. | 
 | 158 | This can be determined by testing its :attr:`type` attribute, which should be | 
 | 159 | :mimetype:`multipart/form-data` (or perhaps another MIME type matching | 
 | 160 | :mimetype:`multipart/\*`).  In this case, it can be iterated over recursively | 
 | 161 | just like the top-level form object. | 
 | 162 |  | 
 | 163 | When a form is submitted in the "old" format (as the query string or as a single | 
 | 164 | data part of type :mimetype:`application/x-www-form-urlencoded`), the items will | 
 | 165 | actually be instances of the class :class:`MiniFieldStorage`.  In this case, the | 
 | 166 | :attr:`list`, :attr:`file`, and :attr:`filename` attributes are always ``None``. | 
 | 167 |  | 
 | 168 |  | 
 | 169 | Higher Level Interface | 
 | 170 | ---------------------- | 
 | 171 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 172 | The previous section explains how to read CGI form data using the | 
 | 173 | :class:`FieldStorage` class.  This section describes a higher level interface | 
 | 174 | which was added to this class to allow one to do it in a more readable and | 
 | 175 | intuitive way.  The interface doesn't make the techniques described in previous | 
 | 176 | sections obsolete --- they are still useful to process file uploads efficiently, | 
 | 177 | for example. | 
 | 178 |  | 
| Christian Heimes | 5b5e81c | 2007-12-31 16:14:33 +0000 | [diff] [blame] | 179 | .. XXX: Is this true ? | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 180 |  | 
 | 181 | The interface consists of two simple methods. Using the methods you can process | 
 | 182 | form data in a generic way, without the need to worry whether only one or more | 
 | 183 | values were posted under one name. | 
 | 184 |  | 
 | 185 | In the previous section, you learned to write following code anytime you | 
 | 186 | expected a user to post more than one value under one name:: | 
 | 187 |  | 
 | 188 |    item = form.getvalue("item") | 
 | 189 |    if isinstance(item, list): | 
 | 190 |        # The user is requesting more than one item. | 
 | 191 |    else: | 
 | 192 |        # The user is requesting only one item. | 
 | 193 |  | 
 | 194 | This situation is common for example when a form contains a group of multiple | 
 | 195 | checkboxes with the same name:: | 
 | 196 |  | 
 | 197 |    <input type="checkbox" name="item" value="1" /> | 
 | 198 |    <input type="checkbox" name="item" value="2" /> | 
 | 199 |  | 
 | 200 | In most situations, however, there's only one form control with a particular | 
 | 201 | name in a form and then you expect and need only one value associated with this | 
 | 202 | name.  So you write a script containing for example this code:: | 
 | 203 |  | 
 | 204 |    user = form.getvalue("user").upper() | 
 | 205 |  | 
 | 206 | The problem with the code is that you should never expect that a client will | 
 | 207 | provide valid input to your scripts.  For example, if a curious user appends | 
 | 208 | another ``user=foo`` pair to the query string, then the script would crash, | 
 | 209 | because in this situation the ``getvalue("user")`` method call returns a list | 
 | 210 | instead of a string.  Calling the :meth:`toupper` method on a list is not valid | 
 | 211 | (since lists do not have a method of this name) and results in an | 
 | 212 | :exc:`AttributeError` exception. | 
 | 213 |  | 
 | 214 | Therefore, the appropriate way to read form data values was to always use the | 
 | 215 | code which checks whether the obtained value is a single value or a list of | 
 | 216 | values.  That's annoying and leads to less readable scripts. | 
 | 217 |  | 
 | 218 | A more convenient approach is to use the methods :meth:`getfirst` and | 
 | 219 | :meth:`getlist` provided by this higher level interface. | 
 | 220 |  | 
 | 221 |  | 
 | 222 | .. method:: FieldStorage.getfirst(name[, default]) | 
 | 223 |  | 
 | 224 |    This method always returns only one value associated with form field *name*. | 
 | 225 |    The method returns only the first value in case that more values were posted | 
 | 226 |    under such name.  Please note that the order in which the values are received | 
 | 227 |    may vary from browser to browser and should not be counted on. [#]_  If no such | 
 | 228 |    form field or value exists then the method returns the value specified by the | 
 | 229 |    optional parameter *default*.  This parameter defaults to ``None`` if not | 
 | 230 |    specified. | 
 | 231 |  | 
 | 232 |  | 
 | 233 | .. method:: FieldStorage.getlist(name) | 
 | 234 |  | 
 | 235 |    This method always returns a list of values associated with form field *name*. | 
 | 236 |    The method returns an empty list if no such form field or value exists for | 
 | 237 |    *name*.  It returns a list consisting of one item if only one such value exists. | 
 | 238 |  | 
 | 239 | Using these methods you can write nice compact code:: | 
 | 240 |  | 
 | 241 |    import cgi | 
 | 242 |    form = cgi.FieldStorage() | 
 | 243 |    user = form.getfirst("user", "").upper()    # This way it's safe. | 
 | 244 |    for item in form.getlist("item"): | 
 | 245 |        do_something(item) | 
 | 246 |  | 
 | 247 |  | 
 | 248 | Old classes | 
 | 249 | ----------- | 
 | 250 |  | 
 | 251 | These classes, present in earlier versions of the :mod:`cgi` module, are still | 
 | 252 | supported for backward compatibility.  New applications should use the | 
 | 253 | :class:`FieldStorage` class. | 
 | 254 |  | 
 | 255 | :class:`SvFormContentDict` stores single value form content as dictionary; it | 
 | 256 | assumes each field name occurs in the form only once. | 
 | 257 |  | 
 | 258 | :class:`FormContentDict` stores multiple value form content as a dictionary (the | 
 | 259 | form items are lists of values).  Useful if your form contains multiple fields | 
 | 260 | with the same name. | 
 | 261 |  | 
 | 262 | Other classes (:class:`FormContent`, :class:`InterpFormContentDict`) are present | 
 | 263 | for backwards compatibility with really old applications only. If you still use | 
 | 264 | these and would be inconvenienced when they disappeared from a next version of | 
 | 265 | this module, drop me a note. | 
 | 266 |  | 
 | 267 |  | 
 | 268 | .. _functions-in-cgi-module: | 
 | 269 |  | 
 | 270 | Functions | 
 | 271 | --------- | 
 | 272 |  | 
 | 273 | These are useful if you want more control, or if you want to employ some of the | 
 | 274 | algorithms implemented in this module in other circumstances. | 
 | 275 |  | 
 | 276 |  | 
 | 277 | .. function:: parse(fp[, keep_blank_values[, strict_parsing]]) | 
 | 278 |  | 
 | 279 |    Parse a query in the environment or from a file (the file defaults to | 
 | 280 |    ``sys.stdin``).  The *keep_blank_values* and *strict_parsing* parameters are | 
 | 281 |    passed to :func:`parse_qs` unchanged. | 
 | 282 |  | 
 | 283 |  | 
 | 284 | .. function:: parse_qs(qs[, keep_blank_values[, strict_parsing]]) | 
 | 285 |  | 
 | 286 |    Parse a query string given as a string argument (data of type | 
 | 287 |    :mimetype:`application/x-www-form-urlencoded`).  Data are returned as a | 
 | 288 |    dictionary.  The dictionary keys are the unique query variable names and the | 
 | 289 |    values are lists of values for each name. | 
 | 290 |  | 
 | 291 |    The optional argument *keep_blank_values* is a flag indicating whether blank | 
 | 292 |    values in URL encoded queries should be treated as blank strings.   A true value | 
 | 293 |    indicates that blanks should be retained as  blank strings.  The default false | 
 | 294 |    value indicates that blank values are to be ignored and treated as if they were | 
 | 295 |    not included. | 
 | 296 |  | 
 | 297 |    The optional argument *strict_parsing* is a flag indicating what to do with | 
 | 298 |    parsing errors.  If false (the default), errors are silently ignored.  If true, | 
 | 299 |    errors raise a :exc:`ValueError` exception. | 
 | 300 |  | 
 | 301 |    Use the :func:`urllib.urlencode` function to convert such dictionaries into | 
 | 302 |    query strings. | 
 | 303 |  | 
 | 304 |  | 
 | 305 | .. function:: parse_qsl(qs[, keep_blank_values[, strict_parsing]]) | 
 | 306 |  | 
 | 307 |    Parse a query string given as a string argument (data of type | 
 | 308 |    :mimetype:`application/x-www-form-urlencoded`).  Data are returned as a list of | 
 | 309 |    name, value pairs. | 
 | 310 |  | 
 | 311 |    The optional argument *keep_blank_values* is a flag indicating whether blank | 
 | 312 |    values in URL encoded queries should be treated as blank strings.   A true value | 
 | 313 |    indicates that blanks should be retained as  blank strings.  The default false | 
 | 314 |    value indicates that blank values are to be ignored and treated as if they were | 
 | 315 |    not included. | 
 | 316 |  | 
 | 317 |    The optional argument *strict_parsing* is a flag indicating what to do with | 
 | 318 |    parsing errors.  If false (the default), errors are silently ignored.  If true, | 
 | 319 |    errors raise a :exc:`ValueError` exception. | 
 | 320 |  | 
 | 321 |    Use the :func:`urllib.urlencode` function to convert such lists of pairs into | 
 | 322 |    query strings. | 
 | 323 |  | 
 | 324 |  | 
 | 325 | .. function:: parse_multipart(fp, pdict) | 
 | 326 |  | 
 | 327 |    Parse input of type :mimetype:`multipart/form-data` (for  file uploads). | 
 | 328 |    Arguments are *fp* for the input file and *pdict* for a dictionary containing | 
 | 329 |    other parameters in the :mailheader:`Content-Type` header. | 
 | 330 |  | 
 | 331 |    Returns a dictionary just like :func:`parse_qs` keys are the field names, each | 
 | 332 |    value is a list of values for that field.  This is easy to use but not much good | 
 | 333 |    if you are expecting megabytes to be uploaded --- in that case, use the | 
 | 334 |    :class:`FieldStorage` class instead which is much more flexible. | 
 | 335 |  | 
 | 336 |    Note that this does not parse nested multipart parts --- use | 
 | 337 |    :class:`FieldStorage` for that. | 
 | 338 |  | 
 | 339 |  | 
 | 340 | .. function:: parse_header(string) | 
 | 341 |  | 
 | 342 |    Parse a MIME header (such as :mailheader:`Content-Type`) into a main value and a | 
 | 343 |    dictionary of parameters. | 
 | 344 |  | 
 | 345 |  | 
 | 346 | .. function:: test() | 
 | 347 |  | 
 | 348 |    Robust test CGI script, usable as main program. Writes minimal HTTP headers and | 
 | 349 |    formats all information provided to the script in HTML form. | 
 | 350 |  | 
 | 351 |  | 
 | 352 | .. function:: print_environ() | 
 | 353 |  | 
 | 354 |    Format the shell environment in HTML. | 
 | 355 |  | 
 | 356 |  | 
 | 357 | .. function:: print_form(form) | 
 | 358 |  | 
 | 359 |    Format a form in HTML. | 
 | 360 |  | 
 | 361 |  | 
 | 362 | .. function:: print_directory() | 
 | 363 |  | 
 | 364 |    Format the current directory in HTML. | 
 | 365 |  | 
 | 366 |  | 
 | 367 | .. function:: print_environ_usage() | 
 | 368 |  | 
 | 369 |    Print a list of useful (used by CGI) environment variables in HTML. | 
 | 370 |  | 
 | 371 |  | 
 | 372 | .. function:: escape(s[, quote]) | 
 | 373 |  | 
 | 374 |    Convert the characters ``'&'``, ``'<'`` and ``'>'`` in string *s* to HTML-safe | 
 | 375 |    sequences.  Use this if you need to display text that might contain such | 
 | 376 |    characters in HTML.  If the optional flag *quote* is true, the quotation mark | 
 | 377 |    character (``'"'``) is also translated; this helps for inclusion in an HTML | 
 | 378 |    attribute value, as in ``<A HREF="...">``.  If the value to be quoted might | 
 | 379 |    include single- or double-quote characters, or both, consider using the | 
 | 380 |    :func:`quoteattr` function in the :mod:`xml.sax.saxutils` module instead. | 
 | 381 |  | 
 | 382 |  | 
 | 383 | .. _cgi-security: | 
 | 384 |  | 
 | 385 | Caring about security | 
 | 386 | --------------------- | 
 | 387 |  | 
 | 388 | .. index:: pair: CGI; security | 
 | 389 |  | 
 | 390 | There's one important rule: if you invoke an external program (via the | 
 | 391 | :func:`os.system` or :func:`os.popen` functions. or others with similar | 
 | 392 | functionality), make very sure you don't pass arbitrary strings received from | 
 | 393 | the client to the shell.  This is a well-known security hole whereby clever | 
 | 394 | hackers anywhere on the Web can exploit a gullible CGI script to invoke | 
 | 395 | arbitrary shell commands.  Even parts of the URL or field names cannot be | 
 | 396 | trusted, since the request doesn't have to come from your form! | 
 | 397 |  | 
 | 398 | To be on the safe side, if you must pass a string gotten from a form to a shell | 
 | 399 | command, you should make sure the string contains only alphanumeric characters, | 
 | 400 | dashes, underscores, and periods. | 
 | 401 |  | 
 | 402 |  | 
 | 403 | Installing your CGI script on a Unix system | 
 | 404 | ------------------------------------------- | 
 | 405 |  | 
 | 406 | Read the documentation for your HTTP server and check with your local system | 
 | 407 | administrator to find the directory where CGI scripts should be installed; | 
 | 408 | usually this is in a directory :file:`cgi-bin` in the server tree. | 
 | 409 |  | 
 | 410 | Make sure that your script is readable and executable by "others"; the Unix file | 
 | 411 | mode should be ``0755`` octal (use ``chmod 0755 filename``).  Make sure that the | 
 | 412 | first line of the script contains ``#!`` starting in column 1 followed by the | 
 | 413 | pathname of the Python interpreter, for instance:: | 
 | 414 |  | 
 | 415 |    #!/usr/local/bin/python | 
 | 416 |  | 
 | 417 | Make sure the Python interpreter exists and is executable by "others". | 
 | 418 |  | 
 | 419 | Make sure that any files your script needs to read or write are readable or | 
 | 420 | writable, respectively, by "others" --- their mode should be ``0644`` for | 
 | 421 | readable and ``0666`` for writable.  This is because, for security reasons, the | 
 | 422 | HTTP server executes your script as user "nobody", without any special | 
 | 423 | privileges.  It can only read (write, execute) files that everybody can read | 
 | 424 | (write, execute).  The current directory at execution time is also different (it | 
 | 425 | is usually the server's cgi-bin directory) and the set of environment variables | 
 | 426 | is also different from what you get when you log in.  In particular, don't count | 
 | 427 | on the shell's search path for executables (:envvar:`PATH`) or the Python module | 
 | 428 | search path (:envvar:`PYTHONPATH`) to be set to anything interesting. | 
 | 429 |  | 
 | 430 | If you need to load modules from a directory which is not on Python's default | 
 | 431 | module search path, you can change the path in your script, before importing | 
 | 432 | other modules.  For example:: | 
 | 433 |  | 
 | 434 |    import sys | 
 | 435 |    sys.path.insert(0, "/usr/home/joe/lib/python") | 
 | 436 |    sys.path.insert(0, "/usr/local/lib/python") | 
 | 437 |  | 
 | 438 | (This way, the directory inserted last will be searched first!) | 
 | 439 |  | 
 | 440 | Instructions for non-Unix systems will vary; check your HTTP server's | 
 | 441 | documentation (it will usually have a section on CGI scripts). | 
 | 442 |  | 
 | 443 |  | 
 | 444 | Testing your CGI script | 
 | 445 | ----------------------- | 
 | 446 |  | 
 | 447 | Unfortunately, a CGI script will generally not run when you try it from the | 
 | 448 | command line, and a script that works perfectly from the command line may fail | 
 | 449 | mysteriously when run from the server.  There's one reason why you should still | 
 | 450 | test your script from the command line: if it contains a syntax error, the | 
 | 451 | Python interpreter won't execute it at all, and the HTTP server will most likely | 
 | 452 | send a cryptic error to the client. | 
 | 453 |  | 
 | 454 | Assuming your script has no syntax errors, yet it does not work, you have no | 
 | 455 | choice but to read the next section. | 
 | 456 |  | 
 | 457 |  | 
 | 458 | Debugging CGI scripts | 
 | 459 | --------------------- | 
 | 460 |  | 
 | 461 | .. index:: pair: CGI; debugging | 
 | 462 |  | 
 | 463 | First of all, check for trivial installation errors --- reading the section | 
 | 464 | above on installing your CGI script carefully can save you a lot of time.  If | 
 | 465 | you wonder whether you have understood the installation procedure correctly, try | 
 | 466 | installing a copy of this module file (:file:`cgi.py`) as a CGI script.  When | 
 | 467 | invoked as a script, the file will dump its environment and the contents of the | 
 | 468 | form in HTML form. Give it the right mode etc, and send it a request.  If it's | 
 | 469 | installed in the standard :file:`cgi-bin` directory, it should be possible to | 
 | 470 | send it a request by entering a URL into your browser of the form:: | 
 | 471 |  | 
 | 472 |    http://yourhostname/cgi-bin/cgi.py?name=Joe+Blow&addr=At+Home | 
 | 473 |  | 
 | 474 | If this gives an error of type 404, the server cannot find the script -- perhaps | 
 | 475 | you need to install it in a different directory.  If it gives another error, | 
 | 476 | there's an installation problem that you should fix before trying to go any | 
 | 477 | further.  If you get a nicely formatted listing of the environment and form | 
 | 478 | content (in this example, the fields should be listed as "addr" with value "At | 
 | 479 | Home" and "name" with value "Joe Blow"), the :file:`cgi.py` script has been | 
 | 480 | installed correctly.  If you follow the same procedure for your own script, you | 
 | 481 | should now be able to debug it. | 
 | 482 |  | 
 | 483 | The next step could be to call the :mod:`cgi` module's :func:`test` function | 
 | 484 | from your script: replace its main code with the single statement :: | 
 | 485 |  | 
 | 486 |    cgi.test() | 
 | 487 |  | 
 | 488 | This should produce the same results as those gotten from installing the | 
 | 489 | :file:`cgi.py` file itself. | 
 | 490 |  | 
 | 491 | When an ordinary Python script raises an unhandled exception (for whatever | 
 | 492 | reason: of a typo in a module name, a file that can't be opened, etc.), the | 
 | 493 | Python interpreter prints a nice traceback and exits.  While the Python | 
 | 494 | interpreter will still do this when your CGI script raises an exception, most | 
 | 495 | likely the traceback will end up in one of the HTTP server's log files, or be | 
 | 496 | discarded altogether. | 
 | 497 |  | 
 | 498 | Fortunately, once you have managed to get your script to execute *some* code, | 
 | 499 | you can easily send tracebacks to the Web browser using the :mod:`cgitb` module. | 
 | 500 | If you haven't done so already, just add the line:: | 
 | 501 |  | 
 | 502 |    import cgitb; cgitb.enable() | 
 | 503 |  | 
 | 504 | to the top of your script.  Then try running it again; when a problem occurs, | 
 | 505 | you should see a detailed report that will likely make apparent the cause of the | 
 | 506 | crash. | 
 | 507 |  | 
 | 508 | If you suspect that there may be a problem in importing the :mod:`cgitb` module, | 
 | 509 | you can use an even more robust approach (which only uses built-in modules):: | 
 | 510 |  | 
 | 511 |    import sys | 
 | 512 |    sys.stderr = sys.stdout | 
| Georg Brandl | 6911e3c | 2007-09-04 07:15:32 +0000 | [diff] [blame] | 513 |    print("Content-Type: text/plain") | 
 | 514 |    print() | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 515 |    ...your code here... | 
 | 516 |  | 
 | 517 | This relies on the Python interpreter to print the traceback.  The content type | 
 | 518 | of the output is set to plain text, which disables all HTML processing.  If your | 
 | 519 | script works, the raw HTML will be displayed by your client.  If it raises an | 
 | 520 | exception, most likely after the first two lines have been printed, a traceback | 
 | 521 | will be displayed. Because no HTML interpretation is going on, the traceback | 
 | 522 | will be readable. | 
 | 523 |  | 
 | 524 |  | 
 | 525 | Common problems and solutions | 
 | 526 | ----------------------------- | 
 | 527 |  | 
 | 528 | * Most HTTP servers buffer the output from CGI scripts until the script is | 
 | 529 |   completed.  This means that it is not possible to display a progress report on | 
 | 530 |   the client's display while the script is running. | 
 | 531 |  | 
 | 532 | * Check the installation instructions above. | 
 | 533 |  | 
 | 534 | * Check the HTTP server's log files.  (``tail -f logfile`` in a separate window | 
 | 535 |   may be useful!) | 
 | 536 |  | 
 | 537 | * Always check a script for syntax errors first, by doing something like | 
 | 538 |   ``python script.py``. | 
 | 539 |  | 
 | 540 | * If your script does not have any syntax errors, try adding ``import cgitb; | 
 | 541 |   cgitb.enable()`` to the top of the script. | 
 | 542 |  | 
 | 543 | * When invoking external programs, make sure they can be found. Usually, this | 
 | 544 |   means using absolute path names --- :envvar:`PATH` is usually not set to a very | 
 | 545 |   useful value in a CGI script. | 
 | 546 |  | 
 | 547 | * When reading or writing external files, make sure they can be read or written | 
 | 548 |   by the userid under which your CGI script will be running: this is typically the | 
 | 549 |   userid under which the web server is running, or some explicitly specified | 
 | 550 |   userid for a web server's ``suexec`` feature. | 
 | 551 |  | 
 | 552 | * Don't try to give a CGI script a set-uid mode.  This doesn't work on most | 
 | 553 |   systems, and is a security liability as well. | 
 | 554 |  | 
 | 555 | .. rubric:: Footnotes | 
 | 556 |  | 
 | 557 | .. [#] Note that some recent versions of the HTML specification do state what order the | 
 | 558 |    field values should be supplied in, but knowing whether a request was | 
 | 559 |    received from a conforming browser, or even from a browser at all, is tedious | 
 | 560 |    and error-prone. | 
 | 561 |  |