blob: 971a6ad30bcd741b3b0317d8897f9c927bb1c939 [file] [log] [blame]
Fred Drake295da241998-08-10 19:42:37 +00001\section{\module{cgi} ---
2 Common Gateway Interface support.}
Fred Drakeb91e9341998-07-23 17:59:49 +00003\declaremodule{standard}{cgi}
4
Fred Drake295da241998-08-10 19:42:37 +00005\modulesynopsis{Common Gateway Interface support, used to interpret
6forms in server-side scripts.}
Fred Drakeb91e9341998-07-23 17:59:49 +00007
Guido van Rossuma12ef941995-02-27 17:53:25 +00008\indexii{WWW}{server}
9\indexii{CGI}{protocol}
10\indexii{HTTP}{protocol}
11\indexii{MIME}{headers}
12\index{URL}
13
Guido van Rossum86751151995-02-28 17:14:32 +000014
Fred Drake8ee679f2001-07-14 02:50:55 +000015Support module for Common Gateway Interface (CGI) scripts.%
Fred Drake6a79be81998-04-03 03:47:03 +000016\index{Common Gateway Interface}
Guido van Rossuma12ef941995-02-27 17:53:25 +000017
Guido van Rossuma29cc971996-07-30 18:22:07 +000018This module defines a number of utilities for use by CGI scripts
19written in Python.
Guido van Rossuma12ef941995-02-27 17:53:25 +000020
Guido van Rossuma29cc971996-07-30 18:22:07 +000021\subsection{Introduction}
Fred Drake12d9fc91998-04-14 17:19:54 +000022\nodename{cgi-intro}
Guido van Rossuma12ef941995-02-27 17:53:25 +000023
Guido van Rossuma29cc971996-07-30 18:22:07 +000024A CGI script is invoked by an HTTP server, usually to process user
Fred Drake637af131998-08-21 20:02:06 +000025input submitted through an HTML \code{<FORM>} or \code{<ISINDEX>} element.
Guido van Rossuma29cc971996-07-30 18:22:07 +000026
Fred Drakea2e268a1997-12-09 03:28:42 +000027Most often, CGI scripts live in the server's special \file{cgi-bin}
Guido van Rossuma29cc971996-07-30 18:22:07 +000028directory. The HTTP server places all sorts of information about the
29request (such as the client's hostname, the requested URL, the query
30string, and lots of other goodies) in the script's shell environment,
31executes the script, and sends the script's output back to the client.
32
33The script's input is connected to the client too, and sometimes the
34form data is read this way; at other times the form data is passed via
Fred Drake6ef871c1998-03-12 06:52:05 +000035the ``query string'' part of the URL. This module is intended
Guido van Rossuma29cc971996-07-30 18:22:07 +000036to take care of the different cases and provide a simpler interface to
37the Python script. It also provides a number of utilities that help
38in debugging scripts, and the latest addition is support for file
Georg Brandl95ac2872006-01-22 13:49:21 +000039uploads from a form (if your browser supports it).
Guido van Rossuma29cc971996-07-30 18:22:07 +000040
41The output of a CGI script should consist of two sections, separated
42by a blank line. The first section contains a number of headers,
43telling the client what kind of data is following. Python code to
44generate a minimal header section looks like this:
Guido van Rossuma12ef941995-02-27 17:53:25 +000045
Fred Drake19479911998-02-13 06:58:54 +000046\begin{verbatim}
Moshe Zadkaa1a4b592000-08-25 21:47:56 +000047print "Content-Type: text/html" # HTML is following
Guido van Rossume47da0a1997-07-17 16:34:52 +000048print # blank line, end of headers
Fred Drake19479911998-02-13 06:58:54 +000049\end{verbatim}
Fred Drake6ef871c1998-03-12 06:52:05 +000050
Guido van Rossuma29cc971996-07-30 18:22:07 +000051The second section is usually HTML, which allows the client software
52to display nicely formatted text with header, in-line images, etc.
53Here's Python code that prints a simple piece of HTML:
Guido van Rossum470be141995-03-17 16:07:09 +000054
Fred Drake19479911998-02-13 06:58:54 +000055\begin{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +000056print "<TITLE>CGI script output</TITLE>"
57print "<H1>This is my first CGI script</H1>"
58print "Hello, world!"
Fred Drake19479911998-02-13 06:58:54 +000059\end{verbatim}
Fred Drake6ef871c1998-03-12 06:52:05 +000060
Guido van Rossuma29cc971996-07-30 18:22:07 +000061\subsection{Using the cgi module}
62\nodename{Using the cgi module}
63
Fred Drake6ef871c1998-03-12 06:52:05 +000064Begin by writing \samp{import cgi}. Do not use \samp{from cgi import
65*} --- the module defines all sorts of names for its own use or for
66backward compatibility that you don't want in your namespace.
Guido van Rossuma29cc971996-07-30 18:22:07 +000067
Fred Drake34a37b82001-12-20 17:13:09 +000068When you write a new script, consider adding the line:
69
70\begin{verbatim}
71import cgitb; cgitb.enable()
72\end{verbatim}
73
74This activates a special exception handler that will display detailed
75reports in the Web browser if any errors occur. If you'd rather not
76show the guts of your program to users of your script, you can have
77the reports saved to files instead, with a line like this:
78
79\begin{verbatim}
80import cgitb; cgitb.enable(display=0, logdir="/tmp")
81\end{verbatim}
82
83It's very helpful to use this feature during script development.
84The reports produced by \refmodule{cgitb} provide information that
85can save you a lot of time in tracking down bugs. You can always
86remove the \code{cgitb} line later when you have tested your script
87and are confident that it works correctly.
88
89To get at submitted form data,
90it's best to use the \class{FieldStorage} class. The other classes
Fred Drake6ef871c1998-03-12 06:52:05 +000091defined in this module are provided mostly for backward compatibility.
92Instantiate it exactly once, without arguments. This reads the form
93contents from standard input or the environment (depending on the
94value of various environment variables set according to the CGI
95standard). Since it may consume standard input, it should be
96instantiated only once.
Guido van Rossuma29cc971996-07-30 18:22:07 +000097
Moshe Zadkaa1a4b592000-08-25 21:47:56 +000098The \class{FieldStorage} instance can be indexed like a Python
99dictionary, and also supports the standard dictionary methods
Fred Drake84e58ab2001-08-11 03:28:41 +0000100\method{has_key()} and \method{keys()}. The built-in \function{len()}
101is also supported. Form fields containing empty strings are ignored
Moshe Zadkaa1a4b592000-08-25 21:47:56 +0000102and do not appear in the dictionary; to keep such values, provide
Raymond Hettingerf17d65d2003-08-12 00:01:16 +0000103a true value for the optional \var{keep_blank_values} keyword
Fred Drake84e58ab2001-08-11 03:28:41 +0000104parameter when creating the \class{FieldStorage} instance.
Moshe Zadkaa1a4b592000-08-25 21:47:56 +0000105
106For instance, the following code (which assumes that the
Fred Drake84e58ab2001-08-11 03:28:41 +0000107\mailheader{Content-Type} header and blank line have already been
108printed) checks that the fields \code{name} and \code{addr} are both
109set to a non-empty string:
Guido van Rossum470be141995-03-17 16:07:09 +0000110
Fred Drake19479911998-02-13 06:58:54 +0000111\begin{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000112form = cgi.FieldStorage()
Fred Drake9f9bd6a2001-06-29 14:59:01 +0000113if not (form.has_key("name") and form.has_key("addr")):
Guido van Rossume47da0a1997-07-17 16:34:52 +0000114 print "<H1>Error</H1>"
115 print "Please fill in the name and addr fields."
116 return
Moshe Zadkaa1a4b592000-08-25 21:47:56 +0000117print "<p>name:", form["name"].value
118print "<p>addr:", form["addr"].value
Guido van Rossume47da0a1997-07-17 16:34:52 +0000119...further form processing here...
Fred Drake19479911998-02-13 06:58:54 +0000120\end{verbatim}
Fred Drake6ef871c1998-03-12 06:52:05 +0000121
122Here the fields, accessed through \samp{form[\var{key}]}, are
123themselves instances of \class{FieldStorage} (or
124\class{MiniFieldStorage}, depending on the form encoding).
Moshe Zadkaa1a4b592000-08-25 21:47:56 +0000125The \member{value} attribute of the instance yields the string value
Fred Drake84e58ab2001-08-11 03:28:41 +0000126of the field. The \method{getvalue()} method returns this string value
Moshe Zadkaa1a4b592000-08-25 21:47:56 +0000127directly; it also accepts an optional second argument as a default to
128return if the requested key is not present.
Guido van Rossum470be141995-03-17 16:07:09 +0000129
Guido van Rossuma29cc971996-07-30 18:22:07 +0000130If the submitted form data contains more than one field with the same
Fred Drake6ef871c1998-03-12 06:52:05 +0000131name, the object retrieved by \samp{form[\var{key}]} is not a
132\class{FieldStorage} or \class{MiniFieldStorage}
Moshe Zadkaa1a4b592000-08-25 21:47:56 +0000133instance but a list of such instances. Similarly, in this situation,
134\samp{form.getvalue(\var{key})} would return a list of strings.
135If you expect this possibility
Fred Drake84e58ab2001-08-11 03:28:41 +0000136(when your HTML form contains multiple fields with the same name), use
Andrew M. Kuchling44cbfd72004-06-06 23:28:23 +0000137the \function{getlist()} function, which always returns a list of values (so that you
138do not need to special-case the single item case). For example, this
Fred Drake84e58ab2001-08-11 03:28:41 +0000139code concatenates any number of username fields, separated by
Fred Drake6ef871c1998-03-12 06:52:05 +0000140commas:
Guido van Rossum470be141995-03-17 16:07:09 +0000141
Fred Drake19479911998-02-13 06:58:54 +0000142\begin{verbatim}
Andrew M. Kuchling44cbfd72004-06-06 23:28:23 +0000143value = form.getlist("username")
144usernames = ",".join(value)
Fred Drake19479911998-02-13 06:58:54 +0000145\end{verbatim}
Fred Drake6ef871c1998-03-12 06:52:05 +0000146
Moshe Zadkaa1a4b592000-08-25 21:47:56 +0000147If a field represents an uploaded file, accessing the value via the
148\member{value} attribute or the \function{getvalue()} method reads the
Fred Drake6ef871c1998-03-12 06:52:05 +0000149entire file in memory as a string. This may not be what you want.
Moshe Zadkaa1a4b592000-08-25 21:47:56 +0000150You can test for an uploaded file by testing either the \member{filename}
151attribute or the \member{file} attribute. You can then read the data at
152leisure from the \member{file} attribute:
Guido van Rossuma29cc971996-07-30 18:22:07 +0000153
Fred Drake19479911998-02-13 06:58:54 +0000154\begin{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000155fileitem = form["userfile"]
156if fileitem.file:
157 # It's an uploaded file; count lines
158 linecount = 0
159 while 1:
160 line = fileitem.file.readline()
161 if not line: break
162 linecount = linecount + 1
Fred Drake19479911998-02-13 06:58:54 +0000163\end{verbatim}
Guido van Rossuma29cc971996-07-30 18:22:07 +0000164
Sean Reifscheider5e84e802007-09-18 23:38:15 +0000165If an error is encountered when obtaining the contents of an uploaded file
166(for example, when the user interrupts the form submission by clicking on
167a Back or Cancel button) the \member{done} attribute of the object for the
168field will be set to the value -1.
169
Fred Drake6ef871c1998-03-12 06:52:05 +0000170The file upload draft standard entertains the possibility of uploading
171multiple files from one field (using a recursive
172\mimetype{multipart/*} encoding). When this occurs, the item will be
173a dictionary-like \class{FieldStorage} item. This can be determined
174by testing its \member{type} attribute, which should be
175\mimetype{multipart/form-data} (or perhaps another MIME type matching
Fred Drake7eca8e51999-01-18 15:46:02 +0000176\mimetype{multipart/*}). In this case, it can be iterated over
Fred Drake6ef871c1998-03-12 06:52:05 +0000177recursively just like the top-level form object.
178
179When a form is submitted in the ``old'' format (as the query string or
180as a single data part of type
181\mimetype{application/x-www-form-urlencoded}), the items will actually
182be instances of the class \class{MiniFieldStorage}. In this case, the
Moshe Zadkaa1a4b592000-08-25 21:47:56 +0000183\member{list}, \member{file}, and \member{filename} attributes are
184always \code{None}.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000185
186
Fred Drake2732cb42001-09-11 16:27:03 +0000187\subsection{Higher Level Interface}
188
189\versionadded{2.2} % XXX: Is this true ?
190
191The previous section explains how to read CGI form data using the
192\class{FieldStorage} class. This section describes a higher level
193interface which was added to this class to allow one to do it in a
194more readable and intuitive way. The interface doesn't make the
195techniques described in previous sections obsolete --- they are still
196useful to process file uploads efficiently, for example.
197
198The interface consists of two simple methods. Using the methods
199you can process form data in a generic way, without the need to worry
200whether only one or more values were posted under one name.
201
202In the previous section, you learned to write following code anytime
203you expected a user to post more than one value under one name:
204
205\begin{verbatim}
Fred Drake2732cb42001-09-11 16:27:03 +0000206item = form.getvalue("item")
Fred Drake5b09eee2002-08-21 19:24:21 +0000207if isinstance(item, list):
Fred Drake2732cb42001-09-11 16:27:03 +0000208 # The user is requesting more than one item.
209else:
210 # The user is requesting only one item.
211\end{verbatim}
212
213This situation is common for example when a form contains a group of
214multiple checkboxes with the same name:
215
216\begin{verbatim}
217<input type="checkbox" name="item" value="1" />
218<input type="checkbox" name="item" value="2" />
219\end{verbatim}
220
221In most situations, however, there's only one form control with a
222particular name in a form and then you expect and need only one value
223associated with this name. So you write a script containing for
224example this code:
225
226\begin{verbatim}
Fred Drake226f6972004-01-23 04:05:27 +0000227user = form.getvalue("user").upper()
Fred Drake2732cb42001-09-11 16:27:03 +0000228\end{verbatim}
229
230The problem with the code is that you should never expect that a
231client will provide valid input to your scripts. For example, if a
232curious user appends another \samp{user=foo} pair to the query string,
233then the script would crash, because in this situation the
234\code{getvalue("user")} method call returns a list instead of a
235string. Calling the \method{toupper()} method on a list is not valid
236(since lists do not have a method of this name) and results in an
237\exception{AttributeError} exception.
238
239Therefore, the appropriate way to read form data values was to always
240use the code which checks whether the obtained value is a single value
241or a list of values. That's annoying and leads to less readable
242scripts.
243
244A more convenient approach is to use the methods \method{getfirst()}
245and \method{getlist()} provided by this higher level interface.
246
247\begin{methoddesc}[FieldStorage]{getfirst}{name\optional{, default}}
Raymond Hettinger0d278b82004-07-10 11:15:56 +0000248 This method always returns only one value associated with form field
Fred Drake2732cb42001-09-11 16:27:03 +0000249 \var{name}. The method returns only the first value in case that
250 more values were posted under such name. Please note that the order
251 in which the values are received may vary from browser to browser
Fred Drake5b09eee2002-08-21 19:24:21 +0000252 and should not be counted on.\footnote{Note that some recent
253 versions of the HTML specification do state what order the
254 field values should be supplied in, but knowing whether a
255 request was received from a conforming browser, or even from a
256 browser at all, is tedious and error-prone.} If no such form
257 field or value exists then the method returns the value specified by
258 the optional parameter \var{default}. This parameter defaults to
259 \code{None} if not specified.
Fred Drake2732cb42001-09-11 16:27:03 +0000260\end{methoddesc}
261
262\begin{methoddesc}[FieldStorage]{getlist}{name}
263 This method always returns a list of values associated with form
264 field \var{name}. The method returns an empty list if no such form
265 field or value exists for \var{name}. It returns a list consisting
266 of one item if only one such value exists.
267\end{methoddesc}
268
269Using these methods you can write nice compact code:
270
271\begin{verbatim}
272import cgi
273form = cgi.FieldStorage()
Fred Drake226f6972004-01-23 04:05:27 +0000274user = form.getfirst("user", "").upper() # This way it's safe.
Fred Drake2732cb42001-09-11 16:27:03 +0000275for item in form.getlist("item"):
276 do_something(item)
277\end{verbatim}
278
279
Guido van Rossuma29cc971996-07-30 18:22:07 +0000280\subsection{Old classes}
281
Fred Drake6ef871c1998-03-12 06:52:05 +0000282These classes, present in earlier versions of the \module{cgi} module,
283are still supported for backward compatibility. New applications
284should use the \class{FieldStorage} class.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000285
Fred Drake6ef871c1998-03-12 06:52:05 +0000286\class{SvFormContentDict} stores single value form content as
287dictionary; it assumes each field name occurs in the form only once.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000288
Fred Drake6ef871c1998-03-12 06:52:05 +0000289\class{FormContentDict} stores multiple value form content as a
290dictionary (the form items are lists of values). Useful if your form
291contains multiple fields with the same name.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000292
Fred Drake6ef871c1998-03-12 06:52:05 +0000293Other classes (\class{FormContent}, \class{InterpFormContentDict}) are
294present for backwards compatibility with really old applications only.
295If you still use these and would be inconvenienced when they
296disappeared from a next version of this module, drop me a note.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000297
298
299\subsection{Functions}
Fred Drake4b3f0311996-12-13 22:04:31 +0000300\nodename{Functions in cgi module}
Guido van Rossuma29cc971996-07-30 18:22:07 +0000301
302These are useful if you want more control, or if you want to employ
303some of the algorithms implemented in this module in other
304circumstances.
305
Fred Drake2732cb42001-09-11 16:27:03 +0000306\begin{funcdesc}{parse}{fp\optional{, keep_blank_values\optional{,
307 strict_parsing}}}
308 Parse a query in the environment or from a file (the file defaults
309 to \code{sys.stdin}). The \var{keep_blank_values} and
310 \var{strict_parsing} parameters are passed to \function{parse_qs()}
311 unchanged.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000312\end{funcdesc}
313
Fred Drake2732cb42001-09-11 16:27:03 +0000314\begin{funcdesc}{parse_qs}{qs\optional{, keep_blank_values\optional{,
315 strict_parsing}}}
Fred Drake6ef871c1998-03-12 06:52:05 +0000316Parse a query string given as a string argument (data of type
Guido van Rossum66ab4e81999-06-10 03:11:41 +0000317\mimetype{application/x-www-form-urlencoded}). Data are
318returned as a dictionary. The dictionary keys are the unique query
Fred Drake38e5d272000-04-03 20:13:55 +0000319variable names and the values are lists of values for each name.
Guido van Rossum66ab4e81999-06-10 03:11:41 +0000320
321The optional argument \var{keep_blank_values} is
322a flag indicating whether blank values in
323URL encoded queries should be treated as blank strings.
324A true value indicates that blanks should be retained as
325blank strings. The default false value indicates that
326blank values are to be ignored and treated as if they were
327not included.
328
329The optional argument \var{strict_parsing} is a flag indicating what
330to do with parsing errors. If false (the default), errors
Georg Brandldb815ab2006-03-17 16:26:31 +0000331are silently ignored. If true, errors raise a \exception{ValueError}
Guido van Rossum66ab4e81999-06-10 03:11:41 +0000332exception.
Brett Cannon1213bdd2003-05-13 02:50:36 +0000333
334Use the \function{\refmodule{urllib}.urlencode()} function to convert
335such dictionaries into query strings.
336
Guido van Rossum66ab4e81999-06-10 03:11:41 +0000337\end{funcdesc}
338
Fred Drake2732cb42001-09-11 16:27:03 +0000339\begin{funcdesc}{parse_qsl}{qs\optional{, keep_blank_values\optional{,
340 strict_parsing}}}
Guido van Rossum66ab4e81999-06-10 03:11:41 +0000341Parse a query string given as a string argument (data of type
342\mimetype{application/x-www-form-urlencoded}). Data are
343returned as a list of name, value pairs.
344
345The optional argument \var{keep_blank_values} is
346a flag indicating whether blank values in
347URL encoded queries should be treated as blank strings.
348A true value indicates that blanks should be retained as
349blank strings. The default false value indicates that
350blank values are to be ignored and treated as if they were
351not included.
352
353The optional argument \var{strict_parsing} is a flag indicating what
354to do with parsing errors. If false (the default), errors
Georg Brandldb815ab2006-03-17 16:26:31 +0000355are silently ignored. If true, errors raise a \exception{ValueError}
Guido van Rossum66ab4e81999-06-10 03:11:41 +0000356exception.
Fred Draked859d472003-04-24 16:22:47 +0000357
Brett Cannon1213bdd2003-05-13 02:50:36 +0000358Use the \function{\refmodule{urllib}.urlencode()} function to convert
Fred Draked859d472003-04-24 16:22:47 +0000359such lists of pairs into query strings.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000360\end{funcdesc}
361
Fred Drakecce10901998-03-17 06:33:25 +0000362\begin{funcdesc}{parse_multipart}{fp, pdict}
Fred Drake6ef871c1998-03-12 06:52:05 +0000363Parse input of type \mimetype{multipart/form-data} (for
364file uploads). Arguments are \var{fp} for the input file and
Moshe Zadkaa1a4b592000-08-25 21:47:56 +0000365\var{pdict} for a dictionary containing other parameters in
Fred Drake84e58ab2001-08-11 03:28:41 +0000366the \mailheader{Content-Type} header.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000367
Fred Drake6ef871c1998-03-12 06:52:05 +0000368Returns a dictionary just like \function{parse_qs()} keys are the
369field names, each value is a list of values for that field. This is
370easy to use but not much good if you are expecting megabytes to be
371uploaded --- in that case, use the \class{FieldStorage} class instead
Moshe Zadkaa1a4b592000-08-25 21:47:56 +0000372which is much more flexible.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000373
Fred Drake6ef871c1998-03-12 06:52:05 +0000374Note that this does not parse nested multipart parts --- use
375\class{FieldStorage} for that.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000376\end{funcdesc}
377
Guido van Rossum81e479a1997-08-25 18:28:03 +0000378\begin{funcdesc}{parse_header}{string}
Fred Drake84e58ab2001-08-11 03:28:41 +0000379Parse a MIME header (such as \mailheader{Content-Type}) into a main
Moshe Zadkaa1a4b592000-08-25 21:47:56 +0000380value and a dictionary of parameters.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000381\end{funcdesc}
382
Guido van Rossum81e479a1997-08-25 18:28:03 +0000383\begin{funcdesc}{test}{}
Fred Drake6ef871c1998-03-12 06:52:05 +0000384Robust test CGI script, usable as main program.
385Writes minimal HTTP headers and formats all information provided to
386the script in HTML form.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000387\end{funcdesc}
388
Guido van Rossum81e479a1997-08-25 18:28:03 +0000389\begin{funcdesc}{print_environ}{}
Fred Drake6ef871c1998-03-12 06:52:05 +0000390Format the shell environment in HTML.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000391\end{funcdesc}
392
Guido van Rossum81e479a1997-08-25 18:28:03 +0000393\begin{funcdesc}{print_form}{form}
Fred Drake6ef871c1998-03-12 06:52:05 +0000394Format a form in HTML.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000395\end{funcdesc}
396
Guido van Rossum81e479a1997-08-25 18:28:03 +0000397\begin{funcdesc}{print_directory}{}
Fred Drake6ef871c1998-03-12 06:52:05 +0000398Format the current directory in HTML.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000399\end{funcdesc}
400
Guido van Rossum81e479a1997-08-25 18:28:03 +0000401\begin{funcdesc}{print_environ_usage}{}
Fred Drake6ef871c1998-03-12 06:52:05 +0000402Print a list of useful (used by CGI) environment variables in
Guido van Rossuma29cc971996-07-30 18:22:07 +0000403HTML.
404\end{funcdesc}
405
Fred Drakecce10901998-03-17 06:33:25 +0000406\begin{funcdesc}{escape}{s\optional{, quote}}
Fred Drake6ef871c1998-03-12 06:52:05 +0000407Convert the characters
408\character{\&}, \character{<} and \character{>} in string \var{s} to
409HTML-safe sequences. Use this if you need to display text that might
410contain such characters in HTML. If the optional flag \var{quote} is
Skip Montanaro6a694502005-08-02 02:53:59 +0000411true, the quotation mark character (\character{"}) is also translated;
Fred Drake91f2f262001-07-06 19:28:48 +0000412this helps for inclusion in an HTML attribute value, as in \code{<A
Fred Drake055be472002-08-23 21:19:53 +0000413HREF="...">}. If the value to be quoted might include single- or
Fred Drake84e58ab2001-08-11 03:28:41 +0000414double-quote characters, or both, consider using the
415\function{quoteattr()} function in the \refmodule{xml.sax.saxutils}
416module instead.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000417\end{funcdesc}
418
419
Fred Drake34a37b82001-12-20 17:13:09 +0000420\subsection{Caring about security \label{cgi-security}}
421
422\indexii{CGI}{security}
Guido van Rossuma29cc971996-07-30 18:22:07 +0000423
Fred Drake91f2f262001-07-06 19:28:48 +0000424There's one important rule: if you invoke an external program (via the
425\function{os.system()} or \function{os.popen()} functions. or others
426with similar functionality), make very sure you don't pass arbitrary
427strings received from the client to the shell. This is a well-known
Fred Drake8ee679f2001-07-14 02:50:55 +0000428security hole whereby clever hackers anywhere on the Web can exploit a
Fred Drake91f2f262001-07-06 19:28:48 +0000429gullible CGI script to invoke arbitrary shell commands. Even parts of
430the URL or field names cannot be trusted, since the request doesn't
431have to come from your form!
Guido van Rossuma29cc971996-07-30 18:22:07 +0000432
433To be on the safe side, if you must pass a string gotten from a form
434to a shell command, you should make sure the string contains only
435alphanumeric characters, dashes, underscores, and periods.
436
437
Fred Drakec37b65e2001-11-28 07:26:15 +0000438\subsection{Installing your CGI script on a \UNIX\ system}
Guido van Rossuma29cc971996-07-30 18:22:07 +0000439
440Read the documentation for your HTTP server and check with your local
441system administrator to find the directory where CGI scripts should be
Fred Drakea2e268a1997-12-09 03:28:42 +0000442installed; usually this is in a directory \file{cgi-bin} in the server tree.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000443
444Make sure that your script is readable and executable by ``others''; the
Fred Drake6ef871c1998-03-12 06:52:05 +0000445\UNIX{} file mode should be \code{0755} octal (use \samp{chmod 0755
Fred Drake7eca8e51999-01-18 15:46:02 +0000446\var{filename}}). Make sure that the first line of the script contains
Fred Drake6ef871c1998-03-12 06:52:05 +0000447\code{\#!} starting in column 1 followed by the pathname of the Python
448interpreter, for instance:
Guido van Rossuma29cc971996-07-30 18:22:07 +0000449
Fred Drake19479911998-02-13 06:58:54 +0000450\begin{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000451#!/usr/local/bin/python
Fred Drake19479911998-02-13 06:58:54 +0000452\end{verbatim}
Fred Drake6ef871c1998-03-12 06:52:05 +0000453
Guido van Rossuma29cc971996-07-30 18:22:07 +0000454Make sure the Python interpreter exists and is executable by ``others''.
455
456Make sure that any files your script needs to read or write are
Fred Drake6ef871c1998-03-12 06:52:05 +0000457readable or writable, respectively, by ``others'' --- their mode
458should be \code{0644} for readable and \code{0666} for writable. This
459is because, for security reasons, the HTTP server executes your script
460as user ``nobody'', without any special privileges. It can only read
461(write, execute) files that everybody can read (write, execute). The
462current directory at execution time is also different (it is usually
463the server's cgi-bin directory) and the set of environment variables
Fred Drake8ee679f2001-07-14 02:50:55 +0000464is also different from what you get when you log in. In particular, don't
Fred Drake6ef871c1998-03-12 06:52:05 +0000465count on the shell's search path for executables (\envvar{PATH}) or
466the Python module search path (\envvar{PYTHONPATH}) to be set to
467anything interesting.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000468
469If you need to load modules from a directory which is not on Python's
470default module search path, you can change the path in your script,
Fred Drake91f2f262001-07-06 19:28:48 +0000471before importing other modules. For example:
Guido van Rossuma29cc971996-07-30 18:22:07 +0000472
Fred Drake19479911998-02-13 06:58:54 +0000473\begin{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000474import sys
475sys.path.insert(0, "/usr/home/joe/lib/python")
476sys.path.insert(0, "/usr/local/lib/python")
Fred Drake19479911998-02-13 06:58:54 +0000477\end{verbatim}
Fred Drake6ef871c1998-03-12 06:52:05 +0000478
Guido van Rossuma29cc971996-07-30 18:22:07 +0000479(This way, the directory inserted last will be searched first!)
480
Fred Drakeefc1e0f1998-01-13 19:00:33 +0000481Instructions for non-\UNIX{} systems will vary; check your HTTP server's
Guido van Rossuma29cc971996-07-30 18:22:07 +0000482documentation (it will usually have a section on CGI scripts).
483
484
485\subsection{Testing your CGI script}
486
487Unfortunately, a CGI script will generally not run when you try it
488from the command line, and a script that works perfectly from the
489command line may fail mysteriously when run from the server. There's
490one reason why you should still test your script from the command
Fred Drake6a79be81998-04-03 03:47:03 +0000491line: if it contains a syntax error, the Python interpreter won't
Guido van Rossuma29cc971996-07-30 18:22:07 +0000492execute it at all, and the HTTP server will most likely send a cryptic
493error to the client.
494
495Assuming your script has no syntax errors, yet it does not work, you
Fred Drake6ef871c1998-03-12 06:52:05 +0000496have no choice but to read the next section.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000497
498
Fred Drake34a37b82001-12-20 17:13:09 +0000499\subsection{Debugging CGI scripts} \indexii{CGI}{debugging}
Guido van Rossuma29cc971996-07-30 18:22:07 +0000500
Fred Drake6ef871c1998-03-12 06:52:05 +0000501First of all, check for trivial installation errors --- reading the
Guido van Rossuma29cc971996-07-30 18:22:07 +0000502section above on installing your CGI script carefully can save you a
503lot of time. If you wonder whether you have understood the
504installation procedure correctly, try installing a copy of this module
Fred Drakea2e268a1997-12-09 03:28:42 +0000505file (\file{cgi.py}) as a CGI script. When invoked as a script, the file
Guido van Rossuma29cc971996-07-30 18:22:07 +0000506will dump its environment and the contents of the form in HTML form.
507Give it the right mode etc, and send it a request. If it's installed
Fred Drakea2e268a1997-12-09 03:28:42 +0000508in the standard \file{cgi-bin} directory, it should be possible to send it a
Guido van Rossuma29cc971996-07-30 18:22:07 +0000509request by entering a URL into your browser of the form:
510
Fred Drake19479911998-02-13 06:58:54 +0000511\begin{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000512http://yourhostname/cgi-bin/cgi.py?name=Joe+Blow&addr=At+Home
Fred Drake19479911998-02-13 06:58:54 +0000513\end{verbatim}
Fred Drake6ef871c1998-03-12 06:52:05 +0000514
Guido van Rossuma29cc971996-07-30 18:22:07 +0000515If this gives an error of type 404, the server cannot find the script
516-- perhaps you need to install it in a different directory. If it
Fred Drake91f2f262001-07-06 19:28:48 +0000517gives another error, there's an installation problem that
Guido van Rossuma29cc971996-07-30 18:22:07 +0000518you should fix before trying to go any further. If you get a nicely
519formatted listing of the environment and form content (in this
520example, the fields should be listed as ``addr'' with value ``At Home''
Fred Drakea2e268a1997-12-09 03:28:42 +0000521and ``name'' with value ``Joe Blow''), the \file{cgi.py} script has been
Guido van Rossuma29cc971996-07-30 18:22:07 +0000522installed correctly. If you follow the same procedure for your own
523script, you should now be able to debug it.
524
Fred Drake6ef871c1998-03-12 06:52:05 +0000525The next step could be to call the \module{cgi} module's
526\function{test()} function from your script: replace its main code
527with the single statement
Guido van Rossuma29cc971996-07-30 18:22:07 +0000528
Fred Drake19479911998-02-13 06:58:54 +0000529\begin{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000530cgi.test()
Fred Drake19479911998-02-13 06:58:54 +0000531\end{verbatim}
Fred Drake6ef871c1998-03-12 06:52:05 +0000532
Guido van Rossuma29cc971996-07-30 18:22:07 +0000533This should produce the same results as those gotten from installing
Fred Drakea2e268a1997-12-09 03:28:42 +0000534the \file{cgi.py} file itself.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000535
Fred Drake91f2f262001-07-06 19:28:48 +0000536When an ordinary Python script raises an unhandled exception (for
537whatever reason: of a typo in a module name, a file that can't be
538opened, etc.), the Python interpreter prints a nice traceback and
539exits. While the Python interpreter will still do this when your CGI
540script raises an exception, most likely the traceback will end up in
Fred Drake34a37b82001-12-20 17:13:09 +0000541one of the HTTP server's log files, or be discarded altogether.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000542
543Fortunately, once you have managed to get your script to execute
Fred Drake34a37b82001-12-20 17:13:09 +0000544\emph{some} code, you can easily send tracebacks to the Web browser
545using the \refmodule{cgitb} module. If you haven't done so already,
546just add the line:
Guido van Rossuma29cc971996-07-30 18:22:07 +0000547
Fred Drake19479911998-02-13 06:58:54 +0000548\begin{verbatim}
Fred Drake34a37b82001-12-20 17:13:09 +0000549import cgitb; cgitb.enable()
Fred Drake19479911998-02-13 06:58:54 +0000550\end{verbatim}
Fred Drake6ef871c1998-03-12 06:52:05 +0000551
Fred Drake34a37b82001-12-20 17:13:09 +0000552to the top of your script. Then try running it again; when a
553problem occurs, you should see a detailed report that will
554likely make apparent the cause of the crash.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000555
Fred Drake34a37b82001-12-20 17:13:09 +0000556If you suspect that there may be a problem in importing the
557\refmodule{cgitb} module, you can use an even more robust approach
558(which only uses built-in modules):
Guido van Rossuma29cc971996-07-30 18:22:07 +0000559
Fred Drake19479911998-02-13 06:58:54 +0000560\begin{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000561import sys
562sys.stderr = sys.stdout
Moshe Zadkaa1a4b592000-08-25 21:47:56 +0000563print "Content-Type: text/plain"
Guido van Rossume47da0a1997-07-17 16:34:52 +0000564print
565...your code here...
Fred Drake19479911998-02-13 06:58:54 +0000566\end{verbatim}
Fred Drake6ef871c1998-03-12 06:52:05 +0000567
Guido van Rossuma29cc971996-07-30 18:22:07 +0000568This relies on the Python interpreter to print the traceback. The
569content type of the output is set to plain text, which disables all
570HTML processing. If your script works, the raw HTML will be displayed
571by your client. If it raises an exception, most likely after the
572first two lines have been printed, a traceback will be displayed.
Fred Drake34a37b82001-12-20 17:13:09 +0000573Because no HTML interpretation is going on, the traceback will be
Guido van Rossuma29cc971996-07-30 18:22:07 +0000574readable.
575
576
577\subsection{Common problems and solutions}
Guido van Rossum470be141995-03-17 16:07:09 +0000578
579\begin{itemize}
Guido van Rossuma29cc971996-07-30 18:22:07 +0000580\item Most HTTP servers buffer the output from CGI scripts until the
581script is completed. This means that it is not possible to display a
582progress report on the client's display while the script is running.
583
584\item Check the installation instructions above.
585
Fred Drake6ef871c1998-03-12 06:52:05 +0000586\item Check the HTTP server's log files. (\samp{tail -f logfile} in a
587separate window may be useful!)
Guido van Rossuma29cc971996-07-30 18:22:07 +0000588
589\item Always check a script for syntax errors first, by doing something
Fred Drake6ef871c1998-03-12 06:52:05 +0000590like \samp{python script.py}.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000591
Fred Drake34a37b82001-12-20 17:13:09 +0000592\item If your script does not have any syntax errors, try adding
593\samp{import cgitb; cgitb.enable()} to the top of the script.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000594
595\item When invoking external programs, make sure they can be found.
Fred Drake6ef871c1998-03-12 06:52:05 +0000596Usually, this means using absolute path names --- \envvar{PATH} is
597usually not set to a very useful value in a CGI script.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000598
599\item When reading or writing external files, make sure they can be read
Alex Martelli50324a62003-11-09 16:31:18 +0000600or written by the userid under which your CGI script will be running:
601this is typically the userid under which the web server is running, or some
602explicitly specified userid for a web server's \samp{suexec} feature.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000603
604\item Don't try to give a CGI script a set-uid mode. This doesn't work on
605most systems, and is a security liability as well.
Guido van Rossum470be141995-03-17 16:07:09 +0000606\end{itemize}
607