blob: 2f590a87f4e8a1d6ecd6fb5f8813d7ea5a92d620 [file] [log] [blame]
Fred Drake295da241998-08-10 19:42:37 +00001\section{\module{cgi} ---
2 Common Gateway Interface support.}
Fred Drakeb91e9341998-07-23 17:59:49 +00003\declaremodule{standard}{cgi}
4
Fred Drake295da241998-08-10 19:42:37 +00005\modulesynopsis{Common Gateway Interface support, used to interpret
6forms in server-side scripts.}
Fred Drakeb91e9341998-07-23 17:59:49 +00007
Guido van Rossuma12ef941995-02-27 17:53:25 +00008\indexii{WWW}{server}
9\indexii{CGI}{protocol}
10\indexii{HTTP}{protocol}
11\indexii{MIME}{headers}
12\index{URL}
13
Guido van Rossum86751151995-02-28 17:14:32 +000014
Fred Drake8ee679f2001-07-14 02:50:55 +000015Support module for Common Gateway Interface (CGI) scripts.%
Fred Drake6a79be81998-04-03 03:47:03 +000016\index{Common Gateway Interface}
Guido van Rossuma12ef941995-02-27 17:53:25 +000017
Guido van Rossuma29cc971996-07-30 18:22:07 +000018This module defines a number of utilities for use by CGI scripts
19written in Python.
Guido van Rossuma12ef941995-02-27 17:53:25 +000020
Guido van Rossuma29cc971996-07-30 18:22:07 +000021\subsection{Introduction}
Fred Drake12d9fc91998-04-14 17:19:54 +000022\nodename{cgi-intro}
Guido van Rossuma12ef941995-02-27 17:53:25 +000023
Guido van Rossuma29cc971996-07-30 18:22:07 +000024A CGI script is invoked by an HTTP server, usually to process user
Fred Drake637af131998-08-21 20:02:06 +000025input submitted through an HTML \code{<FORM>} or \code{<ISINDEX>} element.
Guido van Rossuma29cc971996-07-30 18:22:07 +000026
Fred Drakea2e268a1997-12-09 03:28:42 +000027Most often, CGI scripts live in the server's special \file{cgi-bin}
Guido van Rossuma29cc971996-07-30 18:22:07 +000028directory. The HTTP server places all sorts of information about the
29request (such as the client's hostname, the requested URL, the query
30string, and lots of other goodies) in the script's shell environment,
31executes the script, and sends the script's output back to the client.
32
33The script's input is connected to the client too, and sometimes the
34form data is read this way; at other times the form data is passed via
Fred Drake6ef871c1998-03-12 06:52:05 +000035the ``query string'' part of the URL. This module is intended
Guido van Rossuma29cc971996-07-30 18:22:07 +000036to take care of the different cases and provide a simpler interface to
37the Python script. It also provides a number of utilities that help
38in debugging scripts, and the latest addition is support for file
Fred Drake6ef871c1998-03-12 06:52:05 +000039uploads from a form (if your browser supports it --- Grail 0.3 and
Guido van Rossuma29cc971996-07-30 18:22:07 +000040Netscape 2.0 do).
41
42The output of a CGI script should consist of two sections, separated
43by a blank line. The first section contains a number of headers,
44telling the client what kind of data is following. Python code to
45generate a minimal header section looks like this:
Guido van Rossuma12ef941995-02-27 17:53:25 +000046
Fred Drake19479911998-02-13 06:58:54 +000047\begin{verbatim}
Moshe Zadkaa1a4b592000-08-25 21:47:56 +000048print "Content-Type: text/html" # HTML is following
Guido van Rossume47da0a1997-07-17 16:34:52 +000049print # blank line, end of headers
Fred Drake19479911998-02-13 06:58:54 +000050\end{verbatim}
Fred Drake6ef871c1998-03-12 06:52:05 +000051
Guido van Rossuma29cc971996-07-30 18:22:07 +000052The second section is usually HTML, which allows the client software
53to display nicely formatted text with header, in-line images, etc.
54Here's Python code that prints a simple piece of HTML:
Guido van Rossum470be141995-03-17 16:07:09 +000055
Fred Drake19479911998-02-13 06:58:54 +000056\begin{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +000057print "<TITLE>CGI script output</TITLE>"
58print "<H1>This is my first CGI script</H1>"
59print "Hello, world!"
Fred Drake19479911998-02-13 06:58:54 +000060\end{verbatim}
Fred Drake6ef871c1998-03-12 06:52:05 +000061
Guido van Rossuma29cc971996-07-30 18:22:07 +000062\subsection{Using the cgi module}
63\nodename{Using the cgi module}
64
Fred Drake6ef871c1998-03-12 06:52:05 +000065Begin by writing \samp{import cgi}. Do not use \samp{from cgi import
66*} --- the module defines all sorts of names for its own use or for
67backward compatibility that you don't want in your namespace.
Guido van Rossuma29cc971996-07-30 18:22:07 +000068
Fred Drake34a37b82001-12-20 17:13:09 +000069When you write a new script, consider adding the line:
70
71\begin{verbatim}
72import cgitb; cgitb.enable()
73\end{verbatim}
74
75This activates a special exception handler that will display detailed
76reports in the Web browser if any errors occur. If you'd rather not
77show the guts of your program to users of your script, you can have
78the reports saved to files instead, with a line like this:
79
80\begin{verbatim}
81import cgitb; cgitb.enable(display=0, logdir="/tmp")
82\end{verbatim}
83
84It's very helpful to use this feature during script development.
85The reports produced by \refmodule{cgitb} provide information that
86can save you a lot of time in tracking down bugs. You can always
87remove the \code{cgitb} line later when you have tested your script
88and are confident that it works correctly.
89
90To get at submitted form data,
91it's best to use the \class{FieldStorage} class. The other classes
Fred Drake6ef871c1998-03-12 06:52:05 +000092defined in this module are provided mostly for backward compatibility.
93Instantiate it exactly once, without arguments. This reads the form
94contents from standard input or the environment (depending on the
95value of various environment variables set according to the CGI
96standard). Since it may consume standard input, it should be
97instantiated only once.
Guido van Rossuma29cc971996-07-30 18:22:07 +000098
Moshe Zadkaa1a4b592000-08-25 21:47:56 +000099The \class{FieldStorage} instance can be indexed like a Python
100dictionary, and also supports the standard dictionary methods
Fred Drake84e58ab2001-08-11 03:28:41 +0000101\method{has_key()} and \method{keys()}. The built-in \function{len()}
102is also supported. Form fields containing empty strings are ignored
Moshe Zadkaa1a4b592000-08-25 21:47:56 +0000103and do not appear in the dictionary; to keep such values, provide
Fred Drake84e58ab2001-08-11 03:28:41 +0000104a true value for the the optional \var{keep_blank_values} keyword
105parameter when creating the \class{FieldStorage} instance.
Moshe Zadkaa1a4b592000-08-25 21:47:56 +0000106
107For instance, the following code (which assumes that the
Fred Drake84e58ab2001-08-11 03:28:41 +0000108\mailheader{Content-Type} header and blank line have already been
109printed) checks that the fields \code{name} and \code{addr} are both
110set to a non-empty string:
Guido van Rossum470be141995-03-17 16:07:09 +0000111
Fred Drake19479911998-02-13 06:58:54 +0000112\begin{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000113form = cgi.FieldStorage()
Fred Drake9f9bd6a2001-06-29 14:59:01 +0000114if not (form.has_key("name") and form.has_key("addr")):
Guido van Rossume47da0a1997-07-17 16:34:52 +0000115 print "<H1>Error</H1>"
116 print "Please fill in the name and addr fields."
117 return
Moshe Zadkaa1a4b592000-08-25 21:47:56 +0000118print "<p>name:", form["name"].value
119print "<p>addr:", form["addr"].value
Guido van Rossume47da0a1997-07-17 16:34:52 +0000120...further form processing here...
Fred Drake19479911998-02-13 06:58:54 +0000121\end{verbatim}
Fred Drake6ef871c1998-03-12 06:52:05 +0000122
123Here the fields, accessed through \samp{form[\var{key}]}, are
124themselves instances of \class{FieldStorage} (or
125\class{MiniFieldStorage}, depending on the form encoding).
Moshe Zadkaa1a4b592000-08-25 21:47:56 +0000126The \member{value} attribute of the instance yields the string value
Fred Drake84e58ab2001-08-11 03:28:41 +0000127of the field. The \method{getvalue()} method returns this string value
Moshe Zadkaa1a4b592000-08-25 21:47:56 +0000128directly; it also accepts an optional second argument as a default to
129return if the requested key is not present.
Guido van Rossum470be141995-03-17 16:07:09 +0000130
Guido van Rossuma29cc971996-07-30 18:22:07 +0000131If the submitted form data contains more than one field with the same
Fred Drake6ef871c1998-03-12 06:52:05 +0000132name, the object retrieved by \samp{form[\var{key}]} is not a
133\class{FieldStorage} or \class{MiniFieldStorage}
Moshe Zadkaa1a4b592000-08-25 21:47:56 +0000134instance but a list of such instances. Similarly, in this situation,
135\samp{form.getvalue(\var{key})} would return a list of strings.
136If you expect this possibility
Fred Drake84e58ab2001-08-11 03:28:41 +0000137(when your HTML form contains multiple fields with the same name), use
Fred Drakea7bb2b92002-04-26 20:44:14 +0000138the \function{isinstance()} built-in function to determine whether you
Fred Drake84e58ab2001-08-11 03:28:41 +0000139have a single instance or a list of instances. For example, this
140code concatenates any number of username fields, separated by
Fred Drake6ef871c1998-03-12 06:52:05 +0000141commas:
Guido van Rossum470be141995-03-17 16:07:09 +0000142
Fred Drake19479911998-02-13 06:58:54 +0000143\begin{verbatim}
Moshe Zadkaa1a4b592000-08-25 21:47:56 +0000144value = form.getvalue("username", "")
Fred Drakea7bb2b92002-04-26 20:44:14 +0000145if isinstance(value, list):
Guido van Rossume47da0a1997-07-17 16:34:52 +0000146 # Multiple username fields specified
Moshe Zadkaa1a4b592000-08-25 21:47:56 +0000147 usernames = ",".join(value)
Guido van Rossume47da0a1997-07-17 16:34:52 +0000148else:
Moshe Zadkaa1a4b592000-08-25 21:47:56 +0000149 # Single or no username field specified
150 usernames = value
Fred Drake19479911998-02-13 06:58:54 +0000151\end{verbatim}
Fred Drake6ef871c1998-03-12 06:52:05 +0000152
Moshe Zadkaa1a4b592000-08-25 21:47:56 +0000153If a field represents an uploaded file, accessing the value via the
154\member{value} attribute or the \function{getvalue()} method reads the
Fred Drake6ef871c1998-03-12 06:52:05 +0000155entire file in memory as a string. This may not be what you want.
Moshe Zadkaa1a4b592000-08-25 21:47:56 +0000156You can test for an uploaded file by testing either the \member{filename}
157attribute or the \member{file} attribute. You can then read the data at
158leisure from the \member{file} attribute:
Guido van Rossuma29cc971996-07-30 18:22:07 +0000159
Fred Drake19479911998-02-13 06:58:54 +0000160\begin{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000161fileitem = form["userfile"]
162if fileitem.file:
163 # It's an uploaded file; count lines
164 linecount = 0
165 while 1:
166 line = fileitem.file.readline()
167 if not line: break
168 linecount = linecount + 1
Fred Drake19479911998-02-13 06:58:54 +0000169\end{verbatim}
Guido van Rossuma29cc971996-07-30 18:22:07 +0000170
Fred Drake6ef871c1998-03-12 06:52:05 +0000171The file upload draft standard entertains the possibility of uploading
172multiple files from one field (using a recursive
173\mimetype{multipart/*} encoding). When this occurs, the item will be
174a dictionary-like \class{FieldStorage} item. This can be determined
175by testing its \member{type} attribute, which should be
176\mimetype{multipart/form-data} (or perhaps another MIME type matching
Fred Drake7eca8e51999-01-18 15:46:02 +0000177\mimetype{multipart/*}). In this case, it can be iterated over
Fred Drake6ef871c1998-03-12 06:52:05 +0000178recursively just like the top-level form object.
179
180When a form is submitted in the ``old'' format (as the query string or
181as a single data part of type
182\mimetype{application/x-www-form-urlencoded}), the items will actually
183be instances of the class \class{MiniFieldStorage}. In this case, the
Moshe Zadkaa1a4b592000-08-25 21:47:56 +0000184\member{list}, \member{file}, and \member{filename} attributes are
185always \code{None}.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000186
187
Fred Drake2732cb42001-09-11 16:27:03 +0000188\subsection{Higher Level Interface}
189
190\versionadded{2.2} % XXX: Is this true ?
191
192The previous section explains how to read CGI form data using the
193\class{FieldStorage} class. This section describes a higher level
194interface which was added to this class to allow one to do it in a
195more readable and intuitive way. The interface doesn't make the
196techniques described in previous sections obsolete --- they are still
197useful to process file uploads efficiently, for example.
198
199The interface consists of two simple methods. Using the methods
200you can process form data in a generic way, without the need to worry
201whether only one or more values were posted under one name.
202
203In the previous section, you learned to write following code anytime
204you expected a user to post more than one value under one name:
205
206\begin{verbatim}
207from types import ListType
208
209item = form.getvalue("item")
210if isinstance(item, ListType):
211 # The user is requesting more than one item.
212else:
213 # The user is requesting only one item.
214\end{verbatim}
215
216This situation is common for example when a form contains a group of
217multiple checkboxes with the same name:
218
219\begin{verbatim}
220<input type="checkbox" name="item" value="1" />
221<input type="checkbox" name="item" value="2" />
222\end{verbatim}
223
224In most situations, however, there's only one form control with a
225particular name in a form and then you expect and need only one value
226associated with this name. So you write a script containing for
227example this code:
228
229\begin{verbatim}
230user = form.getvalue("user").toupper()
231\end{verbatim}
232
233The problem with the code is that you should never expect that a
234client will provide valid input to your scripts. For example, if a
235curious user appends another \samp{user=foo} pair to the query string,
236then the script would crash, because in this situation the
237\code{getvalue("user")} method call returns a list instead of a
238string. Calling the \method{toupper()} method on a list is not valid
239(since lists do not have a method of this name) and results in an
240\exception{AttributeError} exception.
241
242Therefore, the appropriate way to read form data values was to always
243use the code which checks whether the obtained value is a single value
244or a list of values. That's annoying and leads to less readable
245scripts.
246
247A more convenient approach is to use the methods \method{getfirst()}
248and \method{getlist()} provided by this higher level interface.
249
250\begin{methoddesc}[FieldStorage]{getfirst}{name\optional{, default}}
251 Thin method always returns only one value associated with form field
252 \var{name}. The method returns only the first value in case that
253 more values were posted under such name. Please note that the order
254 in which the values are received may vary from browser to browser
255 and should not be counted on. If no such form field or value exists
256 then the method returns the value specified by the optional
257 parameter \var{default}. This parameter defaults to \code{None} if
258 not specified.
259\end{methoddesc}
260
261\begin{methoddesc}[FieldStorage]{getlist}{name}
262 This method always returns a list of values associated with form
263 field \var{name}. The method returns an empty list if no such form
264 field or value exists for \var{name}. It returns a list consisting
265 of one item if only one such value exists.
266\end{methoddesc}
267
268Using these methods you can write nice compact code:
269
270\begin{verbatim}
271import cgi
272form = cgi.FieldStorage()
273user = form.getfirst("user").toupper() # This way it's safe.
274for item in form.getlist("item"):
275 do_something(item)
276\end{verbatim}
277
278
Guido van Rossuma29cc971996-07-30 18:22:07 +0000279\subsection{Old classes}
280
Fred Drake6ef871c1998-03-12 06:52:05 +0000281These classes, present in earlier versions of the \module{cgi} module,
282are still supported for backward compatibility. New applications
283should use the \class{FieldStorage} class.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000284
Fred Drake6ef871c1998-03-12 06:52:05 +0000285\class{SvFormContentDict} stores single value form content as
286dictionary; it assumes each field name occurs in the form only once.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000287
Fred Drake6ef871c1998-03-12 06:52:05 +0000288\class{FormContentDict} stores multiple value form content as a
289dictionary (the form items are lists of values). Useful if your form
290contains multiple fields with the same name.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000291
Fred Drake6ef871c1998-03-12 06:52:05 +0000292Other classes (\class{FormContent}, \class{InterpFormContentDict}) are
293present for backwards compatibility with really old applications only.
294If you still use these and would be inconvenienced when they
295disappeared from a next version of this module, drop me a note.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000296
297
298\subsection{Functions}
Fred Drake4b3f0311996-12-13 22:04:31 +0000299\nodename{Functions in cgi module}
Guido van Rossuma29cc971996-07-30 18:22:07 +0000300
301These are useful if you want more control, or if you want to employ
302some of the algorithms implemented in this module in other
303circumstances.
304
Fred Drake2732cb42001-09-11 16:27:03 +0000305\begin{funcdesc}{parse}{fp\optional{, keep_blank_values\optional{,
306 strict_parsing}}}
307 Parse a query in the environment or from a file (the file defaults
308 to \code{sys.stdin}). The \var{keep_blank_values} and
309 \var{strict_parsing} parameters are passed to \function{parse_qs()}
310 unchanged.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000311\end{funcdesc}
312
Fred Drake2732cb42001-09-11 16:27:03 +0000313\begin{funcdesc}{parse_qs}{qs\optional{, keep_blank_values\optional{,
314 strict_parsing}}}
Fred Drake6ef871c1998-03-12 06:52:05 +0000315Parse a query string given as a string argument (data of type
Guido van Rossum66ab4e81999-06-10 03:11:41 +0000316\mimetype{application/x-www-form-urlencoded}). Data are
317returned as a dictionary. The dictionary keys are the unique query
Fred Drake38e5d272000-04-03 20:13:55 +0000318variable names and the values are lists of values for each name.
Guido van Rossum66ab4e81999-06-10 03:11:41 +0000319
320The optional argument \var{keep_blank_values} is
321a flag indicating whether blank values in
322URL encoded queries should be treated as blank strings.
323A true value indicates that blanks should be retained as
324blank strings. The default false value indicates that
325blank values are to be ignored and treated as if they were
326not included.
327
328The optional argument \var{strict_parsing} is a flag indicating what
329to do with parsing errors. If false (the default), errors
330are silently ignored. If true, errors raise a ValueError
331exception.
332\end{funcdesc}
333
Fred Drake2732cb42001-09-11 16:27:03 +0000334\begin{funcdesc}{parse_qsl}{qs\optional{, keep_blank_values\optional{,
335 strict_parsing}}}
Guido van Rossum66ab4e81999-06-10 03:11:41 +0000336Parse a query string given as a string argument (data of type
337\mimetype{application/x-www-form-urlencoded}). Data are
338returned as a list of name, value pairs.
339
340The optional argument \var{keep_blank_values} is
341a flag indicating whether blank values in
342URL encoded queries should be treated as blank strings.
343A true value indicates that blanks should be retained as
344blank strings. The default false value indicates that
345blank values are to be ignored and treated as if they were
346not included.
347
348The optional argument \var{strict_parsing} is a flag indicating what
349to do with parsing errors. If false (the default), errors
350are silently ignored. If true, errors raise a ValueError
351exception.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000352\end{funcdesc}
353
Fred Drakecce10901998-03-17 06:33:25 +0000354\begin{funcdesc}{parse_multipart}{fp, pdict}
Fred Drake6ef871c1998-03-12 06:52:05 +0000355Parse input of type \mimetype{multipart/form-data} (for
356file uploads). Arguments are \var{fp} for the input file and
Moshe Zadkaa1a4b592000-08-25 21:47:56 +0000357\var{pdict} for a dictionary containing other parameters in
Fred Drake84e58ab2001-08-11 03:28:41 +0000358the \mailheader{Content-Type} header.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000359
Fred Drake6ef871c1998-03-12 06:52:05 +0000360Returns a dictionary just like \function{parse_qs()} keys are the
361field names, each value is a list of values for that field. This is
362easy to use but not much good if you are expecting megabytes to be
363uploaded --- in that case, use the \class{FieldStorage} class instead
Moshe Zadkaa1a4b592000-08-25 21:47:56 +0000364which is much more flexible.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000365
Fred Drake6ef871c1998-03-12 06:52:05 +0000366Note that this does not parse nested multipart parts --- use
367\class{FieldStorage} for that.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000368\end{funcdesc}
369
Guido van Rossum81e479a1997-08-25 18:28:03 +0000370\begin{funcdesc}{parse_header}{string}
Fred Drake84e58ab2001-08-11 03:28:41 +0000371Parse a MIME header (such as \mailheader{Content-Type}) into a main
Moshe Zadkaa1a4b592000-08-25 21:47:56 +0000372value and a dictionary of parameters.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000373\end{funcdesc}
374
Guido van Rossum81e479a1997-08-25 18:28:03 +0000375\begin{funcdesc}{test}{}
Fred Drake6ef871c1998-03-12 06:52:05 +0000376Robust test CGI script, usable as main program.
377Writes minimal HTTP headers and formats all information provided to
378the script in HTML form.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000379\end{funcdesc}
380
Guido van Rossum81e479a1997-08-25 18:28:03 +0000381\begin{funcdesc}{print_environ}{}
Fred Drake6ef871c1998-03-12 06:52:05 +0000382Format the shell environment in HTML.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000383\end{funcdesc}
384
Guido van Rossum81e479a1997-08-25 18:28:03 +0000385\begin{funcdesc}{print_form}{form}
Fred Drake6ef871c1998-03-12 06:52:05 +0000386Format a form in HTML.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000387\end{funcdesc}
388
Guido van Rossum81e479a1997-08-25 18:28:03 +0000389\begin{funcdesc}{print_directory}{}
Fred Drake6ef871c1998-03-12 06:52:05 +0000390Format the current directory in HTML.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000391\end{funcdesc}
392
Guido van Rossum81e479a1997-08-25 18:28:03 +0000393\begin{funcdesc}{print_environ_usage}{}
Fred Drake6ef871c1998-03-12 06:52:05 +0000394Print a list of useful (used by CGI) environment variables in
Guido van Rossuma29cc971996-07-30 18:22:07 +0000395HTML.
396\end{funcdesc}
397
Fred Drakecce10901998-03-17 06:33:25 +0000398\begin{funcdesc}{escape}{s\optional{, quote}}
Fred Drake6ef871c1998-03-12 06:52:05 +0000399Convert the characters
400\character{\&}, \character{<} and \character{>} in string \var{s} to
401HTML-safe sequences. Use this if you need to display text that might
402contain such characters in HTML. If the optional flag \var{quote} is
Fred Drake84e58ab2001-08-11 03:28:41 +0000403true, the double-quote character (\character{"}) is also translated;
Fred Drake91f2f262001-07-06 19:28:48 +0000404this helps for inclusion in an HTML attribute value, as in \code{<A
Fred Drake84e58ab2001-08-11 03:28:41 +0000405HREF="...">}. If the value to be qouted might include single- or
406double-quote characters, or both, consider using the
407\function{quoteattr()} function in the \refmodule{xml.sax.saxutils}
408module instead.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000409\end{funcdesc}
410
411
Fred Drake34a37b82001-12-20 17:13:09 +0000412\subsection{Caring about security \label{cgi-security}}
413
414\indexii{CGI}{security}
Guido van Rossuma29cc971996-07-30 18:22:07 +0000415
Fred Drake91f2f262001-07-06 19:28:48 +0000416There's one important rule: if you invoke an external program (via the
417\function{os.system()} or \function{os.popen()} functions. or others
418with similar functionality), make very sure you don't pass arbitrary
419strings received from the client to the shell. This is a well-known
Fred Drake8ee679f2001-07-14 02:50:55 +0000420security hole whereby clever hackers anywhere on the Web can exploit a
Fred Drake91f2f262001-07-06 19:28:48 +0000421gullible CGI script to invoke arbitrary shell commands. Even parts of
422the URL or field names cannot be trusted, since the request doesn't
423have to come from your form!
Guido van Rossuma29cc971996-07-30 18:22:07 +0000424
425To be on the safe side, if you must pass a string gotten from a form
426to a shell command, you should make sure the string contains only
427alphanumeric characters, dashes, underscores, and periods.
428
429
Fred Drakec37b65e2001-11-28 07:26:15 +0000430\subsection{Installing your CGI script on a \UNIX\ system}
Guido van Rossuma29cc971996-07-30 18:22:07 +0000431
432Read the documentation for your HTTP server and check with your local
433system administrator to find the directory where CGI scripts should be
Fred Drakea2e268a1997-12-09 03:28:42 +0000434installed; usually this is in a directory \file{cgi-bin} in the server tree.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000435
436Make sure that your script is readable and executable by ``others''; the
Fred Drake6ef871c1998-03-12 06:52:05 +0000437\UNIX{} file mode should be \code{0755} octal (use \samp{chmod 0755
Fred Drake7eca8e51999-01-18 15:46:02 +0000438\var{filename}}). Make sure that the first line of the script contains
Fred Drake6ef871c1998-03-12 06:52:05 +0000439\code{\#!} starting in column 1 followed by the pathname of the Python
440interpreter, for instance:
Guido van Rossuma29cc971996-07-30 18:22:07 +0000441
Fred Drake19479911998-02-13 06:58:54 +0000442\begin{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000443#!/usr/local/bin/python
Fred Drake19479911998-02-13 06:58:54 +0000444\end{verbatim}
Fred Drake6ef871c1998-03-12 06:52:05 +0000445
Guido van Rossuma29cc971996-07-30 18:22:07 +0000446Make sure the Python interpreter exists and is executable by ``others''.
447
448Make sure that any files your script needs to read or write are
Fred Drake6ef871c1998-03-12 06:52:05 +0000449readable or writable, respectively, by ``others'' --- their mode
450should be \code{0644} for readable and \code{0666} for writable. This
451is because, for security reasons, the HTTP server executes your script
452as user ``nobody'', without any special privileges. It can only read
453(write, execute) files that everybody can read (write, execute). The
454current directory at execution time is also different (it is usually
455the server's cgi-bin directory) and the set of environment variables
Fred Drake8ee679f2001-07-14 02:50:55 +0000456is also different from what you get when you log in. In particular, don't
Fred Drake6ef871c1998-03-12 06:52:05 +0000457count on the shell's search path for executables (\envvar{PATH}) or
458the Python module search path (\envvar{PYTHONPATH}) to be set to
459anything interesting.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000460
461If you need to load modules from a directory which is not on Python's
462default module search path, you can change the path in your script,
Fred Drake91f2f262001-07-06 19:28:48 +0000463before importing other modules. For example:
Guido van Rossuma29cc971996-07-30 18:22:07 +0000464
Fred Drake19479911998-02-13 06:58:54 +0000465\begin{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000466import sys
467sys.path.insert(0, "/usr/home/joe/lib/python")
468sys.path.insert(0, "/usr/local/lib/python")
Fred Drake19479911998-02-13 06:58:54 +0000469\end{verbatim}
Fred Drake6ef871c1998-03-12 06:52:05 +0000470
Guido van Rossuma29cc971996-07-30 18:22:07 +0000471(This way, the directory inserted last will be searched first!)
472
Fred Drakeefc1e0f1998-01-13 19:00:33 +0000473Instructions for non-\UNIX{} systems will vary; check your HTTP server's
Guido van Rossuma29cc971996-07-30 18:22:07 +0000474documentation (it will usually have a section on CGI scripts).
475
476
477\subsection{Testing your CGI script}
478
479Unfortunately, a CGI script will generally not run when you try it
480from the command line, and a script that works perfectly from the
481command line may fail mysteriously when run from the server. There's
482one reason why you should still test your script from the command
Fred Drake6a79be81998-04-03 03:47:03 +0000483line: if it contains a syntax error, the Python interpreter won't
Guido van Rossuma29cc971996-07-30 18:22:07 +0000484execute it at all, and the HTTP server will most likely send a cryptic
485error to the client.
486
487Assuming your script has no syntax errors, yet it does not work, you
Fred Drake6ef871c1998-03-12 06:52:05 +0000488have no choice but to read the next section.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000489
490
Fred Drake34a37b82001-12-20 17:13:09 +0000491\subsection{Debugging CGI scripts} \indexii{CGI}{debugging}
Guido van Rossuma29cc971996-07-30 18:22:07 +0000492
Fred Drake6ef871c1998-03-12 06:52:05 +0000493First of all, check for trivial installation errors --- reading the
Guido van Rossuma29cc971996-07-30 18:22:07 +0000494section above on installing your CGI script carefully can save you a
495lot of time. If you wonder whether you have understood the
496installation procedure correctly, try installing a copy of this module
Fred Drakea2e268a1997-12-09 03:28:42 +0000497file (\file{cgi.py}) as a CGI script. When invoked as a script, the file
Guido van Rossuma29cc971996-07-30 18:22:07 +0000498will dump its environment and the contents of the form in HTML form.
499Give it the right mode etc, and send it a request. If it's installed
Fred Drakea2e268a1997-12-09 03:28:42 +0000500in the standard \file{cgi-bin} directory, it should be possible to send it a
Guido van Rossuma29cc971996-07-30 18:22:07 +0000501request by entering a URL into your browser of the form:
502
Fred Drake19479911998-02-13 06:58:54 +0000503\begin{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000504http://yourhostname/cgi-bin/cgi.py?name=Joe+Blow&addr=At+Home
Fred Drake19479911998-02-13 06:58:54 +0000505\end{verbatim}
Fred Drake6ef871c1998-03-12 06:52:05 +0000506
Guido van Rossuma29cc971996-07-30 18:22:07 +0000507If this gives an error of type 404, the server cannot find the script
508-- perhaps you need to install it in a different directory. If it
Fred Drake91f2f262001-07-06 19:28:48 +0000509gives another error, there's an installation problem that
Guido van Rossuma29cc971996-07-30 18:22:07 +0000510you should fix before trying to go any further. If you get a nicely
511formatted listing of the environment and form content (in this
512example, the fields should be listed as ``addr'' with value ``At Home''
Fred Drakea2e268a1997-12-09 03:28:42 +0000513and ``name'' with value ``Joe Blow''), the \file{cgi.py} script has been
Guido van Rossuma29cc971996-07-30 18:22:07 +0000514installed correctly. If you follow the same procedure for your own
515script, you should now be able to debug it.
516
Fred Drake6ef871c1998-03-12 06:52:05 +0000517The next step could be to call the \module{cgi} module's
518\function{test()} function from your script: replace its main code
519with the single statement
Guido van Rossuma29cc971996-07-30 18:22:07 +0000520
Fred Drake19479911998-02-13 06:58:54 +0000521\begin{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000522cgi.test()
Fred Drake19479911998-02-13 06:58:54 +0000523\end{verbatim}
Fred Drake6ef871c1998-03-12 06:52:05 +0000524
Guido van Rossuma29cc971996-07-30 18:22:07 +0000525This should produce the same results as those gotten from installing
Fred Drakea2e268a1997-12-09 03:28:42 +0000526the \file{cgi.py} file itself.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000527
Fred Drake91f2f262001-07-06 19:28:48 +0000528When an ordinary Python script raises an unhandled exception (for
529whatever reason: of a typo in a module name, a file that can't be
530opened, etc.), the Python interpreter prints a nice traceback and
531exits. While the Python interpreter will still do this when your CGI
532script raises an exception, most likely the traceback will end up in
Fred Drake34a37b82001-12-20 17:13:09 +0000533one of the HTTP server's log files, or be discarded altogether.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000534
535Fortunately, once you have managed to get your script to execute
Fred Drake34a37b82001-12-20 17:13:09 +0000536\emph{some} code, you can easily send tracebacks to the Web browser
537using the \refmodule{cgitb} module. If you haven't done so already,
538just add the line:
Guido van Rossuma29cc971996-07-30 18:22:07 +0000539
Fred Drake19479911998-02-13 06:58:54 +0000540\begin{verbatim}
Fred Drake34a37b82001-12-20 17:13:09 +0000541import cgitb; cgitb.enable()
Fred Drake19479911998-02-13 06:58:54 +0000542\end{verbatim}
Fred Drake6ef871c1998-03-12 06:52:05 +0000543
Fred Drake34a37b82001-12-20 17:13:09 +0000544to the top of your script. Then try running it again; when a
545problem occurs, you should see a detailed report that will
546likely make apparent the cause of the crash.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000547
Fred Drake34a37b82001-12-20 17:13:09 +0000548If you suspect that there may be a problem in importing the
549\refmodule{cgitb} module, you can use an even more robust approach
550(which only uses built-in modules):
Guido van Rossuma29cc971996-07-30 18:22:07 +0000551
Fred Drake19479911998-02-13 06:58:54 +0000552\begin{verbatim}
Guido van Rossume47da0a1997-07-17 16:34:52 +0000553import sys
554sys.stderr = sys.stdout
Moshe Zadkaa1a4b592000-08-25 21:47:56 +0000555print "Content-Type: text/plain"
Guido van Rossume47da0a1997-07-17 16:34:52 +0000556print
557...your code here...
Fred Drake19479911998-02-13 06:58:54 +0000558\end{verbatim}
Fred Drake6ef871c1998-03-12 06:52:05 +0000559
Guido van Rossuma29cc971996-07-30 18:22:07 +0000560This relies on the Python interpreter to print the traceback. The
561content type of the output is set to plain text, which disables all
562HTML processing. If your script works, the raw HTML will be displayed
563by your client. If it raises an exception, most likely after the
564first two lines have been printed, a traceback will be displayed.
Fred Drake34a37b82001-12-20 17:13:09 +0000565Because no HTML interpretation is going on, the traceback will be
Guido van Rossuma29cc971996-07-30 18:22:07 +0000566readable.
567
568
569\subsection{Common problems and solutions}
Guido van Rossum470be141995-03-17 16:07:09 +0000570
571\begin{itemize}
Guido van Rossuma29cc971996-07-30 18:22:07 +0000572\item Most HTTP servers buffer the output from CGI scripts until the
573script is completed. This means that it is not possible to display a
574progress report on the client's display while the script is running.
575
576\item Check the installation instructions above.
577
Fred Drake6ef871c1998-03-12 06:52:05 +0000578\item Check the HTTP server's log files. (\samp{tail -f logfile} in a
579separate window may be useful!)
Guido van Rossuma29cc971996-07-30 18:22:07 +0000580
581\item Always check a script for syntax errors first, by doing something
Fred Drake6ef871c1998-03-12 06:52:05 +0000582like \samp{python script.py}.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000583
Fred Drake34a37b82001-12-20 17:13:09 +0000584\item If your script does not have any syntax errors, try adding
585\samp{import cgitb; cgitb.enable()} to the top of the script.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000586
587\item When invoking external programs, make sure they can be found.
Fred Drake6ef871c1998-03-12 06:52:05 +0000588Usually, this means using absolute path names --- \envvar{PATH} is
589usually not set to a very useful value in a CGI script.
Guido van Rossuma29cc971996-07-30 18:22:07 +0000590
591\item When reading or writing external files, make sure they can be read
592or written by every user on the system.
593
594\item Don't try to give a CGI script a set-uid mode. This doesn't work on
595most systems, and is a security liability as well.
Guido van Rossum470be141995-03-17 16:07:09 +0000596\end{itemize}
597