Guido van Rossum | 470be14 | 1995-03-17 16:07:09 +0000 | [diff] [blame] | 1 | \section{Standard Module \sectcode{cgi}} |
Guido van Rossum | a12ef94 | 1995-02-27 17:53:25 +0000 | [diff] [blame] | 2 | \stmodindex{cgi} |
| 3 | \indexii{WWW}{server} |
| 4 | \indexii{CGI}{protocol} |
| 5 | \indexii{HTTP}{protocol} |
| 6 | \indexii{MIME}{headers} |
| 7 | \index{URL} |
| 8 | |
Guido van Rossum | 8675115 | 1995-02-28 17:14:32 +0000 | [diff] [blame] | 9 | \renewcommand{\indexsubitem}{(in module cgi)} |
| 10 | |
Guido van Rossum | a12ef94 | 1995-02-27 17:53:25 +0000 | [diff] [blame] | 11 | This module makes it easy to write Python scripts that run in a WWW |
| 12 | server using the Common Gateway Interface. It was written by Michael |
| 13 | McLay and subsequently modified by Steve Majewski and Guido van |
| 14 | Rossum. |
| 15 | |
| 16 | When a WWW server finds that a URL contains a reference to a file in a |
| 17 | particular subdirectory (usually \code{/cgibin}), it runs the file as |
| 18 | a subprocess. Information about the request such as the full URL, the |
| 19 | originating host etc., is passed to the subprocess in the shell |
| 20 | environment; additional input from the client may be read from |
| 21 | standard input. Standard output from the subprocess is sent back |
| 22 | across the network to the client as the response from the request. |
| 23 | The CGI protocol describes what the environment variables passed to |
| 24 | the subprocess mean and how the output should be formatted. The |
| 25 | official reference documentation for the CGI protocol can be found on |
| 26 | the World-Wide Web at |
| 27 | \code{<URL:http://hoohoo.ncsa.uiuc.edu/cgi/overview.html>}. The |
| 28 | \code{cgi} module was based on version 1.1 of the protocol and should |
| 29 | also work with version 1.0. |
| 30 | |
| 31 | The \code{cgi} module defines several classes that make it easy to |
| 32 | access the information passed to the subprocess from a Python script; |
| 33 | in particular, it knows how to parse the input sent by an HTML |
| 34 | ``form'' using either a POST or a GET request (these are alternatives |
| 35 | for submitting forms in the HTTP protocol). |
| 36 | |
| 37 | The formatting of the output is so trivial that no additional support |
| 38 | is needed. All you need to do is print a minimal set of MIME headers |
| 39 | describing the output format, followed by a blank line and your actual |
| 40 | output. E.g. if you want to generate HTML, your script could start as |
| 41 | follows: |
| 42 | |
| 43 | \begin{verbatim} |
| 44 | # Header -- one or more lines: |
| 45 | print "Content-type: text/html" |
| 46 | # Blank line separating header from body: |
| 47 | print |
| 48 | # Body, in HTML format: |
| 49 | print "<TITLE>The Amazing SPAM Homepage!</TITLE>" |
| 50 | # etc... |
| 51 | \end{verbatim} |
| 52 | |
| 53 | The server will add some header lines of its own, but it won't touch |
| 54 | the output following the header. |
| 55 | |
| 56 | The \code{cgi} module defines the following functions: |
| 57 | |
| 58 | \begin{funcdesc}{parse}{} |
| 59 | Read and parse the form submitted to the script and return a |
| 60 | dictionary containing the form's fields. This should be called at |
| 61 | most once per script invocation, as it may consume standard input (if |
| 62 | the form was submitted through a POST request). The keys in the |
| 63 | resulting dictionary are the field names used in the submission; the |
| 64 | values are {\em lists} of the field values (since field name may be |
Guido van Rossum | 6c4f003 | 1995-03-07 10:14:09 +0000 | [diff] [blame] | 65 | used multiple times in a single form). \samp{\%} escapes in the |
| 66 | values are translated to their single-character equivalent using |
| 67 | \code{urllib.unquote()}. As a side effect, this function sets |
Guido van Rossum | a12ef94 | 1995-02-27 17:53:25 +0000 | [diff] [blame] | 68 | \code{environ['QUERY_STRING']} to the raw query string, if it isn't |
| 69 | already set. |
| 70 | \end{funcdesc} |
| 71 | |
| 72 | \begin{funcdesc}{print_environ_usage}{} |
| 73 | Print a piece of HTML listing the environment variables that may be |
| 74 | set by the CGI protocol. |
| 75 | This is mainly useful when learning about writing CGI scripts. |
| 76 | \end{funcdesc} |
| 77 | |
| 78 | \begin{funcdesc}{print_environ}{} |
| 79 | Print a piece of HTML text showing the entire contents of the shell |
| 80 | environment. This is mainly useful when debugging a CGI script. |
| 81 | \end{funcdesc} |
| 82 | |
| 83 | \begin{funcdesc}{print_form}{form} |
Guido van Rossum | 6c4f003 | 1995-03-07 10:14:09 +0000 | [diff] [blame] | 84 | Print a piece of HTML text showing the contents of the \var{form} (a |
| 85 | dictionary, an instance of the \code{FormContentDict} class defined |
| 86 | below, or a subclass thereof). |
Guido van Rossum | a12ef94 | 1995-02-27 17:53:25 +0000 | [diff] [blame] | 87 | This is mainly useful when debugging a CGI script. |
| 88 | \end{funcdesc} |
| 89 | |
| 90 | \begin{funcdesc}{escape}{string} |
| 91 | Convert special characters in \var{string} to HTML escapes. In |
| 92 | particular, ``\code{\&}'' is replaced with ``\code{\&}'', |
| 93 | ``\code{<}'' is replaced with ``\code{\<}'', and ``\code{>}'' is |
| 94 | replaced with ``\code{\>}''. This is useful when printing (almost) |
| 95 | arbitrary text in an HTML context. Note that for inclusion in quoted |
| 96 | tag attributes (e.g. \code{<A HREF="...">}), some additional |
| 97 | characters would have to be converted --- in particular the string |
| 98 | quote. There is currently no function that does this. |
| 99 | \end{funcdesc} |
| 100 | |
| 101 | The module defines the following classes. Since the base class |
| 102 | initializes itself by calling \code{parse()}, at most one instance of |
| 103 | at most one of these classes should be created per script invocation: |
| 104 | |
| 105 | \begin{funcdesc}{FormContentDict}{} |
| 106 | This class behaves like a (read-only) dictionary and has the same keys |
| 107 | and values as the dictionary returned by \code{parse()} (i.e. each |
| 108 | field name maps to a list of values). Additionally, it initializes |
| 109 | its data member \code{query_string} to the raw query sent from the |
| 110 | server. |
| 111 | \end{funcdesc} |
| 112 | |
| 113 | \begin{funcdesc}{SvFormContentDict}{} |
| 114 | This class, derived from \code{FormContentDict}, is a little more |
| 115 | user-friendly when you are expecting that each field name is only used |
| 116 | once in the form. When you access for a particular field (using |
| 117 | \code{form[fieldname]}), it will return the string value of that item |
| 118 | if it is unique, or raise \code{IndexError} if the field was specified |
| 119 | more than once in the form. (If the field wasn't specified at all, |
| 120 | \code{KeyError} is raised.) To access fields that are specified |
| 121 | multiple times, use \code{form.getlist(fieldname)}. The |
Guido van Rossum | 8675115 | 1995-02-28 17:14:32 +0000 | [diff] [blame] | 122 | \code{values()} and \code{items()} methods return mixed lists --- |
Guido van Rossum | a12ef94 | 1995-02-27 17:53:25 +0000 | [diff] [blame] | 123 | containing strings for singly-defined fields, and lists of strings for |
| 124 | multiply-defined fields. |
| 125 | \end{funcdesc} |
| 126 | |
| 127 | (It currently defines some more classes, but these are experimental |
| 128 | and/or obsolescent, and are thus not documented --- see the source for |
| 129 | more informations.) |
| 130 | |
| 131 | The module defines the following variable: |
| 132 | |
| 133 | \begin{datadesc}{environ} |
| 134 | The shell environment, exactly as received from the http server. See |
| 135 | the CGI documentation for a description of the various fields. |
| 136 | \end{datadesc} |
Guido van Rossum | 470be14 | 1995-03-17 16:07:09 +0000 | [diff] [blame] | 137 | |
| 138 | \subsection{Example} |
Guido van Rossum | 86cb092 | 1995-03-20 12:59:56 +0000 | [diff] [blame] | 139 | \nodename{CGI Example} |
Guido van Rossum | 470be14 | 1995-03-17 16:07:09 +0000 | [diff] [blame] | 140 | |
| 141 | This example assumes that you have a WWW server up and running, |
| 142 | e.g.\ NCSA's \code{httpd}. |
| 143 | |
| 144 | Place the following file in a convenient spot in the WWW server's |
| 145 | directory tree. E.g., if you place it in the subdirectory \file{test} |
| 146 | of the root directory and call it \file{test.html}, its URL will be |
| 147 | \file{http://\var{yourservername}/test/test.html}. |
| 148 | |
| 149 | \begin{verbatim} |
| 150 | <TITLE>Test Form Input</TITLE> |
| 151 | <H1>Test Form Input</H1> |
| 152 | <FORM METHOD="POST" ACTION="/cgi-bin/test.py"> |
| 153 | <INPUT NAME=Name> (Name)<br> |
| 154 | <INPUT NAME=Address> (Address)<br> |
| 155 | <INPUT TYPE=SUBMIT> |
| 156 | </FORM> |
| 157 | \end{verbatim} |
| 158 | |
| 159 | Selecting this file's URL from a forms-capable browser such as Mosaic |
| 160 | or Netscape will bring up a simple form with two text input fields and |
| 161 | a ``submit'' button. |
| 162 | |
| 163 | But wait. Before pressing ``submit'', a script that responds to the |
| 164 | form must also be installed. The test file as shown assumes that the |
| 165 | script is called \file{test.py} and lives in the server's |
| 166 | \code{cgi-bin} directory. Here's the test script: |
| 167 | |
| 168 | \begin{verbatim} |
| 169 | #!/usr/local/bin/python |
| 170 | |
| 171 | import cgi |
| 172 | |
| 173 | print "Content-type: text/html" |
| 174 | print # End of headers! |
| 175 | print "<TITLE>Test Form Output</TITLE>" |
| 176 | print "<H1>Test Form Output</H1>" |
| 177 | |
| 178 | form = cgi.SvFormContentDict() # Load the form |
| 179 | |
| 180 | name = addr = None # Default: no name and address |
| 181 | |
| 182 | # Extract name and address from the form, if given |
| 183 | |
| 184 | if form.has_key('Name'): |
| 185 | name = form['Name'] |
| 186 | if form.has_key('Address'): |
| 187 | addr = form['Address'] |
| 188 | |
| 189 | # Print an unnumbered list of the name and address, if present |
| 190 | |
| 191 | print "<UL>" |
| 192 | if name is not None: |
| 193 | print "<LI>Name:", cgi.escape(name) |
| 194 | if addr is not None: |
| 195 | print "<LI>Address:", cgi.escape(addr) |
| 196 | print "</UL>" |
| 197 | \end{verbatim} |
| 198 | |
| 199 | The script should be made executable (\samp{chmod +x \var{script}}). |
| 200 | If the Python interpreter is not located at |
| 201 | \file{/usr/local/bin/python} but somewhere else, the first line of the |
| 202 | script should be modified accordingly. |
| 203 | |
| 204 | Now that everything is installed correctly, we can try out the form. |
| 205 | Bring up the test form in your WWW browser, fill in a name and address |
| 206 | in the form, and press the ``submit'' button. The script should now |
| 207 | run and its output is sent back to your browser. This should roughly |
| 208 | look as follows: |
| 209 | |
| 210 | \strong{Test Form Output} |
| 211 | |
| 212 | \begin{itemize} |
| 213 | \item Name: \var{the name you entered} |
| 214 | \item Address: \var{the address you entered} |
| 215 | \end{itemize} |
| 216 | |
| 217 | If you didn't enter a name or address, the corresponding line will be |
| 218 | missing (since the browser doesn't send empty form fields to the |
| 219 | server). |