| \section{\module{rexec} --- | 
 |          Restricted execution framework} | 
 |  | 
 | \declaremodule{standard}{rexec} | 
 | \modulesynopsis{Basic restricted execution framework.} | 
 | \versionchanged[Disabled module]{2.3} | 
 |  | 
 | \begin{notice}[warning] | 
 |   The documentation has been left in place to help in reading old code | 
 |   that uses the module. | 
 | \end{notice} | 
 |  | 
 | This module contains the \class{RExec} class, which supports | 
 | \method{r_eval()}, \method{r_execfile()}, \method{r_exec()}, and | 
 | \method{r_import()} methods, which are restricted versions of the standard | 
 | Python functions \method{eval()}, \method{execfile()} and | 
 | the \keyword{exec} and \keyword{import} statements. | 
 | Code executed in this restricted environment will | 
 | only have access to modules and functions that are deemed safe; you | 
 | can subclass \class{RExec} to add or remove capabilities as desired. | 
 |  | 
 | \begin{notice}[warning] | 
 |   While the \module{rexec} module is designed to perform as described | 
 |   below, it does have a few known vulnerabilities which could be | 
 |   exploited by carefully written code.  Thus it should not be relied | 
 |   upon in situations requiring ``production ready'' security.  In such | 
 |   situations, execution via sub-processes or very careful | 
 |   ``cleansing'' of both code and data to be processed may be | 
 |   necessary.  Alternatively, help in patching known \module{rexec} | 
 |   vulnerabilities would be welcomed. | 
 | \end{notice} | 
 |  | 
 | \begin{notice} | 
 |   The \class{RExec} class can prevent code from performing unsafe | 
 |   operations like reading or writing disk files, or using TCP/IP | 
 |   sockets.  However, it does not protect against code using extremely | 
 |   large amounts of memory or processor time. | 
 | \end{notice} | 
 |  | 
 | \begin{classdesc}{RExec}{\optional{hooks\optional{, verbose}}} | 
 | Returns an instance of the \class{RExec} class.   | 
 |  | 
 | \var{hooks} is an instance of the \class{RHooks} class or a subclass of it. | 
 | If it is omitted or \code{None}, the default \class{RHooks} class is | 
 | instantiated. | 
 | Whenever the \module{rexec} module searches for a module (even a | 
 | built-in one) or reads a module's code, it doesn't actually go out to | 
 | the file system itself.  Rather, it calls methods of an \class{RHooks} | 
 | instance that was passed to or created by its constructor.  (Actually, | 
 | the \class{RExec} object doesn't make these calls --- they are made by | 
 | a module loader object that's part of the \class{RExec} object.  This | 
 | allows another level of flexibility, which can be useful when changing | 
 | the mechanics of \keyword{import} within the restricted environment.) | 
 |  | 
 | By providing an alternate \class{RHooks} object, we can control the | 
 | file system accesses made to import a module, without changing the | 
 | actual algorithm that controls the order in which those accesses are | 
 | made.  For instance, we could substitute an \class{RHooks} object that | 
 | passes all filesystem requests to a file server elsewhere, via some | 
 | RPC mechanism such as ILU.  Grail's applet loader uses this to support | 
 | importing applets from a URL for a directory. | 
 |  | 
 | If \var{verbose} is true, additional debugging output may be sent to | 
 | standard output. | 
 | \end{classdesc} | 
 |  | 
 | It is important to be aware that code running in a restricted | 
 | environment can still call the \function{sys.exit()} function.  To | 
 | disallow restricted code from exiting the interpreter, always protect | 
 | calls that cause restricted code to run with a | 
 | \keyword{try}/\keyword{except} statement that catches the | 
 | \exception{SystemExit} exception.  Removing the \function{sys.exit()} | 
 | function from the restricted environment is not sufficient --- the | 
 | restricted code could still use \code{raise SystemExit}.  Removing | 
 | \exception{SystemExit} is not a reasonable option; some library code | 
 | makes use of this and would break were it not available. | 
 |  | 
 |  | 
 | \begin{seealso} | 
 |   \seetitle[http://grail.sourceforge.net/]{Grail Home Page}{Grail is a | 
 |             Web browser written entirely in Python.  It uses the | 
 |             \module{rexec} module as a foundation for supporting | 
 |             Python applets, and can be used as an example usage of | 
 |             this module.} | 
 | \end{seealso} | 
 |  | 
 |  | 
 | \subsection{RExec Objects \label{rexec-objects}} | 
 |  | 
 | \class{RExec} instances support the following methods: | 
 |  | 
 | \begin{methoddesc}{r_eval}{code} | 
 | \var{code} must either be a string containing a Python expression, or | 
 | a compiled code object, which will be evaluated in the restricted | 
 | environment's \module{__main__} module.  The value of the expression or | 
 | code object will be returned. | 
 | \end{methoddesc} | 
 |  | 
 | \begin{methoddesc}{r_exec}{code} | 
 | \var{code} must either be a string containing one or more lines of | 
 | Python code, or a compiled code object, which will be executed in the | 
 | restricted environment's \module{__main__} module. | 
 | \end{methoddesc} | 
 |  | 
 | \begin{methoddesc}{r_execfile}{filename} | 
 | Execute the Python code contained in the file \var{filename} in the | 
 | restricted environment's \module{__main__} module. | 
 | \end{methoddesc} | 
 |  | 
 | Methods whose names begin with \samp{s_} are similar to the functions | 
 | beginning with \samp{r_}, but the code will be granted access to | 
 | restricted versions of the standard I/O streams \code{sys.stdin}, | 
 | \code{sys.stderr}, and \code{sys.stdout}. | 
 |  | 
 | \begin{methoddesc}{s_eval}{code} | 
 | \var{code} must be a string containing a Python expression, which will | 
 | be evaluated in the restricted environment.   | 
 | \end{methoddesc} | 
 |  | 
 | \begin{methoddesc}{s_exec}{code} | 
 | \var{code} must be a string containing one or more lines of Python code, | 
 | which will be executed in the restricted environment.   | 
 | \end{methoddesc} | 
 |  | 
 | \begin{methoddesc}{s_execfile}{code} | 
 | Execute the Python code contained in the file \var{filename} in the | 
 | restricted environment. | 
 | \end{methoddesc} | 
 |  | 
 | \class{RExec} objects must also support various methods which will be | 
 | implicitly called by code executing in the restricted environment. | 
 | Overriding these methods in a subclass is used to change the policies | 
 | enforced by a restricted environment. | 
 |  | 
 | \begin{methoddesc}{r_import}{modulename\optional{, globals\optional{, | 
 |                              locals\optional{, fromlist}}}} | 
 | Import the module \var{modulename}, raising an \exception{ImportError} | 
 | exception if the module is considered unsafe. | 
 | \end{methoddesc} | 
 |  | 
 | \begin{methoddesc}{r_open}{filename\optional{, mode\optional{, bufsize}}} | 
 | Method called when \function{open()} is called in the restricted | 
 | environment.  The arguments are identical to those of \function{open()}, | 
 | and a file object (or a class instance compatible with file objects) | 
 | should be returned.  \class{RExec}'s default behaviour is allow opening | 
 | any file for reading, but forbidding any attempt to write a file.  See | 
 | the example below for an implementation of a less restrictive | 
 | \method{r_open()}. | 
 | \end{methoddesc} | 
 |  | 
 | \begin{methoddesc}{r_reload}{module} | 
 | Reload the module object \var{module}, re-parsing and re-initializing it.   | 
 | \end{methoddesc} | 
 |  | 
 | \begin{methoddesc}{r_unload}{module} | 
 | Unload the module object \var{module} (remove it from the | 
 | restricted environment's \code{sys.modules} dictionary). | 
 | \end{methoddesc} | 
 |  | 
 | And their equivalents with access to restricted standard I/O streams: | 
 |  | 
 | \begin{methoddesc}{s_import}{modulename\optional{, globals\optional{, | 
 |                              locals\optional{, fromlist}}}} | 
 | Import the module \var{modulename}, raising an \exception{ImportError} | 
 | exception if the module is considered unsafe. | 
 | \end{methoddesc} | 
 |  | 
 | \begin{methoddesc}{s_reload}{module} | 
 | Reload the module object \var{module}, re-parsing and re-initializing it.   | 
 | \end{methoddesc} | 
 |  | 
 | \begin{methoddesc}{s_unload}{module} | 
 | Unload the module object \var{module}.    | 
 | % XXX what are the semantics of this?   | 
 | \end{methoddesc} | 
 |  | 
 |  | 
 | \subsection{Defining restricted environments \label{rexec-extension}} | 
 |  | 
 | The \class{RExec} class has the following class attributes, which are | 
 | used by the \method{__init__()} method.  Changing them on an existing | 
 | instance won't have any effect; instead, create a subclass of | 
 | \class{RExec} and assign them new values in the class definition. | 
 | Instances of the new class will then use those new values.  All these | 
 | attributes are tuples of strings. | 
 |  | 
 | \begin{memberdesc}{nok_builtin_names} | 
 | Contains the names of built-in functions which will \emph{not} be | 
 | available to programs running in the restricted environment.  The | 
 | value for \class{RExec} is \code{('open', 'reload', '__import__')}. | 
 | (This gives the exceptions, because by far the majority of built-in | 
 | functions are harmless.  A subclass that wants to override this | 
 | variable should probably start with the value from the base class and | 
 | concatenate additional forbidden functions --- when new dangerous | 
 | built-in functions are added to Python, they will also be added to | 
 | this module.) | 
 | \end{memberdesc} | 
 |  | 
 | \begin{memberdesc}{ok_builtin_modules} | 
 | Contains the names of built-in modules which can be safely imported. | 
 | The value for \class{RExec} is \code{('audioop', 'array', 'binascii', | 
 | 'cmath', 'errno', 'imageop', 'marshal', 'math', 'md5', 'operator', | 
 | 'parser', 'regex', 'select', 'sha', '_sre', 'strop', | 
 | 'struct', 'time')}.  A similar remark about overriding this variable | 
 | applies --- use the value from the base class as a starting point. | 
 | \end{memberdesc} | 
 |  | 
 | \begin{memberdesc}{ok_path} | 
 | Contains the directories which will be searched when an \keyword{import} | 
 | is performed in the restricted environment.   | 
 | The value for \class{RExec} is the same as \code{sys.path} (at the time | 
 | the module is loaded) for unrestricted code. | 
 | \end{memberdesc} | 
 |  | 
 | \begin{memberdesc}{ok_posix_names} | 
 | % Should this be called ok_os_names? | 
 | Contains the names of the functions in the \refmodule{os} module which will be | 
 | available to programs running in the restricted environment.  The | 
 | value for \class{RExec} is \code{('error', 'fstat', 'listdir', | 
 | 'lstat', 'readlink', 'stat', 'times', 'uname', 'getpid', 'getppid', | 
 | 'getcwd', 'getuid', 'getgid', 'geteuid', 'getegid')}. | 
 | \end{memberdesc} | 
 |  | 
 | \begin{memberdesc}{ok_sys_names} | 
 | Contains the names of the functions and variables in the \refmodule{sys} | 
 | module which will be available to programs running in the restricted | 
 | environment.  The value for \class{RExec} is \code{('ps1', 'ps2', | 
 | 'copyright', 'version', 'platform', 'exit', 'maxint')}. | 
 | \end{memberdesc} | 
 |  | 
 | \begin{memberdesc}{ok_file_types} | 
 | Contains the file types from which modules are allowed to be loaded. | 
 | Each file type is an integer constant defined in the \refmodule{imp} module. | 
 | The meaningful values are \constant{PY_SOURCE}, \constant{PY_COMPILED}, and | 
 | \constant{C_EXTENSION}.  The value for \class{RExec} is \code{(C_EXTENSION, | 
 | PY_SOURCE)}.  Adding \constant{PY_COMPILED} in subclasses is not recommended; | 
 | an attacker could exit the restricted execution mode by putting a forged | 
 | byte-compiled file (\file{.pyc}) anywhere in your file system, for example | 
 | by writing it to \file{/tmp} or uploading it to the \file{/incoming} | 
 | directory of your public FTP server. | 
 | \end{memberdesc} | 
 |  | 
 |  | 
 | \subsection{An example} | 
 |  | 
 | Let us say that we want a slightly more relaxed policy than the | 
 | standard \class{RExec} class.  For example, if we're willing to allow | 
 | files in \file{/tmp} to be written, we can subclass the \class{RExec} | 
 | class: | 
 |  | 
 | \begin{verbatim} | 
 | class TmpWriterRExec(rexec.RExec): | 
 |     def r_open(self, file, mode='r', buf=-1): | 
 |         if mode in ('r', 'rb'): | 
 |             pass | 
 |         elif mode in ('w', 'wb', 'a', 'ab'): | 
 |             # check filename : must begin with /tmp/ | 
 |             if file[:5]!='/tmp/':  | 
 |                 raise IOError, "can't write outside /tmp" | 
 |             elif (string.find(file, '/../') >= 0 or | 
 |                  file[:3] == '../' or file[-3:] == '/..'): | 
 |                 raise IOError, "'..' in filename forbidden" | 
 |         else: raise IOError, "Illegal open() mode" | 
 |         return open(file, mode, buf) | 
 | \end{verbatim} | 
 | % | 
 | Notice that the above code will occasionally forbid a perfectly valid | 
 | filename; for example, code in the restricted environment won't be | 
 | able to open a file called \file{/tmp/foo/../bar}.  To fix this, the | 
 | \method{r_open()} method would have to simplify the filename to | 
 | \file{/tmp/bar}, which would require splitting apart the filename and | 
 | performing various operations on it.  In cases where security is at | 
 | stake, it may be preferable to write simple code which is sometimes | 
 | overly restrictive, instead of more general code that is also more | 
 | complex and may harbor a subtle security hole. |