Doc/lib/librexec.tex - platform/external/python/cpython2 - Gitiles

 \section{\module{rexec} ---
          Restricted execution framework}

 \declaremodule{standard}{rexec}
 \modulesynopsis{Basic restricted execution framework.}
 \versionchanged[Disabled module]{2.3}

 \begin{notice}[warning]
   The documentation has been left in place to help in reading old code
   that uses the module.
 \end{notice}

 This module contains the \class{RExec} class, which supports
 \method{r_eval()}, \method{r_execfile()}, \method{r_exec()}, and
 \method{r_import()} methods, which are restricted versions of the standard
 Python functions \method{eval()}, \method{execfile()} and
 the \keyword{exec} and \keyword{import} statements.
 Code executed in this restricted environment will
 only have access to modules and functions that are deemed safe; you
 can subclass \class{RExec} to add or remove capabilities as desired.

 \begin{notice}[warning]
   While the \module{rexec} module is designed to perform as described
   below, it does have a few known vulnerabilities which could be
   exploited by carefully written code.  Thus it should not be relied
   upon in situations requiring ``production ready'' security.  In such
   situations, execution via sub-processes or very careful
   ``cleansing'' of both code and data to be processed may be
   necessary.  Alternatively, help in patching known \module{rexec}
   vulnerabilities would be welcomed.
 \end{notice}

 \begin{notice}
   The \class{RExec} class can prevent code from performing unsafe
   operations like reading or writing disk files, or using TCP/IP
   sockets.  However, it does not protect against code using extremely
   large amounts of memory or processor time.
 \end{notice}

 \begin{classdesc}{RExec}{\optional{hooks\optional{, verbose}}}
 Returns an instance of the \class{RExec} class.

 \var{hooks} is an instance of the \class{RHooks} class or a subclass of it.
 If it is omitted or \code{None}, the default \class{RHooks} class is
 instantiated.
 Whenever the \module{rexec} module searches for a module (even a
 built-in one) or reads a module's code, it doesn't actually go out to
 the file system itself.  Rather, it calls methods of an \class{RHooks}
 instance that was passed to or created by its constructor.  (Actually,
 the \class{RExec} object doesn't make these calls --- they are made by
 a module loader object that's part of the \class{RExec} object.  This
 allows another level of flexibility, which can be useful when changing
 the mechanics of \keyword{import} within the restricted environment.)

 By providing an alternate \class{RHooks} object, we can control the
 file system accesses made to import a module, without changing the
 actual algorithm that controls the order in which those accesses are
 made.  For instance, we could substitute an \class{RHooks} object that
 passes all filesystem requests to a file server elsewhere, via some
 RPC mechanism such as ILU.  Grail's applet loader uses this to support
 importing applets from a URL for a directory.

 If \var{verbose} is true, additional debugging output may be sent to
 standard output.
 \end{classdesc}

 It is important to be aware that code running in a restricted
 environment can still call the \function{sys.exit()} function.  To
 disallow restricted code from exiting the interpreter, always protect
 calls that cause restricted code to run with a
 \keyword{try}/\keyword{except} statement that catches the
 \exception{SystemExit} exception.  Removing the \function{sys.exit()}
 function from the restricted environment is not sufficient --- the
 restricted code could still use \code{raise SystemExit}.  Removing
 \exception{SystemExit} is not a reasonable option; some library code
 makes use of this and would break were it not available.


 \begin{seealso}
   \seetitle[http://grail.sourceforge.net/]{Grail Home Page}{Grail is a
             Web browser written entirely in Python.  It uses the
             \module{rexec} module as a foundation for supporting
             Python applets, and can be used as an example usage of
             this module.}
 \end{seealso}


 \subsection{RExec Objects \label{rexec-objects}}

 \class{RExec} instances support the following methods:

 \begin{methoddesc}{r_eval}{code}
 \var{code} must either be a string containing a Python expression, or
 a compiled code object, which will be evaluated in the restricted
 environment's \module{__main__} module.  The value of the expression or
 code object will be returned.
 \end{methoddesc}

 \begin{methoddesc}{r_exec}{code}
 \var{code} must either be a string containing one or more lines of
 Python code, or a compiled code object, which will be executed in the
 restricted environment's \module{__main__} module.
 \end{methoddesc}

 \begin{methoddesc}{r_execfile}{filename}
 Execute the Python code contained in the file \var{filename} in the
 restricted environment's \module{__main__} module.
 \end{methoddesc}

 Methods whose names begin with \samp{s_} are similar to the functions
 beginning with \samp{r_}, but the code will be granted access to
 restricted versions of the standard I/O streams \code{sys.stdin},
 \code{sys.stderr}, and \code{sys.stdout}.

 \begin{methoddesc}{s_eval}{code}
 \var{code} must be a string containing a Python expression, which will
 be evaluated in the restricted environment.
 \end{methoddesc}

 \begin{methoddesc}{s_exec}{code}
 \var{code} must be a string containing one or more lines of Python code,
 which will be executed in the restricted environment.
 \end{methoddesc}

 \begin{methoddesc}{s_execfile}{code}
 Execute the Python code contained in the file \var{filename} in the
 restricted environment.
 \end{methoddesc}

 \class{RExec} objects must also support various methods which will be
 implicitly called by code executing in the restricted environment.
 Overriding these methods in a subclass is used to change the policies
 enforced by a restricted environment.

 \begin{methoddesc}{r_import}{modulename\optional{, globals\optional{,
                              locals\optional{, fromlist}}}}
 Import the module \var{modulename}, raising an \exception{ImportError}
 exception if the module is considered unsafe.
 \end{methoddesc}

 \begin{methoddesc}{r_open}{filename\optional{, mode\optional{, bufsize}}}
 Method called when \function{open()} is called in the restricted
 environment.  The arguments are identical to those of \function{open()},
 and a file object (or a class instance compatible with file objects)
 should be returned.  \class{RExec}'s default behaviour is allow opening
 any file for reading, but forbidding any attempt to write a file.  See
 the example below for an implementation of a less restrictive
 \method{r_open()}.
 \end{methoddesc}

 \begin{methoddesc}{r_reload}{module}
 Reload the module object \var{module}, re-parsing and re-initializing it.
 \end{methoddesc}

 \begin{methoddesc}{r_unload}{module}
 Unload the module object \var{module} (remove it from the
 restricted environment's \code{sys.modules} dictionary).
 \end{methoddesc}

 And their equivalents with access to restricted standard I/O streams:

 \begin{methoddesc}{s_import}{modulename\optional{, globals\optional{,
                              locals\optional{, fromlist}}}}
 Import the module \var{modulename}, raising an \exception{ImportError}
 exception if the module is considered unsafe.
 \end{methoddesc}

 \begin{methoddesc}{s_reload}{module}
 Reload the module object \var{module}, re-parsing and re-initializing it.
 \end{methoddesc}

 \begin{methoddesc}{s_unload}{module}
 Unload the module object \var{module}.
 % XXX what are the semantics of this?
 \end{methoddesc}


 \subsection{Defining restricted environments \label{rexec-extension}}

 The \class{RExec} class has the following class attributes, which are
 used by the \method{__init__()} method.  Changing them on an existing
 instance won't have any effect; instead, create a subclass of
 \class{RExec} and assign them new values in the class definition.
 Instances of the new class will then use those new values.  All these
 attributes are tuples of strings.

 \begin{memberdesc}{nok_builtin_names}
 Contains the names of built-in functions which will \emph{not} be
 available to programs running in the restricted environment.  The
 value for \class{RExec} is \code{('open', 'reload', '__import__')}.
 (This gives the exceptions, because by far the majority of built-in
 functions are harmless.  A subclass that wants to override this
 variable should probably start with the value from the base class and
 concatenate additional forbidden functions --- when new dangerous
 built-in functions are added to Python, they will also be added to
 this module.)
 \end{memberdesc}

 \begin{memberdesc}{ok_builtin_modules}
 Contains the names of built-in modules which can be safely imported.
 The value for \class{RExec} is \code{('audioop', 'array', 'binascii',
 'cmath', 'errno', 'imageop', 'marshal', 'math', 'md5', 'operator',
 'parser', 'regex', 'select', 'sha', '_sre', 'strop',
 'struct', 'time')}.  A similar remark about overriding this variable
 applies --- use the value from the base class as a starting point.
 \end{memberdesc}

 \begin{memberdesc}{ok_path}
 Contains the directories which will be searched when an \keyword{import}
 is performed in the restricted environment.
 The value for \class{RExec} is the same as \code{sys.path} (at the time
 the module is loaded) for unrestricted code.
 \end{memberdesc}

 \begin{memberdesc}{ok_posix_names}
 % Should this be called ok_os_names?
 Contains the names of the functions in the \refmodule{os} module which will be
 available to programs running in the restricted environment.  The
 value for \class{RExec} is \code{('error', 'fstat', 'listdir',
 'lstat', 'readlink', 'stat', 'times', 'uname', 'getpid', 'getppid',
 'getcwd', 'getuid', 'getgid', 'geteuid', 'getegid')}.
 \end{memberdesc}

 \begin{memberdesc}{ok_sys_names}
 Contains the names of the functions and variables in the \refmodule{sys}
 module which will be available to programs running in the restricted
 environment.  The value for \class{RExec} is \code{('ps1', 'ps2',
 'copyright', 'version', 'platform', 'exit', 'maxint')}.
 \end{memberdesc}

 \begin{memberdesc}{ok_file_types}
 Contains the file types from which modules are allowed to be loaded.
 Each file type is an integer constant defined in the \refmodule{imp} module.
 The meaningful values are \constant{PY_SOURCE}, \constant{PY_COMPILED}, and
 \constant{C_EXTENSION}.  The value for \class{RExec} is \code{(C_EXTENSION,
 PY_SOURCE)}.  Adding \constant{PY_COMPILED} in subclasses is not recommended;
 an attacker could exit the restricted execution mode by putting a forged
 byte-compiled file (\file{.pyc}) anywhere in your file system, for example
 by writing it to \file{/tmp} or uploading it to the \file{/incoming}
 directory of your public FTP server.
 \end{memberdesc}


 \subsection{An example}

 Let us say that we want a slightly more relaxed policy than the
 standard \class{RExec} class.  For example, if we're willing to allow
 files in \file{/tmp} to be written, we can subclass the \class{RExec}
 class:

 \begin{verbatim}
 class TmpWriterRExec(rexec.RExec):
     def r_open(self, file, mode='r', buf=-1):
         if mode in ('r', 'rb'):
             pass
         elif mode in ('w', 'wb', 'a', 'ab'):
             # check filename : must begin with /tmp/
             if file[:5]!='/tmp/':
                 raise IOError, "can't write outside /tmp"
             elif (string.find(file, '/../') >= 0 or
                  file[:3] == '../' or file[-3:] == '/..'):
                 raise IOError, "'..' in filename forbidden"
         else: raise IOError, "Illegal open() mode"
         return open(file, mode, buf)
 \end{verbatim}
 %
 Notice that the above code will occasionally forbid a perfectly valid
 filename; for example, code in the restricted environment won't be
 able to open a file called \file{/tmp/foo/../bar}.  To fix this, the
 \method{r_open()} method would have to simplify the filename to
 \file{/tmp/bar}, which would require splitting apart the filename and
 performing various operations on it.  In cases where security is at
 stake, it may be preferable to write simple code which is sometimes
 overly restrictive, instead of more general code that is also more
 complex and may harbor a subtle security hole.
	\section{\module{rexec} ---
	Restricted execution framework}

	\declaremodule{standard}{rexec}
	\modulesynopsis{Basic restricted execution framework.}
	\versionchanged[Disabled module]{2.3}

	\begin{notice}[warning]
	The documentation has been left in place to help in reading old code
	that uses the module.
	\end{notice}

	This module contains the \class{RExec} class, which supports
	\method{r_eval()}, \method{r_execfile()}, \method{r_exec()}, and
	\method{r_import()} methods, which are restricted versions of the standard
	Python functions \method{eval()}, \method{execfile()} and
	the \keyword{exec} and \keyword{import} statements.
	Code executed in this restricted environment will
	only have access to modules and functions that are deemed safe; you
	can subclass \class{RExec} to add or remove capabilities as desired.

	\begin{notice}[warning]
	While the \module{rexec} module is designed to perform as described
	below, it does have a few known vulnerabilities which could be
	exploited by carefully written code. Thus it should not be relied
	upon in situations requiring ``production ready'' security. In such
	situations, execution via sub-processes or very careful
	``cleansing'' of both code and data to be processed may be
	necessary. Alternatively, help in patching known \module{rexec}
	vulnerabilities would be welcomed.
	\end{notice}

	\begin{notice}
	The \class{RExec} class can prevent code from performing unsafe
	operations like reading or writing disk files, or using TCP/IP
	sockets. However, it does not protect against code using extremely
	large amounts of memory or processor time.
	\end{notice}

	\begin{classdesc}{RExec}{\optional{hooks\optional{, verbose}}}
	Returns an instance of the \class{RExec} class.

	\var{hooks} is an instance of the \class{RHooks} class or a subclass of it.
	If it is omitted or \code{None}, the default \class{RHooks} class is
	instantiated.
	Whenever the \module{rexec} module searches for a module (even a
	built-in one) or reads a module's code, it doesn't actually go out to
	the file system itself. Rather, it calls methods of an \class{RHooks}
	instance that was passed to or created by its constructor. (Actually,
	the \class{RExec} object doesn't make these calls --- they are made by
	a module loader object that's part of the \class{RExec} object. This
	allows another level of flexibility, which can be useful when changing
	the mechanics of \keyword{import} within the restricted environment.)

	By providing an alternate \class{RHooks} object, we can control the
	file system accesses made to import a module, without changing the
	actual algorithm that controls the order in which those accesses are
	made. For instance, we could substitute an \class{RHooks} object that
	passes all filesystem requests to a file server elsewhere, via some
	RPC mechanism such as ILU. Grail's applet loader uses this to support
	importing applets from a URL for a directory.

	If \var{verbose} is true, additional debugging output may be sent to
	standard output.
	\end{classdesc}

	It is important to be aware that code running in a restricted
	environment can still call the \function{sys.exit()} function. To
	disallow restricted code from exiting the interpreter, always protect
	calls that cause restricted code to run with a
	\keyword{try}/\keyword{except} statement that catches the
	\exception{SystemExit} exception. Removing the \function{sys.exit()}
	function from the restricted environment is not sufficient --- the
	restricted code could still use \code{raise SystemExit}. Removing
	\exception{SystemExit} is not a reasonable option; some library code
	makes use of this and would break were it not available.


	\begin{seealso}
	\seetitle[http://grail.sourceforge.net/]{Grail Home Page}{Grail is a
	Web browser written entirely in Python. It uses the
	\module{rexec} module as a foundation for supporting
	Python applets, and can be used as an example usage of
	this module.}
	\end{seealso}


	\subsection{RExec Objects \label{rexec-objects}}

	\class{RExec} instances support the following methods:

	\begin{methoddesc}{r_eval}{code}
	\var{code} must either be a string containing a Python expression, or
	a compiled code object, which will be evaluated in the restricted
	environment's \module{__main__} module. The value of the expression or
	code object will be returned.
	\end{methoddesc}

	\begin{methoddesc}{r_exec}{code}
	\var{code} must either be a string containing one or more lines of
	Python code, or a compiled code object, which will be executed in the
	restricted environment's \module{__main__} module.
	\end{methoddesc}

	\begin{methoddesc}{r_execfile}{filename}
	Execute the Python code contained in the file \var{filename} in the
	restricted environment's \module{__main__} module.
	\end{methoddesc}

	Methods whose names begin with \samp{s_} are similar to the functions
	beginning with \samp{r_}, but the code will be granted access to
	restricted versions of the standard I/O streams \code{sys.stdin},
	\code{sys.stderr}, and \code{sys.stdout}.

	\begin{methoddesc}{s_eval}{code}
	\var{code} must be a string containing a Python expression, which will
	be evaluated in the restricted environment.
	\end{methoddesc}

	\begin{methoddesc}{s_exec}{code}
	\var{code} must be a string containing one or more lines of Python code,
	which will be executed in the restricted environment.
	\end{methoddesc}

	\begin{methoddesc}{s_execfile}{code}
	Execute the Python code contained in the file \var{filename} in the
	restricted environment.
	\end{methoddesc}

	\class{RExec} objects must also support various methods which will be
	implicitly called by code executing in the restricted environment.
	Overriding these methods in a subclass is used to change the policies
	enforced by a restricted environment.

	\begin{methoddesc}{r_import}{modulename\optional{, globals\optional{,
	locals\optional{, fromlist}}}}
	Import the module \var{modulename}, raising an \exception{ImportError}
	exception if the module is considered unsafe.
	\end{methoddesc}

	\begin{methoddesc}{r_open}{filename\optional{, mode\optional{, bufsize}}}
	Method called when \function{open()} is called in the restricted
	environment. The arguments are identical to those of \function{open()},
	and a file object (or a class instance compatible with file objects)
	should be returned. \class{RExec}'s default behaviour is allow opening
	any file for reading, but forbidding any attempt to write a file. See
	the example below for an implementation of a less restrictive
	\method{r_open()}.
	\end{methoddesc}

	\begin{methoddesc}{r_reload}{module}
	Reload the module object \var{module}, re-parsing and re-initializing it.
	\end{methoddesc}

	\begin{methoddesc}{r_unload}{module}
	Unload the module object \var{module} (remove it from the
	restricted environment's \code{sys.modules} dictionary).
	\end{methoddesc}

	And their equivalents with access to restricted standard I/O streams:

	\begin{methoddesc}{s_import}{modulename\optional{, globals\optional{,
	locals\optional{, fromlist}}}}
	Import the module \var{modulename}, raising an \exception{ImportError}
	exception if the module is considered unsafe.
	\end{methoddesc}

	\begin{methoddesc}{s_reload}{module}
	Reload the module object \var{module}, re-parsing and re-initializing it.
	\end{methoddesc}

	\begin{methoddesc}{s_unload}{module}
	Unload the module object \var{module}.
	% XXX what are the semantics of this?
	\end{methoddesc}


	\subsection{Defining restricted environments \label{rexec-extension}}

	The \class{RExec} class has the following class attributes, which are
	used by the \method{__init__()} method. Changing them on an existing
	instance won't have any effect; instead, create a subclass of
	\class{RExec} and assign them new values in the class definition.
	Instances of the new class will then use those new values. All these
	attributes are tuples of strings.

	\begin{memberdesc}{nok_builtin_names}
	Contains the names of built-in functions which will \emph{not} be
	available to programs running in the restricted environment. The
	value for \class{RExec} is \code{('open', 'reload', '__import__')}.
	(This gives the exceptions, because by far the majority of built-in
	functions are harmless. A subclass that wants to override this
	variable should probably start with the value from the base class and
	concatenate additional forbidden functions --- when new dangerous
	built-in functions are added to Python, they will also be added to
	this module.)
	\end{memberdesc}

	\begin{memberdesc}{ok_builtin_modules}
	Contains the names of built-in modules which can be safely imported.
	The value for \class{RExec} is \code{('audioop', 'array', 'binascii',
	'cmath', 'errno', 'imageop', 'marshal', 'math', 'md5', 'operator',
	'parser', 'regex', 'select', 'sha', '_sre', 'strop',
	'struct', 'time')}. A similar remark about overriding this variable
	applies --- use the value from the base class as a starting point.
	\end{memberdesc}

	\begin{memberdesc}{ok_path}
	Contains the directories which will be searched when an \keyword{import}
	is performed in the restricted environment.
	The value for \class{RExec} is the same as \code{sys.path} (at the time
	the module is loaded) for unrestricted code.
	\end{memberdesc}

	\begin{memberdesc}{ok_posix_names}
	% Should this be called ok_os_names?
	Contains the names of the functions in the \refmodule{os} module which will be
	available to programs running in the restricted environment. The
	value for \class{RExec} is \code{('error', 'fstat', 'listdir',
	'lstat', 'readlink', 'stat', 'times', 'uname', 'getpid', 'getppid',
	'getcwd', 'getuid', 'getgid', 'geteuid', 'getegid')}.
	\end{memberdesc}

	\begin{memberdesc}{ok_sys_names}
	Contains the names of the functions and variables in the \refmodule{sys}
	module which will be available to programs running in the restricted
	environment. The value for \class{RExec} is \code{('ps1', 'ps2',
	'copyright', 'version', 'platform', 'exit', 'maxint')}.
	\end{memberdesc}

	\begin{memberdesc}{ok_file_types}
	Contains the file types from which modules are allowed to be loaded.
	Each file type is an integer constant defined in the \refmodule{imp} module.
	The meaningful values are \constant{PY_SOURCE}, \constant{PY_COMPILED}, and
	\constant{C_EXTENSION}. The value for \class{RExec} is \code{(C_EXTENSION,
	PY_SOURCE)}. Adding \constant{PY_COMPILED} in subclasses is not recommended;
	an attacker could exit the restricted execution mode by putting a forged
	byte-compiled file (\file{.pyc}) anywhere in your file system, for example
	by writing it to \file{/tmp} or uploading it to the \file{/incoming}
	directory of your public FTP server.
	\end{memberdesc}


	\subsection{An example}

	Let us say that we want a slightly more relaxed policy than the
	standard \class{RExec} class. For example, if we're willing to allow
	files in \file{/tmp} to be written, we can subclass the \class{RExec}
	class:

	\begin{verbatim}
	class TmpWriterRExec(rexec.RExec):
	def r_open(self, file, mode='r', buf=-1):
	if mode in ('r', 'rb'):
	pass
	elif mode in ('w', 'wb', 'a', 'ab'):
	# check filename : must begin with /tmp/
	if file[:5]!='/tmp/':
	raise IOError, "can't write outside /tmp"
	elif (string.find(file, '/../') >= 0 or
	file[:3] == '../' or file[-3:] == '/..'):
	raise IOError, "'..' in filename forbidden"
	else: raise IOError, "Illegal open() mode"
	return open(file, mode, buf)
	\end{verbatim}
	%
	Notice that the above code will occasionally forbid a perfectly valid
	filename; for example, code in the restricted environment won't be
	able to open a file called \file{/tmp/foo/../bar}. To fix this, the
	\method{r_open()} method would have to simplify the filename to
	\file{/tmp/bar}, which would require splitting apart the filename and
	performing various operations on it. In cases where security is at
	stake, it may be preferable to write simple code which is sometimes
	overly restrictive, instead of more general code that is also more
	complex and may harbor a subtle security hole.