Guido van Rossum | be0a8a6 | 1996-09-10 17:37:05 +0000 | [diff] [blame] | 1 | \section{Standard Module \sectcode{rexec}} |
| 2 | \stmodindex{rexec} |
| 3 | \renewcommand{\indexsubitem}{(in module rexec)} |
| 4 | |
Guido van Rossum | 095538d | 1996-10-22 01:11:19 +0000 | [diff] [blame] | 5 | This module contains the \code{RExec} class, which supports |
| 6 | \code{r_exec()}, \code{r_eval()}, \code{r_execfile()}, and |
| 7 | \code{r_import()} methods, which are restricted versions of the standard |
| 8 | Python functions \code{exec()}, \code{eval()}, \code{execfile()}, and |
Guido van Rossum | f73f79b | 1996-10-24 22:14:06 +0000 | [diff] [blame] | 9 | the \code{import} statement. |
| 10 | Code executed in this restricted environment will |
Guido van Rossum | 095538d | 1996-10-22 01:11:19 +0000 | [diff] [blame] | 11 | only have access to modules and functions that are deemed safe; you |
| 12 | can subclass \code{RExec} to add or remove capabilities as desired. |
| 13 | |
| 14 | \emph{Note:} The \code{RExec} class can prevent code from performing |
| 15 | unsafe operations like reading or writing disk files, or using TCP/IP |
| 16 | sockets. However, it does not protect against code using extremely |
| 17 | large amounts of memory or CPU time. |
Guido van Rossum | 095538d | 1996-10-22 01:11:19 +0000 | [diff] [blame] | 18 | |
Guido van Rossum | f73f79b | 1996-10-24 22:14:06 +0000 | [diff] [blame] | 19 | \begin{funcdesc}{RExec}{\optional{hooks\optional{\, verbose}}} |
Guido van Rossum | 095538d | 1996-10-22 01:11:19 +0000 | [diff] [blame] | 20 | Returns an instance of the \code{RExec} class. |
| 21 | |
Guido van Rossum | 095538d | 1996-10-22 01:11:19 +0000 | [diff] [blame] | 22 | \var{hooks} is an instance of the \code{RHooks} class or a subclass of it. |
Guido van Rossum | f73f79b | 1996-10-24 22:14:06 +0000 | [diff] [blame] | 23 | If it is omitted or \code{None}, the default \code{RHooks} class is |
| 24 | instantiated. |
Guido van Rossum | 095538d | 1996-10-22 01:11:19 +0000 | [diff] [blame] | 25 | Whenever the RExec module searches for a module (even a built-in one) |
| 26 | or reads a module's code, it doesn't actually go out to the file |
| 27 | system itself. Rather, it calls methods of an RHooks instance that |
| 28 | was passed to or created by its constructor. (Actually, the RExec |
| 29 | object doesn't make these calls---they are made by a module loader |
| 30 | object that's part of the RExec object. This allows another level of |
| 31 | flexibility, e.g. using packages.) |
| 32 | |
Guido van Rossum | f73f79b | 1996-10-24 22:14:06 +0000 | [diff] [blame] | 33 | By providing an alternate RHooks object, we can control the |
Guido van Rossum | 095538d | 1996-10-22 01:11:19 +0000 | [diff] [blame] | 34 | file system accesses made to import a module, without changing the |
| 35 | actual algorithm that controls the order in which those accesses are |
| 36 | made. For instance, we could substitute an RHooks object that passes |
| 37 | all filesystem requests to a file server elsewhere, via some RPC |
| 38 | mechanism such as ILU. Grail's applet loader uses this to support |
| 39 | importing applets from a URL for a directory. |
| 40 | |
Guido van Rossum | f73f79b | 1996-10-24 22:14:06 +0000 | [diff] [blame] | 41 | If \var{verbose} is true, additional debugging output may be sent to |
Guido van Rossum | 095538d | 1996-10-22 01:11:19 +0000 | [diff] [blame] | 42 | standard output. |
| 43 | \end{funcdesc} |
| 44 | |
Guido van Rossum | f73f79b | 1996-10-24 22:14:06 +0000 | [diff] [blame] | 45 | The RExec class has the following class attributes, which are used by the |
Guido van Rossum | 095538d | 1996-10-22 01:11:19 +0000 | [diff] [blame] | 46 | \code{__init__} method. Changing them on an existing instance won't |
| 47 | have any effect; instead, create a subclass of \code{RExec} and assign |
| 48 | them new values in the class definition. Instances of the new class |
| 49 | will then use those new values. All these attributes are tuples of |
| 50 | strings. |
| 51 | |
| 52 | \renewcommand{\indexsubitem}{(RExec object attribute)} |
| 53 | \begin{datadesc}{nok_builtin_names} |
| 54 | Contains the names of built-in functions which will \emph{not} be |
Guido van Rossum | f73f79b | 1996-10-24 22:14:06 +0000 | [diff] [blame] | 55 | available to programs running in the restricted environment. The |
| 56 | value for \code{RExec} is \code{('open',} \code{'reload',} |
| 57 | \code{'__import__')}. (This gives the exceptions, because by far the |
| 58 | majority of built-in functions are harmless. A subclass that wants to |
| 59 | override this variable should probably start with the value from the |
| 60 | base class and concatenate additional forbidden functions --- when new |
| 61 | dangerous built-in functions are added to Python, they will also be |
| 62 | added to this module.) |
Guido van Rossum | 095538d | 1996-10-22 01:11:19 +0000 | [diff] [blame] | 63 | \end{datadesc} |
| 64 | |
| 65 | \begin{datadesc}{ok_builtin_modules} |
| 66 | Contains the names of built-in modules which can be safely imported. |
Guido van Rossum | f73f79b | 1996-10-24 22:14:06 +0000 | [diff] [blame] | 67 | The value for \code{RExec} is \code{('audioop',} \code{'array',} |
| 68 | \code{'binascii',} \code{'cmath',} \code{'errno',} \code{'imageop',} |
| 69 | \code{'marshal',} \code{'math',} \code{'md5',} \code{'operator',} |
| 70 | \code{'parser',} \code{'regex',} \code{'rotor',} \code{'select',} |
| 71 | \code{'strop',} \code{'struct',} \code{'time')}. A similar remark |
| 72 | about overriding this variable applies --- use the value from the base |
| 73 | class as a starting point. |
Guido van Rossum | 095538d | 1996-10-22 01:11:19 +0000 | [diff] [blame] | 74 | \end{datadesc} |
| 75 | |
| 76 | \begin{datadesc}{ok_path} |
| 77 | Contains the directories which will be searched when an \code{import} |
| 78 | is performed in the restricted environment. |
Guido van Rossum | f73f79b | 1996-10-24 22:14:06 +0000 | [diff] [blame] | 79 | The value for \code{RExec} is the same as \code{sys.path} (at the time |
| 80 | the module is loaded) for unrestricted code. |
Guido van Rossum | 095538d | 1996-10-22 01:11:19 +0000 | [diff] [blame] | 81 | \end{datadesc} |
| 82 | |
| 83 | \begin{datadesc}{ok_posix_names} |
| 84 | % Should this be called ok_os_names? |
| 85 | Contains the names of the functions in the \code{os} module which will be |
| 86 | available to programs running in the restricted environment. The |
| 87 | value for \code{RExec} is \code{('error',} \code{'fstat',} |
| 88 | \code{'listdir',} \code{'lstat',} \code{'readlink',} \code{'stat',} |
| 89 | \code{'times',} \code{'uname',} \code{'getpid',} \code{'getppid',} |
| 90 | \code{'getcwd',} \code{'getuid',} \code{'getgid',} \code{'geteuid',} |
| 91 | \code{'getegid')}. |
| 92 | \end{datadesc} |
| 93 | |
| 94 | \begin{datadesc}{ok_sys_names} |
Guido van Rossum | f73f79b | 1996-10-24 22:14:06 +0000 | [diff] [blame] | 95 | Contains the names of the functions and variables in the \code{sys} |
| 96 | module which will be available to programs running in the restricted |
| 97 | environment. The value for \code{RExec} is \code{('ps1',} |
| 98 | \code{'ps2',} \code{'copyright',} \code{'version',} \code{'platform',} |
| 99 | \code{'exit',} \code{'maxint')}. |
Guido van Rossum | 095538d | 1996-10-22 01:11:19 +0000 | [diff] [blame] | 100 | \end{datadesc} |
| 101 | |
| 102 | RExec instances support the following methods: |
| 103 | \renewcommand{\indexsubitem}{(RExec object method)} |
| 104 | |
| 105 | \begin{funcdesc}{r_eval}{code} |
Guido van Rossum | f73f79b | 1996-10-24 22:14:06 +0000 | [diff] [blame] | 106 | \var{code} must either be a string containing a Python expression, or |
| 107 | a compiled code object, which will be evaluated in the restricted |
| 108 | environment's \code{__main__} module. The value of the expression or |
| 109 | code object will be returned. |
Guido van Rossum | 095538d | 1996-10-22 01:11:19 +0000 | [diff] [blame] | 110 | \end{funcdesc} |
| 111 | |
| 112 | \begin{funcdesc}{r_exec}{code} |
Guido van Rossum | f73f79b | 1996-10-24 22:14:06 +0000 | [diff] [blame] | 113 | \var{code} must either be a string containing one or more lines of |
| 114 | Python code, or a compiled code object, which will be executed in the |
| 115 | restricted environment's \code{__main__} module. |
Guido van Rossum | 095538d | 1996-10-22 01:11:19 +0000 | [diff] [blame] | 116 | \end{funcdesc} |
| 117 | |
| 118 | \begin{funcdesc}{r_execfile}{filename} |
| 119 | Execute the Python code contained in the file \var{filename} in the |
Guido van Rossum | f73f79b | 1996-10-24 22:14:06 +0000 | [diff] [blame] | 120 | restricted environment's \code{__main__} module. |
Guido van Rossum | 095538d | 1996-10-22 01:11:19 +0000 | [diff] [blame] | 121 | \end{funcdesc} |
| 122 | |
| 123 | Methods whose names begin with \code{s_} are similar to the functions |
| 124 | beginning with \code{r_}, but the code will be granted access to |
Guido van Rossum | f73f79b | 1996-10-24 22:14:06 +0000 | [diff] [blame] | 125 | restricted versions of the standard I/O streans \code{sys.stdin}, |
| 126 | \code{sys.stderr}, and \code{sys.stdout}. |
Guido van Rossum | 095538d | 1996-10-22 01:11:19 +0000 | [diff] [blame] | 127 | |
| 128 | \begin{funcdesc}{s_eval}{code} |
| 129 | \var{code} must be a string containing a Python expression, which will |
| 130 | be evaluated in the restricted environment. |
| 131 | \end{funcdesc} |
| 132 | |
| 133 | \begin{funcdesc}{s_exec}{code} |
| 134 | \var{code} must be a string containing one or more lines of Python code, |
| 135 | which will be executed in the restricted environment. |
| 136 | \end{funcdesc} |
| 137 | |
| 138 | \begin{funcdesc}{s_execfile}{code} |
| 139 | Execute the Python code contained in the file \var{filename} in the |
| 140 | restricted environment. |
| 141 | \end{funcdesc} |
| 142 | |
Guido van Rossum | f73f79b | 1996-10-24 22:14:06 +0000 | [diff] [blame] | 143 | \code{RExec} objects must also support various methods which will be |
| 144 | implicitly called by code executing in the restricted environment. |
| 145 | Overriding these methods in a subclass is used to change the policies |
| 146 | enforced by a restricted environment. |
Guido van Rossum | 095538d | 1996-10-22 01:11:19 +0000 | [diff] [blame] | 147 | |
Guido van Rossum | f73f79b | 1996-10-24 22:14:06 +0000 | [diff] [blame] | 148 | \begin{funcdesc}{r_import}{modulename\optional{\, globals\, locals\, fromlist}} |
| 149 | Import the module \var{modulename}, raising an \code{ImportError} |
| 150 | exception if the module is considered unsafe. |
Guido van Rossum | 095538d | 1996-10-22 01:11:19 +0000 | [diff] [blame] | 151 | \end{funcdesc} |
| 152 | |
| 153 | \begin{funcdesc}{r_open}{filename\optional{\, mode\optional{\, bufsize}}} |
| 154 | Method called when \code{open()} is called in the restricted |
| 155 | environment. The arguments are identical to those of \code{open()}, |
| 156 | and a file object (or a class instance compatible with file objects) |
| 157 | should be returned. \code{RExec}'s default behaviour is allow opening |
| 158 | any file for reading, but forbidding any attempt to write a file. See |
Guido van Rossum | f73f79b | 1996-10-24 22:14:06 +0000 | [diff] [blame] | 159 | the example below for an implementation of a less restrictive |
| 160 | \code{r_open()}. |
Guido van Rossum | 095538d | 1996-10-22 01:11:19 +0000 | [diff] [blame] | 161 | \end{funcdesc} |
| 162 | |
| 163 | \begin{funcdesc}{r_reload}{module} |
| 164 | Reload the module object \var{module}, re-parsing and re-initializing it. |
| 165 | \end{funcdesc} |
| 166 | |
| 167 | \begin{funcdesc}{r_unload}{module} |
Guido van Rossum | f73f79b | 1996-10-24 22:14:06 +0000 | [diff] [blame] | 168 | Unload the module object \var{module} (i.e., remove it from the |
| 169 | restricted environment's \code{sys.modules} dictionary). |
Guido van Rossum | 095538d | 1996-10-22 01:11:19 +0000 | [diff] [blame] | 170 | \end{funcdesc} |
| 171 | |
Guido van Rossum | f73f79b | 1996-10-24 22:14:06 +0000 | [diff] [blame] | 172 | And their equivalents with access to restricted standard I/O streams: |
| 173 | |
Guido van Rossum | 095538d | 1996-10-22 01:11:19 +0000 | [diff] [blame] | 174 | \begin{funcdesc}{s_import}{modulename\optional{\, globals, locals, fromlist}} |
Guido van Rossum | f73f79b | 1996-10-24 22:14:06 +0000 | [diff] [blame] | 175 | Import the module \var{modulename}, raising an \code{ImportError} |
| 176 | exception if the module is considered unsafe. |
Guido van Rossum | 095538d | 1996-10-22 01:11:19 +0000 | [diff] [blame] | 177 | \end{funcdesc} |
| 178 | |
| 179 | \begin{funcdesc}{s_reload}{module} |
| 180 | Reload the module object \var{module}, re-parsing and re-initializing it. |
| 181 | \end{funcdesc} |
| 182 | |
| 183 | \begin{funcdesc}{s_unload}{module} |
| 184 | Unload the module object \var{module}. |
| 185 | % XXX what are the semantics of this? |
| 186 | \end{funcdesc} |
| 187 | |
| 188 | \subsection{An example} |
| 189 | |
| 190 | Let us say that we want a slightly more relaxed policy than the |
| 191 | standard RExec class. For example, if we're willing to allow files in |
| 192 | \file{/tmp} to be written, we can subclass the \code{RExec} class: |
| 193 | |
| 194 | \bcode\begin{verbatim} |
| 195 | class TmpWriterRExec(rexec.RExec): |
| 196 | def r_open(self, file, mode='r', buf=-1): |
Guido van Rossum | f73f79b | 1996-10-24 22:14:06 +0000 | [diff] [blame] | 197 | if mode in ('r', 'rb'): |
| 198 | pass |
| 199 | elif mode in ('w', 'wb', 'a', 'ab'): |
| 200 | # check filename : must begin with /tmp/ |
| 201 | if file[:5]!='/tmp/': |
| 202 | raise IOError, "can't write outside /tmp" |
| 203 | elif (string.find(file, '/../') >= 0 or |
| 204 | file[:3] == '../' or file[-3:] == '/..'): |
| 205 | raise IOError, "'..' in filename forbidden" |
| 206 | else: raise IOError, "Illegal open() mode" |
Guido van Rossum | 095538d | 1996-10-22 01:11:19 +0000 | [diff] [blame] | 207 | return open(file, mode, buf) |
| 208 | \end{verbatim}\ecode |
| 209 | |
| 210 | Notice that the above code will occasionally forbid a perfectly valid |
| 211 | filename; for example, code in the restricted environment won't be |
| 212 | able to open a file called \file{/tmp/foo/../bar}. To fix this, the |
| 213 | \code{r_open} method would have to simplify the filename to |
| 214 | \file{/tmp/bar}, which would require splitting apart the filename and |
| 215 | performing various operations on it. In cases where security is at |
| 216 | stake, it may be preferable to write simple code which is sometimes |
| 217 | overly restrictive, instead of more general code that is also more |
| 218 | complex and may harbor a subtle security hole. |