Fred Drake | 295da24 | 1998-08-10 19:42:37 +0000 | [diff] [blame] | 1 | \section{\module{marshal} --- |
Barry Warsaw | 69b2d75 | 2001-11-15 23:55:12 +0000 | [diff] [blame] | 2 | Internal Python object serialization} |
Fred Drake | b91e934 | 1998-07-23 17:59:49 +0000 | [diff] [blame] | 3 | |
Fred Drake | ffbe687 | 1999-04-22 21:23:22 +0000 | [diff] [blame] | 4 | \declaremodule{builtin}{marshal} |
Fred Drake | 295da24 | 1998-08-10 19:42:37 +0000 | [diff] [blame] | 5 | \modulesynopsis{Convert Python objects to streams of bytes and back |
Fred Drake | ffbe687 | 1999-04-22 21:23:22 +0000 | [diff] [blame] | 6 | (with different constraints).} |
Fred Drake | b91e934 | 1998-07-23 17:59:49 +0000 | [diff] [blame] | 7 | |
Fred Drake | 0c2af2b | 1998-03-08 06:28:00 +0000 | [diff] [blame] | 8 | |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 9 | This module contains functions that can read and write Python |
| 10 | values in a binary format. The format is specific to Python, but |
| 11 | independent of machine architecture issues (e.g., you can write a |
Guido van Rossum | 470be14 | 1995-03-17 16:07:09 +0000 | [diff] [blame] | 12 | Python value to a file on a PC, transport the file to a Sun, and read |
| 13 | it back there). Details of the format are undocumented on purpose; |
Fred Drake | ea003fc | 1999-04-05 21:59:15 +0000 | [diff] [blame] | 14 | it may change between Python versions (although it rarely |
| 15 | does).\footnote{The name of this module stems from a bit of |
| 16 | terminology used by the designers of Modula-3 (amongst others), who |
| 17 | use the term ``marshalling'' for shipping of data around in a |
| 18 | self-contained form. Strictly speaking, ``to marshal'' means to |
| 19 | convert some data from internal to external form (in an RPC buffer for |
| 20 | instance) and ``unmarshalling'' for the reverse process.} |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 21 | |
Thomas Wouters | f831663 | 2000-07-16 19:01:10 +0000 | [diff] [blame] | 22 | This is not a general ``persistence'' module. For general persistence |
Guido van Rossum | 470be14 | 1995-03-17 16:07:09 +0000 | [diff] [blame] | 23 | and transfer of Python objects through RPC calls, see the modules |
Fred Drake | ffbe687 | 1999-04-22 21:23:22 +0000 | [diff] [blame] | 24 | \refmodule{pickle} and \refmodule{shelve}. The \module{marshal} module exists |
Guido van Rossum | 470be14 | 1995-03-17 16:07:09 +0000 | [diff] [blame] | 25 | mainly to support reading and writing the ``pseudo-compiled'' code for |
Barry Warsaw | 69b2d75 | 2001-11-15 23:55:12 +0000 | [diff] [blame] | 26 | Python modules of \file{.pyc} files. Therefore, the Python |
| 27 | maintainers reserve the right to modify the marshal format in backward |
| 28 | incompatible ways should the need arise. If you're serializing and |
Raymond Hettinger | 9afdaff | 2007-10-31 22:02:21 +0000 | [diff] [blame] | 29 | de-serializing Python objects, use the \module{pickle} module instead |
| 30 | --- the performance is comparable, version independence is guaranteed, |
| 31 | and pickle supports a substantially wider range of objects than marshal. |
Fred Drake | 54820dc | 1997-12-15 21:56:05 +0000 | [diff] [blame] | 32 | \refstmodindex{pickle} |
| 33 | \refstmodindex{shelve} |
Guido van Rossum | 470be14 | 1995-03-17 16:07:09 +0000 | [diff] [blame] | 34 | \obindex{code} |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 35 | |
Andrew M. Kuchling | 7696344 | 2003-05-14 16:51:46 +0000 | [diff] [blame] | 36 | \begin{notice}[warning] |
| 37 | The \module{marshal} module is not intended to be secure against |
| 38 | erroneous or maliciously constructed data. Never unmarshal data |
| 39 | received from an untrusted or unauthenticated source. |
| 40 | \end{notice} |
| 41 | |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 42 | Not all Python object types are supported; in general, only objects |
| 43 | whose value is independent from a particular invocation of Python can |
| 44 | be written and read by this module. The following types are supported: |
| 45 | \code{None}, integers, long integers, floating point numbers, |
Raymond Hettinger | cf81e06 | 2007-10-31 22:16:25 +0000 | [diff] [blame^] | 46 | strings, Unicode objects, tuples, lists, sets, dictionaries, and code |
Fred Drake | 61098f2 | 2000-04-06 14:47:20 +0000 | [diff] [blame] | 47 | objects, where it should be understood that tuples, lists and |
| 48 | dictionaries are only supported as long as the values contained |
| 49 | therein are themselves supported; and recursive lists and dictionaries |
| 50 | should not be written (they will cause infinite loops). |
Guido van Rossum | 470be14 | 1995-03-17 16:07:09 +0000 | [diff] [blame] | 51 | |
Raymond Hettinger | 9afdaff | 2007-10-31 22:02:21 +0000 | [diff] [blame] | 52 | \begin{notice}[warning] |
| 53 | Some unsupported types such as subclasses of builtins will appear to marshal |
| 54 | and unmarshal correctly, but in fact, their type will change and the |
| 55 | additional subclass functionality and instance attributes will be lost. |
| 56 | \end{notice} |
| 57 | |
Fred Drake | af8a015 | 1998-01-14 14:51:31 +0000 | [diff] [blame] | 58 | \strong{Caveat:} On machines where C's \code{long int} type has more than |
Tim Peters | ad2dc3f | 2001-09-14 20:40:13 +0000 | [diff] [blame] | 59 | 32 bits (such as the DEC Alpha), it is possible to create plain Python |
| 60 | integers that are longer than 32 bits. |
| 61 | If such an integer is marshaled and read back in on a machine where |
| 62 | C's \code{long int} type has only 32 bits, a Python long integer object |
| 63 | is returned instead. While of a different type, the numeric value is |
| 64 | the same. (This behavior is new in Python 2.2. In earlier versions, |
| 65 | all but the least-significant 32 bits of the value were lost, and a |
| 66 | warning message was printed.) |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 67 | |
| 68 | There are functions that read/write files as well as functions |
| 69 | operating on strings. |
| 70 | |
| 71 | The module defines these functions: |
| 72 | |
Armin Rigo | 2ccea17 | 2004-12-20 12:25:57 +0000 | [diff] [blame] | 73 | \begin{funcdesc}{dump}{value, file\optional{, version}} |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 74 | Write the value on the open file. The value must be a supported |
| 75 | type. The file must be an open file object such as |
Fred Drake | 7506298 | 1998-02-16 20:40:37 +0000 | [diff] [blame] | 76 | \code{sys.stdout} or returned by \function{open()} or |
Fred Drake | 38e5d27 | 2000-04-03 20:13:55 +0000 | [diff] [blame] | 77 | \function{posix.popen()}. It must be opened in binary mode |
| 78 | (\code{'wb'} or \code{'w+b'}). |
Fred Drake | 7506298 | 1998-02-16 20:40:37 +0000 | [diff] [blame] | 79 | |
Guido van Rossum | bbb1e26 | 1996-06-26 20:20:57 +0000 | [diff] [blame] | 80 | If the value has (or contains an object that has) an unsupported type, |
Fred Drake | 0c2af2b | 1998-03-08 06:28:00 +0000 | [diff] [blame] | 81 | a \exception{ValueError} exception is raised --- but garbage data |
Fred Drake | 7506298 | 1998-02-16 20:40:37 +0000 | [diff] [blame] | 82 | will also be written to the file. The object will not be properly |
| 83 | read back by \function{load()}. |
Martin v. Löwis | ef82d2f | 2004-06-27 16:51:46 +0000 | [diff] [blame] | 84 | |
| 85 | \versionadded[The \var{version} argument indicates the data |
Armin Rigo | 2ccea17 | 2004-12-20 12:25:57 +0000 | [diff] [blame] | 86 | format that \code{dump} should use (see below)]{2.4} |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 87 | \end{funcdesc} |
| 88 | |
| 89 | \begin{funcdesc}{load}{file} |
| 90 | Read one value from the open file and return it. If no valid value |
Fred Drake | 7506298 | 1998-02-16 20:40:37 +0000 | [diff] [blame] | 91 | is read, raise \exception{EOFError}, \exception{ValueError} or |
Fred Drake | 38e5d27 | 2000-04-03 20:13:55 +0000 | [diff] [blame] | 92 | \exception{TypeError}. The file must be an open file object opened |
| 93 | in binary mode (\code{'rb'} or \code{'r+b'}). |
Guido van Rossum | bbb1e26 | 1996-06-26 20:20:57 +0000 | [diff] [blame] | 94 | |
Fred Drake | 0aa811c | 2001-10-20 04:24:09 +0000 | [diff] [blame] | 95 | \warning{If an object containing an unsupported type was |
Fred Drake | 0c2af2b | 1998-03-08 06:28:00 +0000 | [diff] [blame] | 96 | marshalled with \function{dump()}, \function{load()} will substitute |
Fred Drake | 0aa811c | 2001-10-20 04:24:09 +0000 | [diff] [blame] | 97 | \code{None} for the unmarshallable type.} |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 98 | \end{funcdesc} |
| 99 | |
Martin v. Löwis | ef82d2f | 2004-06-27 16:51:46 +0000 | [diff] [blame] | 100 | \begin{funcdesc}{dumps}{value\optional{, version}} |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 101 | Return the string that would be written to a file by |
Fred Drake | 7506298 | 1998-02-16 20:40:37 +0000 | [diff] [blame] | 102 | \code{dump(\var{value}, \var{file})}. The value must be a supported |
| 103 | type. Raise a \exception{ValueError} exception if value has (or |
| 104 | contains an object that has) an unsupported type. |
Martin v. Löwis | ef82d2f | 2004-06-27 16:51:46 +0000 | [diff] [blame] | 105 | |
| 106 | \versionadded[The \var{version} argument indicates the data |
Armin Rigo | 2ccea17 | 2004-12-20 12:25:57 +0000 | [diff] [blame] | 107 | format that \code{dumps} should use (see below)]{2.4} |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 108 | \end{funcdesc} |
| 109 | |
| 110 | \begin{funcdesc}{loads}{string} |
| 111 | Convert the string to a value. If no valid value is found, raise |
Fred Drake | 7506298 | 1998-02-16 20:40:37 +0000 | [diff] [blame] | 112 | \exception{EOFError}, \exception{ValueError} or |
| 113 | \exception{TypeError}. Extra characters in the string are ignored. |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 114 | \end{funcdesc} |
Martin v. Löwis | ef82d2f | 2004-06-27 16:51:46 +0000 | [diff] [blame] | 115 | |
| 116 | In addition, the following constants are defined: |
| 117 | |
| 118 | \begin{datadesc}{version} |
| 119 | Indicates the format that the module uses. Version 0 is the |
Michael W. Hudson | 744ff38 | 2005-06-15 11:38:01 +0000 | [diff] [blame] | 120 | historical format, version 1 (added in Python 2.4) shares interned |
| 121 | strings and version 2 (added in Python 2.5) uses a binary format for |
| 122 | floating point numbers. The current version is 2. |
Martin v. Löwis | ef82d2f | 2004-06-27 16:51:46 +0000 | [diff] [blame] | 123 | |
| 124 | \versionadded{2.4} |
Michael W. Hudson | 744ff38 | 2005-06-15 11:38:01 +0000 | [diff] [blame] | 125 | \end{datadesc} |