Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1 | |
| 2 | :mod:`marshal` --- Internal Python object serialization |
| 3 | ======================================================= |
| 4 | |
| 5 | .. module:: marshal |
| 6 | :synopsis: Convert Python objects to streams of bytes and back (with different |
| 7 | constraints). |
| 8 | |
| 9 | |
| 10 | This module contains functions that can read and write Python values in a binary |
| 11 | format. The format is specific to Python, but independent of machine |
| 12 | architecture issues (e.g., you can write a Python value to a file on a PC, |
| 13 | transport the file to a Sun, and read it back there). Details of the format are |
| 14 | undocumented on purpose; it may change between Python versions (although it |
| 15 | rarely does). [#]_ |
| 16 | |
| 17 | .. index:: |
| 18 | module: pickle |
| 19 | module: shelve |
| 20 | object: code |
| 21 | |
| 22 | This is not a general "persistence" module. For general persistence and |
| 23 | transfer of Python objects through RPC calls, see the modules :mod:`pickle` and |
| 24 | :mod:`shelve`. The :mod:`marshal` module exists mainly to support reading and |
| 25 | writing the "pseudo-compiled" code for Python modules of :file:`.pyc` files. |
| 26 | Therefore, the Python maintainers reserve the right to modify the marshal format |
| 27 | in backward incompatible ways should the need arise. If you're serializing and |
Raymond Hettinger | 84e26b6 | 2007-10-31 21:57:58 +0000 | [diff] [blame] | 28 | de-serializing Python objects, use the :mod:`pickle` module instead -- the |
| 29 | performance is comparable, version independence is guaranteed, and pickle |
| 30 | supports a substantially wider range of objects than marshal. |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 31 | |
| 32 | .. warning:: |
| 33 | |
| 34 | The :mod:`marshal` module is not intended to be secure against erroneous or |
| 35 | maliciously constructed data. Never unmarshal data received from an |
| 36 | untrusted or unauthenticated source. |
| 37 | |
| 38 | Not all Python object types are supported; in general, only objects whose value |
| 39 | is independent from a particular invocation of Python can be written and read by |
Georg Brandl | af795e5 | 2009-09-03 12:31:39 +0000 | [diff] [blame] | 40 | this module. The following types are supported: booleans, integers, long |
| 41 | integers, floating point numbers, complex numbers, strings, Unicode objects, |
| 42 | tuples, lists, sets, frozensets, dictionaries, and code objects, where it should |
| 43 | be understood that tuples, lists, sets, frozensets and dictionaries are only |
| 44 | supported as long as the values contained therein are themselves supported; and |
| 45 | recursive lists, sets and dictionaries should not be written (they will cause |
| 46 | infinite loops). The singletons :const:`None`, :const:`Ellipsis` and |
| 47 | :exc:`StopIteration` can also be marshalled and unmarshalled. |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 48 | |
Georg Brandl | bf863b1 | 2007-08-15 19:06:04 +0000 | [diff] [blame] | 49 | .. warning:: |
Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 50 | |
Georg Brandl | bf863b1 | 2007-08-15 19:06:04 +0000 | [diff] [blame] | 51 | On machines where C's ``long int`` type has more than 32 bits (such as the |
| 52 | DEC Alpha), it is possible to create plain Python integers that are longer |
| 53 | than 32 bits. If such an integer is marshaled and read back in on a machine |
| 54 | where C's ``long int`` type has only 32 bits, a Python long integer object |
| 55 | is returned instead. While of a different type, the numeric value is the |
| 56 | same. (This behavior is new in Python 2.2. In earlier versions, all but the |
| 57 | least-significant 32 bits of the value were lost, and a warning message was |
| 58 | printed.) |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 59 | |
| 60 | There are functions that read/write files as well as functions operating on |
| 61 | strings. |
| 62 | |
| 63 | The module defines these functions: |
| 64 | |
| 65 | |
| 66 | .. function:: dump(value, file[, version]) |
| 67 | |
| 68 | Write the value on the open file. The value must be a supported type. The |
Tim Golden | e9864c5 | 2014-04-29 16:11:18 +0100 | [diff] [blame] | 69 | file must be a open file object such as ``sys.stdout`` or returned by |
| 70 | :func:`open` or :func:`os.popen`. It may not be a wrapper such as |
| 71 | TemporaryFile on Windows. It must be opened in binary mode (``'wb'`` |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 72 | or ``'w+b'``). |
| 73 | |
| 74 | If the value has (or contains an object that has) an unsupported type, a |
| 75 | :exc:`ValueError` exception is raised --- but garbage data will also be written |
| 76 | to the file. The object will not be properly read back by :func:`load`. |
| 77 | |
| 78 | .. versionadded:: 2.4 |
| 79 | The *version* argument indicates the data format that ``dump`` should use |
| 80 | (see below). |
| 81 | |
| 82 | |
| 83 | .. function:: load(file) |
| 84 | |
| 85 | Read one value from the open file and return it. If no valid value is read |
| 86 | (e.g. because the data has a different Python version's incompatible marshal |
| 87 | format), raise :exc:`EOFError`, :exc:`ValueError` or :exc:`TypeError`. The |
| 88 | file must be an open file object opened in binary mode (``'rb'`` or |
| 89 | ``'r+b'``). |
| 90 | |
Georg Brandl | 16a57f6 | 2009-04-27 15:29:09 +0000 | [diff] [blame] | 91 | .. note:: |
Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 92 | |
| 93 | If an object containing an unsupported type was marshalled with :func:`dump`, |
| 94 | :func:`load` will substitute ``None`` for the unmarshallable type. |
| 95 | |
| 96 | |
| 97 | .. function:: dumps(value[, version]) |
| 98 | |
| 99 | Return the string that would be written to a file by ``dump(value, file)``. The |
| 100 | value must be a supported type. Raise a :exc:`ValueError` exception if value |
| 101 | has (or contains an object that has) an unsupported type. |
| 102 | |
| 103 | .. versionadded:: 2.4 |
| 104 | The *version* argument indicates the data format that ``dumps`` should use |
| 105 | (see below). |
| 106 | |
| 107 | |
| 108 | .. function:: loads(string) |
| 109 | |
| 110 | Convert the string to a value. If no valid value is found, raise |
| 111 | :exc:`EOFError`, :exc:`ValueError` or :exc:`TypeError`. Extra characters in the |
| 112 | string are ignored. |
| 113 | |
| 114 | |
| 115 | In addition, the following constants are defined: |
| 116 | |
| 117 | .. data:: version |
| 118 | |
| 119 | Indicates the format that the module uses. Version 0 is the historical format, |
| 120 | version 1 (added in Python 2.4) shares interned strings and version 2 (added in |
| 121 | Python 2.5) uses a binary format for floating point numbers. The current version |
| 122 | is 2. |
| 123 | |
| 124 | .. versionadded:: 2.4 |
| 125 | |
| 126 | |
| 127 | .. rubric:: Footnotes |
| 128 | |
| 129 | .. [#] The name of this module stems from a bit of terminology used by the designers of |
| 130 | Modula-3 (amongst others), who use the term "marshalling" for shipping of data |
| 131 | around in a self-contained form. Strictly speaking, "to marshal" means to |
| 132 | convert some data from internal to external form (in an RPC buffer for instance) |
| 133 | and "unmarshalling" for the reverse process. |
| 134 | |