blob: 73f0611a169f6896d2d474e2e10b903583d5c1bb [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001:mod:`pickle` --- Python object serialization
2=============================================
3
4.. index::
5 single: persistence
6 pair: persistent; objects
7 pair: serializing; objects
8 pair: marshalling; objects
9 pair: flattening; objects
10 pair: pickling; objects
11
12.. module:: pickle
13 :synopsis: Convert Python objects to streams of bytes and back.
Christian Heimes5b5e81c2007-12-31 16:14:33 +000014.. sectionauthor:: Jim Kerr <jbkerr@sr.hp.com>.
15.. sectionauthor:: Barry Warsaw <barry@zope.com>
Georg Brandl116aa622007-08-15 14:28:22 +000016
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +000017
Georg Brandl116aa622007-08-15 14:28:22 +000018The :mod:`pickle` module implements a fundamental, but powerful algorithm for
19serializing and de-serializing a Python object structure. "Pickling" is the
20process whereby a Python object hierarchy is converted into a byte stream, and
21"unpickling" is the inverse operation, whereby a byte stream is converted back
22into an object hierarchy. Pickling (and unpickling) is alternatively known as
23"serialization", "marshalling," [#]_ or "flattening", however, to avoid
Benjamin Petersonbe149d02008-06-20 21:03:22 +000024confusion, the terms used here are "pickling" and "unpickling"..
Georg Brandl116aa622007-08-15 14:28:22 +000025
Georg Brandl0036bcf2010-10-17 10:24:54 +000026.. warning::
27
28 The :mod:`pickle` module is not intended to be secure against erroneous or
29 maliciously constructed data. Never unpickle data received from an untrusted
30 or unauthenticated source.
31
Georg Brandl116aa622007-08-15 14:28:22 +000032
33Relationship to other Python modules
34------------------------------------
35
Benjamin Petersonbe149d02008-06-20 21:03:22 +000036The :mod:`pickle` module has an transparent optimizer (:mod:`_pickle`) written
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +000037in C. It is used whenever available. Otherwise the pure Python implementation is
Benjamin Petersonbe149d02008-06-20 21:03:22 +000038used.
Georg Brandl116aa622007-08-15 14:28:22 +000039
40Python has a more primitive serialization module called :mod:`marshal`, but in
41general :mod:`pickle` should always be the preferred way to serialize Python
42objects. :mod:`marshal` exists primarily to support Python's :file:`.pyc`
43files.
44
Georg Brandl5aa580f2010-11-30 14:57:54 +000045The :mod:`pickle` module differs from :mod:`marshal` in several significant ways:
Georg Brandl116aa622007-08-15 14:28:22 +000046
47* The :mod:`pickle` module keeps track of the objects it has already serialized,
48 so that later references to the same object won't be serialized again.
49 :mod:`marshal` doesn't do this.
50
51 This has implications both for recursive objects and object sharing. Recursive
52 objects are objects that contain references to themselves. These are not
53 handled by marshal, and in fact, attempting to marshal recursive objects will
54 crash your Python interpreter. Object sharing happens when there are multiple
55 references to the same object in different places in the object hierarchy being
56 serialized. :mod:`pickle` stores such objects only once, and ensures that all
57 other references point to the master copy. Shared objects remain shared, which
58 can be very important for mutable objects.
59
60* :mod:`marshal` cannot be used to serialize user-defined classes and their
61 instances. :mod:`pickle` can save and restore class instances transparently,
62 however the class definition must be importable and live in the same module as
63 when the object was stored.
64
65* The :mod:`marshal` serialization format is not guaranteed to be portable
66 across Python versions. Because its primary job in life is to support
67 :file:`.pyc` files, the Python implementers reserve the right to change the
68 serialization format in non-backwards compatible ways should the need arise.
69 The :mod:`pickle` serialization format is guaranteed to be backwards compatible
70 across Python releases.
71
Georg Brandl116aa622007-08-15 14:28:22 +000072Note that serialization is a more primitive notion than persistence; although
73:mod:`pickle` reads and writes file objects, it does not handle the issue of
74naming persistent objects, nor the (even more complicated) issue of concurrent
75access to persistent objects. The :mod:`pickle` module can transform a complex
76object into a byte stream and it can transform the byte stream into an object
77with the same internal structure. Perhaps the most obvious thing to do with
78these byte streams is to write them onto a file, but it is also conceivable to
79send them across a network or store them in a database. The module
80:mod:`shelve` provides a simple interface to pickle and unpickle objects on
81DBM-style database files.
82
83
84Data stream format
85------------------
86
87.. index::
88 single: XDR
89 single: External Data Representation
90
91The data format used by :mod:`pickle` is Python-specific. This has the
92advantage that there are no restrictions imposed by external standards such as
93XDR (which can't represent pointer sharing); however it means that non-Python
94programs may not be able to reconstruct pickled Python objects.
95
Alexandre Vassalotti758bca62008-10-18 19:25:07 +000096By default, the :mod:`pickle` data format uses a compact binary representation.
97The module :mod:`pickletools` contains tools for analyzing data streams
98generated by :mod:`pickle`.
Georg Brandl116aa622007-08-15 14:28:22 +000099
Georg Brandl42f2ae02008-04-06 08:39:37 +0000100There are currently 4 different protocols which can be used for pickling.
Georg Brandl116aa622007-08-15 14:28:22 +0000101
Alexandre Vassalottif7d08c72009-01-23 04:50:05 +0000102* Protocol version 0 is the original human-readable protocol and is
103 backwards compatible with earlier versions of Python.
Georg Brandl116aa622007-08-15 14:28:22 +0000104
105* Protocol version 1 is the old binary format which is also compatible with
106 earlier versions of Python.
107
108* Protocol version 2 was introduced in Python 2.3. It provides much more
Georg Brandl9afde1c2007-11-01 20:32:30 +0000109 efficient pickling of :term:`new-style class`\es.
Georg Brandl116aa622007-08-15 14:28:22 +0000110
Georg Brandl42f2ae02008-04-06 08:39:37 +0000111* Protocol version 3 was added in Python 3.0. It has explicit support for
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000112 bytes and cannot be unpickled by Python 2.x pickle modules. This is
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000113 the current recommended protocol, use it whenever it is possible.
Georg Brandl42f2ae02008-04-06 08:39:37 +0000114
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000115Refer to :pep:`307` for information about improvements brought by
116protocol 2. See :mod:`pickletools`'s source code for extensive
117comments about opcodes used by pickle protocols.
Georg Brandl116aa622007-08-15 14:28:22 +0000118
Georg Brandl116aa622007-08-15 14:28:22 +0000119
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000120Module Interface
121----------------
Georg Brandl116aa622007-08-15 14:28:22 +0000122
123To serialize an object hierarchy, you first create a pickler, then you call the
124pickler's :meth:`dump` method. To de-serialize a data stream, you first create
125an unpickler, then you call the unpickler's :meth:`load` method. The
126:mod:`pickle` module provides the following constant:
127
128
129.. data:: HIGHEST_PROTOCOL
130
131 The highest protocol version available. This value can be passed as a
132 *protocol* value.
133
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000134.. data:: DEFAULT_PROTOCOL
135
136 The default protocol used for pickling. May be less than HIGHEST_PROTOCOL.
137 Currently the default protocol is 3; a backward-incompatible protocol
138 designed for Python 3.0.
139
140
Georg Brandl116aa622007-08-15 14:28:22 +0000141The :mod:`pickle` module provides the following functions to make the pickling
142process more convenient:
143
Georg Brandl18244152009-09-02 20:34:52 +0000144.. function:: dump(obj, file, protocol=None, \*, fix_imports=True)
Georg Brandl116aa622007-08-15 14:28:22 +0000145
Antoine Pitrou11cb9612010-09-15 11:11:28 +0000146 Write a pickled representation of *obj* to the open :term:`file object` *file*.
147 This is equivalent to ``Pickler(file, protocol).dump(obj)``.
Georg Brandl116aa622007-08-15 14:28:22 +0000148
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000149 The optional *protocol* argument tells the pickler to use the given protocol;
150 supported protocols are 0, 1, 2, 3. The default protocol is 3; a
151 backward-incompatible protocol designed for Python 3.0.
Georg Brandl116aa622007-08-15 14:28:22 +0000152
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000153 Specifying a negative protocol version selects the highest protocol version
154 supported. The higher the protocol used, the more recent the version of
155 Python needed to read the pickle produced.
Georg Brandl116aa622007-08-15 14:28:22 +0000156
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000157 The *file* argument must have a write() method that accepts a single bytes
Antoine Pitrou11cb9612010-09-15 11:11:28 +0000158 argument. It can thus be an on-disk file opened for binary writing, a
159 :class:`io.BytesIO` instance, or any other custom object that meets this
160 interface.
Georg Brandl116aa622007-08-15 14:28:22 +0000161
Antoine Pitroud9dfaa92009-06-04 20:32:06 +0000162 If *fix_imports* is True and *protocol* is less than 3, pickle will try to
163 map the new Python 3.x names to the old module names used in Python 2.x,
164 so that the pickle data stream is readable with Python 2.x.
165
Georg Brandl18244152009-09-02 20:34:52 +0000166.. function:: dumps(obj, protocol=None, \*, fix_imports=True)
Georg Brandl116aa622007-08-15 14:28:22 +0000167
Mark Summerfieldb9e23042008-04-21 14:47:45 +0000168 Return the pickled representation of the object as a :class:`bytes`
169 object, instead of writing it to a file.
Georg Brandl116aa622007-08-15 14:28:22 +0000170
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000171 The optional *protocol* argument tells the pickler to use the given protocol;
172 supported protocols are 0, 1, 2, 3. The default protocol is 3; a
173 backward-incompatible protocol designed for Python 3.0.
174
175 Specifying a negative protocol version selects the highest protocol version
176 supported. The higher the protocol used, the more recent the version of
177 Python needed to read the pickle produced.
178
Antoine Pitroud9dfaa92009-06-04 20:32:06 +0000179 If *fix_imports* is True and *protocol* is less than 3, pickle will try to
180 map the new Python 3.x names to the old module names used in Python 2.x,
181 so that the pickle data stream is readable with Python 2.x.
182
Georg Brandl18244152009-09-02 20:34:52 +0000183.. function:: load(file, \*, fix_imports=True, encoding="ASCII", errors="strict")
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000184
Antoine Pitrou11cb9612010-09-15 11:11:28 +0000185 Read a pickled object representation from the open :term:`file object` *file*
186 and return the reconstituted object hierarchy specified therein. This is
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000187 equivalent to ``Unpickler(file).load()``.
188
189 The protocol version of the pickle is detected automatically, so no protocol
190 argument is needed. Bytes past the pickled object's representation are
191 ignored.
192
193 The argument *file* must have two methods, a read() method that takes an
194 integer argument, and a readline() method that requires no arguments. Both
Antoine Pitrou11cb9612010-09-15 11:11:28 +0000195 methods should return bytes. Thus *file* can be an on-disk file opened
196 for binary reading, a :class:`io.BytesIO` object, or any other custom object
197 that meets this interface.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000198
Antoine Pitroud9dfaa92009-06-04 20:32:06 +0000199 Optional keyword arguments are *fix_imports*, *encoding* and *errors*,
Georg Brandl6faee4e2010-09-21 14:48:28 +0000200 which are used to control compatibility support for pickle stream generated
Antoine Pitroud9dfaa92009-06-04 20:32:06 +0000201 by Python 2.x. If *fix_imports* is True, pickle will try to map the old
202 Python 2.x names to the new names used in Python 3.x. The *encoding* and
203 *errors* tell pickle how to decode 8-bit string instances pickled by Python
204 2.x; these default to 'ASCII' and 'strict', respectively.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000205
Georg Brandl18244152009-09-02 20:34:52 +0000206.. function:: loads(bytes_object, \*, fix_imports=True, encoding="ASCII", errors="strict")
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000207
208 Read a pickled object hierarchy from a :class:`bytes` object and return the
209 reconstituted object hierarchy specified therein
210
211 The protocol version of the pickle is detected automatically, so no protocol
212 argument is needed. Bytes past the pickled object's representation are
213 ignored.
214
Antoine Pitroud9dfaa92009-06-04 20:32:06 +0000215 Optional keyword arguments are *fix_imports*, *encoding* and *errors*,
Georg Brandl6faee4e2010-09-21 14:48:28 +0000216 which are used to control compatibility support for pickle stream generated
Antoine Pitroud9dfaa92009-06-04 20:32:06 +0000217 by Python 2.x. If *fix_imports* is True, pickle will try to map the old
218 Python 2.x names to the new names used in Python 3.x. The *encoding* and
219 *errors* tell pickle how to decode 8-bit string instances pickled by Python
220 2.x; these default to 'ASCII' and 'strict', respectively.
Georg Brandl116aa622007-08-15 14:28:22 +0000221
Georg Brandl116aa622007-08-15 14:28:22 +0000222
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000223The :mod:`pickle` module defines three exceptions:
Georg Brandl116aa622007-08-15 14:28:22 +0000224
225.. exception:: PickleError
226
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000227 Common base class for the other pickling exceptions. It inherits
Georg Brandl116aa622007-08-15 14:28:22 +0000228 :exc:`Exception`.
229
Georg Brandl116aa622007-08-15 14:28:22 +0000230.. exception:: PicklingError
231
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000232 Error raised when an unpicklable object is encountered by :class:`Pickler`.
233 It inherits :exc:`PickleError`.
Georg Brandl116aa622007-08-15 14:28:22 +0000234
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000235 Refer to :ref:`pickle-picklable` to learn what kinds of objects can be
236 pickled.
237
Georg Brandl116aa622007-08-15 14:28:22 +0000238.. exception:: UnpicklingError
239
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000240 Error raised when there a problem unpickling an object, such as a data
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000241 corruption or a security violation. It inherits :exc:`PickleError`.
Georg Brandl116aa622007-08-15 14:28:22 +0000242
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000243 Note that other exceptions may also be raised during unpickling, including
244 (but not necessarily limited to) AttributeError, EOFError, ImportError, and
245 IndexError.
246
247
248The :mod:`pickle` module exports two classes, :class:`Pickler` and
Georg Brandl116aa622007-08-15 14:28:22 +0000249:class:`Unpickler`:
250
Georg Brandl18244152009-09-02 20:34:52 +0000251.. class:: Pickler(file, protocol=None, \*, fix_imports=True)
Georg Brandl116aa622007-08-15 14:28:22 +0000252
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000253 This takes a binary file for writing a pickle data stream.
Georg Brandl116aa622007-08-15 14:28:22 +0000254
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000255 The optional *protocol* argument tells the pickler to use the given protocol;
256 supported protocols are 0, 1, 2, 3. The default protocol is 3; a
257 backward-incompatible protocol designed for Python 3.0.
Georg Brandl116aa622007-08-15 14:28:22 +0000258
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000259 Specifying a negative protocol version selects the highest protocol version
260 supported. The higher the protocol used, the more recent the version of
261 Python needed to read the pickle produced.
Georg Brandl116aa622007-08-15 14:28:22 +0000262
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000263 The *file* argument must have a write() method that accepts a single bytes
Antoine Pitrou11cb9612010-09-15 11:11:28 +0000264 argument. It can thus be an on-disk file opened for binary writing, a
265 :class:`io.BytesIO` instance, or any other custom object that meets this interface.
Georg Brandl116aa622007-08-15 14:28:22 +0000266
Antoine Pitroud9dfaa92009-06-04 20:32:06 +0000267 If *fix_imports* is True and *protocol* is less than 3, pickle will try to
268 map the new Python 3.x names to the old module names used in Python 2.x,
269 so that the pickle data stream is readable with Python 2.x.
270
Benjamin Petersone41251e2008-04-25 01:59:09 +0000271 .. method:: dump(obj)
Georg Brandl116aa622007-08-15 14:28:22 +0000272
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000273 Write a pickled representation of *obj* to the open file object given in
274 the constructor.
Georg Brandl116aa622007-08-15 14:28:22 +0000275
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000276 .. method:: persistent_id(obj)
277
278 Do nothing by default. This exists so a subclass can override it.
279
280 If :meth:`persistent_id` returns ``None``, *obj* is pickled as usual. Any
281 other value causes :class:`Pickler` to emit the returned value as a
282 persistent ID for *obj*. The meaning of this persistent ID should be
283 defined by :meth:`Unpickler.persistent_load`. Note that the value
284 returned by :meth:`persistent_id` cannot itself have a persistent ID.
285
286 See :ref:`pickle-persistent` for details and examples of uses.
Georg Brandl116aa622007-08-15 14:28:22 +0000287
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000288 .. attribute:: fast
289
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000290 Deprecated. Enable fast mode if set to a true value. The fast mode
291 disables the usage of memo, therefore speeding the pickling process by not
292 generating superfluous PUT opcodes. It should not be used with
293 self-referential objects, doing otherwise will cause :class:`Pickler` to
294 recurse infinitely.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000295
296 Use :func:`pickletools.optimize` if you need more compact pickles.
297
Georg Brandl116aa622007-08-15 14:28:22 +0000298
Georg Brandl18244152009-09-02 20:34:52 +0000299.. class:: Unpickler(file, \*, fix_imports=True, encoding="ASCII", errors="strict")
Georg Brandl116aa622007-08-15 14:28:22 +0000300
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000301 This takes a binary file for reading a pickle data stream.
Georg Brandl116aa622007-08-15 14:28:22 +0000302
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000303 The protocol version of the pickle is detected automatically, so no
304 protocol argument is needed.
305
306 The argument *file* must have two methods, a read() method that takes an
307 integer argument, and a readline() method that requires no arguments. Both
Antoine Pitrou11cb9612010-09-15 11:11:28 +0000308 methods should return bytes. Thus *file* can be an on-disk file object opened
309 for binary reading, a :class:`io.BytesIO` object, or any other custom object
310 that meets this interface.
Georg Brandl116aa622007-08-15 14:28:22 +0000311
Antoine Pitroud9dfaa92009-06-04 20:32:06 +0000312 Optional keyword arguments are *fix_imports*, *encoding* and *errors*,
Georg Brandl6faee4e2010-09-21 14:48:28 +0000313 which are used to control compatibility support for pickle stream generated
Antoine Pitroud9dfaa92009-06-04 20:32:06 +0000314 by Python 2.x. If *fix_imports* is True, pickle will try to map the old
315 Python 2.x names to the new names used in Python 3.x. The *encoding* and
316 *errors* tell pickle how to decode 8-bit string instances pickled by Python
317 2.x; these default to 'ASCII' and 'strict', respectively.
Georg Brandl116aa622007-08-15 14:28:22 +0000318
Benjamin Petersone41251e2008-04-25 01:59:09 +0000319 .. method:: load()
Georg Brandl116aa622007-08-15 14:28:22 +0000320
Benjamin Petersone41251e2008-04-25 01:59:09 +0000321 Read a pickled object representation from the open file object given in
322 the constructor, and return the reconstituted object hierarchy specified
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000323 therein. Bytes past the pickled object's representation are ignored.
Georg Brandl116aa622007-08-15 14:28:22 +0000324
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000325 .. method:: persistent_load(pid)
Georg Brandl116aa622007-08-15 14:28:22 +0000326
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000327 Raise an :exc:`UnpickingError` by default.
Georg Brandl116aa622007-08-15 14:28:22 +0000328
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000329 If defined, :meth:`persistent_load` should return the object specified by
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000330 the persistent ID *pid*. If an invalid persistent ID is encountered, an
331 :exc:`UnpickingError` should be raised.
Georg Brandl116aa622007-08-15 14:28:22 +0000332
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000333 See :ref:`pickle-persistent` for details and examples of uses.
334
335 .. method:: find_class(module, name)
336
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000337 Import *module* if necessary and return the object called *name* from it,
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000338 where the *module* and *name* arguments are :class:`str` objects. Note,
339 unlike its name suggests, :meth:`find_class` is also used for finding
340 functions.
Georg Brandl116aa622007-08-15 14:28:22 +0000341
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000342 Subclasses may override this to gain control over what type of objects and
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000343 how they can be loaded, potentially reducing security risks. Refer to
344 :ref:`pickle-restrict` for details.
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000345
346
347.. _pickle-picklable:
Georg Brandl116aa622007-08-15 14:28:22 +0000348
349What can be pickled and unpickled?
350----------------------------------
351
352The following types can be pickled:
353
354* ``None``, ``True``, and ``False``
355
Georg Brandlba956ae2007-11-29 17:24:34 +0000356* integers, floating point numbers, complex numbers
Georg Brandl116aa622007-08-15 14:28:22 +0000357
Georg Brandlf6945182008-02-01 11:56:49 +0000358* strings, bytes, bytearrays
Georg Brandl116aa622007-08-15 14:28:22 +0000359
360* tuples, lists, sets, and dictionaries containing only picklable objects
361
362* functions defined at the top level of a module
363
364* built-in functions defined at the top level of a module
365
366* classes that are defined at the top level of a module
367
368* instances of such classes whose :attr:`__dict__` or :meth:`__setstate__` is
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000369 picklable (see section :ref:`pickle-inst` for details)
Georg Brandl116aa622007-08-15 14:28:22 +0000370
371Attempts to pickle unpicklable objects will raise the :exc:`PicklingError`
372exception; when this happens, an unspecified number of bytes may have already
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000373been written to the underlying file. Trying to pickle a highly recursive data
Georg Brandl116aa622007-08-15 14:28:22 +0000374structure may exceed the maximum recursion depth, a :exc:`RuntimeError` will be
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000375raised in this case. You can carefully raise this limit with
Georg Brandl116aa622007-08-15 14:28:22 +0000376:func:`sys.setrecursionlimit`.
377
378Note that functions (built-in and user-defined) are pickled by "fully qualified"
379name reference, not by value. This means that only the function name is
380pickled, along with the name of module the function is defined in. Neither the
381function's code, nor any of its function attributes are pickled. Thus the
382defining module must be importable in the unpickling environment, and the module
383must contain the named object, otherwise an exception will be raised. [#]_
384
385Similarly, classes are pickled by named reference, so the same restrictions in
386the unpickling environment apply. Note that none of the class's code or data is
387pickled, so in the following example the class attribute ``attr`` is not
388restored in the unpickling environment::
389
390 class Foo:
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000391 attr = 'A class attribute'
Georg Brandl116aa622007-08-15 14:28:22 +0000392
393 picklestring = pickle.dumps(Foo)
394
395These restrictions are why picklable functions and classes must be defined in
396the top level of a module.
397
398Similarly, when class instances are pickled, their class's code and data are not
399pickled along with them. Only the instance data are pickled. This is done on
400purpose, so you can fix bugs in a class or add methods to the class and still
401load objects that were created with an earlier version of the class. If you
402plan to have long-lived objects that will see many versions of a class, it may
403be worthwhile to put a version number in the objects so that suitable
404conversions can be made by the class's :meth:`__setstate__` method.
405
406
Georg Brandl116aa622007-08-15 14:28:22 +0000407.. _pickle-inst:
408
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000409Pickling Class Instances
410------------------------
Georg Brandl116aa622007-08-15 14:28:22 +0000411
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000412In this section, we describe the general mechanisms available to you to define,
413customize, and control how class instances are pickled and unpickled.
Georg Brandl116aa622007-08-15 14:28:22 +0000414
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000415In most cases, no additional code is needed to make instances picklable. By
416default, pickle will retrieve the class and the attributes of an instance via
417introspection. When a class instance is unpickled, its :meth:`__init__` method
418is usually *not* invoked. The default behaviour first creates an uninitialized
419instance and then restores the saved attributes. The following code shows an
420implementation of this behaviour::
Georg Brandl85eb8c12007-08-31 16:33:38 +0000421
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000422 def save(obj):
423 return (obj.__class__, obj.__dict__)
424
425 def load(cls, attributes):
426 obj = cls.__new__(cls)
427 obj.__dict__.update(attributes)
428 return obj
Georg Brandl116aa622007-08-15 14:28:22 +0000429
Georg Brandl6faee4e2010-09-21 14:48:28 +0000430Classes can alter the default behaviour by providing one or several special
Georg Brandlc8148262010-10-17 11:13:37 +0000431methods:
Georg Brandl116aa622007-08-15 14:28:22 +0000432
Georg Brandlc8148262010-10-17 11:13:37 +0000433.. method:: object.__getnewargs__()
Georg Brandl116aa622007-08-15 14:28:22 +0000434
Georg Brandlc8148262010-10-17 11:13:37 +0000435 In protocol 2 and newer, classes that implements the :meth:`__getnewargs__`
436 method can dictate the values passed to the :meth:`__new__` method upon
437 unpickling. This is often needed for classes whose :meth:`__new__` method
438 requires arguments.
Georg Brandl116aa622007-08-15 14:28:22 +0000439
Georg Brandl116aa622007-08-15 14:28:22 +0000440
Georg Brandlc8148262010-10-17 11:13:37 +0000441.. method:: object.__getstate__()
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000442
Georg Brandlc8148262010-10-17 11:13:37 +0000443 Classes can further influence how their instances are pickled; if the class
444 defines the method :meth:`__getstate__`, it is called and the returned object
445 is pickled as the contents for the instance, instead of the contents of the
446 instance's dictionary. If the :meth:`__getstate__` method is absent, the
447 instance's :attr:`__dict__` is pickled as usual.
Georg Brandl116aa622007-08-15 14:28:22 +0000448
Georg Brandlc8148262010-10-17 11:13:37 +0000449
450.. method:: object.__setstate__(state)
451
452 Upon unpickling, if the class defines :meth:`__setstate__`, it is called with
453 the unpickled state. In that case, there is no requirement for the state
454 object to be a dictionary. Otherwise, the pickled state must be a dictionary
455 and its items are assigned to the new instance's dictionary.
456
457 .. note::
458
459 If :meth:`__getstate__` returns a false value, the :meth:`__setstate__`
460 method will not be called upon unpickling.
461
Georg Brandl116aa622007-08-15 14:28:22 +0000462
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000463Refer to the section :ref:`pickle-state` for more information about how to use
464the methods :meth:`__getstate__` and :meth:`__setstate__`.
Georg Brandl116aa622007-08-15 14:28:22 +0000465
Benjamin Petersond23f8222009-04-05 19:13:16 +0000466.. note::
Georg Brandle720c0a2009-04-27 16:20:50 +0000467
Benjamin Petersond23f8222009-04-05 19:13:16 +0000468 At unpickling time, some methods like :meth:`__getattr__`,
469 :meth:`__getattribute__`, or :meth:`__setattr__` may be called upon the
Georg Brandlc8148262010-10-17 11:13:37 +0000470 instance. In case those methods rely on some internal invariant being true,
471 the type should implement :meth:`__getnewargs__` to establish such an
472 invariant; otherwise, neither :meth:`__new__` nor :meth:`__init__` will be
473 called.
Benjamin Petersond23f8222009-04-05 19:13:16 +0000474
Georg Brandlc8148262010-10-17 11:13:37 +0000475.. index:: pair: copy; protocol
Christian Heimes05e8be12008-02-23 18:30:17 +0000476
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000477As we shall see, pickle does not use directly the methods described above. In
478fact, these methods are part of the copy protocol which implements the
479:meth:`__reduce__` special method. The copy protocol provides a unified
480interface for retrieving the data necessary for pickling and copying
Georg Brandl48310cd2009-01-03 21:18:54 +0000481objects. [#]_
Georg Brandl116aa622007-08-15 14:28:22 +0000482
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000483Although powerful, implementing :meth:`__reduce__` directly in your classes is
484error prone. For this reason, class designers should use the high-level
485interface (i.e., :meth:`__getnewargs__`, :meth:`__getstate__` and
Georg Brandlc8148262010-10-17 11:13:37 +0000486:meth:`__setstate__`) whenever possible. We will show, however, cases where
487using :meth:`__reduce__` is the only option or leads to more efficient pickling
488or both.
Georg Brandl116aa622007-08-15 14:28:22 +0000489
Georg Brandlc8148262010-10-17 11:13:37 +0000490.. method:: object.__reduce__()
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000491
Georg Brandlc8148262010-10-17 11:13:37 +0000492 The interface is currently defined as follows. The :meth:`__reduce__` method
493 takes no argument and shall return either a string or preferably a tuple (the
494 returned object is often referred to as the "reduce value").
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000495
Georg Brandlc8148262010-10-17 11:13:37 +0000496 If a string is returned, the string should be interpreted as the name of a
497 global variable. It should be the object's local name relative to its
498 module; the pickle module searches the module namespace to determine the
499 object's module. This behaviour is typically useful for singletons.
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000500
Georg Brandlc8148262010-10-17 11:13:37 +0000501 When a tuple is returned, it must be between two and five items long.
502 Optional items can either be omitted, or ``None`` can be provided as their
503 value. The semantics of each item are in order:
Georg Brandl116aa622007-08-15 14:28:22 +0000504
Georg Brandlc8148262010-10-17 11:13:37 +0000505 .. XXX Mention __newobj__ special-case?
Georg Brandl116aa622007-08-15 14:28:22 +0000506
Georg Brandlc8148262010-10-17 11:13:37 +0000507 * A callable object that will be called to create the initial version of the
508 object.
Georg Brandl116aa622007-08-15 14:28:22 +0000509
Georg Brandlc8148262010-10-17 11:13:37 +0000510 * A tuple of arguments for the callable object. An empty tuple must be given
511 if the callable does not accept any argument.
Georg Brandl116aa622007-08-15 14:28:22 +0000512
Georg Brandlc8148262010-10-17 11:13:37 +0000513 * Optionally, the object's state, which will be passed to the object's
514 :meth:`__setstate__` method as previously described. If the object has no
515 such method then, the value must be a dictionary and it will be added to
516 the object's :attr:`__dict__` attribute.
Georg Brandl116aa622007-08-15 14:28:22 +0000517
Georg Brandlc8148262010-10-17 11:13:37 +0000518 * Optionally, an iterator (and not a sequence) yielding successive items.
519 These items will be appended to the object either using
520 ``obj.append(item)`` or, in batch, using ``obj.extend(list_of_items)``.
521 This is primarily used for list subclasses, but may be used by other
522 classes as long as they have :meth:`append` and :meth:`extend` methods with
523 the appropriate signature. (Whether :meth:`append` or :meth:`extend` is
524 used depends on which pickle protocol version is used as well as the number
525 of items to append, so both must be supported.)
Georg Brandl116aa622007-08-15 14:28:22 +0000526
Georg Brandlc8148262010-10-17 11:13:37 +0000527 * Optionally, an iterator (not a sequence) yielding successive key-value
528 pairs. These items will be stored to the object using ``obj[key] =
529 value``. This is primarily used for dictionary subclasses, but may be used
530 by other classes as long as they implement :meth:`__setitem__`.
Georg Brandl116aa622007-08-15 14:28:22 +0000531
Georg Brandlc8148262010-10-17 11:13:37 +0000532
533.. method:: object.__reduce_ex__(protocol)
534
535 Alternatively, a :meth:`__reduce_ex__` method may be defined. The only
536 difference is this method should take a single integer argument, the protocol
537 version. When defined, pickle will prefer it over the :meth:`__reduce__`
538 method. In addition, :meth:`__reduce__` automatically becomes a synonym for
539 the extended version. The main use for this method is to provide
540 backwards-compatible reduce values for older Python releases.
Georg Brandl116aa622007-08-15 14:28:22 +0000541
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000542.. _pickle-persistent:
543
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000544Persistence of External Objects
545^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +0000546
Christian Heimes05e8be12008-02-23 18:30:17 +0000547.. index::
548 single: persistent_id (pickle protocol)
549 single: persistent_load (pickle protocol)
550
Georg Brandl116aa622007-08-15 14:28:22 +0000551For the benefit of object persistence, the :mod:`pickle` module supports the
552notion of a reference to an object outside the pickled data stream. Such
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000553objects are referenced by a persistent ID, which should be either a string of
554alphanumeric characters (for protocol 0) [#]_ or just an arbitrary object (for
555any newer protocol).
Georg Brandl116aa622007-08-15 14:28:22 +0000556
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000557The resolution of such persistent IDs is not defined by the :mod:`pickle`
558module; it will delegate this resolution to the user defined methods on the
559pickler and unpickler, :meth:`persistent_id` and :meth:`persistent_load`
560respectively.
Georg Brandl116aa622007-08-15 14:28:22 +0000561
562To pickle objects that have an external persistent id, the pickler must have a
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000563custom :meth:`persistent_id` method that takes an object as an argument and
Georg Brandl116aa622007-08-15 14:28:22 +0000564returns either ``None`` or the persistent id for that object. When ``None`` is
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000565returned, the pickler simply pickles the object as normal. When a persistent ID
566string is returned, the pickler will pickle that object, along with a marker so
567that the unpickler will recognize it as a persistent ID.
Georg Brandl116aa622007-08-15 14:28:22 +0000568
569To unpickle external objects, the unpickler must have a custom
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000570:meth:`persistent_load` method that takes a persistent ID object and returns the
571referenced object.
Georg Brandl116aa622007-08-15 14:28:22 +0000572
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000573Here is a comprehensive example presenting how persistent ID can be used to
574pickle external objects by reference.
Georg Brandl116aa622007-08-15 14:28:22 +0000575
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000576.. literalinclude:: ../includes/dbpickle.py
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +0000577
Georg Brandl116aa622007-08-15 14:28:22 +0000578
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000579.. _pickle-state:
580
581Handling Stateful Objects
582^^^^^^^^^^^^^^^^^^^^^^^^^
583
584.. index::
585 single: __getstate__() (copy protocol)
586 single: __setstate__() (copy protocol)
587
588Here's an example that shows how to modify pickling behavior for a class.
589The :class:`TextReader` class opens a text file, and returns the line number and
590line contents each time its :meth:`readline` method is called. If a
591:class:`TextReader` instance is pickled, all attributes *except* the file object
592member are saved. When the instance is unpickled, the file is reopened, and
593reading resumes from the last location. The :meth:`__setstate__` and
594:meth:`__getstate__` methods are used to implement this behavior. ::
595
596 class TextReader:
597 """Print and number lines in a text file."""
598
599 def __init__(self, filename):
600 self.filename = filename
601 self.file = open(filename)
602 self.lineno = 0
603
604 def readline(self):
605 self.lineno += 1
606 line = self.file.readline()
607 if not line:
608 return None
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +0000609 if line.endswith('\n'):
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000610 line = line[:-1]
611 return "%i: %s" % (self.lineno, line)
612
613 def __getstate__(self):
614 # Copy the object's state from self.__dict__ which contains
615 # all our instance attributes. Always use the dict.copy()
616 # method to avoid modifying the original state.
617 state = self.__dict__.copy()
618 # Remove the unpicklable entries.
619 del state['file']
620 return state
621
622 def __setstate__(self, state):
623 # Restore instance attributes (i.e., filename and lineno).
624 self.__dict__.update(state)
625 # Restore the previously opened file's state. To do so, we need to
626 # reopen it and read from it until the line count is restored.
627 file = open(self.filename)
628 for _ in range(self.lineno):
629 file.readline()
630 # Finally, save the file.
631 self.file = file
632
633
634A sample usage might be something like this::
635
636 >>> reader = TextReader("hello.txt")
637 >>> reader.readline()
638 '1: Hello world!'
639 >>> reader.readline()
640 '2: I am line number two.'
641 >>> new_reader = pickle.loads(pickle.dumps(reader))
642 >>> new_reader.readline()
643 '3: Goodbye!'
644
645
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000646.. _pickle-restrict:
Georg Brandl116aa622007-08-15 14:28:22 +0000647
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000648Restricting Globals
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000649-------------------
Georg Brandl116aa622007-08-15 14:28:22 +0000650
Christian Heimes05e8be12008-02-23 18:30:17 +0000651.. index::
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000652 single: find_class() (pickle protocol)
Christian Heimes05e8be12008-02-23 18:30:17 +0000653
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000654By default, unpickling will import any class or function that it finds in the
655pickle data. For many applications, this behaviour is unacceptable as it
656permits the unpickler to import and invoke arbitrary code. Just consider what
657this hand-crafted pickle data stream does when loaded::
Georg Brandl116aa622007-08-15 14:28:22 +0000658
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000659 >>> import pickle
660 >>> pickle.loads(b"cos\nsystem\n(S'echo hello world'\ntR.")
661 hello world
662 0
Georg Brandl116aa622007-08-15 14:28:22 +0000663
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000664In this example, the unpickler imports the :func:`os.system` function and then
665apply the string argument "echo hello world". Although this example is
666inoffensive, it is not difficult to imagine one that could damage your system.
Georg Brandl116aa622007-08-15 14:28:22 +0000667
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000668For this reason, you may want to control what gets unpickled by customizing
669:meth:`Unpickler.find_class`. Unlike its name suggests, :meth:`find_class` is
670called whenever a global (i.e., a class or a function) is requested. Thus it is
671possible to either forbid completely globals or restrict them to a safe subset.
672
673Here is an example of an unpickler allowing only few safe classes from the
674:mod:`builtins` module to be loaded::
675
676 import builtins
677 import io
678 import pickle
679
680 safe_builtins = {
681 'range',
682 'complex',
683 'set',
684 'frozenset',
685 'slice',
686 }
687
688 class RestrictedUnpickler(pickle.Unpickler):
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000689
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000690 def find_class(self, module, name):
691 # Only allow safe classes from builtins.
692 if module == "builtins" and name in safe_builtins:
693 return getattr(builtins, name)
694 # Forbid everything else.
695 raise pickle.UnpicklingError("global '%s.%s' is forbidden" %
696 (module, name))
697
698 def restricted_loads(s):
699 """Helper function analogous to pickle.loads()."""
700 return RestrictedUnpickler(io.BytesIO(s)).load()
701
702A sample usage of our unpickler working has intended::
703
704 >>> restricted_loads(pickle.dumps([1, 2, range(15)]))
705 [1, 2, range(0, 15)]
706 >>> restricted_loads(b"cos\nsystem\n(S'echo hello world'\ntR.")
707 Traceback (most recent call last):
708 ...
709 pickle.UnpicklingError: global 'os.system' is forbidden
710 >>> restricted_loads(b'cbuiltins\neval\n'
711 ... b'(S\'getattr(__import__("os"), "system")'
712 ... b'("echo hello world")\'\ntR.')
713 Traceback (most recent call last):
714 ...
715 pickle.UnpicklingError: global 'builtins.eval' is forbidden
716
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000717
718.. XXX Add note about how extension codes could evade our protection
Georg Brandl48310cd2009-01-03 21:18:54 +0000719 mechanism (e.g. cached classes do not invokes find_class()).
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000720
721As our examples shows, you have to be careful with what you allow to be
722unpickled. Therefore if security is a concern, you may want to consider
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +0000723alternatives such as the marshalling API in :mod:`xmlrpc.client` or
724third-party solutions.
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000725
Georg Brandl116aa622007-08-15 14:28:22 +0000726
727.. _pickle-example:
728
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +0000729Examples
730--------
Georg Brandl116aa622007-08-15 14:28:22 +0000731
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +0000732For the simplest code, use the :func:`dump` and :func:`load` functions. ::
Georg Brandl116aa622007-08-15 14:28:22 +0000733
734 import pickle
735
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +0000736 # An arbitrary collection of objects supported by pickle.
737 data = {
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +0000738 'a': [1, 2.0, 3, 4+6j],
739 'b': ("character string", b"byte string"),
740 'c': set([None, True, False])
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +0000741 }
Georg Brandl116aa622007-08-15 14:28:22 +0000742
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +0000743 with open('data.pickle', 'wb') as f:
744 # Pickle the 'data' dictionary using the highest protocol available.
745 pickle.dump(data, f, pickle.HIGHEST_PROTOCOL)
Georg Brandl116aa622007-08-15 14:28:22 +0000746
Georg Brandl116aa622007-08-15 14:28:22 +0000747
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +0000748The following example reads the resulting pickled data. ::
Georg Brandl116aa622007-08-15 14:28:22 +0000749
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +0000750 import pickle
Georg Brandl116aa622007-08-15 14:28:22 +0000751
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +0000752 with open('data.pickle', 'rb') as f:
753 # The protocol version used is detected automatically, so we do not
754 # have to specify it.
755 data = pickle.load(f)
Georg Brandl116aa622007-08-15 14:28:22 +0000756
Georg Brandl116aa622007-08-15 14:28:22 +0000757
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +0000758.. XXX: Add examples showing how to optimize pickles for size (like using
759.. pickletools.optimize() or the gzip module).
760
761
Georg Brandl116aa622007-08-15 14:28:22 +0000762.. seealso::
763
Alexandre Vassalottif7fa63d2008-05-11 08:55:36 +0000764 Module :mod:`copyreg`
Georg Brandl116aa622007-08-15 14:28:22 +0000765 Pickle interface constructor registration for extension types.
766
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +0000767 Module :mod:`pickletools`
768 Tools for working with and analyzing pickled data.
769
Georg Brandl116aa622007-08-15 14:28:22 +0000770 Module :mod:`shelve`
771 Indexed databases of objects; uses :mod:`pickle`.
772
773 Module :mod:`copy`
774 Shallow and deep object copying.
775
776 Module :mod:`marshal`
777 High-performance serialization of built-in types.
778
779
Georg Brandl116aa622007-08-15 14:28:22 +0000780.. rubric:: Footnotes
781
782.. [#] Don't confuse this with the :mod:`marshal` module
783
Georg Brandl116aa622007-08-15 14:28:22 +0000784.. [#] The exception raised will likely be an :exc:`ImportError` or an
785 :exc:`AttributeError` but it could be something else.
786
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000787.. [#] The :mod:`copy` module uses this protocol for shallow and deep copying
788 operations.
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000789
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000790.. [#] The limitation on alphanumeric characters is due to the fact
791 the persistent IDs, in protocol 0, are delimited by the newline
792 character. Therefore if any kind of newline characters occurs in
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000793 persistent IDs, the resulting pickle will become unreadable.