blob: 4aab5f5e9c702f2be0545dd0ecebc28a5093ea3d [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001:mod:`pickle` --- Python object serialization
2=============================================
3
4.. index::
5 single: persistence
6 pair: persistent; objects
7 pair: serializing; objects
8 pair: marshalling; objects
9 pair: flattening; objects
10 pair: pickling; objects
11
12.. module:: pickle
13 :synopsis: Convert Python objects to streams of bytes and back.
Christian Heimes5b5e81c2007-12-31 16:14:33 +000014.. sectionauthor:: Jim Kerr <jbkerr@sr.hp.com>.
15.. sectionauthor:: Barry Warsaw <barry@zope.com>
Georg Brandl116aa622007-08-15 14:28:22 +000016
17The :mod:`pickle` module implements a fundamental, but powerful algorithm for
18serializing and de-serializing a Python object structure. "Pickling" is the
19process whereby a Python object hierarchy is converted into a byte stream, and
20"unpickling" is the inverse operation, whereby a byte stream is converted back
21into an object hierarchy. Pickling (and unpickling) is alternatively known as
22"serialization", "marshalling," [#]_ or "flattening", however, to avoid
Benjamin Petersonbe149d02008-06-20 21:03:22 +000023confusion, the terms used here are "pickling" and "unpickling"..
Georg Brandl116aa622007-08-15 14:28:22 +000024
25
26Relationship to other Python modules
27------------------------------------
28
Benjamin Petersonbe149d02008-06-20 21:03:22 +000029The :mod:`pickle` module has an transparent optimizer (:mod:`_pickle`) written
30in C. It is used whenever available. Otherwise the pure Python implementation is
31used.
Georg Brandl116aa622007-08-15 14:28:22 +000032
33Python has a more primitive serialization module called :mod:`marshal`, but in
34general :mod:`pickle` should always be the preferred way to serialize Python
35objects. :mod:`marshal` exists primarily to support Python's :file:`.pyc`
36files.
37
38The :mod:`pickle` module differs from :mod:`marshal` several significant ways:
39
40* The :mod:`pickle` module keeps track of the objects it has already serialized,
41 so that later references to the same object won't be serialized again.
42 :mod:`marshal` doesn't do this.
43
44 This has implications both for recursive objects and object sharing. Recursive
45 objects are objects that contain references to themselves. These are not
46 handled by marshal, and in fact, attempting to marshal recursive objects will
47 crash your Python interpreter. Object sharing happens when there are multiple
48 references to the same object in different places in the object hierarchy being
49 serialized. :mod:`pickle` stores such objects only once, and ensures that all
50 other references point to the master copy. Shared objects remain shared, which
51 can be very important for mutable objects.
52
53* :mod:`marshal` cannot be used to serialize user-defined classes and their
54 instances. :mod:`pickle` can save and restore class instances transparently,
55 however the class definition must be importable and live in the same module as
56 when the object was stored.
57
58* The :mod:`marshal` serialization format is not guaranteed to be portable
59 across Python versions. Because its primary job in life is to support
60 :file:`.pyc` files, the Python implementers reserve the right to change the
61 serialization format in non-backwards compatible ways should the need arise.
62 The :mod:`pickle` serialization format is guaranteed to be backwards compatible
63 across Python releases.
64
65.. warning::
66
67 The :mod:`pickle` module is not intended to be secure against erroneous or
68 maliciously constructed data. Never unpickle data received from an untrusted or
69 unauthenticated source.
70
71Note that serialization is a more primitive notion than persistence; although
72:mod:`pickle` reads and writes file objects, it does not handle the issue of
73naming persistent objects, nor the (even more complicated) issue of concurrent
74access to persistent objects. The :mod:`pickle` module can transform a complex
75object into a byte stream and it can transform the byte stream into an object
76with the same internal structure. Perhaps the most obvious thing to do with
77these byte streams is to write them onto a file, but it is also conceivable to
78send them across a network or store them in a database. The module
79:mod:`shelve` provides a simple interface to pickle and unpickle objects on
80DBM-style database files.
81
82
83Data stream format
84------------------
85
86.. index::
87 single: XDR
88 single: External Data Representation
89
90The data format used by :mod:`pickle` is Python-specific. This has the
91advantage that there are no restrictions imposed by external standards such as
92XDR (which can't represent pointer sharing); however it means that non-Python
93programs may not be able to reconstruct pickled Python objects.
94
Alexandre Vassalotti758bca62008-10-18 19:25:07 +000095By default, the :mod:`pickle` data format uses a compact binary representation.
96The module :mod:`pickletools` contains tools for analyzing data streams
97generated by :mod:`pickle`.
Georg Brandl116aa622007-08-15 14:28:22 +000098
Georg Brandl42f2ae02008-04-06 08:39:37 +000099There are currently 4 different protocols which can be used for pickling.
Georg Brandl116aa622007-08-15 14:28:22 +0000100
101* Protocol version 0 is the original ASCII protocol and is backwards compatible
102 with earlier versions of Python.
103
104* Protocol version 1 is the old binary format which is also compatible with
105 earlier versions of Python.
106
107* Protocol version 2 was introduced in Python 2.3. It provides much more
Georg Brandl9afde1c2007-11-01 20:32:30 +0000108 efficient pickling of :term:`new-style class`\es.
Georg Brandl116aa622007-08-15 14:28:22 +0000109
Georg Brandl42f2ae02008-04-06 08:39:37 +0000110* Protocol version 3 was added in Python 3.0. It has explicit support for
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000111 bytes and cannot be unpickled by Python 2.x pickle modules. This is
112 the current recommended protocol, use it whenever it is possible.
Georg Brandl42f2ae02008-04-06 08:39:37 +0000113
Georg Brandl116aa622007-08-15 14:28:22 +0000114Refer to :pep:`307` for more information.
115
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000116If a *protocol* is not specified, protocol 3 is used. If *protocol* is
Georg Brandl42f2ae02008-04-06 08:39:37 +0000117specified as a negative value or :const:`HIGHEST_PROTOCOL`, the highest
118protocol version available will be used.
Georg Brandl116aa622007-08-15 14:28:22 +0000119
Georg Brandl116aa622007-08-15 14:28:22 +0000120
121Usage
122-----
123
124To serialize an object hierarchy, you first create a pickler, then you call the
125pickler's :meth:`dump` method. To de-serialize a data stream, you first create
126an unpickler, then you call the unpickler's :meth:`load` method. The
127:mod:`pickle` module provides the following constant:
128
129
130.. data:: HIGHEST_PROTOCOL
131
132 The highest protocol version available. This value can be passed as a
133 *protocol* value.
134
Georg Brandl116aa622007-08-15 14:28:22 +0000135.. note::
136
137 Be sure to always open pickle files created with protocols >= 1 in binary mode.
138 For the old ASCII-based pickle protocol 0 you can use either text mode or binary
139 mode as long as you stay consistent.
140
141 A pickle file written with protocol 0 in binary mode will contain lone linefeeds
142 as line terminators and therefore will look "funny" when viewed in Notepad or
143 other editors which do not support this format.
144
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000145.. data:: DEFAULT_PROTOCOL
146
147 The default protocol used for pickling. May be less than HIGHEST_PROTOCOL.
148 Currently the default protocol is 3; a backward-incompatible protocol
149 designed for Python 3.0.
150
151
Georg Brandl116aa622007-08-15 14:28:22 +0000152The :mod:`pickle` module provides the following functions to make the pickling
153process more convenient:
154
Georg Brandl116aa622007-08-15 14:28:22 +0000155.. function:: dump(obj, file[, protocol])
156
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000157 Write a pickled representation of *obj* to the open file object *file*. This
158 is equivalent to ``Pickler(file, protocol).dump(obj)``.
Georg Brandl116aa622007-08-15 14:28:22 +0000159
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000160 The optional *protocol* argument tells the pickler to use the given protocol;
161 supported protocols are 0, 1, 2, 3. The default protocol is 3; a
162 backward-incompatible protocol designed for Python 3.0.
Georg Brandl116aa622007-08-15 14:28:22 +0000163
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000164 Specifying a negative protocol version selects the highest protocol version
165 supported. The higher the protocol used, the more recent the version of
166 Python needed to read the pickle produced.
Georg Brandl116aa622007-08-15 14:28:22 +0000167
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000168 The *file* argument must have a write() method that accepts a single bytes
169 argument. It can thus be a file object opened for binary writing, a
170 io.BytesIO instance, or any other custom object that meets this interface.
Georg Brandl116aa622007-08-15 14:28:22 +0000171
172.. function:: dumps(obj[, protocol])
173
Mark Summerfieldb9e23042008-04-21 14:47:45 +0000174 Return the pickled representation of the object as a :class:`bytes`
175 object, instead of writing it to a file.
Georg Brandl116aa622007-08-15 14:28:22 +0000176
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000177 The optional *protocol* argument tells the pickler to use the given protocol;
178 supported protocols are 0, 1, 2, 3. The default protocol is 3; a
179 backward-incompatible protocol designed for Python 3.0.
180
181 Specifying a negative protocol version selects the highest protocol version
182 supported. The higher the protocol used, the more recent the version of
183 Python needed to read the pickle produced.
184
185.. function:: load(file, [\*, encoding="ASCII", errors="strict"])
186
187 Read a pickled object representation from the open file object *file* and
188 return the reconstituted object hierarchy specified therein. This is
189 equivalent to ``Unpickler(file).load()``.
190
191 The protocol version of the pickle is detected automatically, so no protocol
192 argument is needed. Bytes past the pickled object's representation are
193 ignored.
194
195 The argument *file* must have two methods, a read() method that takes an
196 integer argument, and a readline() method that requires no arguments. Both
197 methods should return bytes. Thus *file* can be a binary file object opened
198 for reading, a BytesIO object, or any other custom object that meets this
199 interface.
200
201 Optional keyword arguments are encoding and errors, which are used to decode
202 8-bit string instances pickled by Python 2.x. These default to 'ASCII' and
203 'strict', respectively.
204
205.. function:: loads(bytes_object, [\*, encoding="ASCII", errors="strict"])
206
207 Read a pickled object hierarchy from a :class:`bytes` object and return the
208 reconstituted object hierarchy specified therein
209
210 The protocol version of the pickle is detected automatically, so no protocol
211 argument is needed. Bytes past the pickled object's representation are
212 ignored.
213
214 Optional keyword arguments are encoding and errors, which are used to decode
215 8-bit string instances pickled by Python 2.x. These default to 'ASCII' and
216 'strict', respectively.
Georg Brandl116aa622007-08-15 14:28:22 +0000217
Georg Brandl116aa622007-08-15 14:28:22 +0000218
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000219The :mod:`pickle` module defines three exceptions:
Georg Brandl116aa622007-08-15 14:28:22 +0000220
221.. exception:: PickleError
222
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000223 Common base class for the other pickling exceptions. It inherits
Georg Brandl116aa622007-08-15 14:28:22 +0000224 :exc:`Exception`.
225
Georg Brandl116aa622007-08-15 14:28:22 +0000226.. exception:: PicklingError
227
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000228 Error raised when an unpicklable object is encountered by :class:`Pickler`.
229 It inherits :exc:`PickleError`.
Georg Brandl116aa622007-08-15 14:28:22 +0000230
231.. exception:: UnpicklingError
232
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000233 Error raised when there a problem unpickling an object, such as a data
234 corruption or a security violation. It inherits :exc:`PickleError`.
Georg Brandl116aa622007-08-15 14:28:22 +0000235
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000236 Note that other exceptions may also be raised during unpickling, including
237 (but not necessarily limited to) AttributeError, EOFError, ImportError, and
238 IndexError.
239
240
241The :mod:`pickle` module exports two classes, :class:`Pickler` and
Georg Brandl116aa622007-08-15 14:28:22 +0000242:class:`Unpickler`:
243
Georg Brandl116aa622007-08-15 14:28:22 +0000244.. class:: Pickler(file[, protocol])
245
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000246 This takes a binary file for writing a pickle data stream.
Georg Brandl116aa622007-08-15 14:28:22 +0000247
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000248 The optional *protocol* argument tells the pickler to use the given protocol;
249 supported protocols are 0, 1, 2, 3. The default protocol is 3; a
250 backward-incompatible protocol designed for Python 3.0.
Georg Brandl116aa622007-08-15 14:28:22 +0000251
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000252 Specifying a negative protocol version selects the highest protocol version
253 supported. The higher the protocol used, the more recent the version of
254 Python needed to read the pickle produced.
Georg Brandl116aa622007-08-15 14:28:22 +0000255
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000256 The *file* argument must have a write() method that accepts a single bytes
257 argument. It can thus be a file object opened for binary writing, a
258 io.BytesIO instance, or any other custom object that meets this interface.
Georg Brandl116aa622007-08-15 14:28:22 +0000259
Benjamin Petersone41251e2008-04-25 01:59:09 +0000260 .. method:: dump(obj)
Georg Brandl116aa622007-08-15 14:28:22 +0000261
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000262 Write a pickled representation of *obj* to the open file object given in
263 the constructor.
Georg Brandl116aa622007-08-15 14:28:22 +0000264
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000265 .. method:: persistent_id(obj)
266
267 Do nothing by default. This exists so a subclass can override it.
268
269 If :meth:`persistent_id` returns ``None``, *obj* is pickled as usual. Any
270 other value causes :class:`Pickler` to emit the returned value as a
271 persistent ID for *obj*. The meaning of this persistent ID should be
272 defined by :meth:`Unpickler.persistent_load`. Note that the value
273 returned by :meth:`persistent_id` cannot itself have a persistent ID.
274
275 See :ref:`pickle-persistent` for details and examples of uses.
Georg Brandl116aa622007-08-15 14:28:22 +0000276
Benjamin Petersone41251e2008-04-25 01:59:09 +0000277 .. method:: clear_memo()
Georg Brandl116aa622007-08-15 14:28:22 +0000278
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000279 Deprecated. Use the :meth:`clear` method on the :attr:`memo`. Clear the
280 pickler's memo, useful when reusing picklers.
281
282 .. attribute:: fast
283
284 Enable fast mode if set to a true value. The fast mode disables the usage
285 of memo, therefore speeding the pickling process by not generating
286 superfluous PUT opcodes. It should not be used with self-referential
287 objects, doing otherwise will cause :class:`Pickler` to recurse
288 infinitely.
289
290 Use :func:`pickletools.optimize` if you need more compact pickles.
291
292 .. attribute:: memo
293
294 Dictionary holding previously pickled objects to allow shared or
295 recursive objects to pickled by reference as opposed to by value.
Georg Brandl116aa622007-08-15 14:28:22 +0000296
Georg Brandl116aa622007-08-15 14:28:22 +0000297
298It is possible to make multiple calls to the :meth:`dump` method of the same
299:class:`Pickler` instance. These must then be matched to the same number of
300calls to the :meth:`load` method of the corresponding :class:`Unpickler`
301instance. If the same object is pickled by multiple :meth:`dump` calls, the
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000302:meth:`load` will all yield references to the same object.
Georg Brandl116aa622007-08-15 14:28:22 +0000303
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000304Please note, this is intended for pickling multiple objects without intervening
305modifications to the objects or their parts. If you modify an object and then
306pickle it again using the same :class:`Pickler` instance, the object is not
307pickled again --- a reference to it is pickled and the :class:`Unpickler` will
308return the old value, not the modified one.
Georg Brandl116aa622007-08-15 14:28:22 +0000309
310
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000311.. class:: Unpickler(file, [\*, encoding="ASCII", errors="strict"])
Georg Brandl116aa622007-08-15 14:28:22 +0000312
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000313 This takes a binary file for reading a pickle data stream.
Georg Brandl116aa622007-08-15 14:28:22 +0000314
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000315 The protocol version of the pickle is detected automatically, so no
316 protocol argument is needed.
317
318 The argument *file* must have two methods, a read() method that takes an
319 integer argument, and a readline() method that requires no arguments. Both
320 methods should return bytes. Thus *file* can be a binary file object opened
321 for reading, a BytesIO object, or any other custom object that meets this
Georg Brandl116aa622007-08-15 14:28:22 +0000322 interface.
323
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000324 Optional keyword arguments are encoding and errors, which are used to decode
325 8-bit string instances pickled by Python 2.x. These default to 'ASCII' and
326 'strict', respectively.
Georg Brandl116aa622007-08-15 14:28:22 +0000327
Benjamin Petersone41251e2008-04-25 01:59:09 +0000328 .. method:: load()
Georg Brandl116aa622007-08-15 14:28:22 +0000329
Benjamin Petersone41251e2008-04-25 01:59:09 +0000330 Read a pickled object representation from the open file object given in
331 the constructor, and return the reconstituted object hierarchy specified
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000332 therein. Bytes past the pickled object's representation are ignored.
Georg Brandl116aa622007-08-15 14:28:22 +0000333
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000334 .. method:: persistent_load(pid)
Georg Brandl116aa622007-08-15 14:28:22 +0000335
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000336 Raise an :exc:`UnpickingError` by default.
Georg Brandl116aa622007-08-15 14:28:22 +0000337
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000338 If defined, :meth:`persistent_load` should return the object specified by
339 the persistent ID *pid*. On errors, such as if an invalid persistent ID is
340 encountered, an :exc:`UnpickingError` should be raised.
Georg Brandl116aa622007-08-15 14:28:22 +0000341
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000342 See :ref:`pickle-persistent` for details and examples of uses.
343
344 .. method:: find_class(module, name)
345
346 Import *module* if necessary and return the object called *name* from it.
347 Subclasses may override this to gain control over what type of objects can
348 be loaded, potentially reducing security risks.
Georg Brandl116aa622007-08-15 14:28:22 +0000349
Georg Brandl116aa622007-08-15 14:28:22 +0000350
351What can be pickled and unpickled?
352----------------------------------
353
354The following types can be pickled:
355
356* ``None``, ``True``, and ``False``
357
Georg Brandlba956ae2007-11-29 17:24:34 +0000358* integers, floating point numbers, complex numbers
Georg Brandl116aa622007-08-15 14:28:22 +0000359
Georg Brandlf6945182008-02-01 11:56:49 +0000360* strings, bytes, bytearrays
Georg Brandl116aa622007-08-15 14:28:22 +0000361
362* tuples, lists, sets, and dictionaries containing only picklable objects
363
364* functions defined at the top level of a module
365
366* built-in functions defined at the top level of a module
367
368* classes that are defined at the top level of a module
369
370* instances of such classes whose :attr:`__dict__` or :meth:`__setstate__` is
371 picklable (see section :ref:`pickle-protocol` for details)
372
373Attempts to pickle unpicklable objects will raise the :exc:`PicklingError`
374exception; when this happens, an unspecified number of bytes may have already
375been written to the underlying file. Trying to pickle a highly recursive data
376structure may exceed the maximum recursion depth, a :exc:`RuntimeError` will be
377raised in this case. You can carefully raise this limit with
378:func:`sys.setrecursionlimit`.
379
380Note that functions (built-in and user-defined) are pickled by "fully qualified"
381name reference, not by value. This means that only the function name is
382pickled, along with the name of module the function is defined in. Neither the
383function's code, nor any of its function attributes are pickled. Thus the
384defining module must be importable in the unpickling environment, and the module
385must contain the named object, otherwise an exception will be raised. [#]_
386
387Similarly, classes are pickled by named reference, so the same restrictions in
388the unpickling environment apply. Note that none of the class's code or data is
389pickled, so in the following example the class attribute ``attr`` is not
390restored in the unpickling environment::
391
392 class Foo:
393 attr = 'a class attr'
394
395 picklestring = pickle.dumps(Foo)
396
397These restrictions are why picklable functions and classes must be defined in
398the top level of a module.
399
400Similarly, when class instances are pickled, their class's code and data are not
401pickled along with them. Only the instance data are pickled. This is done on
402purpose, so you can fix bugs in a class or add methods to the class and still
403load objects that were created with an earlier version of the class. If you
404plan to have long-lived objects that will see many versions of a class, it may
405be worthwhile to put a version number in the objects so that suitable
406conversions can be made by the class's :meth:`__setstate__` method.
407
408
409.. _pickle-protocol:
410
411The pickle protocol
412-------------------
413
414This section describes the "pickling protocol" that defines the interface
415between the pickler/unpickler and the objects that are being serialized. This
416protocol provides a standard way for you to define, customize, and control how
417your objects are serialized and de-serialized. The description in this section
418doesn't cover specific customizations that you can employ to make the unpickling
419environment slightly safer from untrusted pickle data streams; see section
420:ref:`pickle-sub` for more details.
421
422
423.. _pickle-inst:
424
425Pickling and unpickling normal class instances
426^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
427
428.. index::
429 single: __getinitargs__() (copy protocol)
430 single: __init__() (instance constructor)
431
Georg Brandl85eb8c12007-08-31 16:33:38 +0000432.. XXX is __getinitargs__ only used with old-style classes?
Georg Brandl23e8db52008-04-07 19:17:06 +0000433.. XXX update w.r.t Py3k's classes
Georg Brandl85eb8c12007-08-31 16:33:38 +0000434
Georg Brandl116aa622007-08-15 14:28:22 +0000435When a pickled class instance is unpickled, its :meth:`__init__` method is
436normally *not* invoked. If it is desirable that the :meth:`__init__` method be
437called on unpickling, an old-style class can define a method
438:meth:`__getinitargs__`, which should return a *tuple* containing the arguments
439to be passed to the class constructor (:meth:`__init__` for example). The
440:meth:`__getinitargs__` method is called at pickle time; the tuple it returns is
441incorporated in the pickle for the instance.
442
443.. index:: single: __getnewargs__() (copy protocol)
444
445New-style types can provide a :meth:`__getnewargs__` method that is used for
446protocol 2. Implementing this method is needed if the type establishes some
447internal invariants when the instance is created, or if the memory allocation is
448affected by the values passed to the :meth:`__new__` method for the type (as it
Georg Brandl9afde1c2007-11-01 20:32:30 +0000449is for tuples and strings). Instances of a :term:`new-style class` :class:`C`
450are created using ::
Georg Brandl116aa622007-08-15 14:28:22 +0000451
452 obj = C.__new__(C, *args)
453
454
455where *args* is the result of calling :meth:`__getnewargs__` on the original
456object; if there is no :meth:`__getnewargs__`, an empty tuple is assumed.
457
458.. index::
459 single: __getstate__() (copy protocol)
460 single: __setstate__() (copy protocol)
461 single: __dict__ (instance attribute)
462
463Classes can further influence how their instances are pickled; if the class
464defines the method :meth:`__getstate__`, it is called and the return state is
465pickled as the contents for the instance, instead of the contents of the
466instance's dictionary. If there is no :meth:`__getstate__` method, the
467instance's :attr:`__dict__` is pickled.
468
469Upon unpickling, if the class also defines the method :meth:`__setstate__`, it
470is called with the unpickled state. [#]_ If there is no :meth:`__setstate__`
471method, the pickled state must be a dictionary and its items are assigned to the
472new instance's dictionary. If a class defines both :meth:`__getstate__` and
473:meth:`__setstate__`, the state object needn't be a dictionary and these methods
474can do what they want. [#]_
475
476.. warning::
477
Georg Brandl23e8db52008-04-07 19:17:06 +0000478 If :meth:`__getstate__` returns a false value, the :meth:`__setstate__`
479 method will not be called.
Georg Brandl116aa622007-08-15 14:28:22 +0000480
481
482Pickling and unpickling extension types
483^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
484
Christian Heimes05e8be12008-02-23 18:30:17 +0000485.. index::
486 single: __reduce__() (pickle protocol)
487 single: __reduce_ex__() (pickle protocol)
488 single: __safe_for_unpickling__ (pickle protocol)
489
Georg Brandl116aa622007-08-15 14:28:22 +0000490When the :class:`Pickler` encounters an object of a type it knows nothing about
491--- such as an extension type --- it looks in two places for a hint of how to
492pickle it. One alternative is for the object to implement a :meth:`__reduce__`
493method. If provided, at pickling time :meth:`__reduce__` will be called with no
494arguments, and it must return either a string or a tuple.
495
496If a string is returned, it names a global variable whose contents are pickled
497as normal. The string returned by :meth:`__reduce__` should be the object's
498local name relative to its module; the pickle module searches the module
499namespace to determine the object's module.
500
501When a tuple is returned, it must be between two and five elements long.
Martin v. Löwis2a241ca2008-04-05 18:58:09 +0000502Optional elements can either be omitted, or ``None`` can be provided as their
503value. The contents of this tuple are pickled as normal and used to
504reconstruct the object at unpickling time. The semantics of each element are:
Georg Brandl116aa622007-08-15 14:28:22 +0000505
506* A callable object that will be called to create the initial version of the
507 object. The next element of the tuple will provide arguments for this callable,
508 and later elements provide additional state information that will subsequently
509 be used to fully reconstruct the pickled data.
510
511 In the unpickling environment this object must be either a class, a callable
512 registered as a "safe constructor" (see below), or it must have an attribute
513 :attr:`__safe_for_unpickling__` with a true value. Otherwise, an
514 :exc:`UnpicklingError` will be raised in the unpickling environment. Note that
515 as usual, the callable itself is pickled by name.
516
Georg Brandl55ac8f02007-09-01 13:51:09 +0000517* A tuple of arguments for the callable object, not ``None``.
Georg Brandl116aa622007-08-15 14:28:22 +0000518
519* Optionally, the object's state, which will be passed to the object's
520 :meth:`__setstate__` method as described in section :ref:`pickle-inst`. If the
521 object has no :meth:`__setstate__` method, then, as above, the value must be a
522 dictionary and it will be added to the object's :attr:`__dict__`.
523
524* Optionally, an iterator (and not a sequence) yielding successive list items.
525 These list items will be pickled, and appended to the object using either
526 ``obj.append(item)`` or ``obj.extend(list_of_items)``. This is primarily used
527 for list subclasses, but may be used by other classes as long as they have
528 :meth:`append` and :meth:`extend` methods with the appropriate signature.
529 (Whether :meth:`append` or :meth:`extend` is used depends on which pickle
530 protocol version is used as well as the number of items to append, so both must
531 be supported.)
532
533* Optionally, an iterator (not a sequence) yielding successive dictionary items,
534 which should be tuples of the form ``(key, value)``. These items will be
535 pickled and stored to the object using ``obj[key] = value``. This is primarily
536 used for dictionary subclasses, but may be used by other classes as long as they
537 implement :meth:`__setitem__`.
538
539It is sometimes useful to know the protocol version when implementing
540:meth:`__reduce__`. This can be done by implementing a method named
541:meth:`__reduce_ex__` instead of :meth:`__reduce__`. :meth:`__reduce_ex__`, when
542it exists, is called in preference over :meth:`__reduce__` (you may still
543provide :meth:`__reduce__` for backwards compatibility). The
544:meth:`__reduce_ex__` method will be called with a single integer argument, the
545protocol version.
546
547The :class:`object` class implements both :meth:`__reduce__` and
548:meth:`__reduce_ex__`; however, if a subclass overrides :meth:`__reduce__` but
549not :meth:`__reduce_ex__`, the :meth:`__reduce_ex__` implementation detects this
550and calls :meth:`__reduce__`.
551
552An alternative to implementing a :meth:`__reduce__` method on the object to be
Alexandre Vassalottif7fa63d2008-05-11 08:55:36 +0000553pickled, is to register the callable with the :mod:`copyreg` module. This
Georg Brandl116aa622007-08-15 14:28:22 +0000554module provides a way for programs to register "reduction functions" and
555constructors for user-defined types. Reduction functions have the same
556semantics and interface as the :meth:`__reduce__` method described above, except
557that they are called with a single argument, the object to be pickled.
558
559The registered constructor is deemed a "safe constructor" for purposes of
560unpickling as described above.
561
562
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000563.. _pickle-persistent:
564
Georg Brandl116aa622007-08-15 14:28:22 +0000565Pickling and unpickling external objects
566^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
567
Christian Heimes05e8be12008-02-23 18:30:17 +0000568.. index::
569 single: persistent_id (pickle protocol)
570 single: persistent_load (pickle protocol)
571
Georg Brandl116aa622007-08-15 14:28:22 +0000572For the benefit of object persistence, the :mod:`pickle` module supports the
573notion of a reference to an object outside the pickled data stream. Such
574objects are referenced by a "persistent id", which is just an arbitrary string
575of printable ASCII characters. The resolution of such names is not defined by
576the :mod:`pickle` module; it will delegate this resolution to user defined
Benjamin Petersonbe149d02008-06-20 21:03:22 +0000577functions on the pickler and unpickler.
Georg Brandl116aa622007-08-15 14:28:22 +0000578
579To define external persistent id resolution, you need to set the
580:attr:`persistent_id` attribute of the pickler object and the
581:attr:`persistent_load` attribute of the unpickler object.
582
583To pickle objects that have an external persistent id, the pickler must have a
584custom :func:`persistent_id` method that takes an object as an argument and
585returns either ``None`` or the persistent id for that object. When ``None`` is
586returned, the pickler simply pickles the object as normal. When a persistent id
587string is returned, the pickler will pickle that string, along with a marker so
588that the unpickler will recognize the string as a persistent id.
589
590To unpickle external objects, the unpickler must have a custom
591:func:`persistent_load` function that takes a persistent id string and returns
592the referenced object.
593
594Here's a silly example that *might* shed more light::
595
596 import pickle
Georg Brandl03124942008-06-10 15:50:56 +0000597 from io import StringIO
Georg Brandl116aa622007-08-15 14:28:22 +0000598
599 src = StringIO()
600 p = pickle.Pickler(src)
601
602 def persistent_id(obj):
603 if hasattr(obj, 'x'):
604 return 'the value %d' % obj.x
605 else:
606 return None
607
608 p.persistent_id = persistent_id
609
610 class Integer:
611 def __init__(self, x):
612 self.x = x
613 def __str__(self):
614 return 'My name is integer %d' % self.x
615
616 i = Integer(7)
Georg Brandl6911e3c2007-09-04 07:15:32 +0000617 print(i)
Georg Brandl116aa622007-08-15 14:28:22 +0000618 p.dump(i)
619
620 datastream = src.getvalue()
Georg Brandl6911e3c2007-09-04 07:15:32 +0000621 print(repr(datastream))
Georg Brandl116aa622007-08-15 14:28:22 +0000622 dst = StringIO(datastream)
623
624 up = pickle.Unpickler(dst)
625
626 class FancyInteger(Integer):
627 def __str__(self):
628 return 'I am the integer %d' % self.x
629
630 def persistent_load(persid):
631 if persid.startswith('the value '):
632 value = int(persid.split()[2])
633 return FancyInteger(value)
634 else:
Collin Winter6fe2a6c2007-09-10 00:20:05 +0000635 raise pickle.UnpicklingError('Invalid persistent id')
Georg Brandl116aa622007-08-15 14:28:22 +0000636
637 up.persistent_load = persistent_load
638
639 j = up.load()
Georg Brandl6911e3c2007-09-04 07:15:32 +0000640 print(j)
Georg Brandl116aa622007-08-15 14:28:22 +0000641
Georg Brandl116aa622007-08-15 14:28:22 +0000642
Benjamin Petersonbe149d02008-06-20 21:03:22 +0000643.. BAW: pickle supports something called inst_persistent_id()
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000644 which appears to give unknown types a second shot at producing a persistent
645 id. Since Jim Fulton can't remember why it was added or what it's for, I'm
646 leaving it undocumented.
Georg Brandl116aa622007-08-15 14:28:22 +0000647
648
649.. _pickle-sub:
650
651Subclassing Unpicklers
652----------------------
653
Christian Heimes05e8be12008-02-23 18:30:17 +0000654.. index::
655 single: load_global() (pickle protocol)
656 single: find_global() (pickle protocol)
657
Georg Brandl116aa622007-08-15 14:28:22 +0000658By default, unpickling will import any class that it finds in the pickle data.
659You can control exactly what gets unpickled and what gets called by customizing
Benjamin Petersonbe149d02008-06-20 21:03:22 +0000660your unpickler.
Georg Brandl116aa622007-08-15 14:28:22 +0000661
Benjamin Petersonbe149d02008-06-20 21:03:22 +0000662You need to derive a subclass from :class:`Unpickler`, overriding the
663:meth:`load_global` method. :meth:`load_global` should read two lines from the
664pickle data stream where the first line will the name of the module containing
665the class and the second line will be the name of the instance's class. It then
666looks up the class, possibly importing the module and digging out the attribute,
667then it appends what it finds to the unpickler's stack. Later on, this class
668will be assigned to the :attr:`__class__` attribute of an empty class, as a way
669of magically creating an instance without calling its class's
670:meth:`__init__`. Your job (should you choose to accept it), would be to have
671:meth:`load_global` push onto the unpickler's stack, a known safe version of any
672class you deem safe to unpickle. It is up to you to produce such a class. Or
673you could raise an error if you want to disallow all unpickling of instances.
674If this sounds like a hack, you're right. Refer to the source code to make this
675work.
Georg Brandl116aa622007-08-15 14:28:22 +0000676
677The moral of the story is that you should be really careful about the source of
678the strings your application unpickles.
679
680
681.. _pickle-example:
682
683Example
684-------
685
686For the simplest code, use the :func:`dump` and :func:`load` functions. Note
687that a self-referencing list is pickled and restored correctly. ::
688
689 import pickle
690
691 data1 = {'a': [1, 2.0, 3, 4+6j],
Georg Brandlf6945182008-02-01 11:56:49 +0000692 'b': ("string", "string using Unicode features \u0394"),
Georg Brandl116aa622007-08-15 14:28:22 +0000693 'c': None}
694
695 selfref_list = [1, 2, 3]
696 selfref_list.append(selfref_list)
697
698 output = open('data.pkl', 'wb')
699
Georg Brandl42f2ae02008-04-06 08:39:37 +0000700 # Pickle dictionary using protocol 2.
701 pickle.dump(data1, output, 2)
Georg Brandl116aa622007-08-15 14:28:22 +0000702
703 # Pickle the list using the highest protocol available.
704 pickle.dump(selfref_list, output, -1)
705
706 output.close()
707
708The following example reads the resulting pickled data. When reading a
709pickle-containing file, you should open the file in binary mode because you
710can't be sure if the ASCII or binary format was used. ::
711
712 import pprint, pickle
713
714 pkl_file = open('data.pkl', 'rb')
715
716 data1 = pickle.load(pkl_file)
717 pprint.pprint(data1)
718
719 data2 = pickle.load(pkl_file)
720 pprint.pprint(data2)
721
722 pkl_file.close()
723
724Here's a larger example that shows how to modify pickling behavior for a class.
725The :class:`TextReader` class opens a text file, and returns the line number and
726line contents each time its :meth:`readline` method is called. If a
727:class:`TextReader` instance is pickled, all attributes *except* the file object
728member are saved. When the instance is unpickled, the file is reopened, and
729reading resumes from the last location. The :meth:`__setstate__` and
730:meth:`__getstate__` methods are used to implement this behavior. ::
731
732 #!/usr/local/bin/python
733
734 class TextReader:
735 """Print and number lines in a text file."""
736 def __init__(self, file):
737 self.file = file
738 self.fh = open(file)
739 self.lineno = 0
740
741 def readline(self):
742 self.lineno = self.lineno + 1
743 line = self.fh.readline()
744 if not line:
745 return None
746 if line.endswith("\n"):
747 line = line[:-1]
748 return "%d: %s" % (self.lineno, line)
749
750 def __getstate__(self):
751 odict = self.__dict__.copy() # copy the dict since we change it
752 del odict['fh'] # remove filehandle entry
753 return odict
754
755 def __setstate__(self, dict):
756 fh = open(dict['file']) # reopen file
757 count = dict['lineno'] # read from file...
758 while count: # until line count is restored
759 fh.readline()
760 count = count - 1
761 self.__dict__.update(dict) # update attributes
762 self.fh = fh # save the file object
763
764A sample usage might be something like this::
765
766 >>> import TextReader
767 >>> obj = TextReader.TextReader("TextReader.py")
768 >>> obj.readline()
769 '1: #!/usr/local/bin/python'
770 >>> obj.readline()
771 '2: '
772 >>> obj.readline()
773 '3: class TextReader:'
774 >>> import pickle
775 >>> pickle.dump(obj, open('save.p', 'wb'))
776
777If you want to see that :mod:`pickle` works across Python processes, start
778another Python session, before continuing. What follows can happen from either
779the same process or a new process. ::
780
781 >>> import pickle
782 >>> reader = pickle.load(open('save.p', 'rb'))
783 >>> reader.readline()
784 '4: """Print and number lines in a text file."""'
785
786
787.. seealso::
788
Alexandre Vassalottif7fa63d2008-05-11 08:55:36 +0000789 Module :mod:`copyreg`
Georg Brandl116aa622007-08-15 14:28:22 +0000790 Pickle interface constructor registration for extension types.
791
792 Module :mod:`shelve`
793 Indexed databases of objects; uses :mod:`pickle`.
794
795 Module :mod:`copy`
796 Shallow and deep object copying.
797
798 Module :mod:`marshal`
799 High-performance serialization of built-in types.
800
801
Georg Brandl116aa622007-08-15 14:28:22 +0000802.. rubric:: Footnotes
803
804.. [#] Don't confuse this with the :mod:`marshal` module
805
Georg Brandl116aa622007-08-15 14:28:22 +0000806.. [#] The exception raised will likely be an :exc:`ImportError` or an
807 :exc:`AttributeError` but it could be something else.
808
809.. [#] These methods can also be used to implement copying class instances.
810
811.. [#] This protocol is also used by the shallow and deep copying operations defined in
812 the :mod:`copy` module.