blob: eb58178e0e928b37158684af57c0a6675943ba39 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001:mod:`pickle` --- Python object serialization
2=============================================
3
Terry Jan Reedyfa089b92016-06-11 15:02:54 -04004.. module:: pickle
5 :synopsis: Convert Python objects to streams of bytes and back.
6
7.. sectionauthor:: Jim Kerr <jbkerr@sr.hp.com>.
8.. sectionauthor:: Barry Warsaw <barry@python.org>
9
10**Source code:** :source:`Lib/pickle.py`
11
Georg Brandl116aa622007-08-15 14:28:22 +000012.. index::
13 single: persistence
14 pair: persistent; objects
15 pair: serializing; objects
16 pair: marshalling; objects
17 pair: flattening; objects
18 pair: pickling; objects
19
Terry Jan Reedyfa089b92016-06-11 15:02:54 -040020--------------
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +000021
Antoine Pitroud4d60552013-12-07 00:56:59 +010022The :mod:`pickle` module implements binary protocols for serializing and
23de-serializing a Python object structure. *"Pickling"* is the process
24whereby a Python object hierarchy is converted into a byte stream, and
25*"unpickling"* is the inverse operation, whereby a byte stream
26(from a :term:`binary file` or :term:`bytes-like object`) is converted
27back into an object hierarchy. Pickling (and unpickling) is alternatively
28known as "serialization", "marshalling," [#]_ or "flattening"; however, to
29avoid confusion, the terms used here are "pickling" and "unpickling".
Georg Brandl116aa622007-08-15 14:28:22 +000030
Georg Brandl0036bcf2010-10-17 10:24:54 +000031.. warning::
32
Daniel Popedaa82d02019-08-31 06:51:33 +010033 The ``pickle`` module **is not secure**. Only unpickle data you trust.
34
35 It is possible to construct malicious pickle data which will **execute
36 arbitrary code during unpickling**. Never unpickle data that could have come
37 from an untrusted source, or that could have been tampered with.
38
39 Consider signing data with :mod:`hmac` if you need to ensure that it has not
40 been tampered with.
41
42 Safer serialization formats such as :mod:`json` may be more appropriate if
43 you are processing untrusted data. See :ref:`comparison-with-json`.
Georg Brandl0036bcf2010-10-17 10:24:54 +000044
Georg Brandl116aa622007-08-15 14:28:22 +000045
46Relationship to other Python modules
47------------------------------------
48
Antoine Pitroud4d60552013-12-07 00:56:59 +010049Comparison with ``marshal``
50^^^^^^^^^^^^^^^^^^^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +000051
52Python has a more primitive serialization module called :mod:`marshal`, but in
53general :mod:`pickle` should always be the preferred way to serialize Python
54objects. :mod:`marshal` exists primarily to support Python's :file:`.pyc`
55files.
56
Georg Brandl5aa580f2010-11-30 14:57:54 +000057The :mod:`pickle` module differs from :mod:`marshal` in several significant ways:
Georg Brandl116aa622007-08-15 14:28:22 +000058
59* The :mod:`pickle` module keeps track of the objects it has already serialized,
60 so that later references to the same object won't be serialized again.
61 :mod:`marshal` doesn't do this.
62
63 This has implications both for recursive objects and object sharing. Recursive
64 objects are objects that contain references to themselves. These are not
65 handled by marshal, and in fact, attempting to marshal recursive objects will
66 crash your Python interpreter. Object sharing happens when there are multiple
67 references to the same object in different places in the object hierarchy being
68 serialized. :mod:`pickle` stores such objects only once, and ensures that all
69 other references point to the master copy. Shared objects remain shared, which
70 can be very important for mutable objects.
71
72* :mod:`marshal` cannot be used to serialize user-defined classes and their
73 instances. :mod:`pickle` can save and restore class instances transparently,
74 however the class definition must be importable and live in the same module as
75 when the object was stored.
76
77* The :mod:`marshal` serialization format is not guaranteed to be portable
78 across Python versions. Because its primary job in life is to support
79 :file:`.pyc` files, the Python implementers reserve the right to change the
80 serialization format in non-backwards compatible ways should the need arise.
81 The :mod:`pickle` serialization format is guaranteed to be backwards compatible
Gregory P. Smithe3287532018-12-09 11:42:58 -080082 across Python releases provided a compatible pickle protocol is chosen and
83 pickling and unpickling code deals with Python 2 to Python 3 type differences
84 if your data is crossing that unique breaking change language boundary.
Georg Brandl116aa622007-08-15 14:28:22 +000085
Daniel Popedaa82d02019-08-31 06:51:33 +010086
87.. _comparison-with-json:
88
Antoine Pitroud4d60552013-12-07 00:56:59 +010089Comparison with ``json``
90^^^^^^^^^^^^^^^^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +000091
Antoine Pitroud4d60552013-12-07 00:56:59 +010092There are fundamental differences between the pickle protocols and
93`JSON (JavaScript Object Notation) <http://json.org>`_:
94
95* JSON is a text serialization format (it outputs unicode text, although
96 most of the time it is then encoded to ``utf-8``), while pickle is
97 a binary serialization format;
98
99* JSON is human-readable, while pickle is not;
100
101* JSON is interoperable and widely used outside of the Python ecosystem,
102 while pickle is Python-specific;
103
104* JSON, by default, can only represent a subset of the Python built-in
105 types, and no custom classes; pickle can represent an extremely large
106 number of Python types (many of them automatically, by clever usage
107 of Python's introspection facilities; complex cases can be tackled by
Daniel Popedaa82d02019-08-31 06:51:33 +0100108 implementing :ref:`specific object APIs <pickle-inst>`);
109
110* Unlike pickle, deserializing untrusted JSON does not in itself create an
111 arbitrary code execution vulnerability.
Antoine Pitroud4d60552013-12-07 00:56:59 +0100112
113.. seealso::
114 The :mod:`json` module: a standard library module allowing JSON
115 serialization and deserialization.
Georg Brandl116aa622007-08-15 14:28:22 +0000116
Antoine Pitrou9bcb1122013-12-07 01:05:57 +0100117
118.. _pickle-protocols:
119
Georg Brandl116aa622007-08-15 14:28:22 +0000120Data stream format
121------------------
122
123.. index::
Georg Brandl116aa622007-08-15 14:28:22 +0000124 single: External Data Representation
125
126The data format used by :mod:`pickle` is Python-specific. This has the
127advantage that there are no restrictions imposed by external standards such as
Antoine Pitroua9494f62012-05-10 15:38:30 +0200128JSON or XDR (which can't represent pointer sharing); however it means that
129non-Python programs may not be able to reconstruct pickled Python objects.
Georg Brandl116aa622007-08-15 14:28:22 +0000130
Antoine Pitroua9494f62012-05-10 15:38:30 +0200131By default, the :mod:`pickle` data format uses a relatively compact binary
132representation. If you need optimal size characteristics, you can efficiently
133:doc:`compress <archiving>` pickled data.
134
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000135The module :mod:`pickletools` contains tools for analyzing data streams
Antoine Pitroua9494f62012-05-10 15:38:30 +0200136generated by :mod:`pickle`. :mod:`pickletools` source code has extensive
137comments about opcodes used by pickle protocols.
Georg Brandl116aa622007-08-15 14:28:22 +0000138
Antoine Pitroub6457242014-01-21 02:39:54 +0100139There are currently 5 different protocols which can be used for pickling.
140The higher the protocol used, the more recent the version of Python needed
141to read the pickle produced.
Georg Brandl116aa622007-08-15 14:28:22 +0000142
Antoine Pitroua9494f62012-05-10 15:38:30 +0200143* Protocol version 0 is the original "human-readable" protocol and is
Alexandre Vassalottif7d08c72009-01-23 04:50:05 +0000144 backwards compatible with earlier versions of Python.
Georg Brandl116aa622007-08-15 14:28:22 +0000145
Antoine Pitroua9494f62012-05-10 15:38:30 +0200146* Protocol version 1 is an old binary format which is also compatible with
Georg Brandl116aa622007-08-15 14:28:22 +0000147 earlier versions of Python.
148
149* Protocol version 2 was introduced in Python 2.3. It provides much more
Antoine Pitroua9494f62012-05-10 15:38:30 +0200150 efficient pickling of :term:`new-style class`\es. Refer to :pep:`307` for
151 information about improvements brought by protocol 2.
Georg Brandl116aa622007-08-15 14:28:22 +0000152
Antoine Pitrou9bcb1122013-12-07 01:05:57 +0100153* Protocol version 3 was added in Python 3.0. It has explicit support for
Łukasz Langac51d8c92018-04-03 23:06:53 -0700154 :class:`bytes` objects and cannot be unpickled by Python 2.x. This was
155 the default protocol in Python 3.0--3.7.
Antoine Pitrou9bcb1122013-12-07 01:05:57 +0100156
157* Protocol version 4 was added in Python 3.4. It adds support for very large
158 objects, pickling more kinds of objects, and some data format
Łukasz Langac51d8c92018-04-03 23:06:53 -0700159 optimizations. It is the default protocol starting with Python 3.8.
160 Refer to :pep:`3154` for information about improvements brought by
161 protocol 4.
Georg Brandl116aa622007-08-15 14:28:22 +0000162
Antoine Pitroud4d60552013-12-07 00:56:59 +0100163.. note::
164 Serialization is a more primitive notion than persistence; although
165 :mod:`pickle` reads and writes file objects, it does not handle the issue of
166 naming persistent objects, nor the (even more complicated) issue of concurrent
167 access to persistent objects. The :mod:`pickle` module can transform a complex
168 object into a byte stream and it can transform the byte stream into an object
169 with the same internal structure. Perhaps the most obvious thing to do with
170 these byte streams is to write them onto a file, but it is also conceivable to
171 send them across a network or store them in a database. The :mod:`shelve`
172 module provides a simple interface to pickle and unpickle objects on
173 DBM-style database files.
174
Georg Brandl116aa622007-08-15 14:28:22 +0000175
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000176Module Interface
177----------------
Georg Brandl116aa622007-08-15 14:28:22 +0000178
Antoine Pitroua9494f62012-05-10 15:38:30 +0200179To serialize an object hierarchy, you simply call the :func:`dumps` function.
180Similarly, to de-serialize a data stream, you call the :func:`loads` function.
181However, if you want more control over serialization and de-serialization,
182you can create a :class:`Pickler` or an :class:`Unpickler` object, respectively.
183
184The :mod:`pickle` module provides the following constants:
Georg Brandl116aa622007-08-15 14:28:22 +0000185
186
187.. data:: HIGHEST_PROTOCOL
188
Antoine Pitrou9bcb1122013-12-07 01:05:57 +0100189 An integer, the highest :ref:`protocol version <pickle-protocols>`
190 available. This value can be passed as a *protocol* value to functions
191 :func:`dump` and :func:`dumps` as well as the :class:`Pickler`
192 constructor.
Georg Brandl116aa622007-08-15 14:28:22 +0000193
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000194.. data:: DEFAULT_PROTOCOL
195
Antoine Pitrou9bcb1122013-12-07 01:05:57 +0100196 An integer, the default :ref:`protocol version <pickle-protocols>` used
197 for pickling. May be less than :data:`HIGHEST_PROTOCOL`. Currently the
Łukasz Langac51d8c92018-04-03 23:06:53 -0700198 default protocol is 4, first introduced in Python 3.4 and incompatible
199 with previous versions.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000200
Łukasz Langac51d8c92018-04-03 23:06:53 -0700201 .. versionchanged:: 3.0
202
203 The default protocol is 3.
204
205 .. versionchanged:: 3.8
206
207 The default protocol is 4.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000208
Georg Brandl116aa622007-08-15 14:28:22 +0000209The :mod:`pickle` module provides the following functions to make the pickling
210process more convenient:
211
Antoine Pitrou91f43802019-05-26 17:10:09 +0200212.. function:: dump(obj, file, protocol=None, \*, fix_imports=True, buffer_callback=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000213
Géry Ogam362f5352019-08-07 07:02:23 +0200214 Write the pickled representation of the object *obj* to the open
215 :term:`file object` *file*. This is equivalent to
216 ``Pickler(file, protocol).dump(obj)``.
Georg Brandl116aa622007-08-15 14:28:22 +0000217
Antoine Pitrou91f43802019-05-26 17:10:09 +0200218 Arguments *file*, *protocol*, *fix_imports* and *buffer_callback* have
219 the same meaning as in the :class:`Pickler` constructor.
Georg Brandl116aa622007-08-15 14:28:22 +0000220
Antoine Pitrou91f43802019-05-26 17:10:09 +0200221 .. versionchanged:: 3.8
222 The *buffer_callback* argument was added.
Georg Brandl116aa622007-08-15 14:28:22 +0000223
Antoine Pitrou91f43802019-05-26 17:10:09 +0200224.. function:: dumps(obj, protocol=None, \*, fix_imports=True, buffer_callback=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000225
Géry Ogam362f5352019-08-07 07:02:23 +0200226 Return the pickled representation of the object *obj* as a :class:`bytes` object,
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800227 instead of writing it to a file.
Georg Brandl116aa622007-08-15 14:28:22 +0000228
Antoine Pitrou91f43802019-05-26 17:10:09 +0200229 Arguments *protocol*, *fix_imports* and *buffer_callback* have the same
230 meaning as in the :class:`Pickler` constructor.
Antoine Pitroud9dfaa92009-06-04 20:32:06 +0000231
Antoine Pitrou91f43802019-05-26 17:10:09 +0200232 .. versionchanged:: 3.8
233 The *buffer_callback* argument was added.
234
235.. function:: load(file, \*, fix_imports=True, encoding="ASCII", errors="strict", buffers=None)
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000236
Géry Ogam362f5352019-08-07 07:02:23 +0200237 Read the pickled representation of an object from the open :term:`file object`
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800238 *file* and return the reconstituted object hierarchy specified therein.
239 This is equivalent to ``Unpickler(file).load()``.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000240
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800241 The protocol version of the pickle is detected automatically, so no
Géry Ogam362f5352019-08-07 07:02:23 +0200242 protocol argument is needed. Bytes past the pickled representation
243 of the object are ignored.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000244
Antoine Pitrou91f43802019-05-26 17:10:09 +0200245 Arguments *file*, *fix_imports*, *encoding*, *errors*, *strict* and *buffers*
246 have the same meaning as in the :class:`Unpickler` constructor.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000247
Antoine Pitrou91f43802019-05-26 17:10:09 +0200248 .. versionchanged:: 3.8
249 The *buffers* argument was added.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000250
Antoine Pitrou91f43802019-05-26 17:10:09 +0200251.. function:: loads(bytes_object, \*, fix_imports=True, encoding="ASCII", errors="strict", buffers=None)
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000252
Géry Ogam362f5352019-08-07 07:02:23 +0200253 Return the reconstituted object hierarchy of the pickled representation
254 *bytes_object* of an object.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000255
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800256 The protocol version of the pickle is detected automatically, so no
Géry Ogam362f5352019-08-07 07:02:23 +0200257 protocol argument is needed. Bytes past the pickled representation
258 of the object are ignored.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000259
Antoine Pitrou91f43802019-05-26 17:10:09 +0200260 Arguments *file*, *fix_imports*, *encoding*, *errors*, *strict* and *buffers*
261 have the same meaning as in the :class:`Unpickler` constructor.
262
263 .. versionchanged:: 3.8
264 The *buffers* argument was added.
Georg Brandl116aa622007-08-15 14:28:22 +0000265
Georg Brandl116aa622007-08-15 14:28:22 +0000266
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000267The :mod:`pickle` module defines three exceptions:
Georg Brandl116aa622007-08-15 14:28:22 +0000268
269.. exception:: PickleError
270
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000271 Common base class for the other pickling exceptions. It inherits
Georg Brandl116aa622007-08-15 14:28:22 +0000272 :exc:`Exception`.
273
Georg Brandl116aa622007-08-15 14:28:22 +0000274.. exception:: PicklingError
275
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000276 Error raised when an unpicklable object is encountered by :class:`Pickler`.
277 It inherits :exc:`PickleError`.
Georg Brandl116aa622007-08-15 14:28:22 +0000278
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000279 Refer to :ref:`pickle-picklable` to learn what kinds of objects can be
280 pickled.
281
Georg Brandl116aa622007-08-15 14:28:22 +0000282.. exception:: UnpicklingError
283
Ezio Melottie62aad32011-11-18 13:51:10 +0200284 Error raised when there is a problem unpickling an object, such as a data
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000285 corruption or a security violation. It inherits :exc:`PickleError`.
Georg Brandl116aa622007-08-15 14:28:22 +0000286
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000287 Note that other exceptions may also be raised during unpickling, including
288 (but not necessarily limited to) AttributeError, EOFError, ImportError, and
289 IndexError.
290
291
Antoine Pitrou91f43802019-05-26 17:10:09 +0200292The :mod:`pickle` module exports three classes, :class:`Pickler`,
293:class:`Unpickler` and :class:`PickleBuffer`:
Georg Brandl116aa622007-08-15 14:28:22 +0000294
Antoine Pitrou91f43802019-05-26 17:10:09 +0200295.. class:: Pickler(file, protocol=None, \*, fix_imports=True, buffer_callback=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000296
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000297 This takes a binary file for writing a pickle data stream.
Georg Brandl116aa622007-08-15 14:28:22 +0000298
Antoine Pitroub6457242014-01-21 02:39:54 +0100299 The optional *protocol* argument, an integer, tells the pickler to use
300 the given protocol; supported protocols are 0 to :data:`HIGHEST_PROTOCOL`.
301 If not specified, the default is :data:`DEFAULT_PROTOCOL`. If a negative
302 number is specified, :data:`HIGHEST_PROTOCOL` is selected.
Georg Brandl116aa622007-08-15 14:28:22 +0000303
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000304 The *file* argument must have a write() method that accepts a single bytes
Serhiy Storchakad65c9492015-11-02 14:10:23 +0200305 argument. It can thus be an on-disk file opened for binary writing, an
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800306 :class:`io.BytesIO` instance, or any other custom object that meets this
307 interface.
Georg Brandl116aa622007-08-15 14:28:22 +0000308
Serhiy Storchakafbc1c262013-11-29 12:17:13 +0200309 If *fix_imports* is true and *protocol* is less than 3, pickle will try to
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800310 map the new Python 3 names to the old module names used in Python 2, so
311 that the pickle data stream is readable with Python 2.
Antoine Pitroud9dfaa92009-06-04 20:32:06 +0000312
Antoine Pitrou91f43802019-05-26 17:10:09 +0200313 If *buffer_callback* is None (the default), buffer views are
314 serialized into *file* as part of the pickle stream.
315
316 If *buffer_callback* is not None, then it can be called any number
317 of times with a buffer view. If the callback returns a false value
318 (such as None), the given buffer is :ref:`out-of-band <pickle-oob>`;
319 otherwise the buffer is serialized in-band, i.e. inside the pickle stream.
320
321 It is an error if *buffer_callback* is not None and *protocol* is
322 None or smaller than 5.
323
324 .. versionchanged:: 3.8
325 The *buffer_callback* argument was added.
326
Benjamin Petersone41251e2008-04-25 01:59:09 +0000327 .. method:: dump(obj)
Georg Brandl116aa622007-08-15 14:28:22 +0000328
Géry Ogam362f5352019-08-07 07:02:23 +0200329 Write the pickled representation of *obj* to the open file object given in
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000330 the constructor.
Georg Brandl116aa622007-08-15 14:28:22 +0000331
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000332 .. method:: persistent_id(obj)
333
334 Do nothing by default. This exists so a subclass can override it.
335
336 If :meth:`persistent_id` returns ``None``, *obj* is pickled as usual. Any
337 other value causes :class:`Pickler` to emit the returned value as a
338 persistent ID for *obj*. The meaning of this persistent ID should be
339 defined by :meth:`Unpickler.persistent_load`. Note that the value
340 returned by :meth:`persistent_id` cannot itself have a persistent ID.
341
342 See :ref:`pickle-persistent` for details and examples of uses.
Georg Brandl116aa622007-08-15 14:28:22 +0000343
Antoine Pitrou8d3c2902012-03-04 18:31:48 +0100344 .. attribute:: dispatch_table
345
346 A pickler object's dispatch table is a registry of *reduction
347 functions* of the kind which can be declared using
348 :func:`copyreg.pickle`. It is a mapping whose keys are classes
349 and whose values are reduction functions. A reduction function
350 takes a single argument of the associated class and should
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300351 conform to the same interface as a :meth:`__reduce__`
Antoine Pitrou8d3c2902012-03-04 18:31:48 +0100352 method.
353
354 By default, a pickler object will not have a
355 :attr:`dispatch_table` attribute, and it will instead use the
356 global dispatch table managed by the :mod:`copyreg` module.
357 However, to customize the pickling for a specific pickler object
358 one can set the :attr:`dispatch_table` attribute to a dict-like
359 object. Alternatively, if a subclass of :class:`Pickler` has a
360 :attr:`dispatch_table` attribute then this will be used as the
361 default dispatch table for instances of that class.
362
363 See :ref:`pickle-dispatch` for usage examples.
364
365 .. versionadded:: 3.3
366
Pierre Glaser289f1f82019-05-08 23:08:25 +0200367 .. method:: reducer_override(self, obj)
368
369 Special reducer that can be defined in :class:`Pickler` subclasses. This
370 method has priority over any reducer in the :attr:`dispatch_table`. It
371 should conform to the same interface as a :meth:`__reduce__` method, and
372 can optionally return ``NotImplemented`` to fallback on
373 :attr:`dispatch_table`-registered reducers to pickle ``obj``.
374
375 For a detailed example, see :ref:`reducer_override`.
376
377 .. versionadded:: 3.8
378
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000379 .. attribute:: fast
380
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000381 Deprecated. Enable fast mode if set to a true value. The fast mode
382 disables the usage of memo, therefore speeding the pickling process by not
383 generating superfluous PUT opcodes. It should not be used with
384 self-referential objects, doing otherwise will cause :class:`Pickler` to
385 recurse infinitely.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000386
387 Use :func:`pickletools.optimize` if you need more compact pickles.
388
Georg Brandl116aa622007-08-15 14:28:22 +0000389
Antoine Pitrou91f43802019-05-26 17:10:09 +0200390.. class:: Unpickler(file, \*, fix_imports=True, encoding="ASCII", errors="strict", buffers=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000391
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000392 This takes a binary file for reading a pickle data stream.
Georg Brandl116aa622007-08-15 14:28:22 +0000393
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000394 The protocol version of the pickle is detected automatically, so no
395 protocol argument is needed.
396
Antoine Pitrou91f43802019-05-26 17:10:09 +0200397 The argument *file* must have three methods, a read() method that takes an
398 integer argument, a readinto() method that takes a buffer argument
399 and a readline() method that requires no arguments, as in the
400 :class:`io.BufferedIOBase` interface. Thus *file* can be an on-disk file
Martin Panter7462b6492015-11-02 03:37:02 +0000401 opened for binary reading, an :class:`io.BytesIO` object, or any other
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800402 custom object that meets this interface.
Georg Brandl116aa622007-08-15 14:28:22 +0000403
Antoine Pitrou91f43802019-05-26 17:10:09 +0200404 The optional arguments *fix_imports*, *encoding* and *errors* are used
405 to control compatibility support for pickle stream generated by Python 2.
406 If *fix_imports* is true, pickle will try to map the old Python 2 names
407 to the new names used in Python 3. The *encoding* and *errors* tell
408 pickle how to decode 8-bit string instances pickled by Python 2;
409 these default to 'ASCII' and 'strict', respectively. The *encoding* can
Sebastian Pucilowskia8d25a12017-12-21 20:00:49 +1100410 be 'bytes' to read these 8-bit string instances as bytes objects.
Antoine Pitrou91f43802019-05-26 17:10:09 +0200411 Using ``encoding='latin1'`` is required for unpickling NumPy arrays and
412 instances of :class:`~datetime.datetime`, :class:`~datetime.date` and
413 :class:`~datetime.time` pickled by Python 2.
414
415 If *buffers* is None (the default), then all data necessary for
416 deserialization must be contained in the pickle stream. This means
417 that the *buffer_callback* argument was None when a :class:`Pickler`
418 was instantiated (or when :func:`dump` or :func:`dumps` was called).
419
420 If *buffers* is not None, it should be an iterable of buffer-enabled
421 objects that is consumed each time the pickle stream references
422 an :ref:`out-of-band <pickle-oob>` buffer view. Such buffers have been
423 given in order to the *buffer_callback* of a Pickler object.
424
425 .. versionchanged:: 3.8
426 The *buffers* argument was added.
Georg Brandl116aa622007-08-15 14:28:22 +0000427
Benjamin Petersone41251e2008-04-25 01:59:09 +0000428 .. method:: load()
Georg Brandl116aa622007-08-15 14:28:22 +0000429
Géry Ogam362f5352019-08-07 07:02:23 +0200430 Read the pickled representation of an object from the open file object
431 given in the constructor, and return the reconstituted object hierarchy
432 specified therein. Bytes past the pickled representation of the object
433 are ignored.
Georg Brandl116aa622007-08-15 14:28:22 +0000434
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000435 .. method:: persistent_load(pid)
Georg Brandl116aa622007-08-15 14:28:22 +0000436
Ezio Melottie62aad32011-11-18 13:51:10 +0200437 Raise an :exc:`UnpicklingError` by default.
Georg Brandl116aa622007-08-15 14:28:22 +0000438
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000439 If defined, :meth:`persistent_load` should return the object specified by
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000440 the persistent ID *pid*. If an invalid persistent ID is encountered, an
Ezio Melottie62aad32011-11-18 13:51:10 +0200441 :exc:`UnpicklingError` should be raised.
Georg Brandl116aa622007-08-15 14:28:22 +0000442
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000443 See :ref:`pickle-persistent` for details and examples of uses.
444
445 .. method:: find_class(module, name)
446
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000447 Import *module* if necessary and return the object called *name* from it,
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000448 where the *module* and *name* arguments are :class:`str` objects. Note,
449 unlike its name suggests, :meth:`find_class` is also used for finding
450 functions.
Georg Brandl116aa622007-08-15 14:28:22 +0000451
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000452 Subclasses may override this to gain control over what type of objects and
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000453 how they can be loaded, potentially reducing security risks. Refer to
454 :ref:`pickle-restrict` for details.
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000455
Steve Dower44f91c32019-06-27 10:47:59 -0700456 .. audit-event:: pickle.find_class module,name pickle.Unpickler.find_class
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000457
Antoine Pitrou91f43802019-05-26 17:10:09 +0200458.. class:: PickleBuffer(buffer)
459
460 A wrapper for a buffer representing picklable data. *buffer* must be a
461 :ref:`buffer-providing <bufferobjects>` object, such as a
462 :term:`bytes-like object` or a N-dimensional array.
463
464 :class:`PickleBuffer` is itself a buffer provider, therefore it is
465 possible to pass it to other APIs expecting a buffer-providing object,
466 such as :class:`memoryview`.
467
468 :class:`PickleBuffer` objects can only be serialized using pickle
469 protocol 5 or higher. They are eligible for
470 :ref:`out-of-band serialization <pickle-oob>`.
471
472 .. versionadded:: 3.8
473
474 .. method:: raw()
475
476 Return a :class:`memoryview` of the memory area underlying this buffer.
477 The returned object is a one-dimensional, C-contiguous memoryview
478 with format ``B`` (unsigned bytes). :exc:`BufferError` is raised if
479 the buffer is neither C- nor Fortran-contiguous.
480
481 .. method:: release()
482
483 Release the underlying buffer exposed by the PickleBuffer object.
484
485
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000486.. _pickle-picklable:
Georg Brandl116aa622007-08-15 14:28:22 +0000487
488What can be pickled and unpickled?
489----------------------------------
490
491The following types can be pickled:
492
493* ``None``, ``True``, and ``False``
494
Georg Brandlba956ae2007-11-29 17:24:34 +0000495* integers, floating point numbers, complex numbers
Georg Brandl116aa622007-08-15 14:28:22 +0000496
Georg Brandlf6945182008-02-01 11:56:49 +0000497* strings, bytes, bytearrays
Georg Brandl116aa622007-08-15 14:28:22 +0000498
499* tuples, lists, sets, and dictionaries containing only picklable objects
500
Ethan Furman2498d9e2013-10-18 00:45:40 -0700501* functions defined at the top level of a module (using :keyword:`def`, not
502 :keyword:`lambda`)
Georg Brandl116aa622007-08-15 14:28:22 +0000503
504* built-in functions defined at the top level of a module
505
506* classes that are defined at the top level of a module
507
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300508* instances of such classes whose :attr:`~object.__dict__` or the result of
509 calling :meth:`__getstate__` is picklable (see section :ref:`pickle-inst` for
Eli Bendersky78f3ce52013-01-02 05:53:59 -0800510 details).
Georg Brandl116aa622007-08-15 14:28:22 +0000511
512Attempts to pickle unpicklable objects will raise the :exc:`PicklingError`
513exception; when this happens, an unspecified number of bytes may have already
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000514been written to the underlying file. Trying to pickle a highly recursive data
Yury Selivanovf488fb42015-07-03 01:04:23 -0400515structure may exceed the maximum recursion depth, a :exc:`RecursionError` will be
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000516raised in this case. You can carefully raise this limit with
Georg Brandl116aa622007-08-15 14:28:22 +0000517:func:`sys.setrecursionlimit`.
518
519Note that functions (built-in and user-defined) are pickled by "fully qualified"
Ethan Furman2498d9e2013-10-18 00:45:40 -0700520name reference, not by value. [#]_ This means that only the function name is
Eli Bendersky78f3ce52013-01-02 05:53:59 -0800521pickled, along with the name of the module the function is defined in. Neither
522the function's code, nor any of its function attributes are pickled. Thus the
Georg Brandl116aa622007-08-15 14:28:22 +0000523defining module must be importable in the unpickling environment, and the module
524must contain the named object, otherwise an exception will be raised. [#]_
525
526Similarly, classes are pickled by named reference, so the same restrictions in
527the unpickling environment apply. Note that none of the class's code or data is
528pickled, so in the following example the class attribute ``attr`` is not
529restored in the unpickling environment::
530
531 class Foo:
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000532 attr = 'A class attribute'
Georg Brandl116aa622007-08-15 14:28:22 +0000533
534 picklestring = pickle.dumps(Foo)
535
536These restrictions are why picklable functions and classes must be defined in
537the top level of a module.
538
539Similarly, when class instances are pickled, their class's code and data are not
540pickled along with them. Only the instance data are pickled. This is done on
541purpose, so you can fix bugs in a class or add methods to the class and still
542load objects that were created with an earlier version of the class. If you
543plan to have long-lived objects that will see many versions of a class, it may
544be worthwhile to put a version number in the objects so that suitable
545conversions can be made by the class's :meth:`__setstate__` method.
546
547
Georg Brandl116aa622007-08-15 14:28:22 +0000548.. _pickle-inst:
549
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000550Pickling Class Instances
551------------------------
Georg Brandl116aa622007-08-15 14:28:22 +0000552
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300553.. currentmodule:: None
554
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000555In this section, we describe the general mechanisms available to you to define,
556customize, and control how class instances are pickled and unpickled.
Georg Brandl116aa622007-08-15 14:28:22 +0000557
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000558In most cases, no additional code is needed to make instances picklable. By
559default, pickle will retrieve the class and the attributes of an instance via
560introspection. When a class instance is unpickled, its :meth:`__init__` method
561is usually *not* invoked. The default behaviour first creates an uninitialized
562instance and then restores the saved attributes. The following code shows an
563implementation of this behaviour::
Georg Brandl85eb8c12007-08-31 16:33:38 +0000564
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000565 def save(obj):
566 return (obj.__class__, obj.__dict__)
567
568 def load(cls, attributes):
569 obj = cls.__new__(cls)
570 obj.__dict__.update(attributes)
571 return obj
Georg Brandl116aa622007-08-15 14:28:22 +0000572
Georg Brandl6faee4e2010-09-21 14:48:28 +0000573Classes can alter the default behaviour by providing one or several special
Georg Brandlc8148262010-10-17 11:13:37 +0000574methods:
Georg Brandl116aa622007-08-15 14:28:22 +0000575
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100576.. method:: object.__getnewargs_ex__()
577
Serhiy Storchakab6d84832015-10-13 21:26:35 +0300578 In protocols 2 and newer, classes that implements the
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100579 :meth:`__getnewargs_ex__` method can dictate the values passed to the
580 :meth:`__new__` method upon unpickling. The method must return a pair
581 ``(args, kwargs)`` where *args* is a tuple of positional arguments
582 and *kwargs* a dictionary of named arguments for constructing the
583 object. Those will be passed to the :meth:`__new__` method upon
584 unpickling.
585
586 You should implement this method if the :meth:`__new__` method of your
587 class requires keyword-only arguments. Otherwise, it is recommended for
588 compatibility to implement :meth:`__getnewargs__`.
589
Serhiy Storchakab6d84832015-10-13 21:26:35 +0300590 .. versionchanged:: 3.6
591 :meth:`__getnewargs_ex__` is now used in protocols 2 and 3.
592
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100593
Georg Brandlc8148262010-10-17 11:13:37 +0000594.. method:: object.__getnewargs__()
Georg Brandl116aa622007-08-15 14:28:22 +0000595
Andrés Delfino0e0534c2018-06-09 21:41:09 -0300596 This method serves a similar purpose as :meth:`__getnewargs_ex__`, but
Serhiy Storchakab6d84832015-10-13 21:26:35 +0300597 supports only positional arguments. It must return a tuple of arguments
598 ``args`` which will be passed to the :meth:`__new__` method upon unpickling.
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100599
Serhiy Storchakab6d84832015-10-13 21:26:35 +0300600 :meth:`__getnewargs__` will not be called if :meth:`__getnewargs_ex__` is
601 defined.
602
603 .. versionchanged:: 3.6
604 Before Python 3.6, :meth:`__getnewargs__` was called instead of
605 :meth:`__getnewargs_ex__` in protocols 2 and 3.
Georg Brandl116aa622007-08-15 14:28:22 +0000606
Georg Brandl116aa622007-08-15 14:28:22 +0000607
Georg Brandlc8148262010-10-17 11:13:37 +0000608.. method:: object.__getstate__()
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000609
Georg Brandlc8148262010-10-17 11:13:37 +0000610 Classes can further influence how their instances are pickled; if the class
611 defines the method :meth:`__getstate__`, it is called and the returned object
612 is pickled as the contents for the instance, instead of the contents of the
613 instance's dictionary. If the :meth:`__getstate__` method is absent, the
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300614 instance's :attr:`~object.__dict__` is pickled as usual.
Georg Brandl116aa622007-08-15 14:28:22 +0000615
Georg Brandlc8148262010-10-17 11:13:37 +0000616
617.. method:: object.__setstate__(state)
618
619 Upon unpickling, if the class defines :meth:`__setstate__`, it is called with
620 the unpickled state. In that case, there is no requirement for the state
621 object to be a dictionary. Otherwise, the pickled state must be a dictionary
622 and its items are assigned to the new instance's dictionary.
623
624 .. note::
625
626 If :meth:`__getstate__` returns a false value, the :meth:`__setstate__`
627 method will not be called upon unpickling.
628
Georg Brandl116aa622007-08-15 14:28:22 +0000629
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000630Refer to the section :ref:`pickle-state` for more information about how to use
631the methods :meth:`__getstate__` and :meth:`__setstate__`.
Georg Brandl116aa622007-08-15 14:28:22 +0000632
Benjamin Petersond23f8222009-04-05 19:13:16 +0000633.. note::
Georg Brandle720c0a2009-04-27 16:20:50 +0000634
Benjamin Petersond23f8222009-04-05 19:13:16 +0000635 At unpickling time, some methods like :meth:`__getattr__`,
636 :meth:`__getattribute__`, or :meth:`__setattr__` may be called upon the
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100637 instance. In case those methods rely on some internal invariant being
638 true, the type should implement :meth:`__getnewargs__` or
639 :meth:`__getnewargs_ex__` to establish such an invariant; otherwise,
640 neither :meth:`__new__` nor :meth:`__init__` will be called.
Benjamin Petersond23f8222009-04-05 19:13:16 +0000641
Georg Brandlc8148262010-10-17 11:13:37 +0000642.. index:: pair: copy; protocol
Christian Heimes05e8be12008-02-23 18:30:17 +0000643
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000644As we shall see, pickle does not use directly the methods described above. In
645fact, these methods are part of the copy protocol which implements the
646:meth:`__reduce__` special method. The copy protocol provides a unified
647interface for retrieving the data necessary for pickling and copying
Georg Brandl48310cd2009-01-03 21:18:54 +0000648objects. [#]_
Georg Brandl116aa622007-08-15 14:28:22 +0000649
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000650Although powerful, implementing :meth:`__reduce__` directly in your classes is
651error prone. For this reason, class designers should use the high-level
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100652interface (i.e., :meth:`__getnewargs_ex__`, :meth:`__getstate__` and
Georg Brandlc8148262010-10-17 11:13:37 +0000653:meth:`__setstate__`) whenever possible. We will show, however, cases where
654using :meth:`__reduce__` is the only option or leads to more efficient pickling
655or both.
Georg Brandl116aa622007-08-15 14:28:22 +0000656
Georg Brandlc8148262010-10-17 11:13:37 +0000657.. method:: object.__reduce__()
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000658
Georg Brandlc8148262010-10-17 11:13:37 +0000659 The interface is currently defined as follows. The :meth:`__reduce__` method
660 takes no argument and shall return either a string or preferably a tuple (the
661 returned object is often referred to as the "reduce value").
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000662
Georg Brandlc8148262010-10-17 11:13:37 +0000663 If a string is returned, the string should be interpreted as the name of a
664 global variable. It should be the object's local name relative to its
665 module; the pickle module searches the module namespace to determine the
666 object's module. This behaviour is typically useful for singletons.
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000667
Pierre Glaser65d98d02019-05-08 21:40:25 +0200668 When a tuple is returned, it must be between two and six items long.
Georg Brandlc8148262010-10-17 11:13:37 +0000669 Optional items can either be omitted, or ``None`` can be provided as their
670 value. The semantics of each item are in order:
Georg Brandl116aa622007-08-15 14:28:22 +0000671
Georg Brandlc8148262010-10-17 11:13:37 +0000672 .. XXX Mention __newobj__ special-case?
Georg Brandl116aa622007-08-15 14:28:22 +0000673
Georg Brandlc8148262010-10-17 11:13:37 +0000674 * A callable object that will be called to create the initial version of the
675 object.
Georg Brandl116aa622007-08-15 14:28:22 +0000676
Georg Brandlc8148262010-10-17 11:13:37 +0000677 * A tuple of arguments for the callable object. An empty tuple must be given
678 if the callable does not accept any argument.
Georg Brandl116aa622007-08-15 14:28:22 +0000679
Georg Brandlc8148262010-10-17 11:13:37 +0000680 * Optionally, the object's state, which will be passed to the object's
681 :meth:`__setstate__` method as previously described. If the object has no
682 such method then, the value must be a dictionary and it will be added to
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300683 the object's :attr:`~object.__dict__` attribute.
Georg Brandl116aa622007-08-15 14:28:22 +0000684
Georg Brandlc8148262010-10-17 11:13:37 +0000685 * Optionally, an iterator (and not a sequence) yielding successive items.
686 These items will be appended to the object either using
687 ``obj.append(item)`` or, in batch, using ``obj.extend(list_of_items)``.
688 This is primarily used for list subclasses, but may be used by other
689 classes as long as they have :meth:`append` and :meth:`extend` methods with
690 the appropriate signature. (Whether :meth:`append` or :meth:`extend` is
691 used depends on which pickle protocol version is used as well as the number
692 of items to append, so both must be supported.)
Georg Brandl116aa622007-08-15 14:28:22 +0000693
Georg Brandlc8148262010-10-17 11:13:37 +0000694 * Optionally, an iterator (not a sequence) yielding successive key-value
695 pairs. These items will be stored to the object using ``obj[key] =
696 value``. This is primarily used for dictionary subclasses, but may be used
697 by other classes as long as they implement :meth:`__setitem__`.
Georg Brandl116aa622007-08-15 14:28:22 +0000698
Pierre Glaser65d98d02019-05-08 21:40:25 +0200699 * Optionally, a callable with a ``(obj, state)`` signature. This
Xtreak9b5a0ef2019-05-16 10:04:24 +0530700 callable allows the user to programmatically control the state-updating
Pierre Glaser65d98d02019-05-08 21:40:25 +0200701 behavior of a specific object, instead of using ``obj``'s static
702 :meth:`__setstate__` method. If not ``None``, this callable will have
703 priority over ``obj``'s :meth:`__setstate__`.
704
705 .. versionadded:: 3.8
706 The optional sixth tuple item, ``(obj, state)``, was added.
707
Georg Brandlc8148262010-10-17 11:13:37 +0000708
709.. method:: object.__reduce_ex__(protocol)
710
711 Alternatively, a :meth:`__reduce_ex__` method may be defined. The only
712 difference is this method should take a single integer argument, the protocol
713 version. When defined, pickle will prefer it over the :meth:`__reduce__`
714 method. In addition, :meth:`__reduce__` automatically becomes a synonym for
715 the extended version. The main use for this method is to provide
716 backwards-compatible reduce values for older Python releases.
Georg Brandl116aa622007-08-15 14:28:22 +0000717
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300718.. currentmodule:: pickle
719
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000720.. _pickle-persistent:
721
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000722Persistence of External Objects
723^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +0000724
Christian Heimes05e8be12008-02-23 18:30:17 +0000725.. index::
726 single: persistent_id (pickle protocol)
727 single: persistent_load (pickle protocol)
728
Georg Brandl116aa622007-08-15 14:28:22 +0000729For the benefit of object persistence, the :mod:`pickle` module supports the
730notion of a reference to an object outside the pickled data stream. Such
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000731objects are referenced by a persistent ID, which should be either a string of
732alphanumeric characters (for protocol 0) [#]_ or just an arbitrary object (for
733any newer protocol).
Georg Brandl116aa622007-08-15 14:28:22 +0000734
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000735The resolution of such persistent IDs is not defined by the :mod:`pickle`
Géry Ogam362f5352019-08-07 07:02:23 +0200736module; it will delegate this resolution to the user-defined methods on the
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300737pickler and unpickler, :meth:`~Pickler.persistent_id` and
738:meth:`~Unpickler.persistent_load` respectively.
Georg Brandl116aa622007-08-15 14:28:22 +0000739
Géry Ogam362f5352019-08-07 07:02:23 +0200740To pickle objects that have an external persistent ID, the pickler must have a
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300741custom :meth:`~Pickler.persistent_id` method that takes an object as an
Géry Ogam362f5352019-08-07 07:02:23 +0200742argument and returns either ``None`` or the persistent ID for that object.
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300743When ``None`` is returned, the pickler simply pickles the object as normal.
744When a persistent ID string is returned, the pickler will pickle that object,
745along with a marker so that the unpickler will recognize it as a persistent ID.
Georg Brandl116aa622007-08-15 14:28:22 +0000746
747To unpickle external objects, the unpickler must have a custom
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300748:meth:`~Unpickler.persistent_load` method that takes a persistent ID object and
749returns the referenced object.
Georg Brandl116aa622007-08-15 14:28:22 +0000750
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000751Here is a comprehensive example presenting how persistent ID can be used to
752pickle external objects by reference.
Georg Brandl116aa622007-08-15 14:28:22 +0000753
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000754.. literalinclude:: ../includes/dbpickle.py
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +0000755
Antoine Pitrou8d3c2902012-03-04 18:31:48 +0100756.. _pickle-dispatch:
757
758Dispatch Tables
759^^^^^^^^^^^^^^^
760
761If one wants to customize pickling of some classes without disturbing
762any other code which depends on pickling, then one can create a
763pickler with a private dispatch table.
764
765The global dispatch table managed by the :mod:`copyreg` module is
766available as :data:`copyreg.dispatch_table`. Therefore, one may
767choose to use a modified copy of :data:`copyreg.dispatch_table` as a
768private dispatch table.
769
770For example ::
771
772 f = io.BytesIO()
773 p = pickle.Pickler(f)
774 p.dispatch_table = copyreg.dispatch_table.copy()
775 p.dispatch_table[SomeClass] = reduce_SomeClass
776
777creates an instance of :class:`pickle.Pickler` with a private dispatch
778table which handles the ``SomeClass`` class specially. Alternatively,
779the code ::
780
781 class MyPickler(pickle.Pickler):
782 dispatch_table = copyreg.dispatch_table.copy()
783 dispatch_table[SomeClass] = reduce_SomeClass
784 f = io.BytesIO()
785 p = MyPickler(f)
786
787does the same, but all instances of ``MyPickler`` will by default
788share the same dispatch table. The equivalent code using the
789:mod:`copyreg` module is ::
790
791 copyreg.pickle(SomeClass, reduce_SomeClass)
792 f = io.BytesIO()
793 p = pickle.Pickler(f)
Georg Brandl116aa622007-08-15 14:28:22 +0000794
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000795.. _pickle-state:
796
797Handling Stateful Objects
798^^^^^^^^^^^^^^^^^^^^^^^^^
799
800.. index::
801 single: __getstate__() (copy protocol)
802 single: __setstate__() (copy protocol)
803
804Here's an example that shows how to modify pickling behavior for a class.
805The :class:`TextReader` class opens a text file, and returns the line number and
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300806line contents each time its :meth:`!readline` method is called. If a
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000807:class:`TextReader` instance is pickled, all attributes *except* the file object
808member are saved. When the instance is unpickled, the file is reopened, and
809reading resumes from the last location. The :meth:`__setstate__` and
810:meth:`__getstate__` methods are used to implement this behavior. ::
811
812 class TextReader:
813 """Print and number lines in a text file."""
814
815 def __init__(self, filename):
816 self.filename = filename
817 self.file = open(filename)
818 self.lineno = 0
819
820 def readline(self):
821 self.lineno += 1
822 line = self.file.readline()
823 if not line:
824 return None
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +0000825 if line.endswith('\n'):
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000826 line = line[:-1]
827 return "%i: %s" % (self.lineno, line)
828
829 def __getstate__(self):
830 # Copy the object's state from self.__dict__ which contains
831 # all our instance attributes. Always use the dict.copy()
832 # method to avoid modifying the original state.
833 state = self.__dict__.copy()
834 # Remove the unpicklable entries.
835 del state['file']
836 return state
837
838 def __setstate__(self, state):
839 # Restore instance attributes (i.e., filename and lineno).
840 self.__dict__.update(state)
841 # Restore the previously opened file's state. To do so, we need to
842 # reopen it and read from it until the line count is restored.
843 file = open(self.filename)
844 for _ in range(self.lineno):
845 file.readline()
846 # Finally, save the file.
847 self.file = file
848
849
850A sample usage might be something like this::
851
852 >>> reader = TextReader("hello.txt")
853 >>> reader.readline()
854 '1: Hello world!'
855 >>> reader.readline()
856 '2: I am line number two.'
857 >>> new_reader = pickle.loads(pickle.dumps(reader))
858 >>> new_reader.readline()
859 '3: Goodbye!'
860
Pierre Glaser289f1f82019-05-08 23:08:25 +0200861.. _reducer_override:
862
863Custom Reduction for Types, Functions, and Other Objects
864--------------------------------------------------------
865
866.. versionadded:: 3.8
867
868Sometimes, :attr:`~Pickler.dispatch_table` may not be flexible enough.
869In particular we may want to customize pickling based on another criterion
870than the object's type, or we may want to customize the pickling of
871functions and classes.
872
873For those cases, it is possible to subclass from the :class:`Pickler` class and
874implement a :meth:`~Pickler.reducer_override` method. This method can return an
875arbitrary reduction tuple (see :meth:`__reduce__`). It can alternatively return
876``NotImplemented`` to fallback to the traditional behavior.
877
878If both the :attr:`~Pickler.dispatch_table` and
879:meth:`~Pickler.reducer_override` are defined, then
880:meth:`~Pickler.reducer_override` method takes priority.
881
882.. Note::
883 For performance reasons, :meth:`~Pickler.reducer_override` may not be
884 called for the following objects: ``None``, ``True``, ``False``, and
885 exact instances of :class:`int`, :class:`float`, :class:`bytes`,
886 :class:`str`, :class:`dict`, :class:`set`, :class:`frozenset`, :class:`list`
887 and :class:`tuple`.
888
889Here is a simple example where we allow pickling and reconstructing
890a given class::
891
892 import io
893 import pickle
894
895 class MyClass:
896 my_attribute = 1
897
898 class MyPickler(pickle.Pickler):
899 def reducer_override(self, obj):
900 """Custom reducer for MyClass."""
901 if getattr(obj, "__name__", None) == "MyClass":
902 return type, (obj.__name__, obj.__bases__,
903 {'my_attribute': obj.my_attribute})
904 else:
905 # For any other object, fallback to usual reduction
906 return NotImplemented
907
908 f = io.BytesIO()
909 p = MyPickler(f)
910 p.dump(MyClass)
911
912 del MyClass
913
914 unpickled_class = pickle.loads(f.getvalue())
915
916 assert isinstance(unpickled_class, type)
917 assert unpickled_class.__name__ == "MyClass"
918 assert unpickled_class.my_attribute == 1
919
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000920
Antoine Pitrou91f43802019-05-26 17:10:09 +0200921.. _pickle-oob:
922
923Out-of-band Buffers
924-------------------
925
926.. versionadded:: 3.8
927
928In some contexts, the :mod:`pickle` module is used to transfer massive amounts
929of data. Therefore, it can be important to minimize the number of memory
930copies, to preserve performance and resource consumption. However, normal
931operation of the :mod:`pickle` module, as it transforms a graph-like structure
932of objects into a sequential stream of bytes, intrinsically involves copying
933data to and from the pickle stream.
934
935This constraint can be eschewed if both the *provider* (the implementation
936of the object types to be transferred) and the *consumer* (the implementation
937of the communications system) support the out-of-band transfer facilities
938provided by pickle protocol 5 and higher.
939
940Provider API
941^^^^^^^^^^^^
942
943The large data objects to be pickled must implement a :meth:`__reduce_ex__`
944method specialized for protocol 5 and higher, which returns a
945:class:`PickleBuffer` instance (instead of e.g. a :class:`bytes` object)
946for any large data.
947
948A :class:`PickleBuffer` object *signals* that the underlying buffer is
949eligible for out-of-band data transfer. Those objects remain compatible
950with normal usage of the :mod:`pickle` module. However, consumers can also
951opt-in to tell :mod:`pickle` that they will handle those buffers by
952themselves.
953
954Consumer API
955^^^^^^^^^^^^
956
957A communications system can enable custom handling of the :class:`PickleBuffer`
958objects generated when serializing an object graph.
959
960On the sending side, it needs to pass a *buffer_callback* argument to
961:class:`Pickler` (or to the :func:`dump` or :func:`dumps` function), which
962will be called with each :class:`PickleBuffer` generated while pickling
963the object graph. Buffers accumulated by the *buffer_callback* will not
964see their data copied into the pickle stream, only a cheap marker will be
965inserted.
966
967On the receiving side, it needs to pass a *buffers* argument to
968:class:`Unpickler` (or to the :func:`load` or :func:`loads` function),
969which is an iterable of the buffers which were passed to *buffer_callback*.
970That iterable should produce buffers in the same order as they were passed
971to *buffer_callback*. Those buffers will provide the data expected by the
972reconstructors of the objects whose pickling produced the original
973:class:`PickleBuffer` objects.
974
975Between the sending side and the receiving side, the communications system
976is free to implement its own transfer mechanism for out-of-band buffers.
977Potential optimizations include the use of shared memory or datatype-dependent
978compression.
979
980Example
981^^^^^^^
982
983Here is a trivial example where we implement a :class:`bytearray` subclass
984able to participate in out-of-band buffer pickling::
985
986 class ZeroCopyByteArray(bytearray):
987
988 def __reduce_ex__(self, protocol):
989 if protocol >= 5:
990 return type(self)._reconstruct, (PickleBuffer(self),), None
991 else:
992 # PickleBuffer is forbidden with pickle protocols <= 4.
993 return type(self)._reconstruct, (bytearray(self),)
994
995 @classmethod
996 def _reconstruct(cls, obj):
997 with memoryview(obj) as m:
998 # Get a handle over the original buffer object
999 obj = m.obj
1000 if type(obj) is cls:
1001 # Original buffer object is a ZeroCopyByteArray, return it
1002 # as-is.
1003 return obj
1004 else:
1005 return cls(obj)
1006
1007The reconstructor (the ``_reconstruct`` class method) returns the buffer's
1008providing object if it has the right type. This is an easy way to simulate
1009zero-copy behaviour on this toy example.
1010
1011On the consumer side, we can pickle those objects the usual way, which
1012when unserialized will give us a copy of the original object::
1013
1014 b = ZeroCopyByteArray(b"abc")
1015 data = pickle.dumps(b, protocol=5)
1016 new_b = pickle.loads(data)
1017 print(b == new_b) # True
1018 print(b is new_b) # False: a copy was made
1019
1020But if we pass a *buffer_callback* and then give back the accumulated
1021buffers when unserializing, we are able to get back the original object::
1022
1023 b = ZeroCopyByteArray(b"abc")
1024 buffers = []
1025 data = pickle.dumps(b, protocol=5, buffer_callback=buffers.append)
1026 new_b = pickle.loads(data, buffers=buffers)
1027 print(b == new_b) # True
1028 print(b is new_b) # True: no copy was made
1029
1030This example is limited by the fact that :class:`bytearray` allocates its
1031own memory: you cannot create a :class:`bytearray` instance that is backed
1032by another object's memory. However, third-party datatypes such as NumPy
1033arrays do not have this limitation, and allow use of zero-copy pickling
1034(or making as few copies as possible) when transferring between distinct
1035processes or systems.
1036
1037.. seealso:: :pep:`574` -- Pickle protocol 5 with out-of-band data
1038
1039
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001040.. _pickle-restrict:
Georg Brandl116aa622007-08-15 14:28:22 +00001041
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001042Restricting Globals
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +00001043-------------------
Georg Brandl116aa622007-08-15 14:28:22 +00001044
Christian Heimes05e8be12008-02-23 18:30:17 +00001045.. index::
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001046 single: find_class() (pickle protocol)
Christian Heimes05e8be12008-02-23 18:30:17 +00001047
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001048By default, unpickling will import any class or function that it finds in the
1049pickle data. For many applications, this behaviour is unacceptable as it
1050permits the unpickler to import and invoke arbitrary code. Just consider what
1051this hand-crafted pickle data stream does when loaded::
Georg Brandl116aa622007-08-15 14:28:22 +00001052
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001053 >>> import pickle
1054 >>> pickle.loads(b"cos\nsystem\n(S'echo hello world'\ntR.")
1055 hello world
1056 0
Georg Brandl116aa622007-08-15 14:28:22 +00001057
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001058In this example, the unpickler imports the :func:`os.system` function and then
1059apply the string argument "echo hello world". Although this example is
1060inoffensive, it is not difficult to imagine one that could damage your system.
Georg Brandl116aa622007-08-15 14:28:22 +00001061
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001062For this reason, you may want to control what gets unpickled by customizing
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +03001063:meth:`Unpickler.find_class`. Unlike its name suggests,
1064:meth:`Unpickler.find_class` is called whenever a global (i.e., a class or
1065a function) is requested. Thus it is possible to either completely forbid
1066globals or restrict them to a safe subset.
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001067
1068Here is an example of an unpickler allowing only few safe classes from the
1069:mod:`builtins` module to be loaded::
1070
1071 import builtins
1072 import io
1073 import pickle
1074
1075 safe_builtins = {
1076 'range',
1077 'complex',
1078 'set',
1079 'frozenset',
1080 'slice',
1081 }
1082
1083 class RestrictedUnpickler(pickle.Unpickler):
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +00001084
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001085 def find_class(self, module, name):
1086 # Only allow safe classes from builtins.
1087 if module == "builtins" and name in safe_builtins:
1088 return getattr(builtins, name)
1089 # Forbid everything else.
1090 raise pickle.UnpicklingError("global '%s.%s' is forbidden" %
1091 (module, name))
1092
1093 def restricted_loads(s):
1094 """Helper function analogous to pickle.loads()."""
1095 return RestrictedUnpickler(io.BytesIO(s)).load()
1096
1097A sample usage of our unpickler working has intended::
1098
1099 >>> restricted_loads(pickle.dumps([1, 2, range(15)]))
1100 [1, 2, range(0, 15)]
1101 >>> restricted_loads(b"cos\nsystem\n(S'echo hello world'\ntR.")
1102 Traceback (most recent call last):
1103 ...
1104 pickle.UnpicklingError: global 'os.system' is forbidden
1105 >>> restricted_loads(b'cbuiltins\neval\n'
1106 ... b'(S\'getattr(__import__("os"), "system")'
1107 ... b'("echo hello world")\'\ntR.')
1108 Traceback (most recent call last):
1109 ...
1110 pickle.UnpicklingError: global 'builtins.eval' is forbidden
1111
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +00001112
1113.. XXX Add note about how extension codes could evade our protection
Georg Brandl48310cd2009-01-03 21:18:54 +00001114 mechanism (e.g. cached classes do not invokes find_class()).
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +00001115
1116As our examples shows, you have to be careful with what you allow to be
1117unpickled. Therefore if security is a concern, you may want to consider
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +00001118alternatives such as the marshalling API in :mod:`xmlrpc.client` or
1119third-party solutions.
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +00001120
Georg Brandl116aa622007-08-15 14:28:22 +00001121
Antoine Pitroud4d60552013-12-07 00:56:59 +01001122Performance
1123-----------
1124
1125Recent versions of the pickle protocol (from protocol 2 and upwards) feature
1126efficient binary encodings for several common features and built-in types.
1127Also, the :mod:`pickle` module has a transparent optimizer written in C.
1128
1129
Georg Brandl116aa622007-08-15 14:28:22 +00001130.. _pickle-example:
1131
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +00001132Examples
1133--------
Georg Brandl116aa622007-08-15 14:28:22 +00001134
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +00001135For the simplest code, use the :func:`dump` and :func:`load` functions. ::
Georg Brandl116aa622007-08-15 14:28:22 +00001136
1137 import pickle
1138
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +00001139 # An arbitrary collection of objects supported by pickle.
1140 data = {
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +00001141 'a': [1, 2.0, 3, 4+6j],
1142 'b': ("character string", b"byte string"),
Raymond Hettingerdf1b6992014-11-09 15:56:33 -08001143 'c': {None, True, False}
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +00001144 }
Georg Brandl116aa622007-08-15 14:28:22 +00001145
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +00001146 with open('data.pickle', 'wb') as f:
1147 # Pickle the 'data' dictionary using the highest protocol available.
1148 pickle.dump(data, f, pickle.HIGHEST_PROTOCOL)
Georg Brandl116aa622007-08-15 14:28:22 +00001149
Georg Brandl116aa622007-08-15 14:28:22 +00001150
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +00001151The following example reads the resulting pickled data. ::
Georg Brandl116aa622007-08-15 14:28:22 +00001152
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +00001153 import pickle
Georg Brandl116aa622007-08-15 14:28:22 +00001154
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +00001155 with open('data.pickle', 'rb') as f:
1156 # The protocol version used is detected automatically, so we do not
1157 # have to specify it.
1158 data = pickle.load(f)
Georg Brandl116aa622007-08-15 14:28:22 +00001159
Georg Brandl116aa622007-08-15 14:28:22 +00001160
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +00001161.. XXX: Add examples showing how to optimize pickles for size (like using
1162.. pickletools.optimize() or the gzip module).
1163
1164
Georg Brandl116aa622007-08-15 14:28:22 +00001165.. seealso::
1166
Alexandre Vassalottif7fa63d2008-05-11 08:55:36 +00001167 Module :mod:`copyreg`
Georg Brandl116aa622007-08-15 14:28:22 +00001168 Pickle interface constructor registration for extension types.
1169
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +00001170 Module :mod:`pickletools`
1171 Tools for working with and analyzing pickled data.
1172
Georg Brandl116aa622007-08-15 14:28:22 +00001173 Module :mod:`shelve`
1174 Indexed databases of objects; uses :mod:`pickle`.
1175
1176 Module :mod:`copy`
1177 Shallow and deep object copying.
1178
1179 Module :mod:`marshal`
1180 High-performance serialization of built-in types.
1181
1182
Georg Brandl116aa622007-08-15 14:28:22 +00001183.. rubric:: Footnotes
1184
1185.. [#] Don't confuse this with the :mod:`marshal` module
1186
Ethan Furman2498d9e2013-10-18 00:45:40 -07001187.. [#] This is why :keyword:`lambda` functions cannot be pickled: all
Serhiy Storchaka2b57c432018-12-19 08:09:46 +02001188 :keyword:`!lambda` functions share the same name: ``<lambda>``.
Ethan Furman2498d9e2013-10-18 00:45:40 -07001189
Georg Brandl116aa622007-08-15 14:28:22 +00001190.. [#] The exception raised will likely be an :exc:`ImportError` or an
1191 :exc:`AttributeError` but it could be something else.
1192
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +00001193.. [#] The :mod:`copy` module uses this protocol for shallow and deep copying
1194 operations.
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +00001195
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001196.. [#] The limitation on alphanumeric characters is due to the fact
1197 the persistent IDs, in protocol 0, are delimited by the newline
1198 character. Therefore if any kind of newline characters occurs in
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +00001199 persistent IDs, the resulting pickle will become unreadable.