blob: 779b60ed4da00e3831f80ab275a6dd0810878b29 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001:mod:`pickle` --- Python object serialization
2=============================================
3
Terry Jan Reedyfa089b92016-06-11 15:02:54 -04004.. module:: pickle
5 :synopsis: Convert Python objects to streams of bytes and back.
6
7.. sectionauthor:: Jim Kerr <jbkerr@sr.hp.com>.
8.. sectionauthor:: Barry Warsaw <barry@python.org>
9
10**Source code:** :source:`Lib/pickle.py`
11
Georg Brandl116aa622007-08-15 14:28:22 +000012.. index::
13 single: persistence
14 pair: persistent; objects
15 pair: serializing; objects
16 pair: marshalling; objects
17 pair: flattening; objects
18 pair: pickling; objects
19
Terry Jan Reedyfa089b92016-06-11 15:02:54 -040020--------------
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +000021
Antoine Pitroud4d60552013-12-07 00:56:59 +010022The :mod:`pickle` module implements binary protocols for serializing and
23de-serializing a Python object structure. *"Pickling"* is the process
24whereby a Python object hierarchy is converted into a byte stream, and
25*"unpickling"* is the inverse operation, whereby a byte stream
26(from a :term:`binary file` or :term:`bytes-like object`) is converted
27back into an object hierarchy. Pickling (and unpickling) is alternatively
28known as "serialization", "marshalling," [#]_ or "flattening"; however, to
29avoid confusion, the terms used here are "pickling" and "unpickling".
Georg Brandl116aa622007-08-15 14:28:22 +000030
Georg Brandl0036bcf2010-10-17 10:24:54 +000031.. warning::
32
Daniel Popedaa82d02019-08-31 06:51:33 +010033 The ``pickle`` module **is not secure**. Only unpickle data you trust.
34
35 It is possible to construct malicious pickle data which will **execute
36 arbitrary code during unpickling**. Never unpickle data that could have come
37 from an untrusted source, or that could have been tampered with.
38
39 Consider signing data with :mod:`hmac` if you need to ensure that it has not
40 been tampered with.
41
42 Safer serialization formats such as :mod:`json` may be more appropriate if
43 you are processing untrusted data. See :ref:`comparison-with-json`.
Georg Brandl0036bcf2010-10-17 10:24:54 +000044
Georg Brandl116aa622007-08-15 14:28:22 +000045
46Relationship to other Python modules
47------------------------------------
48
Antoine Pitroud4d60552013-12-07 00:56:59 +010049Comparison with ``marshal``
50^^^^^^^^^^^^^^^^^^^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +000051
52Python has a more primitive serialization module called :mod:`marshal`, but in
53general :mod:`pickle` should always be the preferred way to serialize Python
54objects. :mod:`marshal` exists primarily to support Python's :file:`.pyc`
55files.
56
Georg Brandl5aa580f2010-11-30 14:57:54 +000057The :mod:`pickle` module differs from :mod:`marshal` in several significant ways:
Georg Brandl116aa622007-08-15 14:28:22 +000058
59* The :mod:`pickle` module keeps track of the objects it has already serialized,
60 so that later references to the same object won't be serialized again.
61 :mod:`marshal` doesn't do this.
62
63 This has implications both for recursive objects and object sharing. Recursive
64 objects are objects that contain references to themselves. These are not
65 handled by marshal, and in fact, attempting to marshal recursive objects will
66 crash your Python interpreter. Object sharing happens when there are multiple
67 references to the same object in different places in the object hierarchy being
68 serialized. :mod:`pickle` stores such objects only once, and ensures that all
69 other references point to the master copy. Shared objects remain shared, which
70 can be very important for mutable objects.
71
72* :mod:`marshal` cannot be used to serialize user-defined classes and their
73 instances. :mod:`pickle` can save and restore class instances transparently,
74 however the class definition must be importable and live in the same module as
75 when the object was stored.
76
77* The :mod:`marshal` serialization format is not guaranteed to be portable
78 across Python versions. Because its primary job in life is to support
79 :file:`.pyc` files, the Python implementers reserve the right to change the
80 serialization format in non-backwards compatible ways should the need arise.
81 The :mod:`pickle` serialization format is guaranteed to be backwards compatible
Gregory P. Smithe3287532018-12-09 11:42:58 -080082 across Python releases provided a compatible pickle protocol is chosen and
83 pickling and unpickling code deals with Python 2 to Python 3 type differences
84 if your data is crossing that unique breaking change language boundary.
Georg Brandl116aa622007-08-15 14:28:22 +000085
Daniel Popedaa82d02019-08-31 06:51:33 +010086
87.. _comparison-with-json:
88
Antoine Pitroud4d60552013-12-07 00:56:59 +010089Comparison with ``json``
90^^^^^^^^^^^^^^^^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +000091
Antoine Pitroud4d60552013-12-07 00:56:59 +010092There are fundamental differences between the pickle protocols and
93`JSON (JavaScript Object Notation) <http://json.org>`_:
94
95* JSON is a text serialization format (it outputs unicode text, although
96 most of the time it is then encoded to ``utf-8``), while pickle is
97 a binary serialization format;
98
99* JSON is human-readable, while pickle is not;
100
101* JSON is interoperable and widely used outside of the Python ecosystem,
102 while pickle is Python-specific;
103
104* JSON, by default, can only represent a subset of the Python built-in
105 types, and no custom classes; pickle can represent an extremely large
106 number of Python types (many of them automatically, by clever usage
107 of Python's introspection facilities; complex cases can be tackled by
Daniel Popedaa82d02019-08-31 06:51:33 +0100108 implementing :ref:`specific object APIs <pickle-inst>`);
109
110* Unlike pickle, deserializing untrusted JSON does not in itself create an
111 arbitrary code execution vulnerability.
Antoine Pitroud4d60552013-12-07 00:56:59 +0100112
113.. seealso::
114 The :mod:`json` module: a standard library module allowing JSON
115 serialization and deserialization.
Georg Brandl116aa622007-08-15 14:28:22 +0000116
Antoine Pitrou9bcb1122013-12-07 01:05:57 +0100117
118.. _pickle-protocols:
119
Georg Brandl116aa622007-08-15 14:28:22 +0000120Data stream format
121------------------
122
123.. index::
Georg Brandl116aa622007-08-15 14:28:22 +0000124 single: External Data Representation
125
126The data format used by :mod:`pickle` is Python-specific. This has the
127advantage that there are no restrictions imposed by external standards such as
Antoine Pitroua9494f62012-05-10 15:38:30 +0200128JSON or XDR (which can't represent pointer sharing); however it means that
129non-Python programs may not be able to reconstruct pickled Python objects.
Georg Brandl116aa622007-08-15 14:28:22 +0000130
Antoine Pitroua9494f62012-05-10 15:38:30 +0200131By default, the :mod:`pickle` data format uses a relatively compact binary
132representation. If you need optimal size characteristics, you can efficiently
133:doc:`compress <archiving>` pickled data.
134
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000135The module :mod:`pickletools` contains tools for analyzing data streams
Antoine Pitroua9494f62012-05-10 15:38:30 +0200136generated by :mod:`pickle`. :mod:`pickletools` source code has extensive
137comments about opcodes used by pickle protocols.
Georg Brandl116aa622007-08-15 14:28:22 +0000138
Dima Tisnekd0e0f5b2019-11-03 20:55:33 +0900139There are currently 6 different protocols which can be used for pickling.
Antoine Pitroub6457242014-01-21 02:39:54 +0100140The higher the protocol used, the more recent the version of Python needed
141to read the pickle produced.
Georg Brandl116aa622007-08-15 14:28:22 +0000142
Antoine Pitroua9494f62012-05-10 15:38:30 +0200143* Protocol version 0 is the original "human-readable" protocol and is
Alexandre Vassalottif7d08c72009-01-23 04:50:05 +0000144 backwards compatible with earlier versions of Python.
Georg Brandl116aa622007-08-15 14:28:22 +0000145
Antoine Pitroua9494f62012-05-10 15:38:30 +0200146* Protocol version 1 is an old binary format which is also compatible with
Georg Brandl116aa622007-08-15 14:28:22 +0000147 earlier versions of Python.
148
149* Protocol version 2 was introduced in Python 2.3. It provides much more
Antoine Pitroua9494f62012-05-10 15:38:30 +0200150 efficient pickling of :term:`new-style class`\es. Refer to :pep:`307` for
151 information about improvements brought by protocol 2.
Georg Brandl116aa622007-08-15 14:28:22 +0000152
Antoine Pitrou9bcb1122013-12-07 01:05:57 +0100153* Protocol version 3 was added in Python 3.0. It has explicit support for
Łukasz Langac51d8c92018-04-03 23:06:53 -0700154 :class:`bytes` objects and cannot be unpickled by Python 2.x. This was
155 the default protocol in Python 3.0--3.7.
Antoine Pitrou9bcb1122013-12-07 01:05:57 +0100156
157* Protocol version 4 was added in Python 3.4. It adds support for very large
158 objects, pickling more kinds of objects, and some data format
Łukasz Langac51d8c92018-04-03 23:06:53 -0700159 optimizations. It is the default protocol starting with Python 3.8.
160 Refer to :pep:`3154` for information about improvements brought by
161 protocol 4.
Georg Brandl116aa622007-08-15 14:28:22 +0000162
Dima Tisnekd0e0f5b2019-11-03 20:55:33 +0900163* Protocol version 5 was added in Python 3.8. It adds support for out-of-band
164 data and speedup for in-band data. Refer to :pep:`574` for information about
165 improvements brought by protocol 5.
166
Antoine Pitroud4d60552013-12-07 00:56:59 +0100167.. note::
168 Serialization is a more primitive notion than persistence; although
169 :mod:`pickle` reads and writes file objects, it does not handle the issue of
170 naming persistent objects, nor the (even more complicated) issue of concurrent
171 access to persistent objects. The :mod:`pickle` module can transform a complex
172 object into a byte stream and it can transform the byte stream into an object
173 with the same internal structure. Perhaps the most obvious thing to do with
174 these byte streams is to write them onto a file, but it is also conceivable to
175 send them across a network or store them in a database. The :mod:`shelve`
176 module provides a simple interface to pickle and unpickle objects on
177 DBM-style database files.
178
Georg Brandl116aa622007-08-15 14:28:22 +0000179
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000180Module Interface
181----------------
Georg Brandl116aa622007-08-15 14:28:22 +0000182
Antoine Pitroua9494f62012-05-10 15:38:30 +0200183To serialize an object hierarchy, you simply call the :func:`dumps` function.
184Similarly, to de-serialize a data stream, you call the :func:`loads` function.
185However, if you want more control over serialization and de-serialization,
186you can create a :class:`Pickler` or an :class:`Unpickler` object, respectively.
187
188The :mod:`pickle` module provides the following constants:
Georg Brandl116aa622007-08-15 14:28:22 +0000189
190
191.. data:: HIGHEST_PROTOCOL
192
Antoine Pitrou9bcb1122013-12-07 01:05:57 +0100193 An integer, the highest :ref:`protocol version <pickle-protocols>`
194 available. This value can be passed as a *protocol* value to functions
195 :func:`dump` and :func:`dumps` as well as the :class:`Pickler`
196 constructor.
Georg Brandl116aa622007-08-15 14:28:22 +0000197
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000198.. data:: DEFAULT_PROTOCOL
199
Antoine Pitrou9bcb1122013-12-07 01:05:57 +0100200 An integer, the default :ref:`protocol version <pickle-protocols>` used
201 for pickling. May be less than :data:`HIGHEST_PROTOCOL`. Currently the
Łukasz Langac51d8c92018-04-03 23:06:53 -0700202 default protocol is 4, first introduced in Python 3.4 and incompatible
203 with previous versions.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000204
Łukasz Langac51d8c92018-04-03 23:06:53 -0700205 .. versionchanged:: 3.0
206
207 The default protocol is 3.
208
209 .. versionchanged:: 3.8
210
211 The default protocol is 4.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000212
Georg Brandl116aa622007-08-15 14:28:22 +0000213The :mod:`pickle` module provides the following functions to make the pickling
214process more convenient:
215
Antoine Pitrou91f43802019-05-26 17:10:09 +0200216.. function:: dump(obj, file, protocol=None, \*, fix_imports=True, buffer_callback=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000217
Géry Ogam362f5352019-08-07 07:02:23 +0200218 Write the pickled representation of the object *obj* to the open
219 :term:`file object` *file*. This is equivalent to
220 ``Pickler(file, protocol).dump(obj)``.
Georg Brandl116aa622007-08-15 14:28:22 +0000221
Antoine Pitrou91f43802019-05-26 17:10:09 +0200222 Arguments *file*, *protocol*, *fix_imports* and *buffer_callback* have
223 the same meaning as in the :class:`Pickler` constructor.
Georg Brandl116aa622007-08-15 14:28:22 +0000224
Antoine Pitrou91f43802019-05-26 17:10:09 +0200225 .. versionchanged:: 3.8
226 The *buffer_callback* argument was added.
Georg Brandl116aa622007-08-15 14:28:22 +0000227
Antoine Pitrou91f43802019-05-26 17:10:09 +0200228.. function:: dumps(obj, protocol=None, \*, fix_imports=True, buffer_callback=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000229
Géry Ogam362f5352019-08-07 07:02:23 +0200230 Return the pickled representation of the object *obj* as a :class:`bytes` object,
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800231 instead of writing it to a file.
Georg Brandl116aa622007-08-15 14:28:22 +0000232
Antoine Pitrou91f43802019-05-26 17:10:09 +0200233 Arguments *protocol*, *fix_imports* and *buffer_callback* have the same
234 meaning as in the :class:`Pickler` constructor.
Antoine Pitroud9dfaa92009-06-04 20:32:06 +0000235
Antoine Pitrou91f43802019-05-26 17:10:09 +0200236 .. versionchanged:: 3.8
237 The *buffer_callback* argument was added.
238
239.. function:: load(file, \*, fix_imports=True, encoding="ASCII", errors="strict", buffers=None)
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000240
Géry Ogam362f5352019-08-07 07:02:23 +0200241 Read the pickled representation of an object from the open :term:`file object`
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800242 *file* and return the reconstituted object hierarchy specified therein.
243 This is equivalent to ``Unpickler(file).load()``.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000244
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800245 The protocol version of the pickle is detected automatically, so no
Géry Ogam362f5352019-08-07 07:02:23 +0200246 protocol argument is needed. Bytes past the pickled representation
247 of the object are ignored.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000248
Antoine Pitrou91f43802019-05-26 17:10:09 +0200249 Arguments *file*, *fix_imports*, *encoding*, *errors*, *strict* and *buffers*
250 have the same meaning as in the :class:`Unpickler` constructor.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000251
Antoine Pitrou91f43802019-05-26 17:10:09 +0200252 .. versionchanged:: 3.8
253 The *buffers* argument was added.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000254
Antoine Pitrou91f43802019-05-26 17:10:09 +0200255.. function:: loads(bytes_object, \*, fix_imports=True, encoding="ASCII", errors="strict", buffers=None)
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000256
Géry Ogam362f5352019-08-07 07:02:23 +0200257 Return the reconstituted object hierarchy of the pickled representation
258 *bytes_object* of an object.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000259
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800260 The protocol version of the pickle is detected automatically, so no
Géry Ogam362f5352019-08-07 07:02:23 +0200261 protocol argument is needed. Bytes past the pickled representation
262 of the object are ignored.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000263
Antoine Pitrou91f43802019-05-26 17:10:09 +0200264 Arguments *file*, *fix_imports*, *encoding*, *errors*, *strict* and *buffers*
265 have the same meaning as in the :class:`Unpickler` constructor.
266
267 .. versionchanged:: 3.8
268 The *buffers* argument was added.
Georg Brandl116aa622007-08-15 14:28:22 +0000269
Georg Brandl116aa622007-08-15 14:28:22 +0000270
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000271The :mod:`pickle` module defines three exceptions:
Georg Brandl116aa622007-08-15 14:28:22 +0000272
273.. exception:: PickleError
274
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000275 Common base class for the other pickling exceptions. It inherits
Georg Brandl116aa622007-08-15 14:28:22 +0000276 :exc:`Exception`.
277
Georg Brandl116aa622007-08-15 14:28:22 +0000278.. exception:: PicklingError
279
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000280 Error raised when an unpicklable object is encountered by :class:`Pickler`.
281 It inherits :exc:`PickleError`.
Georg Brandl116aa622007-08-15 14:28:22 +0000282
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000283 Refer to :ref:`pickle-picklable` to learn what kinds of objects can be
284 pickled.
285
Georg Brandl116aa622007-08-15 14:28:22 +0000286.. exception:: UnpicklingError
287
Ezio Melottie62aad32011-11-18 13:51:10 +0200288 Error raised when there is a problem unpickling an object, such as a data
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000289 corruption or a security violation. It inherits :exc:`PickleError`.
Georg Brandl116aa622007-08-15 14:28:22 +0000290
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000291 Note that other exceptions may also be raised during unpickling, including
292 (but not necessarily limited to) AttributeError, EOFError, ImportError, and
293 IndexError.
294
295
Antoine Pitrou91f43802019-05-26 17:10:09 +0200296The :mod:`pickle` module exports three classes, :class:`Pickler`,
297:class:`Unpickler` and :class:`PickleBuffer`:
Georg Brandl116aa622007-08-15 14:28:22 +0000298
Antoine Pitrou91f43802019-05-26 17:10:09 +0200299.. class:: Pickler(file, protocol=None, \*, fix_imports=True, buffer_callback=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000300
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000301 This takes a binary file for writing a pickle data stream.
Georg Brandl116aa622007-08-15 14:28:22 +0000302
Antoine Pitroub6457242014-01-21 02:39:54 +0100303 The optional *protocol* argument, an integer, tells the pickler to use
304 the given protocol; supported protocols are 0 to :data:`HIGHEST_PROTOCOL`.
305 If not specified, the default is :data:`DEFAULT_PROTOCOL`. If a negative
306 number is specified, :data:`HIGHEST_PROTOCOL` is selected.
Georg Brandl116aa622007-08-15 14:28:22 +0000307
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000308 The *file* argument must have a write() method that accepts a single bytes
Serhiy Storchakad65c9492015-11-02 14:10:23 +0200309 argument. It can thus be an on-disk file opened for binary writing, an
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800310 :class:`io.BytesIO` instance, or any other custom object that meets this
311 interface.
Georg Brandl116aa622007-08-15 14:28:22 +0000312
Serhiy Storchakafbc1c262013-11-29 12:17:13 +0200313 If *fix_imports* is true and *protocol* is less than 3, pickle will try to
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800314 map the new Python 3 names to the old module names used in Python 2, so
315 that the pickle data stream is readable with Python 2.
Antoine Pitroud9dfaa92009-06-04 20:32:06 +0000316
Antoine Pitrou91f43802019-05-26 17:10:09 +0200317 If *buffer_callback* is None (the default), buffer views are
318 serialized into *file* as part of the pickle stream.
319
320 If *buffer_callback* is not None, then it can be called any number
321 of times with a buffer view. If the callback returns a false value
322 (such as None), the given buffer is :ref:`out-of-band <pickle-oob>`;
323 otherwise the buffer is serialized in-band, i.e. inside the pickle stream.
324
325 It is an error if *buffer_callback* is not None and *protocol* is
326 None or smaller than 5.
327
328 .. versionchanged:: 3.8
329 The *buffer_callback* argument was added.
330
Benjamin Petersone41251e2008-04-25 01:59:09 +0000331 .. method:: dump(obj)
Georg Brandl116aa622007-08-15 14:28:22 +0000332
Géry Ogam362f5352019-08-07 07:02:23 +0200333 Write the pickled representation of *obj* to the open file object given in
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000334 the constructor.
Georg Brandl116aa622007-08-15 14:28:22 +0000335
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000336 .. method:: persistent_id(obj)
337
338 Do nothing by default. This exists so a subclass can override it.
339
340 If :meth:`persistent_id` returns ``None``, *obj* is pickled as usual. Any
341 other value causes :class:`Pickler` to emit the returned value as a
342 persistent ID for *obj*. The meaning of this persistent ID should be
343 defined by :meth:`Unpickler.persistent_load`. Note that the value
344 returned by :meth:`persistent_id` cannot itself have a persistent ID.
345
346 See :ref:`pickle-persistent` for details and examples of uses.
Georg Brandl116aa622007-08-15 14:28:22 +0000347
Antoine Pitrou8d3c2902012-03-04 18:31:48 +0100348 .. attribute:: dispatch_table
349
350 A pickler object's dispatch table is a registry of *reduction
351 functions* of the kind which can be declared using
352 :func:`copyreg.pickle`. It is a mapping whose keys are classes
353 and whose values are reduction functions. A reduction function
354 takes a single argument of the associated class and should
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300355 conform to the same interface as a :meth:`__reduce__`
Antoine Pitrou8d3c2902012-03-04 18:31:48 +0100356 method.
357
358 By default, a pickler object will not have a
359 :attr:`dispatch_table` attribute, and it will instead use the
360 global dispatch table managed by the :mod:`copyreg` module.
361 However, to customize the pickling for a specific pickler object
362 one can set the :attr:`dispatch_table` attribute to a dict-like
363 object. Alternatively, if a subclass of :class:`Pickler` has a
364 :attr:`dispatch_table` attribute then this will be used as the
365 default dispatch table for instances of that class.
366
367 See :ref:`pickle-dispatch` for usage examples.
368
369 .. versionadded:: 3.3
370
Pierre Glaser289f1f82019-05-08 23:08:25 +0200371 .. method:: reducer_override(self, obj)
372
373 Special reducer that can be defined in :class:`Pickler` subclasses. This
374 method has priority over any reducer in the :attr:`dispatch_table`. It
375 should conform to the same interface as a :meth:`__reduce__` method, and
376 can optionally return ``NotImplemented`` to fallback on
377 :attr:`dispatch_table`-registered reducers to pickle ``obj``.
378
379 For a detailed example, see :ref:`reducer_override`.
380
381 .. versionadded:: 3.8
382
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000383 .. attribute:: fast
384
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000385 Deprecated. Enable fast mode if set to a true value. The fast mode
386 disables the usage of memo, therefore speeding the pickling process by not
387 generating superfluous PUT opcodes. It should not be used with
388 self-referential objects, doing otherwise will cause :class:`Pickler` to
389 recurse infinitely.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000390
391 Use :func:`pickletools.optimize` if you need more compact pickles.
392
Georg Brandl116aa622007-08-15 14:28:22 +0000393
Antoine Pitrou91f43802019-05-26 17:10:09 +0200394.. class:: Unpickler(file, \*, fix_imports=True, encoding="ASCII", errors="strict", buffers=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000395
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000396 This takes a binary file for reading a pickle data stream.
Georg Brandl116aa622007-08-15 14:28:22 +0000397
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000398 The protocol version of the pickle is detected automatically, so no
399 protocol argument is needed.
400
Antoine Pitrou91f43802019-05-26 17:10:09 +0200401 The argument *file* must have three methods, a read() method that takes an
402 integer argument, a readinto() method that takes a buffer argument
403 and a readline() method that requires no arguments, as in the
404 :class:`io.BufferedIOBase` interface. Thus *file* can be an on-disk file
Martin Panter7462b6492015-11-02 03:37:02 +0000405 opened for binary reading, an :class:`io.BytesIO` object, or any other
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800406 custom object that meets this interface.
Georg Brandl116aa622007-08-15 14:28:22 +0000407
Antoine Pitrou91f43802019-05-26 17:10:09 +0200408 The optional arguments *fix_imports*, *encoding* and *errors* are used
409 to control compatibility support for pickle stream generated by Python 2.
410 If *fix_imports* is true, pickle will try to map the old Python 2 names
411 to the new names used in Python 3. The *encoding* and *errors* tell
412 pickle how to decode 8-bit string instances pickled by Python 2;
413 these default to 'ASCII' and 'strict', respectively. The *encoding* can
Sebastian Pucilowskia8d25a12017-12-21 20:00:49 +1100414 be 'bytes' to read these 8-bit string instances as bytes objects.
Antoine Pitrou91f43802019-05-26 17:10:09 +0200415 Using ``encoding='latin1'`` is required for unpickling NumPy arrays and
416 instances of :class:`~datetime.datetime`, :class:`~datetime.date` and
417 :class:`~datetime.time` pickled by Python 2.
418
419 If *buffers* is None (the default), then all data necessary for
420 deserialization must be contained in the pickle stream. This means
421 that the *buffer_callback* argument was None when a :class:`Pickler`
422 was instantiated (or when :func:`dump` or :func:`dumps` was called).
423
424 If *buffers* is not None, it should be an iterable of buffer-enabled
425 objects that is consumed each time the pickle stream references
426 an :ref:`out-of-band <pickle-oob>` buffer view. Such buffers have been
427 given in order to the *buffer_callback* of a Pickler object.
428
429 .. versionchanged:: 3.8
430 The *buffers* argument was added.
Georg Brandl116aa622007-08-15 14:28:22 +0000431
Benjamin Petersone41251e2008-04-25 01:59:09 +0000432 .. method:: load()
Georg Brandl116aa622007-08-15 14:28:22 +0000433
Géry Ogam362f5352019-08-07 07:02:23 +0200434 Read the pickled representation of an object from the open file object
435 given in the constructor, and return the reconstituted object hierarchy
436 specified therein. Bytes past the pickled representation of the object
437 are ignored.
Georg Brandl116aa622007-08-15 14:28:22 +0000438
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000439 .. method:: persistent_load(pid)
Georg Brandl116aa622007-08-15 14:28:22 +0000440
Ezio Melottie62aad32011-11-18 13:51:10 +0200441 Raise an :exc:`UnpicklingError` by default.
Georg Brandl116aa622007-08-15 14:28:22 +0000442
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000443 If defined, :meth:`persistent_load` should return the object specified by
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000444 the persistent ID *pid*. If an invalid persistent ID is encountered, an
Ezio Melottie62aad32011-11-18 13:51:10 +0200445 :exc:`UnpicklingError` should be raised.
Georg Brandl116aa622007-08-15 14:28:22 +0000446
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000447 See :ref:`pickle-persistent` for details and examples of uses.
448
449 .. method:: find_class(module, name)
450
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000451 Import *module* if necessary and return the object called *name* from it,
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000452 where the *module* and *name* arguments are :class:`str` objects. Note,
453 unlike its name suggests, :meth:`find_class` is also used for finding
454 functions.
Georg Brandl116aa622007-08-15 14:28:22 +0000455
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000456 Subclasses may override this to gain control over what type of objects and
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000457 how they can be loaded, potentially reducing security risks. Refer to
458 :ref:`pickle-restrict` for details.
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000459
Steve Dower44f91c32019-06-27 10:47:59 -0700460 .. audit-event:: pickle.find_class module,name pickle.Unpickler.find_class
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000461
Antoine Pitrou91f43802019-05-26 17:10:09 +0200462.. class:: PickleBuffer(buffer)
463
464 A wrapper for a buffer representing picklable data. *buffer* must be a
465 :ref:`buffer-providing <bufferobjects>` object, such as a
466 :term:`bytes-like object` or a N-dimensional array.
467
468 :class:`PickleBuffer` is itself a buffer provider, therefore it is
469 possible to pass it to other APIs expecting a buffer-providing object,
470 such as :class:`memoryview`.
471
472 :class:`PickleBuffer` objects can only be serialized using pickle
473 protocol 5 or higher. They are eligible for
474 :ref:`out-of-band serialization <pickle-oob>`.
475
476 .. versionadded:: 3.8
477
478 .. method:: raw()
479
480 Return a :class:`memoryview` of the memory area underlying this buffer.
481 The returned object is a one-dimensional, C-contiguous memoryview
482 with format ``B`` (unsigned bytes). :exc:`BufferError` is raised if
483 the buffer is neither C- nor Fortran-contiguous.
484
485 .. method:: release()
486
487 Release the underlying buffer exposed by the PickleBuffer object.
488
489
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000490.. _pickle-picklable:
Georg Brandl116aa622007-08-15 14:28:22 +0000491
492What can be pickled and unpickled?
493----------------------------------
494
495The following types can be pickled:
496
497* ``None``, ``True``, and ``False``
498
Georg Brandlba956ae2007-11-29 17:24:34 +0000499* integers, floating point numbers, complex numbers
Georg Brandl116aa622007-08-15 14:28:22 +0000500
Georg Brandlf6945182008-02-01 11:56:49 +0000501* strings, bytes, bytearrays
Georg Brandl116aa622007-08-15 14:28:22 +0000502
503* tuples, lists, sets, and dictionaries containing only picklable objects
504
Ethan Furman2498d9e2013-10-18 00:45:40 -0700505* functions defined at the top level of a module (using :keyword:`def`, not
506 :keyword:`lambda`)
Georg Brandl116aa622007-08-15 14:28:22 +0000507
508* built-in functions defined at the top level of a module
509
510* classes that are defined at the top level of a module
511
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300512* instances of such classes whose :attr:`~object.__dict__` or the result of
513 calling :meth:`__getstate__` is picklable (see section :ref:`pickle-inst` for
Eli Bendersky78f3ce52013-01-02 05:53:59 -0800514 details).
Georg Brandl116aa622007-08-15 14:28:22 +0000515
516Attempts to pickle unpicklable objects will raise the :exc:`PicklingError`
517exception; when this happens, an unspecified number of bytes may have already
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000518been written to the underlying file. Trying to pickle a highly recursive data
Yury Selivanovf488fb42015-07-03 01:04:23 -0400519structure may exceed the maximum recursion depth, a :exc:`RecursionError` will be
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000520raised in this case. You can carefully raise this limit with
Georg Brandl116aa622007-08-15 14:28:22 +0000521:func:`sys.setrecursionlimit`.
522
523Note that functions (built-in and user-defined) are pickled by "fully qualified"
Ethan Furman2498d9e2013-10-18 00:45:40 -0700524name reference, not by value. [#]_ This means that only the function name is
Eli Bendersky78f3ce52013-01-02 05:53:59 -0800525pickled, along with the name of the module the function is defined in. Neither
526the function's code, nor any of its function attributes are pickled. Thus the
Georg Brandl116aa622007-08-15 14:28:22 +0000527defining module must be importable in the unpickling environment, and the module
528must contain the named object, otherwise an exception will be raised. [#]_
529
530Similarly, classes are pickled by named reference, so the same restrictions in
531the unpickling environment apply. Note that none of the class's code or data is
532pickled, so in the following example the class attribute ``attr`` is not
533restored in the unpickling environment::
534
535 class Foo:
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000536 attr = 'A class attribute'
Georg Brandl116aa622007-08-15 14:28:22 +0000537
538 picklestring = pickle.dumps(Foo)
539
540These restrictions are why picklable functions and classes must be defined in
541the top level of a module.
542
543Similarly, when class instances are pickled, their class's code and data are not
544pickled along with them. Only the instance data are pickled. This is done on
545purpose, so you can fix bugs in a class or add methods to the class and still
546load objects that were created with an earlier version of the class. If you
547plan to have long-lived objects that will see many versions of a class, it may
548be worthwhile to put a version number in the objects so that suitable
549conversions can be made by the class's :meth:`__setstate__` method.
550
551
Georg Brandl116aa622007-08-15 14:28:22 +0000552.. _pickle-inst:
553
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000554Pickling Class Instances
555------------------------
Georg Brandl116aa622007-08-15 14:28:22 +0000556
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300557.. currentmodule:: None
558
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000559In this section, we describe the general mechanisms available to you to define,
560customize, and control how class instances are pickled and unpickled.
Georg Brandl116aa622007-08-15 14:28:22 +0000561
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000562In most cases, no additional code is needed to make instances picklable. By
563default, pickle will retrieve the class and the attributes of an instance via
564introspection. When a class instance is unpickled, its :meth:`__init__` method
565is usually *not* invoked. The default behaviour first creates an uninitialized
566instance and then restores the saved attributes. The following code shows an
567implementation of this behaviour::
Georg Brandl85eb8c12007-08-31 16:33:38 +0000568
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000569 def save(obj):
570 return (obj.__class__, obj.__dict__)
571
572 def load(cls, attributes):
573 obj = cls.__new__(cls)
574 obj.__dict__.update(attributes)
575 return obj
Georg Brandl116aa622007-08-15 14:28:22 +0000576
Georg Brandl6faee4e2010-09-21 14:48:28 +0000577Classes can alter the default behaviour by providing one or several special
Georg Brandlc8148262010-10-17 11:13:37 +0000578methods:
Georg Brandl116aa622007-08-15 14:28:22 +0000579
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100580.. method:: object.__getnewargs_ex__()
581
Serhiy Storchakab6d84832015-10-13 21:26:35 +0300582 In protocols 2 and newer, classes that implements the
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100583 :meth:`__getnewargs_ex__` method can dictate the values passed to the
584 :meth:`__new__` method upon unpickling. The method must return a pair
585 ``(args, kwargs)`` where *args* is a tuple of positional arguments
586 and *kwargs* a dictionary of named arguments for constructing the
587 object. Those will be passed to the :meth:`__new__` method upon
588 unpickling.
589
590 You should implement this method if the :meth:`__new__` method of your
591 class requires keyword-only arguments. Otherwise, it is recommended for
592 compatibility to implement :meth:`__getnewargs__`.
593
Serhiy Storchakab6d84832015-10-13 21:26:35 +0300594 .. versionchanged:: 3.6
595 :meth:`__getnewargs_ex__` is now used in protocols 2 and 3.
596
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100597
Georg Brandlc8148262010-10-17 11:13:37 +0000598.. method:: object.__getnewargs__()
Georg Brandl116aa622007-08-15 14:28:22 +0000599
Andrés Delfino0e0534c2018-06-09 21:41:09 -0300600 This method serves a similar purpose as :meth:`__getnewargs_ex__`, but
Serhiy Storchakab6d84832015-10-13 21:26:35 +0300601 supports only positional arguments. It must return a tuple of arguments
602 ``args`` which will be passed to the :meth:`__new__` method upon unpickling.
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100603
Serhiy Storchakab6d84832015-10-13 21:26:35 +0300604 :meth:`__getnewargs__` will not be called if :meth:`__getnewargs_ex__` is
605 defined.
606
607 .. versionchanged:: 3.6
608 Before Python 3.6, :meth:`__getnewargs__` was called instead of
609 :meth:`__getnewargs_ex__` in protocols 2 and 3.
Georg Brandl116aa622007-08-15 14:28:22 +0000610
Georg Brandl116aa622007-08-15 14:28:22 +0000611
Georg Brandlc8148262010-10-17 11:13:37 +0000612.. method:: object.__getstate__()
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000613
Georg Brandlc8148262010-10-17 11:13:37 +0000614 Classes can further influence how their instances are pickled; if the class
615 defines the method :meth:`__getstate__`, it is called and the returned object
616 is pickled as the contents for the instance, instead of the contents of the
617 instance's dictionary. If the :meth:`__getstate__` method is absent, the
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300618 instance's :attr:`~object.__dict__` is pickled as usual.
Georg Brandl116aa622007-08-15 14:28:22 +0000619
Georg Brandlc8148262010-10-17 11:13:37 +0000620
621.. method:: object.__setstate__(state)
622
623 Upon unpickling, if the class defines :meth:`__setstate__`, it is called with
624 the unpickled state. In that case, there is no requirement for the state
625 object to be a dictionary. Otherwise, the pickled state must be a dictionary
626 and its items are assigned to the new instance's dictionary.
627
628 .. note::
629
630 If :meth:`__getstate__` returns a false value, the :meth:`__setstate__`
631 method will not be called upon unpickling.
632
Georg Brandl116aa622007-08-15 14:28:22 +0000633
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000634Refer to the section :ref:`pickle-state` for more information about how to use
635the methods :meth:`__getstate__` and :meth:`__setstate__`.
Georg Brandl116aa622007-08-15 14:28:22 +0000636
Benjamin Petersond23f8222009-04-05 19:13:16 +0000637.. note::
Georg Brandle720c0a2009-04-27 16:20:50 +0000638
Benjamin Petersond23f8222009-04-05 19:13:16 +0000639 At unpickling time, some methods like :meth:`__getattr__`,
640 :meth:`__getattribute__`, or :meth:`__setattr__` may be called upon the
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100641 instance. In case those methods rely on some internal invariant being
642 true, the type should implement :meth:`__getnewargs__` or
643 :meth:`__getnewargs_ex__` to establish such an invariant; otherwise,
644 neither :meth:`__new__` nor :meth:`__init__` will be called.
Benjamin Petersond23f8222009-04-05 19:13:16 +0000645
Georg Brandlc8148262010-10-17 11:13:37 +0000646.. index:: pair: copy; protocol
Christian Heimes05e8be12008-02-23 18:30:17 +0000647
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000648As we shall see, pickle does not use directly the methods described above. In
649fact, these methods are part of the copy protocol which implements the
650:meth:`__reduce__` special method. The copy protocol provides a unified
651interface for retrieving the data necessary for pickling and copying
Georg Brandl48310cd2009-01-03 21:18:54 +0000652objects. [#]_
Georg Brandl116aa622007-08-15 14:28:22 +0000653
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000654Although powerful, implementing :meth:`__reduce__` directly in your classes is
655error prone. For this reason, class designers should use the high-level
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100656interface (i.e., :meth:`__getnewargs_ex__`, :meth:`__getstate__` and
Georg Brandlc8148262010-10-17 11:13:37 +0000657:meth:`__setstate__`) whenever possible. We will show, however, cases where
658using :meth:`__reduce__` is the only option or leads to more efficient pickling
659or both.
Georg Brandl116aa622007-08-15 14:28:22 +0000660
Georg Brandlc8148262010-10-17 11:13:37 +0000661.. method:: object.__reduce__()
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000662
Georg Brandlc8148262010-10-17 11:13:37 +0000663 The interface is currently defined as follows. The :meth:`__reduce__` method
664 takes no argument and shall return either a string or preferably a tuple (the
665 returned object is often referred to as the "reduce value").
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000666
Georg Brandlc8148262010-10-17 11:13:37 +0000667 If a string is returned, the string should be interpreted as the name of a
668 global variable. It should be the object's local name relative to its
669 module; the pickle module searches the module namespace to determine the
670 object's module. This behaviour is typically useful for singletons.
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000671
Pierre Glaser65d98d02019-05-08 21:40:25 +0200672 When a tuple is returned, it must be between two and six items long.
Georg Brandlc8148262010-10-17 11:13:37 +0000673 Optional items can either be omitted, or ``None`` can be provided as their
674 value. The semantics of each item are in order:
Georg Brandl116aa622007-08-15 14:28:22 +0000675
Georg Brandlc8148262010-10-17 11:13:37 +0000676 .. XXX Mention __newobj__ special-case?
Georg Brandl116aa622007-08-15 14:28:22 +0000677
Georg Brandlc8148262010-10-17 11:13:37 +0000678 * A callable object that will be called to create the initial version of the
679 object.
Georg Brandl116aa622007-08-15 14:28:22 +0000680
Georg Brandlc8148262010-10-17 11:13:37 +0000681 * A tuple of arguments for the callable object. An empty tuple must be given
682 if the callable does not accept any argument.
Georg Brandl116aa622007-08-15 14:28:22 +0000683
Georg Brandlc8148262010-10-17 11:13:37 +0000684 * Optionally, the object's state, which will be passed to the object's
685 :meth:`__setstate__` method as previously described. If the object has no
686 such method then, the value must be a dictionary and it will be added to
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300687 the object's :attr:`~object.__dict__` attribute.
Georg Brandl116aa622007-08-15 14:28:22 +0000688
Georg Brandlc8148262010-10-17 11:13:37 +0000689 * Optionally, an iterator (and not a sequence) yielding successive items.
690 These items will be appended to the object either using
691 ``obj.append(item)`` or, in batch, using ``obj.extend(list_of_items)``.
692 This is primarily used for list subclasses, but may be used by other
693 classes as long as they have :meth:`append` and :meth:`extend` methods with
694 the appropriate signature. (Whether :meth:`append` or :meth:`extend` is
695 used depends on which pickle protocol version is used as well as the number
696 of items to append, so both must be supported.)
Georg Brandl116aa622007-08-15 14:28:22 +0000697
Georg Brandlc8148262010-10-17 11:13:37 +0000698 * Optionally, an iterator (not a sequence) yielding successive key-value
699 pairs. These items will be stored to the object using ``obj[key] =
700 value``. This is primarily used for dictionary subclasses, but may be used
701 by other classes as long as they implement :meth:`__setitem__`.
Georg Brandl116aa622007-08-15 14:28:22 +0000702
Pierre Glaser65d98d02019-05-08 21:40:25 +0200703 * Optionally, a callable with a ``(obj, state)`` signature. This
Xtreak9b5a0ef2019-05-16 10:04:24 +0530704 callable allows the user to programmatically control the state-updating
Pierre Glaser65d98d02019-05-08 21:40:25 +0200705 behavior of a specific object, instead of using ``obj``'s static
706 :meth:`__setstate__` method. If not ``None``, this callable will have
707 priority over ``obj``'s :meth:`__setstate__`.
708
709 .. versionadded:: 3.8
710 The optional sixth tuple item, ``(obj, state)``, was added.
711
Georg Brandlc8148262010-10-17 11:13:37 +0000712
713.. method:: object.__reduce_ex__(protocol)
714
715 Alternatively, a :meth:`__reduce_ex__` method may be defined. The only
716 difference is this method should take a single integer argument, the protocol
717 version. When defined, pickle will prefer it over the :meth:`__reduce__`
718 method. In addition, :meth:`__reduce__` automatically becomes a synonym for
719 the extended version. The main use for this method is to provide
720 backwards-compatible reduce values for older Python releases.
Georg Brandl116aa622007-08-15 14:28:22 +0000721
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300722.. currentmodule:: pickle
723
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000724.. _pickle-persistent:
725
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000726Persistence of External Objects
727^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +0000728
Christian Heimes05e8be12008-02-23 18:30:17 +0000729.. index::
730 single: persistent_id (pickle protocol)
731 single: persistent_load (pickle protocol)
732
Georg Brandl116aa622007-08-15 14:28:22 +0000733For the benefit of object persistence, the :mod:`pickle` module supports the
734notion of a reference to an object outside the pickled data stream. Such
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000735objects are referenced by a persistent ID, which should be either a string of
736alphanumeric characters (for protocol 0) [#]_ or just an arbitrary object (for
737any newer protocol).
Georg Brandl116aa622007-08-15 14:28:22 +0000738
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000739The resolution of such persistent IDs is not defined by the :mod:`pickle`
Géry Ogam362f5352019-08-07 07:02:23 +0200740module; it will delegate this resolution to the user-defined methods on the
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300741pickler and unpickler, :meth:`~Pickler.persistent_id` and
742:meth:`~Unpickler.persistent_load` respectively.
Georg Brandl116aa622007-08-15 14:28:22 +0000743
Géry Ogam362f5352019-08-07 07:02:23 +0200744To pickle objects that have an external persistent ID, the pickler must have a
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300745custom :meth:`~Pickler.persistent_id` method that takes an object as an
Géry Ogam362f5352019-08-07 07:02:23 +0200746argument and returns either ``None`` or the persistent ID for that object.
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300747When ``None`` is returned, the pickler simply pickles the object as normal.
748When a persistent ID string is returned, the pickler will pickle that object,
749along with a marker so that the unpickler will recognize it as a persistent ID.
Georg Brandl116aa622007-08-15 14:28:22 +0000750
751To unpickle external objects, the unpickler must have a custom
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300752:meth:`~Unpickler.persistent_load` method that takes a persistent ID object and
753returns the referenced object.
Georg Brandl116aa622007-08-15 14:28:22 +0000754
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000755Here is a comprehensive example presenting how persistent ID can be used to
756pickle external objects by reference.
Georg Brandl116aa622007-08-15 14:28:22 +0000757
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000758.. literalinclude:: ../includes/dbpickle.py
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +0000759
Antoine Pitrou8d3c2902012-03-04 18:31:48 +0100760.. _pickle-dispatch:
761
762Dispatch Tables
763^^^^^^^^^^^^^^^
764
765If one wants to customize pickling of some classes without disturbing
766any other code which depends on pickling, then one can create a
767pickler with a private dispatch table.
768
769The global dispatch table managed by the :mod:`copyreg` module is
770available as :data:`copyreg.dispatch_table`. Therefore, one may
771choose to use a modified copy of :data:`copyreg.dispatch_table` as a
772private dispatch table.
773
774For example ::
775
776 f = io.BytesIO()
777 p = pickle.Pickler(f)
778 p.dispatch_table = copyreg.dispatch_table.copy()
779 p.dispatch_table[SomeClass] = reduce_SomeClass
780
781creates an instance of :class:`pickle.Pickler` with a private dispatch
782table which handles the ``SomeClass`` class specially. Alternatively,
783the code ::
784
785 class MyPickler(pickle.Pickler):
786 dispatch_table = copyreg.dispatch_table.copy()
787 dispatch_table[SomeClass] = reduce_SomeClass
788 f = io.BytesIO()
789 p = MyPickler(f)
790
791does the same, but all instances of ``MyPickler`` will by default
792share the same dispatch table. The equivalent code using the
793:mod:`copyreg` module is ::
794
795 copyreg.pickle(SomeClass, reduce_SomeClass)
796 f = io.BytesIO()
797 p = pickle.Pickler(f)
Georg Brandl116aa622007-08-15 14:28:22 +0000798
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000799.. _pickle-state:
800
801Handling Stateful Objects
802^^^^^^^^^^^^^^^^^^^^^^^^^
803
804.. index::
805 single: __getstate__() (copy protocol)
806 single: __setstate__() (copy protocol)
807
808Here's an example that shows how to modify pickling behavior for a class.
809The :class:`TextReader` class opens a text file, and returns the line number and
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300810line contents each time its :meth:`!readline` method is called. If a
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000811:class:`TextReader` instance is pickled, all attributes *except* the file object
812member are saved. When the instance is unpickled, the file is reopened, and
813reading resumes from the last location. The :meth:`__setstate__` and
814:meth:`__getstate__` methods are used to implement this behavior. ::
815
816 class TextReader:
817 """Print and number lines in a text file."""
818
819 def __init__(self, filename):
820 self.filename = filename
821 self.file = open(filename)
822 self.lineno = 0
823
824 def readline(self):
825 self.lineno += 1
826 line = self.file.readline()
827 if not line:
828 return None
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +0000829 if line.endswith('\n'):
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000830 line = line[:-1]
831 return "%i: %s" % (self.lineno, line)
832
833 def __getstate__(self):
834 # Copy the object's state from self.__dict__ which contains
835 # all our instance attributes. Always use the dict.copy()
836 # method to avoid modifying the original state.
837 state = self.__dict__.copy()
838 # Remove the unpicklable entries.
839 del state['file']
840 return state
841
842 def __setstate__(self, state):
843 # Restore instance attributes (i.e., filename and lineno).
844 self.__dict__.update(state)
845 # Restore the previously opened file's state. To do so, we need to
846 # reopen it and read from it until the line count is restored.
847 file = open(self.filename)
848 for _ in range(self.lineno):
849 file.readline()
850 # Finally, save the file.
851 self.file = file
852
853
854A sample usage might be something like this::
855
856 >>> reader = TextReader("hello.txt")
857 >>> reader.readline()
858 '1: Hello world!'
859 >>> reader.readline()
860 '2: I am line number two.'
861 >>> new_reader = pickle.loads(pickle.dumps(reader))
862 >>> new_reader.readline()
863 '3: Goodbye!'
864
Pierre Glaser289f1f82019-05-08 23:08:25 +0200865.. _reducer_override:
866
867Custom Reduction for Types, Functions, and Other Objects
868--------------------------------------------------------
869
870.. versionadded:: 3.8
871
872Sometimes, :attr:`~Pickler.dispatch_table` may not be flexible enough.
873In particular we may want to customize pickling based on another criterion
874than the object's type, or we may want to customize the pickling of
875functions and classes.
876
877For those cases, it is possible to subclass from the :class:`Pickler` class and
878implement a :meth:`~Pickler.reducer_override` method. This method can return an
879arbitrary reduction tuple (see :meth:`__reduce__`). It can alternatively return
880``NotImplemented`` to fallback to the traditional behavior.
881
882If both the :attr:`~Pickler.dispatch_table` and
883:meth:`~Pickler.reducer_override` are defined, then
884:meth:`~Pickler.reducer_override` method takes priority.
885
886.. Note::
887 For performance reasons, :meth:`~Pickler.reducer_override` may not be
888 called for the following objects: ``None``, ``True``, ``False``, and
889 exact instances of :class:`int`, :class:`float`, :class:`bytes`,
890 :class:`str`, :class:`dict`, :class:`set`, :class:`frozenset`, :class:`list`
891 and :class:`tuple`.
892
893Here is a simple example where we allow pickling and reconstructing
894a given class::
895
896 import io
897 import pickle
898
899 class MyClass:
900 my_attribute = 1
901
902 class MyPickler(pickle.Pickler):
903 def reducer_override(self, obj):
904 """Custom reducer for MyClass."""
905 if getattr(obj, "__name__", None) == "MyClass":
906 return type, (obj.__name__, obj.__bases__,
907 {'my_attribute': obj.my_attribute})
908 else:
909 # For any other object, fallback to usual reduction
910 return NotImplemented
911
912 f = io.BytesIO()
913 p = MyPickler(f)
914 p.dump(MyClass)
915
916 del MyClass
917
918 unpickled_class = pickle.loads(f.getvalue())
919
920 assert isinstance(unpickled_class, type)
921 assert unpickled_class.__name__ == "MyClass"
922 assert unpickled_class.my_attribute == 1
923
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000924
Antoine Pitrou91f43802019-05-26 17:10:09 +0200925.. _pickle-oob:
926
927Out-of-band Buffers
928-------------------
929
930.. versionadded:: 3.8
931
932In some contexts, the :mod:`pickle` module is used to transfer massive amounts
933of data. Therefore, it can be important to minimize the number of memory
934copies, to preserve performance and resource consumption. However, normal
935operation of the :mod:`pickle` module, as it transforms a graph-like structure
936of objects into a sequential stream of bytes, intrinsically involves copying
937data to and from the pickle stream.
938
939This constraint can be eschewed if both the *provider* (the implementation
940of the object types to be transferred) and the *consumer* (the implementation
941of the communications system) support the out-of-band transfer facilities
942provided by pickle protocol 5 and higher.
943
944Provider API
945^^^^^^^^^^^^
946
947The large data objects to be pickled must implement a :meth:`__reduce_ex__`
948method specialized for protocol 5 and higher, which returns a
949:class:`PickleBuffer` instance (instead of e.g. a :class:`bytes` object)
950for any large data.
951
952A :class:`PickleBuffer` object *signals* that the underlying buffer is
953eligible for out-of-band data transfer. Those objects remain compatible
954with normal usage of the :mod:`pickle` module. However, consumers can also
955opt-in to tell :mod:`pickle` that they will handle those buffers by
956themselves.
957
958Consumer API
959^^^^^^^^^^^^
960
961A communications system can enable custom handling of the :class:`PickleBuffer`
962objects generated when serializing an object graph.
963
964On the sending side, it needs to pass a *buffer_callback* argument to
965:class:`Pickler` (or to the :func:`dump` or :func:`dumps` function), which
966will be called with each :class:`PickleBuffer` generated while pickling
967the object graph. Buffers accumulated by the *buffer_callback* will not
968see their data copied into the pickle stream, only a cheap marker will be
969inserted.
970
971On the receiving side, it needs to pass a *buffers* argument to
972:class:`Unpickler` (or to the :func:`load` or :func:`loads` function),
973which is an iterable of the buffers which were passed to *buffer_callback*.
974That iterable should produce buffers in the same order as they were passed
975to *buffer_callback*. Those buffers will provide the data expected by the
976reconstructors of the objects whose pickling produced the original
977:class:`PickleBuffer` objects.
978
979Between the sending side and the receiving side, the communications system
980is free to implement its own transfer mechanism for out-of-band buffers.
981Potential optimizations include the use of shared memory or datatype-dependent
982compression.
983
984Example
985^^^^^^^
986
987Here is a trivial example where we implement a :class:`bytearray` subclass
988able to participate in out-of-band buffer pickling::
989
990 class ZeroCopyByteArray(bytearray):
991
992 def __reduce_ex__(self, protocol):
993 if protocol >= 5:
994 return type(self)._reconstruct, (PickleBuffer(self),), None
995 else:
996 # PickleBuffer is forbidden with pickle protocols <= 4.
997 return type(self)._reconstruct, (bytearray(self),)
998
999 @classmethod
1000 def _reconstruct(cls, obj):
1001 with memoryview(obj) as m:
1002 # Get a handle over the original buffer object
1003 obj = m.obj
1004 if type(obj) is cls:
1005 # Original buffer object is a ZeroCopyByteArray, return it
1006 # as-is.
1007 return obj
1008 else:
1009 return cls(obj)
1010
1011The reconstructor (the ``_reconstruct`` class method) returns the buffer's
1012providing object if it has the right type. This is an easy way to simulate
1013zero-copy behaviour on this toy example.
1014
1015On the consumer side, we can pickle those objects the usual way, which
1016when unserialized will give us a copy of the original object::
1017
1018 b = ZeroCopyByteArray(b"abc")
1019 data = pickle.dumps(b, protocol=5)
1020 new_b = pickle.loads(data)
1021 print(b == new_b) # True
1022 print(b is new_b) # False: a copy was made
1023
1024But if we pass a *buffer_callback* and then give back the accumulated
1025buffers when unserializing, we are able to get back the original object::
1026
1027 b = ZeroCopyByteArray(b"abc")
1028 buffers = []
1029 data = pickle.dumps(b, protocol=5, buffer_callback=buffers.append)
1030 new_b = pickle.loads(data, buffers=buffers)
1031 print(b == new_b) # True
1032 print(b is new_b) # True: no copy was made
1033
1034This example is limited by the fact that :class:`bytearray` allocates its
1035own memory: you cannot create a :class:`bytearray` instance that is backed
1036by another object's memory. However, third-party datatypes such as NumPy
1037arrays do not have this limitation, and allow use of zero-copy pickling
1038(or making as few copies as possible) when transferring between distinct
1039processes or systems.
1040
1041.. seealso:: :pep:`574` -- Pickle protocol 5 with out-of-band data
1042
1043
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001044.. _pickle-restrict:
Georg Brandl116aa622007-08-15 14:28:22 +00001045
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001046Restricting Globals
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +00001047-------------------
Georg Brandl116aa622007-08-15 14:28:22 +00001048
Christian Heimes05e8be12008-02-23 18:30:17 +00001049.. index::
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001050 single: find_class() (pickle protocol)
Christian Heimes05e8be12008-02-23 18:30:17 +00001051
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001052By default, unpickling will import any class or function that it finds in the
1053pickle data. For many applications, this behaviour is unacceptable as it
1054permits the unpickler to import and invoke arbitrary code. Just consider what
1055this hand-crafted pickle data stream does when loaded::
Georg Brandl116aa622007-08-15 14:28:22 +00001056
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001057 >>> import pickle
1058 >>> pickle.loads(b"cos\nsystem\n(S'echo hello world'\ntR.")
1059 hello world
1060 0
Georg Brandl116aa622007-08-15 14:28:22 +00001061
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001062In this example, the unpickler imports the :func:`os.system` function and then
1063apply the string argument "echo hello world". Although this example is
1064inoffensive, it is not difficult to imagine one that could damage your system.
Georg Brandl116aa622007-08-15 14:28:22 +00001065
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001066For this reason, you may want to control what gets unpickled by customizing
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +03001067:meth:`Unpickler.find_class`. Unlike its name suggests,
1068:meth:`Unpickler.find_class` is called whenever a global (i.e., a class or
1069a function) is requested. Thus it is possible to either completely forbid
1070globals or restrict them to a safe subset.
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001071
1072Here is an example of an unpickler allowing only few safe classes from the
1073:mod:`builtins` module to be loaded::
1074
1075 import builtins
1076 import io
1077 import pickle
1078
1079 safe_builtins = {
1080 'range',
1081 'complex',
1082 'set',
1083 'frozenset',
1084 'slice',
1085 }
1086
1087 class RestrictedUnpickler(pickle.Unpickler):
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +00001088
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001089 def find_class(self, module, name):
1090 # Only allow safe classes from builtins.
1091 if module == "builtins" and name in safe_builtins:
1092 return getattr(builtins, name)
1093 # Forbid everything else.
1094 raise pickle.UnpicklingError("global '%s.%s' is forbidden" %
1095 (module, name))
1096
1097 def restricted_loads(s):
1098 """Helper function analogous to pickle.loads()."""
1099 return RestrictedUnpickler(io.BytesIO(s)).load()
1100
1101A sample usage of our unpickler working has intended::
1102
1103 >>> restricted_loads(pickle.dumps([1, 2, range(15)]))
1104 [1, 2, range(0, 15)]
1105 >>> restricted_loads(b"cos\nsystem\n(S'echo hello world'\ntR.")
1106 Traceback (most recent call last):
1107 ...
1108 pickle.UnpicklingError: global 'os.system' is forbidden
1109 >>> restricted_loads(b'cbuiltins\neval\n'
1110 ... b'(S\'getattr(__import__("os"), "system")'
1111 ... b'("echo hello world")\'\ntR.')
1112 Traceback (most recent call last):
1113 ...
1114 pickle.UnpicklingError: global 'builtins.eval' is forbidden
1115
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +00001116
1117.. XXX Add note about how extension codes could evade our protection
Georg Brandl48310cd2009-01-03 21:18:54 +00001118 mechanism (e.g. cached classes do not invokes find_class()).
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +00001119
1120As our examples shows, you have to be careful with what you allow to be
1121unpickled. Therefore if security is a concern, you may want to consider
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +00001122alternatives such as the marshalling API in :mod:`xmlrpc.client` or
1123third-party solutions.
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +00001124
Georg Brandl116aa622007-08-15 14:28:22 +00001125
Antoine Pitroud4d60552013-12-07 00:56:59 +01001126Performance
1127-----------
1128
1129Recent versions of the pickle protocol (from protocol 2 and upwards) feature
1130efficient binary encodings for several common features and built-in types.
1131Also, the :mod:`pickle` module has a transparent optimizer written in C.
1132
1133
Georg Brandl116aa622007-08-15 14:28:22 +00001134.. _pickle-example:
1135
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +00001136Examples
1137--------
Georg Brandl116aa622007-08-15 14:28:22 +00001138
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +00001139For the simplest code, use the :func:`dump` and :func:`load` functions. ::
Georg Brandl116aa622007-08-15 14:28:22 +00001140
1141 import pickle
1142
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +00001143 # An arbitrary collection of objects supported by pickle.
1144 data = {
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +00001145 'a': [1, 2.0, 3, 4+6j],
1146 'b': ("character string", b"byte string"),
Raymond Hettingerdf1b6992014-11-09 15:56:33 -08001147 'c': {None, True, False}
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +00001148 }
Georg Brandl116aa622007-08-15 14:28:22 +00001149
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +00001150 with open('data.pickle', 'wb') as f:
1151 # Pickle the 'data' dictionary using the highest protocol available.
1152 pickle.dump(data, f, pickle.HIGHEST_PROTOCOL)
Georg Brandl116aa622007-08-15 14:28:22 +00001153
Georg Brandl116aa622007-08-15 14:28:22 +00001154
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +00001155The following example reads the resulting pickled data. ::
Georg Brandl116aa622007-08-15 14:28:22 +00001156
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +00001157 import pickle
Georg Brandl116aa622007-08-15 14:28:22 +00001158
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +00001159 with open('data.pickle', 'rb') as f:
1160 # The protocol version used is detected automatically, so we do not
1161 # have to specify it.
1162 data = pickle.load(f)
Georg Brandl116aa622007-08-15 14:28:22 +00001163
Georg Brandl116aa622007-08-15 14:28:22 +00001164
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +00001165.. XXX: Add examples showing how to optimize pickles for size (like using
1166.. pickletools.optimize() or the gzip module).
1167
1168
Georg Brandl116aa622007-08-15 14:28:22 +00001169.. seealso::
1170
Alexandre Vassalottif7fa63d2008-05-11 08:55:36 +00001171 Module :mod:`copyreg`
Georg Brandl116aa622007-08-15 14:28:22 +00001172 Pickle interface constructor registration for extension types.
1173
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +00001174 Module :mod:`pickletools`
1175 Tools for working with and analyzing pickled data.
1176
Georg Brandl116aa622007-08-15 14:28:22 +00001177 Module :mod:`shelve`
1178 Indexed databases of objects; uses :mod:`pickle`.
1179
1180 Module :mod:`copy`
1181 Shallow and deep object copying.
1182
1183 Module :mod:`marshal`
1184 High-performance serialization of built-in types.
1185
1186
Georg Brandl116aa622007-08-15 14:28:22 +00001187.. rubric:: Footnotes
1188
1189.. [#] Don't confuse this with the :mod:`marshal` module
1190
Ethan Furman2498d9e2013-10-18 00:45:40 -07001191.. [#] This is why :keyword:`lambda` functions cannot be pickled: all
Serhiy Storchaka2b57c432018-12-19 08:09:46 +02001192 :keyword:`!lambda` functions share the same name: ``<lambda>``.
Ethan Furman2498d9e2013-10-18 00:45:40 -07001193
Georg Brandl116aa622007-08-15 14:28:22 +00001194.. [#] The exception raised will likely be an :exc:`ImportError` or an
1195 :exc:`AttributeError` but it could be something else.
1196
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +00001197.. [#] The :mod:`copy` module uses this protocol for shallow and deep copying
1198 operations.
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +00001199
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001200.. [#] The limitation on alphanumeric characters is due to the fact
1201 the persistent IDs, in protocol 0, are delimited by the newline
1202 character. Therefore if any kind of newline characters occurs in
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +00001203 persistent IDs, the resulting pickle will become unreadable.