blob: 9442efa2b667fcf7e54753d303550af6f129e1d3 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001:mod:`pickle` --- Python object serialization
2=============================================
3
Terry Jan Reedyfa089b92016-06-11 15:02:54 -04004.. module:: pickle
5 :synopsis: Convert Python objects to streams of bytes and back.
6
7.. sectionauthor:: Jim Kerr <jbkerr@sr.hp.com>.
8.. sectionauthor:: Barry Warsaw <barry@python.org>
9
10**Source code:** :source:`Lib/pickle.py`
11
Georg Brandl116aa622007-08-15 14:28:22 +000012.. index::
13 single: persistence
14 pair: persistent; objects
15 pair: serializing; objects
16 pair: marshalling; objects
17 pair: flattening; objects
18 pair: pickling; objects
19
Terry Jan Reedyfa089b92016-06-11 15:02:54 -040020--------------
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +000021
Antoine Pitroud4d60552013-12-07 00:56:59 +010022The :mod:`pickle` module implements binary protocols for serializing and
23de-serializing a Python object structure. *"Pickling"* is the process
24whereby a Python object hierarchy is converted into a byte stream, and
25*"unpickling"* is the inverse operation, whereby a byte stream
26(from a :term:`binary file` or :term:`bytes-like object`) is converted
27back into an object hierarchy. Pickling (and unpickling) is alternatively
28known as "serialization", "marshalling," [#]_ or "flattening"; however, to
29avoid confusion, the terms used here are "pickling" and "unpickling".
Georg Brandl116aa622007-08-15 14:28:22 +000030
Georg Brandl0036bcf2010-10-17 10:24:54 +000031.. warning::
32
Miss Islington (bot)6922b9e2019-08-30 23:02:15 -070033 The ``pickle`` module **is not secure**. Only unpickle data you trust.
34
35 It is possible to construct malicious pickle data which will **execute
36 arbitrary code during unpickling**. Never unpickle data that could have come
37 from an untrusted source, or that could have been tampered with.
38
39 Consider signing data with :mod:`hmac` if you need to ensure that it has not
40 been tampered with.
41
42 Safer serialization formats such as :mod:`json` may be more appropriate if
43 you are processing untrusted data. See :ref:`comparison-with-json`.
Georg Brandl0036bcf2010-10-17 10:24:54 +000044
Georg Brandl116aa622007-08-15 14:28:22 +000045
46Relationship to other Python modules
47------------------------------------
48
Antoine Pitroud4d60552013-12-07 00:56:59 +010049Comparison with ``marshal``
50^^^^^^^^^^^^^^^^^^^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +000051
52Python has a more primitive serialization module called :mod:`marshal`, but in
53general :mod:`pickle` should always be the preferred way to serialize Python
54objects. :mod:`marshal` exists primarily to support Python's :file:`.pyc`
55files.
56
Georg Brandl5aa580f2010-11-30 14:57:54 +000057The :mod:`pickle` module differs from :mod:`marshal` in several significant ways:
Georg Brandl116aa622007-08-15 14:28:22 +000058
59* The :mod:`pickle` module keeps track of the objects it has already serialized,
60 so that later references to the same object won't be serialized again.
61 :mod:`marshal` doesn't do this.
62
63 This has implications both for recursive objects and object sharing. Recursive
64 objects are objects that contain references to themselves. These are not
65 handled by marshal, and in fact, attempting to marshal recursive objects will
66 crash your Python interpreter. Object sharing happens when there are multiple
67 references to the same object in different places in the object hierarchy being
68 serialized. :mod:`pickle` stores such objects only once, and ensures that all
69 other references point to the master copy. Shared objects remain shared, which
70 can be very important for mutable objects.
71
72* :mod:`marshal` cannot be used to serialize user-defined classes and their
73 instances. :mod:`pickle` can save and restore class instances transparently,
74 however the class definition must be importable and live in the same module as
75 when the object was stored.
76
77* The :mod:`marshal` serialization format is not guaranteed to be portable
78 across Python versions. Because its primary job in life is to support
79 :file:`.pyc` files, the Python implementers reserve the right to change the
80 serialization format in non-backwards compatible ways should the need arise.
81 The :mod:`pickle` serialization format is guaranteed to be backwards compatible
Gregory P. Smithe3287532018-12-09 11:42:58 -080082 across Python releases provided a compatible pickle protocol is chosen and
83 pickling and unpickling code deals with Python 2 to Python 3 type differences
84 if your data is crossing that unique breaking change language boundary.
Georg Brandl116aa622007-08-15 14:28:22 +000085
Miss Islington (bot)6922b9e2019-08-30 23:02:15 -070086
87.. _comparison-with-json:
88
Antoine Pitroud4d60552013-12-07 00:56:59 +010089Comparison with ``json``
90^^^^^^^^^^^^^^^^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +000091
Antoine Pitroud4d60552013-12-07 00:56:59 +010092There are fundamental differences between the pickle protocols and
93`JSON (JavaScript Object Notation) <http://json.org>`_:
94
95* JSON is a text serialization format (it outputs unicode text, although
96 most of the time it is then encoded to ``utf-8``), while pickle is
97 a binary serialization format;
98
99* JSON is human-readable, while pickle is not;
100
101* JSON is interoperable and widely used outside of the Python ecosystem,
102 while pickle is Python-specific;
103
104* JSON, by default, can only represent a subset of the Python built-in
105 types, and no custom classes; pickle can represent an extremely large
106 number of Python types (many of them automatically, by clever usage
107 of Python's introspection facilities; complex cases can be tackled by
Miss Islington (bot)6922b9e2019-08-30 23:02:15 -0700108 implementing :ref:`specific object APIs <pickle-inst>`);
109
110* Unlike pickle, deserializing untrusted JSON does not in itself create an
111 arbitrary code execution vulnerability.
Antoine Pitroud4d60552013-12-07 00:56:59 +0100112
113.. seealso::
114 The :mod:`json` module: a standard library module allowing JSON
115 serialization and deserialization.
Georg Brandl116aa622007-08-15 14:28:22 +0000116
Antoine Pitrou9bcb1122013-12-07 01:05:57 +0100117
118.. _pickle-protocols:
119
Georg Brandl116aa622007-08-15 14:28:22 +0000120Data stream format
121------------------
122
123.. index::
Georg Brandl116aa622007-08-15 14:28:22 +0000124 single: External Data Representation
125
126The data format used by :mod:`pickle` is Python-specific. This has the
127advantage that there are no restrictions imposed by external standards such as
Antoine Pitroua9494f62012-05-10 15:38:30 +0200128JSON or XDR (which can't represent pointer sharing); however it means that
129non-Python programs may not be able to reconstruct pickled Python objects.
Georg Brandl116aa622007-08-15 14:28:22 +0000130
Antoine Pitroua9494f62012-05-10 15:38:30 +0200131By default, the :mod:`pickle` data format uses a relatively compact binary
132representation. If you need optimal size characteristics, you can efficiently
133:doc:`compress <archiving>` pickled data.
134
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000135The module :mod:`pickletools` contains tools for analyzing data streams
Antoine Pitroua9494f62012-05-10 15:38:30 +0200136generated by :mod:`pickle`. :mod:`pickletools` source code has extensive
137comments about opcodes used by pickle protocols.
Georg Brandl116aa622007-08-15 14:28:22 +0000138
Antoine Pitroub6457242014-01-21 02:39:54 +0100139There are currently 5 different protocols which can be used for pickling.
140The higher the protocol used, the more recent the version of Python needed
141to read the pickle produced.
Georg Brandl116aa622007-08-15 14:28:22 +0000142
Antoine Pitroua9494f62012-05-10 15:38:30 +0200143* Protocol version 0 is the original "human-readable" protocol and is
Alexandre Vassalottif7d08c72009-01-23 04:50:05 +0000144 backwards compatible with earlier versions of Python.
Georg Brandl116aa622007-08-15 14:28:22 +0000145
Antoine Pitroua9494f62012-05-10 15:38:30 +0200146* Protocol version 1 is an old binary format which is also compatible with
Georg Brandl116aa622007-08-15 14:28:22 +0000147 earlier versions of Python.
148
149* Protocol version 2 was introduced in Python 2.3. It provides much more
Antoine Pitroua9494f62012-05-10 15:38:30 +0200150 efficient pickling of :term:`new-style class`\es. Refer to :pep:`307` for
151 information about improvements brought by protocol 2.
Georg Brandl116aa622007-08-15 14:28:22 +0000152
Antoine Pitrou9bcb1122013-12-07 01:05:57 +0100153* Protocol version 3 was added in Python 3.0. It has explicit support for
Łukasz Langac51d8c92018-04-03 23:06:53 -0700154 :class:`bytes` objects and cannot be unpickled by Python 2.x. This was
155 the default protocol in Python 3.0--3.7.
Antoine Pitrou9bcb1122013-12-07 01:05:57 +0100156
157* Protocol version 4 was added in Python 3.4. It adds support for very large
158 objects, pickling more kinds of objects, and some data format
Łukasz Langac51d8c92018-04-03 23:06:53 -0700159 optimizations. It is the default protocol starting with Python 3.8.
160 Refer to :pep:`3154` for information about improvements brought by
161 protocol 4.
Georg Brandl116aa622007-08-15 14:28:22 +0000162
Antoine Pitroud4d60552013-12-07 00:56:59 +0100163.. note::
164 Serialization is a more primitive notion than persistence; although
165 :mod:`pickle` reads and writes file objects, it does not handle the issue of
166 naming persistent objects, nor the (even more complicated) issue of concurrent
167 access to persistent objects. The :mod:`pickle` module can transform a complex
168 object into a byte stream and it can transform the byte stream into an object
169 with the same internal structure. Perhaps the most obvious thing to do with
170 these byte streams is to write them onto a file, but it is also conceivable to
171 send them across a network or store them in a database. The :mod:`shelve`
172 module provides a simple interface to pickle and unpickle objects on
173 DBM-style database files.
174
Georg Brandl116aa622007-08-15 14:28:22 +0000175
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000176Module Interface
177----------------
Georg Brandl116aa622007-08-15 14:28:22 +0000178
Antoine Pitroua9494f62012-05-10 15:38:30 +0200179To serialize an object hierarchy, you simply call the :func:`dumps` function.
180Similarly, to de-serialize a data stream, you call the :func:`loads` function.
181However, if you want more control over serialization and de-serialization,
182you can create a :class:`Pickler` or an :class:`Unpickler` object, respectively.
183
184The :mod:`pickle` module provides the following constants:
Georg Brandl116aa622007-08-15 14:28:22 +0000185
186
187.. data:: HIGHEST_PROTOCOL
188
Antoine Pitrou9bcb1122013-12-07 01:05:57 +0100189 An integer, the highest :ref:`protocol version <pickle-protocols>`
190 available. This value can be passed as a *protocol* value to functions
191 :func:`dump` and :func:`dumps` as well as the :class:`Pickler`
192 constructor.
Georg Brandl116aa622007-08-15 14:28:22 +0000193
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000194.. data:: DEFAULT_PROTOCOL
195
Antoine Pitrou9bcb1122013-12-07 01:05:57 +0100196 An integer, the default :ref:`protocol version <pickle-protocols>` used
197 for pickling. May be less than :data:`HIGHEST_PROTOCOL`. Currently the
Łukasz Langac51d8c92018-04-03 23:06:53 -0700198 default protocol is 4, first introduced in Python 3.4 and incompatible
199 with previous versions.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000200
Łukasz Langac51d8c92018-04-03 23:06:53 -0700201 .. versionchanged:: 3.0
202
203 The default protocol is 3.
204
205 .. versionchanged:: 3.8
206
207 The default protocol is 4.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000208
Georg Brandl116aa622007-08-15 14:28:22 +0000209The :mod:`pickle` module provides the following functions to make the pickling
210process more convenient:
211
Antoine Pitrou91f43802019-05-26 17:10:09 +0200212.. function:: dump(obj, file, protocol=None, \*, fix_imports=True, buffer_callback=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000213
Antoine Pitrou11cb9612010-09-15 11:11:28 +0000214 Write a pickled representation of *obj* to the open :term:`file object` *file*.
215 This is equivalent to ``Pickler(file, protocol).dump(obj)``.
Georg Brandl116aa622007-08-15 14:28:22 +0000216
Antoine Pitrou91f43802019-05-26 17:10:09 +0200217 Arguments *file*, *protocol*, *fix_imports* and *buffer_callback* have
218 the same meaning as in the :class:`Pickler` constructor.
Georg Brandl116aa622007-08-15 14:28:22 +0000219
Antoine Pitrou91f43802019-05-26 17:10:09 +0200220 .. versionchanged:: 3.8
221 The *buffer_callback* argument was added.
Georg Brandl116aa622007-08-15 14:28:22 +0000222
Antoine Pitrou91f43802019-05-26 17:10:09 +0200223.. function:: dumps(obj, protocol=None, \*, fix_imports=True, buffer_callback=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000224
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800225 Return the pickled representation of the object as a :class:`bytes` object,
226 instead of writing it to a file.
Georg Brandl116aa622007-08-15 14:28:22 +0000227
Antoine Pitrou91f43802019-05-26 17:10:09 +0200228 Arguments *protocol*, *fix_imports* and *buffer_callback* have the same
229 meaning as in the :class:`Pickler` constructor.
Antoine Pitroud9dfaa92009-06-04 20:32:06 +0000230
Antoine Pitrou91f43802019-05-26 17:10:09 +0200231 .. versionchanged:: 3.8
232 The *buffer_callback* argument was added.
233
234.. function:: load(file, \*, fix_imports=True, encoding="ASCII", errors="strict", buffers=None)
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000235
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800236 Read a pickled object representation from the open :term:`file object`
237 *file* and return the reconstituted object hierarchy specified therein.
238 This is equivalent to ``Unpickler(file).load()``.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000239
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800240 The protocol version of the pickle is detected automatically, so no
241 protocol argument is needed. Bytes past the pickled object's
242 representation are ignored.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000243
Antoine Pitrou91f43802019-05-26 17:10:09 +0200244 Arguments *file*, *fix_imports*, *encoding*, *errors*, *strict* and *buffers*
245 have the same meaning as in the :class:`Unpickler` constructor.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000246
Antoine Pitrou91f43802019-05-26 17:10:09 +0200247 .. versionchanged:: 3.8
248 The *buffers* argument was added.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000249
Antoine Pitrou91f43802019-05-26 17:10:09 +0200250.. function:: loads(bytes_object, \*, fix_imports=True, encoding="ASCII", errors="strict", buffers=None)
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000251
252 Read a pickled object hierarchy from a :class:`bytes` object and return the
Martin Panterd21e0b52015-10-10 10:36:22 +0000253 reconstituted object hierarchy specified therein.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000254
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800255 The protocol version of the pickle is detected automatically, so no
256 protocol argument is needed. Bytes past the pickled object's
257 representation are ignored.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000258
Antoine Pitrou91f43802019-05-26 17:10:09 +0200259 Arguments *file*, *fix_imports*, *encoding*, *errors*, *strict* and *buffers*
260 have the same meaning as in the :class:`Unpickler` constructor.
261
262 .. versionchanged:: 3.8
263 The *buffers* argument was added.
Georg Brandl116aa622007-08-15 14:28:22 +0000264
Georg Brandl116aa622007-08-15 14:28:22 +0000265
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000266The :mod:`pickle` module defines three exceptions:
Georg Brandl116aa622007-08-15 14:28:22 +0000267
268.. exception:: PickleError
269
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000270 Common base class for the other pickling exceptions. It inherits
Georg Brandl116aa622007-08-15 14:28:22 +0000271 :exc:`Exception`.
272
Georg Brandl116aa622007-08-15 14:28:22 +0000273.. exception:: PicklingError
274
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000275 Error raised when an unpicklable object is encountered by :class:`Pickler`.
276 It inherits :exc:`PickleError`.
Georg Brandl116aa622007-08-15 14:28:22 +0000277
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000278 Refer to :ref:`pickle-picklable` to learn what kinds of objects can be
279 pickled.
280
Georg Brandl116aa622007-08-15 14:28:22 +0000281.. exception:: UnpicklingError
282
Ezio Melottie62aad32011-11-18 13:51:10 +0200283 Error raised when there is a problem unpickling an object, such as a data
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000284 corruption or a security violation. It inherits :exc:`PickleError`.
Georg Brandl116aa622007-08-15 14:28:22 +0000285
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000286 Note that other exceptions may also be raised during unpickling, including
287 (but not necessarily limited to) AttributeError, EOFError, ImportError, and
288 IndexError.
289
290
Antoine Pitrou91f43802019-05-26 17:10:09 +0200291The :mod:`pickle` module exports three classes, :class:`Pickler`,
292:class:`Unpickler` and :class:`PickleBuffer`:
Georg Brandl116aa622007-08-15 14:28:22 +0000293
Antoine Pitrou91f43802019-05-26 17:10:09 +0200294.. class:: Pickler(file, protocol=None, \*, fix_imports=True, buffer_callback=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000295
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000296 This takes a binary file for writing a pickle data stream.
Georg Brandl116aa622007-08-15 14:28:22 +0000297
Antoine Pitroub6457242014-01-21 02:39:54 +0100298 The optional *protocol* argument, an integer, tells the pickler to use
299 the given protocol; supported protocols are 0 to :data:`HIGHEST_PROTOCOL`.
300 If not specified, the default is :data:`DEFAULT_PROTOCOL`. If a negative
301 number is specified, :data:`HIGHEST_PROTOCOL` is selected.
Georg Brandl116aa622007-08-15 14:28:22 +0000302
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000303 The *file* argument must have a write() method that accepts a single bytes
Serhiy Storchakad65c9492015-11-02 14:10:23 +0200304 argument. It can thus be an on-disk file opened for binary writing, an
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800305 :class:`io.BytesIO` instance, or any other custom object that meets this
306 interface.
Georg Brandl116aa622007-08-15 14:28:22 +0000307
Serhiy Storchakafbc1c262013-11-29 12:17:13 +0200308 If *fix_imports* is true and *protocol* is less than 3, pickle will try to
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800309 map the new Python 3 names to the old module names used in Python 2, so
310 that the pickle data stream is readable with Python 2.
Antoine Pitroud9dfaa92009-06-04 20:32:06 +0000311
Antoine Pitrou91f43802019-05-26 17:10:09 +0200312 If *buffer_callback* is None (the default), buffer views are
313 serialized into *file* as part of the pickle stream.
314
315 If *buffer_callback* is not None, then it can be called any number
316 of times with a buffer view. If the callback returns a false value
317 (such as None), the given buffer is :ref:`out-of-band <pickle-oob>`;
318 otherwise the buffer is serialized in-band, i.e. inside the pickle stream.
319
320 It is an error if *buffer_callback* is not None and *protocol* is
321 None or smaller than 5.
322
323 .. versionchanged:: 3.8
324 The *buffer_callback* argument was added.
325
Benjamin Petersone41251e2008-04-25 01:59:09 +0000326 .. method:: dump(obj)
Georg Brandl116aa622007-08-15 14:28:22 +0000327
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000328 Write a pickled representation of *obj* to the open file object given in
329 the constructor.
Georg Brandl116aa622007-08-15 14:28:22 +0000330
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000331 .. method:: persistent_id(obj)
332
333 Do nothing by default. This exists so a subclass can override it.
334
335 If :meth:`persistent_id` returns ``None``, *obj* is pickled as usual. Any
336 other value causes :class:`Pickler` to emit the returned value as a
337 persistent ID for *obj*. The meaning of this persistent ID should be
338 defined by :meth:`Unpickler.persistent_load`. Note that the value
339 returned by :meth:`persistent_id` cannot itself have a persistent ID.
340
341 See :ref:`pickle-persistent` for details and examples of uses.
Georg Brandl116aa622007-08-15 14:28:22 +0000342
Antoine Pitrou8d3c2902012-03-04 18:31:48 +0100343 .. attribute:: dispatch_table
344
345 A pickler object's dispatch table is a registry of *reduction
346 functions* of the kind which can be declared using
347 :func:`copyreg.pickle`. It is a mapping whose keys are classes
348 and whose values are reduction functions. A reduction function
349 takes a single argument of the associated class and should
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300350 conform to the same interface as a :meth:`__reduce__`
Antoine Pitrou8d3c2902012-03-04 18:31:48 +0100351 method.
352
353 By default, a pickler object will not have a
354 :attr:`dispatch_table` attribute, and it will instead use the
355 global dispatch table managed by the :mod:`copyreg` module.
356 However, to customize the pickling for a specific pickler object
357 one can set the :attr:`dispatch_table` attribute to a dict-like
358 object. Alternatively, if a subclass of :class:`Pickler` has a
359 :attr:`dispatch_table` attribute then this will be used as the
360 default dispatch table for instances of that class.
361
362 See :ref:`pickle-dispatch` for usage examples.
363
364 .. versionadded:: 3.3
365
Pierre Glaser289f1f82019-05-08 23:08:25 +0200366 .. method:: reducer_override(self, obj)
367
368 Special reducer that can be defined in :class:`Pickler` subclasses. This
369 method has priority over any reducer in the :attr:`dispatch_table`. It
370 should conform to the same interface as a :meth:`__reduce__` method, and
371 can optionally return ``NotImplemented`` to fallback on
372 :attr:`dispatch_table`-registered reducers to pickle ``obj``.
373
374 For a detailed example, see :ref:`reducer_override`.
375
376 .. versionadded:: 3.8
377
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000378 .. attribute:: fast
379
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000380 Deprecated. Enable fast mode if set to a true value. The fast mode
381 disables the usage of memo, therefore speeding the pickling process by not
382 generating superfluous PUT opcodes. It should not be used with
383 self-referential objects, doing otherwise will cause :class:`Pickler` to
384 recurse infinitely.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000385
386 Use :func:`pickletools.optimize` if you need more compact pickles.
387
Georg Brandl116aa622007-08-15 14:28:22 +0000388
Antoine Pitrou91f43802019-05-26 17:10:09 +0200389.. class:: Unpickler(file, \*, fix_imports=True, encoding="ASCII", errors="strict", buffers=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000390
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000391 This takes a binary file for reading a pickle data stream.
Georg Brandl116aa622007-08-15 14:28:22 +0000392
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000393 The protocol version of the pickle is detected automatically, so no
394 protocol argument is needed.
395
Antoine Pitrou91f43802019-05-26 17:10:09 +0200396 The argument *file* must have three methods, a read() method that takes an
397 integer argument, a readinto() method that takes a buffer argument
398 and a readline() method that requires no arguments, as in the
399 :class:`io.BufferedIOBase` interface. Thus *file* can be an on-disk file
Martin Panter7462b6492015-11-02 03:37:02 +0000400 opened for binary reading, an :class:`io.BytesIO` object, or any other
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800401 custom object that meets this interface.
Georg Brandl116aa622007-08-15 14:28:22 +0000402
Antoine Pitrou91f43802019-05-26 17:10:09 +0200403 The optional arguments *fix_imports*, *encoding* and *errors* are used
404 to control compatibility support for pickle stream generated by Python 2.
405 If *fix_imports* is true, pickle will try to map the old Python 2 names
406 to the new names used in Python 3. The *encoding* and *errors* tell
407 pickle how to decode 8-bit string instances pickled by Python 2;
408 these default to 'ASCII' and 'strict', respectively. The *encoding* can
Sebastian Pucilowskia8d25a12017-12-21 20:00:49 +1100409 be 'bytes' to read these 8-bit string instances as bytes objects.
Antoine Pitrou91f43802019-05-26 17:10:09 +0200410 Using ``encoding='latin1'`` is required for unpickling NumPy arrays and
411 instances of :class:`~datetime.datetime`, :class:`~datetime.date` and
412 :class:`~datetime.time` pickled by Python 2.
413
414 If *buffers* is None (the default), then all data necessary for
415 deserialization must be contained in the pickle stream. This means
416 that the *buffer_callback* argument was None when a :class:`Pickler`
417 was instantiated (or when :func:`dump` or :func:`dumps` was called).
418
419 If *buffers* is not None, it should be an iterable of buffer-enabled
420 objects that is consumed each time the pickle stream references
421 an :ref:`out-of-band <pickle-oob>` buffer view. Such buffers have been
422 given in order to the *buffer_callback* of a Pickler object.
423
424 .. versionchanged:: 3.8
425 The *buffers* argument was added.
Georg Brandl116aa622007-08-15 14:28:22 +0000426
Benjamin Petersone41251e2008-04-25 01:59:09 +0000427 .. method:: load()
Georg Brandl116aa622007-08-15 14:28:22 +0000428
Benjamin Petersone41251e2008-04-25 01:59:09 +0000429 Read a pickled object representation from the open file object given in
430 the constructor, and return the reconstituted object hierarchy specified
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000431 therein. Bytes past the pickled object's representation are ignored.
Georg Brandl116aa622007-08-15 14:28:22 +0000432
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000433 .. method:: persistent_load(pid)
Georg Brandl116aa622007-08-15 14:28:22 +0000434
Ezio Melottie62aad32011-11-18 13:51:10 +0200435 Raise an :exc:`UnpicklingError` by default.
Georg Brandl116aa622007-08-15 14:28:22 +0000436
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000437 If defined, :meth:`persistent_load` should return the object specified by
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000438 the persistent ID *pid*. If an invalid persistent ID is encountered, an
Ezio Melottie62aad32011-11-18 13:51:10 +0200439 :exc:`UnpicklingError` should be raised.
Georg Brandl116aa622007-08-15 14:28:22 +0000440
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000441 See :ref:`pickle-persistent` for details and examples of uses.
442
443 .. method:: find_class(module, name)
444
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000445 Import *module* if necessary and return the object called *name* from it,
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000446 where the *module* and *name* arguments are :class:`str` objects. Note,
447 unlike its name suggests, :meth:`find_class` is also used for finding
448 functions.
Georg Brandl116aa622007-08-15 14:28:22 +0000449
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000450 Subclasses may override this to gain control over what type of objects and
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000451 how they can be loaded, potentially reducing security risks. Refer to
452 :ref:`pickle-restrict` for details.
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000453
Miss Islington (bot)4fee28a2019-06-27 11:07:16 -0700454 .. audit-event:: pickle.find_class module,name pickle.Unpickler.find_class
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000455
Antoine Pitrou91f43802019-05-26 17:10:09 +0200456.. class:: PickleBuffer(buffer)
457
458 A wrapper for a buffer representing picklable data. *buffer* must be a
459 :ref:`buffer-providing <bufferobjects>` object, such as a
460 :term:`bytes-like object` or a N-dimensional array.
461
462 :class:`PickleBuffer` is itself a buffer provider, therefore it is
463 possible to pass it to other APIs expecting a buffer-providing object,
464 such as :class:`memoryview`.
465
466 :class:`PickleBuffer` objects can only be serialized using pickle
467 protocol 5 or higher. They are eligible for
468 :ref:`out-of-band serialization <pickle-oob>`.
469
470 .. versionadded:: 3.8
471
472 .. method:: raw()
473
474 Return a :class:`memoryview` of the memory area underlying this buffer.
475 The returned object is a one-dimensional, C-contiguous memoryview
476 with format ``B`` (unsigned bytes). :exc:`BufferError` is raised if
477 the buffer is neither C- nor Fortran-contiguous.
478
479 .. method:: release()
480
481 Release the underlying buffer exposed by the PickleBuffer object.
482
483
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000484.. _pickle-picklable:
Georg Brandl116aa622007-08-15 14:28:22 +0000485
486What can be pickled and unpickled?
487----------------------------------
488
489The following types can be pickled:
490
491* ``None``, ``True``, and ``False``
492
Georg Brandlba956ae2007-11-29 17:24:34 +0000493* integers, floating point numbers, complex numbers
Georg Brandl116aa622007-08-15 14:28:22 +0000494
Georg Brandlf6945182008-02-01 11:56:49 +0000495* strings, bytes, bytearrays
Georg Brandl116aa622007-08-15 14:28:22 +0000496
497* tuples, lists, sets, and dictionaries containing only picklable objects
498
Ethan Furman2498d9e2013-10-18 00:45:40 -0700499* functions defined at the top level of a module (using :keyword:`def`, not
500 :keyword:`lambda`)
Georg Brandl116aa622007-08-15 14:28:22 +0000501
502* built-in functions defined at the top level of a module
503
504* classes that are defined at the top level of a module
505
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300506* instances of such classes whose :attr:`~object.__dict__` or the result of
507 calling :meth:`__getstate__` is picklable (see section :ref:`pickle-inst` for
Eli Bendersky78f3ce52013-01-02 05:53:59 -0800508 details).
Georg Brandl116aa622007-08-15 14:28:22 +0000509
510Attempts to pickle unpicklable objects will raise the :exc:`PicklingError`
511exception; when this happens, an unspecified number of bytes may have already
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000512been written to the underlying file. Trying to pickle a highly recursive data
Yury Selivanovf488fb42015-07-03 01:04:23 -0400513structure may exceed the maximum recursion depth, a :exc:`RecursionError` will be
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000514raised in this case. You can carefully raise this limit with
Georg Brandl116aa622007-08-15 14:28:22 +0000515:func:`sys.setrecursionlimit`.
516
517Note that functions (built-in and user-defined) are pickled by "fully qualified"
Ethan Furman2498d9e2013-10-18 00:45:40 -0700518name reference, not by value. [#]_ This means that only the function name is
Eli Bendersky78f3ce52013-01-02 05:53:59 -0800519pickled, along with the name of the module the function is defined in. Neither
520the function's code, nor any of its function attributes are pickled. Thus the
Georg Brandl116aa622007-08-15 14:28:22 +0000521defining module must be importable in the unpickling environment, and the module
522must contain the named object, otherwise an exception will be raised. [#]_
523
524Similarly, classes are pickled by named reference, so the same restrictions in
525the unpickling environment apply. Note that none of the class's code or data is
526pickled, so in the following example the class attribute ``attr`` is not
527restored in the unpickling environment::
528
529 class Foo:
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000530 attr = 'A class attribute'
Georg Brandl116aa622007-08-15 14:28:22 +0000531
532 picklestring = pickle.dumps(Foo)
533
534These restrictions are why picklable functions and classes must be defined in
535the top level of a module.
536
537Similarly, when class instances are pickled, their class's code and data are not
538pickled along with them. Only the instance data are pickled. This is done on
539purpose, so you can fix bugs in a class or add methods to the class and still
540load objects that were created with an earlier version of the class. If you
541plan to have long-lived objects that will see many versions of a class, it may
542be worthwhile to put a version number in the objects so that suitable
543conversions can be made by the class's :meth:`__setstate__` method.
544
545
Georg Brandl116aa622007-08-15 14:28:22 +0000546.. _pickle-inst:
547
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000548Pickling Class Instances
549------------------------
Georg Brandl116aa622007-08-15 14:28:22 +0000550
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300551.. currentmodule:: None
552
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000553In this section, we describe the general mechanisms available to you to define,
554customize, and control how class instances are pickled and unpickled.
Georg Brandl116aa622007-08-15 14:28:22 +0000555
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000556In most cases, no additional code is needed to make instances picklable. By
557default, pickle will retrieve the class and the attributes of an instance via
558introspection. When a class instance is unpickled, its :meth:`__init__` method
559is usually *not* invoked. The default behaviour first creates an uninitialized
560instance and then restores the saved attributes. The following code shows an
561implementation of this behaviour::
Georg Brandl85eb8c12007-08-31 16:33:38 +0000562
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000563 def save(obj):
564 return (obj.__class__, obj.__dict__)
565
566 def load(cls, attributes):
567 obj = cls.__new__(cls)
568 obj.__dict__.update(attributes)
569 return obj
Georg Brandl116aa622007-08-15 14:28:22 +0000570
Georg Brandl6faee4e2010-09-21 14:48:28 +0000571Classes can alter the default behaviour by providing one or several special
Georg Brandlc8148262010-10-17 11:13:37 +0000572methods:
Georg Brandl116aa622007-08-15 14:28:22 +0000573
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100574.. method:: object.__getnewargs_ex__()
575
Serhiy Storchakab6d84832015-10-13 21:26:35 +0300576 In protocols 2 and newer, classes that implements the
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100577 :meth:`__getnewargs_ex__` method can dictate the values passed to the
578 :meth:`__new__` method upon unpickling. The method must return a pair
579 ``(args, kwargs)`` where *args* is a tuple of positional arguments
580 and *kwargs* a dictionary of named arguments for constructing the
581 object. Those will be passed to the :meth:`__new__` method upon
582 unpickling.
583
584 You should implement this method if the :meth:`__new__` method of your
585 class requires keyword-only arguments. Otherwise, it is recommended for
586 compatibility to implement :meth:`__getnewargs__`.
587
Serhiy Storchakab6d84832015-10-13 21:26:35 +0300588 .. versionchanged:: 3.6
589 :meth:`__getnewargs_ex__` is now used in protocols 2 and 3.
590
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100591
Georg Brandlc8148262010-10-17 11:13:37 +0000592.. method:: object.__getnewargs__()
Georg Brandl116aa622007-08-15 14:28:22 +0000593
Andrés Delfino0e0534c2018-06-09 21:41:09 -0300594 This method serves a similar purpose as :meth:`__getnewargs_ex__`, but
Serhiy Storchakab6d84832015-10-13 21:26:35 +0300595 supports only positional arguments. It must return a tuple of arguments
596 ``args`` which will be passed to the :meth:`__new__` method upon unpickling.
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100597
Serhiy Storchakab6d84832015-10-13 21:26:35 +0300598 :meth:`__getnewargs__` will not be called if :meth:`__getnewargs_ex__` is
599 defined.
600
601 .. versionchanged:: 3.6
602 Before Python 3.6, :meth:`__getnewargs__` was called instead of
603 :meth:`__getnewargs_ex__` in protocols 2 and 3.
Georg Brandl116aa622007-08-15 14:28:22 +0000604
Georg Brandl116aa622007-08-15 14:28:22 +0000605
Georg Brandlc8148262010-10-17 11:13:37 +0000606.. method:: object.__getstate__()
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000607
Georg Brandlc8148262010-10-17 11:13:37 +0000608 Classes can further influence how their instances are pickled; if the class
609 defines the method :meth:`__getstate__`, it is called and the returned object
610 is pickled as the contents for the instance, instead of the contents of the
611 instance's dictionary. If the :meth:`__getstate__` method is absent, the
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300612 instance's :attr:`~object.__dict__` is pickled as usual.
Georg Brandl116aa622007-08-15 14:28:22 +0000613
Georg Brandlc8148262010-10-17 11:13:37 +0000614
615.. method:: object.__setstate__(state)
616
617 Upon unpickling, if the class defines :meth:`__setstate__`, it is called with
618 the unpickled state. In that case, there is no requirement for the state
619 object to be a dictionary. Otherwise, the pickled state must be a dictionary
620 and its items are assigned to the new instance's dictionary.
621
622 .. note::
623
624 If :meth:`__getstate__` returns a false value, the :meth:`__setstate__`
625 method will not be called upon unpickling.
626
Georg Brandl116aa622007-08-15 14:28:22 +0000627
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000628Refer to the section :ref:`pickle-state` for more information about how to use
629the methods :meth:`__getstate__` and :meth:`__setstate__`.
Georg Brandl116aa622007-08-15 14:28:22 +0000630
Benjamin Petersond23f8222009-04-05 19:13:16 +0000631.. note::
Georg Brandle720c0a2009-04-27 16:20:50 +0000632
Benjamin Petersond23f8222009-04-05 19:13:16 +0000633 At unpickling time, some methods like :meth:`__getattr__`,
634 :meth:`__getattribute__`, or :meth:`__setattr__` may be called upon the
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100635 instance. In case those methods rely on some internal invariant being
636 true, the type should implement :meth:`__getnewargs__` or
637 :meth:`__getnewargs_ex__` to establish such an invariant; otherwise,
638 neither :meth:`__new__` nor :meth:`__init__` will be called.
Benjamin Petersond23f8222009-04-05 19:13:16 +0000639
Georg Brandlc8148262010-10-17 11:13:37 +0000640.. index:: pair: copy; protocol
Christian Heimes05e8be12008-02-23 18:30:17 +0000641
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000642As we shall see, pickle does not use directly the methods described above. In
643fact, these methods are part of the copy protocol which implements the
644:meth:`__reduce__` special method. The copy protocol provides a unified
645interface for retrieving the data necessary for pickling and copying
Georg Brandl48310cd2009-01-03 21:18:54 +0000646objects. [#]_
Georg Brandl116aa622007-08-15 14:28:22 +0000647
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000648Although powerful, implementing :meth:`__reduce__` directly in your classes is
649error prone. For this reason, class designers should use the high-level
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100650interface (i.e., :meth:`__getnewargs_ex__`, :meth:`__getstate__` and
Georg Brandlc8148262010-10-17 11:13:37 +0000651:meth:`__setstate__`) whenever possible. We will show, however, cases where
652using :meth:`__reduce__` is the only option or leads to more efficient pickling
653or both.
Georg Brandl116aa622007-08-15 14:28:22 +0000654
Georg Brandlc8148262010-10-17 11:13:37 +0000655.. method:: object.__reduce__()
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000656
Georg Brandlc8148262010-10-17 11:13:37 +0000657 The interface is currently defined as follows. The :meth:`__reduce__` method
658 takes no argument and shall return either a string or preferably a tuple (the
659 returned object is often referred to as the "reduce value").
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000660
Georg Brandlc8148262010-10-17 11:13:37 +0000661 If a string is returned, the string should be interpreted as the name of a
662 global variable. It should be the object's local name relative to its
663 module; the pickle module searches the module namespace to determine the
664 object's module. This behaviour is typically useful for singletons.
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000665
Pierre Glaser65d98d02019-05-08 21:40:25 +0200666 When a tuple is returned, it must be between two and six items long.
Georg Brandlc8148262010-10-17 11:13:37 +0000667 Optional items can either be omitted, or ``None`` can be provided as their
668 value. The semantics of each item are in order:
Georg Brandl116aa622007-08-15 14:28:22 +0000669
Georg Brandlc8148262010-10-17 11:13:37 +0000670 .. XXX Mention __newobj__ special-case?
Georg Brandl116aa622007-08-15 14:28:22 +0000671
Georg Brandlc8148262010-10-17 11:13:37 +0000672 * A callable object that will be called to create the initial version of the
673 object.
Georg Brandl116aa622007-08-15 14:28:22 +0000674
Georg Brandlc8148262010-10-17 11:13:37 +0000675 * A tuple of arguments for the callable object. An empty tuple must be given
676 if the callable does not accept any argument.
Georg Brandl116aa622007-08-15 14:28:22 +0000677
Georg Brandlc8148262010-10-17 11:13:37 +0000678 * Optionally, the object's state, which will be passed to the object's
679 :meth:`__setstate__` method as previously described. If the object has no
680 such method then, the value must be a dictionary and it will be added to
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300681 the object's :attr:`~object.__dict__` attribute.
Georg Brandl116aa622007-08-15 14:28:22 +0000682
Georg Brandlc8148262010-10-17 11:13:37 +0000683 * Optionally, an iterator (and not a sequence) yielding successive items.
684 These items will be appended to the object either using
685 ``obj.append(item)`` or, in batch, using ``obj.extend(list_of_items)``.
686 This is primarily used for list subclasses, but may be used by other
687 classes as long as they have :meth:`append` and :meth:`extend` methods with
688 the appropriate signature. (Whether :meth:`append` or :meth:`extend` is
689 used depends on which pickle protocol version is used as well as the number
690 of items to append, so both must be supported.)
Georg Brandl116aa622007-08-15 14:28:22 +0000691
Georg Brandlc8148262010-10-17 11:13:37 +0000692 * Optionally, an iterator (not a sequence) yielding successive key-value
693 pairs. These items will be stored to the object using ``obj[key] =
694 value``. This is primarily used for dictionary subclasses, but may be used
695 by other classes as long as they implement :meth:`__setitem__`.
Georg Brandl116aa622007-08-15 14:28:22 +0000696
Pierre Glaser65d98d02019-05-08 21:40:25 +0200697 * Optionally, a callable with a ``(obj, state)`` signature. This
Xtreak9b5a0ef2019-05-16 10:04:24 +0530698 callable allows the user to programmatically control the state-updating
Pierre Glaser65d98d02019-05-08 21:40:25 +0200699 behavior of a specific object, instead of using ``obj``'s static
700 :meth:`__setstate__` method. If not ``None``, this callable will have
701 priority over ``obj``'s :meth:`__setstate__`.
702
703 .. versionadded:: 3.8
704 The optional sixth tuple item, ``(obj, state)``, was added.
705
Georg Brandlc8148262010-10-17 11:13:37 +0000706
707.. method:: object.__reduce_ex__(protocol)
708
709 Alternatively, a :meth:`__reduce_ex__` method may be defined. The only
710 difference is this method should take a single integer argument, the protocol
711 version. When defined, pickle will prefer it over the :meth:`__reduce__`
712 method. In addition, :meth:`__reduce__` automatically becomes a synonym for
713 the extended version. The main use for this method is to provide
714 backwards-compatible reduce values for older Python releases.
Georg Brandl116aa622007-08-15 14:28:22 +0000715
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300716.. currentmodule:: pickle
717
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000718.. _pickle-persistent:
719
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000720Persistence of External Objects
721^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +0000722
Christian Heimes05e8be12008-02-23 18:30:17 +0000723.. index::
724 single: persistent_id (pickle protocol)
725 single: persistent_load (pickle protocol)
726
Georg Brandl116aa622007-08-15 14:28:22 +0000727For the benefit of object persistence, the :mod:`pickle` module supports the
728notion of a reference to an object outside the pickled data stream. Such
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000729objects are referenced by a persistent ID, which should be either a string of
730alphanumeric characters (for protocol 0) [#]_ or just an arbitrary object (for
731any newer protocol).
Georg Brandl116aa622007-08-15 14:28:22 +0000732
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000733The resolution of such persistent IDs is not defined by the :mod:`pickle`
734module; it will delegate this resolution to the user defined methods on the
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300735pickler and unpickler, :meth:`~Pickler.persistent_id` and
736:meth:`~Unpickler.persistent_load` respectively.
Georg Brandl116aa622007-08-15 14:28:22 +0000737
738To pickle objects that have an external persistent id, the pickler must have a
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300739custom :meth:`~Pickler.persistent_id` method that takes an object as an
740argument and returns either ``None`` or the persistent id for that object.
741When ``None`` is returned, the pickler simply pickles the object as normal.
742When a persistent ID string is returned, the pickler will pickle that object,
743along with a marker so that the unpickler will recognize it as a persistent ID.
Georg Brandl116aa622007-08-15 14:28:22 +0000744
745To unpickle external objects, the unpickler must have a custom
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300746:meth:`~Unpickler.persistent_load` method that takes a persistent ID object and
747returns the referenced object.
Georg Brandl116aa622007-08-15 14:28:22 +0000748
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000749Here is a comprehensive example presenting how persistent ID can be used to
750pickle external objects by reference.
Georg Brandl116aa622007-08-15 14:28:22 +0000751
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000752.. literalinclude:: ../includes/dbpickle.py
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +0000753
Antoine Pitrou8d3c2902012-03-04 18:31:48 +0100754.. _pickle-dispatch:
755
756Dispatch Tables
757^^^^^^^^^^^^^^^
758
759If one wants to customize pickling of some classes without disturbing
760any other code which depends on pickling, then one can create a
761pickler with a private dispatch table.
762
763The global dispatch table managed by the :mod:`copyreg` module is
764available as :data:`copyreg.dispatch_table`. Therefore, one may
765choose to use a modified copy of :data:`copyreg.dispatch_table` as a
766private dispatch table.
767
768For example ::
769
770 f = io.BytesIO()
771 p = pickle.Pickler(f)
772 p.dispatch_table = copyreg.dispatch_table.copy()
773 p.dispatch_table[SomeClass] = reduce_SomeClass
774
775creates an instance of :class:`pickle.Pickler` with a private dispatch
776table which handles the ``SomeClass`` class specially. Alternatively,
777the code ::
778
779 class MyPickler(pickle.Pickler):
780 dispatch_table = copyreg.dispatch_table.copy()
781 dispatch_table[SomeClass] = reduce_SomeClass
782 f = io.BytesIO()
783 p = MyPickler(f)
784
785does the same, but all instances of ``MyPickler`` will by default
786share the same dispatch table. The equivalent code using the
787:mod:`copyreg` module is ::
788
789 copyreg.pickle(SomeClass, reduce_SomeClass)
790 f = io.BytesIO()
791 p = pickle.Pickler(f)
Georg Brandl116aa622007-08-15 14:28:22 +0000792
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000793.. _pickle-state:
794
795Handling Stateful Objects
796^^^^^^^^^^^^^^^^^^^^^^^^^
797
798.. index::
799 single: __getstate__() (copy protocol)
800 single: __setstate__() (copy protocol)
801
802Here's an example that shows how to modify pickling behavior for a class.
803The :class:`TextReader` class opens a text file, and returns the line number and
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300804line contents each time its :meth:`!readline` method is called. If a
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000805:class:`TextReader` instance is pickled, all attributes *except* the file object
806member are saved. When the instance is unpickled, the file is reopened, and
807reading resumes from the last location. The :meth:`__setstate__` and
808:meth:`__getstate__` methods are used to implement this behavior. ::
809
810 class TextReader:
811 """Print and number lines in a text file."""
812
813 def __init__(self, filename):
814 self.filename = filename
815 self.file = open(filename)
816 self.lineno = 0
817
818 def readline(self):
819 self.lineno += 1
820 line = self.file.readline()
821 if not line:
822 return None
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +0000823 if line.endswith('\n'):
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000824 line = line[:-1]
825 return "%i: %s" % (self.lineno, line)
826
827 def __getstate__(self):
828 # Copy the object's state from self.__dict__ which contains
829 # all our instance attributes. Always use the dict.copy()
830 # method to avoid modifying the original state.
831 state = self.__dict__.copy()
832 # Remove the unpicklable entries.
833 del state['file']
834 return state
835
836 def __setstate__(self, state):
837 # Restore instance attributes (i.e., filename and lineno).
838 self.__dict__.update(state)
839 # Restore the previously opened file's state. To do so, we need to
840 # reopen it and read from it until the line count is restored.
841 file = open(self.filename)
842 for _ in range(self.lineno):
843 file.readline()
844 # Finally, save the file.
845 self.file = file
846
847
848A sample usage might be something like this::
849
850 >>> reader = TextReader("hello.txt")
851 >>> reader.readline()
852 '1: Hello world!'
853 >>> reader.readline()
854 '2: I am line number two.'
855 >>> new_reader = pickle.loads(pickle.dumps(reader))
856 >>> new_reader.readline()
857 '3: Goodbye!'
858
Pierre Glaser289f1f82019-05-08 23:08:25 +0200859.. _reducer_override:
860
861Custom Reduction for Types, Functions, and Other Objects
862--------------------------------------------------------
863
864.. versionadded:: 3.8
865
866Sometimes, :attr:`~Pickler.dispatch_table` may not be flexible enough.
867In particular we may want to customize pickling based on another criterion
868than the object's type, or we may want to customize the pickling of
869functions and classes.
870
871For those cases, it is possible to subclass from the :class:`Pickler` class and
872implement a :meth:`~Pickler.reducer_override` method. This method can return an
873arbitrary reduction tuple (see :meth:`__reduce__`). It can alternatively return
874``NotImplemented`` to fallback to the traditional behavior.
875
876If both the :attr:`~Pickler.dispatch_table` and
877:meth:`~Pickler.reducer_override` are defined, then
878:meth:`~Pickler.reducer_override` method takes priority.
879
880.. Note::
881 For performance reasons, :meth:`~Pickler.reducer_override` may not be
882 called for the following objects: ``None``, ``True``, ``False``, and
883 exact instances of :class:`int`, :class:`float`, :class:`bytes`,
884 :class:`str`, :class:`dict`, :class:`set`, :class:`frozenset`, :class:`list`
885 and :class:`tuple`.
886
887Here is a simple example where we allow pickling and reconstructing
888a given class::
889
890 import io
891 import pickle
892
893 class MyClass:
894 my_attribute = 1
895
896 class MyPickler(pickle.Pickler):
897 def reducer_override(self, obj):
898 """Custom reducer for MyClass."""
899 if getattr(obj, "__name__", None) == "MyClass":
900 return type, (obj.__name__, obj.__bases__,
901 {'my_attribute': obj.my_attribute})
902 else:
903 # For any other object, fallback to usual reduction
904 return NotImplemented
905
906 f = io.BytesIO()
907 p = MyPickler(f)
908 p.dump(MyClass)
909
910 del MyClass
911
912 unpickled_class = pickle.loads(f.getvalue())
913
914 assert isinstance(unpickled_class, type)
915 assert unpickled_class.__name__ == "MyClass"
916 assert unpickled_class.my_attribute == 1
917
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000918
Antoine Pitrou91f43802019-05-26 17:10:09 +0200919.. _pickle-oob:
920
921Out-of-band Buffers
922-------------------
923
924.. versionadded:: 3.8
925
926In some contexts, the :mod:`pickle` module is used to transfer massive amounts
927of data. Therefore, it can be important to minimize the number of memory
928copies, to preserve performance and resource consumption. However, normal
929operation of the :mod:`pickle` module, as it transforms a graph-like structure
930of objects into a sequential stream of bytes, intrinsically involves copying
931data to and from the pickle stream.
932
933This constraint can be eschewed if both the *provider* (the implementation
934of the object types to be transferred) and the *consumer* (the implementation
935of the communications system) support the out-of-band transfer facilities
936provided by pickle protocol 5 and higher.
937
938Provider API
939^^^^^^^^^^^^
940
941The large data objects to be pickled must implement a :meth:`__reduce_ex__`
942method specialized for protocol 5 and higher, which returns a
943:class:`PickleBuffer` instance (instead of e.g. a :class:`bytes` object)
944for any large data.
945
946A :class:`PickleBuffer` object *signals* that the underlying buffer is
947eligible for out-of-band data transfer. Those objects remain compatible
948with normal usage of the :mod:`pickle` module. However, consumers can also
949opt-in to tell :mod:`pickle` that they will handle those buffers by
950themselves.
951
952Consumer API
953^^^^^^^^^^^^
954
955A communications system can enable custom handling of the :class:`PickleBuffer`
956objects generated when serializing an object graph.
957
958On the sending side, it needs to pass a *buffer_callback* argument to
959:class:`Pickler` (or to the :func:`dump` or :func:`dumps` function), which
960will be called with each :class:`PickleBuffer` generated while pickling
961the object graph. Buffers accumulated by the *buffer_callback* will not
962see their data copied into the pickle stream, only a cheap marker will be
963inserted.
964
965On the receiving side, it needs to pass a *buffers* argument to
966:class:`Unpickler` (or to the :func:`load` or :func:`loads` function),
967which is an iterable of the buffers which were passed to *buffer_callback*.
968That iterable should produce buffers in the same order as they were passed
969to *buffer_callback*. Those buffers will provide the data expected by the
970reconstructors of the objects whose pickling produced the original
971:class:`PickleBuffer` objects.
972
973Between the sending side and the receiving side, the communications system
974is free to implement its own transfer mechanism for out-of-band buffers.
975Potential optimizations include the use of shared memory or datatype-dependent
976compression.
977
978Example
979^^^^^^^
980
981Here is a trivial example where we implement a :class:`bytearray` subclass
982able to participate in out-of-band buffer pickling::
983
984 class ZeroCopyByteArray(bytearray):
985
986 def __reduce_ex__(self, protocol):
987 if protocol >= 5:
988 return type(self)._reconstruct, (PickleBuffer(self),), None
989 else:
990 # PickleBuffer is forbidden with pickle protocols <= 4.
991 return type(self)._reconstruct, (bytearray(self),)
992
993 @classmethod
994 def _reconstruct(cls, obj):
995 with memoryview(obj) as m:
996 # Get a handle over the original buffer object
997 obj = m.obj
998 if type(obj) is cls:
999 # Original buffer object is a ZeroCopyByteArray, return it
1000 # as-is.
1001 return obj
1002 else:
1003 return cls(obj)
1004
1005The reconstructor (the ``_reconstruct`` class method) returns the buffer's
1006providing object if it has the right type. This is an easy way to simulate
1007zero-copy behaviour on this toy example.
1008
1009On the consumer side, we can pickle those objects the usual way, which
1010when unserialized will give us a copy of the original object::
1011
1012 b = ZeroCopyByteArray(b"abc")
1013 data = pickle.dumps(b, protocol=5)
1014 new_b = pickle.loads(data)
1015 print(b == new_b) # True
1016 print(b is new_b) # False: a copy was made
1017
1018But if we pass a *buffer_callback* and then give back the accumulated
1019buffers when unserializing, we are able to get back the original object::
1020
1021 b = ZeroCopyByteArray(b"abc")
1022 buffers = []
1023 data = pickle.dumps(b, protocol=5, buffer_callback=buffers.append)
1024 new_b = pickle.loads(data, buffers=buffers)
1025 print(b == new_b) # True
1026 print(b is new_b) # True: no copy was made
1027
1028This example is limited by the fact that :class:`bytearray` allocates its
1029own memory: you cannot create a :class:`bytearray` instance that is backed
1030by another object's memory. However, third-party datatypes such as NumPy
1031arrays do not have this limitation, and allow use of zero-copy pickling
1032(or making as few copies as possible) when transferring between distinct
1033processes or systems.
1034
1035.. seealso:: :pep:`574` -- Pickle protocol 5 with out-of-band data
1036
1037
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001038.. _pickle-restrict:
Georg Brandl116aa622007-08-15 14:28:22 +00001039
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001040Restricting Globals
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +00001041-------------------
Georg Brandl116aa622007-08-15 14:28:22 +00001042
Christian Heimes05e8be12008-02-23 18:30:17 +00001043.. index::
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001044 single: find_class() (pickle protocol)
Christian Heimes05e8be12008-02-23 18:30:17 +00001045
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001046By default, unpickling will import any class or function that it finds in the
1047pickle data. For many applications, this behaviour is unacceptable as it
1048permits the unpickler to import and invoke arbitrary code. Just consider what
1049this hand-crafted pickle data stream does when loaded::
Georg Brandl116aa622007-08-15 14:28:22 +00001050
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001051 >>> import pickle
1052 >>> pickle.loads(b"cos\nsystem\n(S'echo hello world'\ntR.")
1053 hello world
1054 0
Georg Brandl116aa622007-08-15 14:28:22 +00001055
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001056In this example, the unpickler imports the :func:`os.system` function and then
1057apply the string argument "echo hello world". Although this example is
1058inoffensive, it is not difficult to imagine one that could damage your system.
Georg Brandl116aa622007-08-15 14:28:22 +00001059
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001060For this reason, you may want to control what gets unpickled by customizing
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +03001061:meth:`Unpickler.find_class`. Unlike its name suggests,
1062:meth:`Unpickler.find_class` is called whenever a global (i.e., a class or
1063a function) is requested. Thus it is possible to either completely forbid
1064globals or restrict them to a safe subset.
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001065
1066Here is an example of an unpickler allowing only few safe classes from the
1067:mod:`builtins` module to be loaded::
1068
1069 import builtins
1070 import io
1071 import pickle
1072
1073 safe_builtins = {
1074 'range',
1075 'complex',
1076 'set',
1077 'frozenset',
1078 'slice',
1079 }
1080
1081 class RestrictedUnpickler(pickle.Unpickler):
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +00001082
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001083 def find_class(self, module, name):
1084 # Only allow safe classes from builtins.
1085 if module == "builtins" and name in safe_builtins:
1086 return getattr(builtins, name)
1087 # Forbid everything else.
1088 raise pickle.UnpicklingError("global '%s.%s' is forbidden" %
1089 (module, name))
1090
1091 def restricted_loads(s):
1092 """Helper function analogous to pickle.loads()."""
1093 return RestrictedUnpickler(io.BytesIO(s)).load()
1094
1095A sample usage of our unpickler working has intended::
1096
1097 >>> restricted_loads(pickle.dumps([1, 2, range(15)]))
1098 [1, 2, range(0, 15)]
1099 >>> restricted_loads(b"cos\nsystem\n(S'echo hello world'\ntR.")
1100 Traceback (most recent call last):
1101 ...
1102 pickle.UnpicklingError: global 'os.system' is forbidden
1103 >>> restricted_loads(b'cbuiltins\neval\n'
1104 ... b'(S\'getattr(__import__("os"), "system")'
1105 ... b'("echo hello world")\'\ntR.')
1106 Traceback (most recent call last):
1107 ...
1108 pickle.UnpicklingError: global 'builtins.eval' is forbidden
1109
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +00001110
1111.. XXX Add note about how extension codes could evade our protection
Georg Brandl48310cd2009-01-03 21:18:54 +00001112 mechanism (e.g. cached classes do not invokes find_class()).
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +00001113
1114As our examples shows, you have to be careful with what you allow to be
1115unpickled. Therefore if security is a concern, you may want to consider
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +00001116alternatives such as the marshalling API in :mod:`xmlrpc.client` or
1117third-party solutions.
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +00001118
Georg Brandl116aa622007-08-15 14:28:22 +00001119
Antoine Pitroud4d60552013-12-07 00:56:59 +01001120Performance
1121-----------
1122
1123Recent versions of the pickle protocol (from protocol 2 and upwards) feature
1124efficient binary encodings for several common features and built-in types.
1125Also, the :mod:`pickle` module has a transparent optimizer written in C.
1126
1127
Georg Brandl116aa622007-08-15 14:28:22 +00001128.. _pickle-example:
1129
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +00001130Examples
1131--------
Georg Brandl116aa622007-08-15 14:28:22 +00001132
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +00001133For the simplest code, use the :func:`dump` and :func:`load` functions. ::
Georg Brandl116aa622007-08-15 14:28:22 +00001134
1135 import pickle
1136
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +00001137 # An arbitrary collection of objects supported by pickle.
1138 data = {
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +00001139 'a': [1, 2.0, 3, 4+6j],
1140 'b': ("character string", b"byte string"),
Raymond Hettingerdf1b6992014-11-09 15:56:33 -08001141 'c': {None, True, False}
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +00001142 }
Georg Brandl116aa622007-08-15 14:28:22 +00001143
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +00001144 with open('data.pickle', 'wb') as f:
1145 # Pickle the 'data' dictionary using the highest protocol available.
1146 pickle.dump(data, f, pickle.HIGHEST_PROTOCOL)
Georg Brandl116aa622007-08-15 14:28:22 +00001147
Georg Brandl116aa622007-08-15 14:28:22 +00001148
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +00001149The following example reads the resulting pickled data. ::
Georg Brandl116aa622007-08-15 14:28:22 +00001150
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +00001151 import pickle
Georg Brandl116aa622007-08-15 14:28:22 +00001152
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +00001153 with open('data.pickle', 'rb') as f:
1154 # The protocol version used is detected automatically, so we do not
1155 # have to specify it.
1156 data = pickle.load(f)
Georg Brandl116aa622007-08-15 14:28:22 +00001157
Georg Brandl116aa622007-08-15 14:28:22 +00001158
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +00001159.. XXX: Add examples showing how to optimize pickles for size (like using
1160.. pickletools.optimize() or the gzip module).
1161
1162
Georg Brandl116aa622007-08-15 14:28:22 +00001163.. seealso::
1164
Alexandre Vassalottif7fa63d2008-05-11 08:55:36 +00001165 Module :mod:`copyreg`
Georg Brandl116aa622007-08-15 14:28:22 +00001166 Pickle interface constructor registration for extension types.
1167
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +00001168 Module :mod:`pickletools`
1169 Tools for working with and analyzing pickled data.
1170
Georg Brandl116aa622007-08-15 14:28:22 +00001171 Module :mod:`shelve`
1172 Indexed databases of objects; uses :mod:`pickle`.
1173
1174 Module :mod:`copy`
1175 Shallow and deep object copying.
1176
1177 Module :mod:`marshal`
1178 High-performance serialization of built-in types.
1179
1180
Georg Brandl116aa622007-08-15 14:28:22 +00001181.. rubric:: Footnotes
1182
1183.. [#] Don't confuse this with the :mod:`marshal` module
1184
Ethan Furman2498d9e2013-10-18 00:45:40 -07001185.. [#] This is why :keyword:`lambda` functions cannot be pickled: all
Serhiy Storchaka2b57c432018-12-19 08:09:46 +02001186 :keyword:`!lambda` functions share the same name: ``<lambda>``.
Ethan Furman2498d9e2013-10-18 00:45:40 -07001187
Georg Brandl116aa622007-08-15 14:28:22 +00001188.. [#] The exception raised will likely be an :exc:`ImportError` or an
1189 :exc:`AttributeError` but it could be something else.
1190
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +00001191.. [#] The :mod:`copy` module uses this protocol for shallow and deep copying
1192 operations.
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +00001193
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001194.. [#] The limitation on alphanumeric characters is due to the fact
1195 the persistent IDs, in protocol 0, are delimited by the newline
1196 character. Therefore if any kind of newline characters occurs in
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +00001197 persistent IDs, the resulting pickle will become unreadable.