blob: 09c9c86abbbad0a5fb2da8fc391c8e8977610aab [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001:mod:`pickle` --- Python object serialization
2=============================================
3
Terry Jan Reedyfa089b92016-06-11 15:02:54 -04004.. module:: pickle
5 :synopsis: Convert Python objects to streams of bytes and back.
6
7.. sectionauthor:: Jim Kerr <jbkerr@sr.hp.com>.
8.. sectionauthor:: Barry Warsaw <barry@python.org>
9
10**Source code:** :source:`Lib/pickle.py`
11
Georg Brandl116aa622007-08-15 14:28:22 +000012.. index::
13 single: persistence
14 pair: persistent; objects
15 pair: serializing; objects
16 pair: marshalling; objects
17 pair: flattening; objects
18 pair: pickling; objects
19
Terry Jan Reedyfa089b92016-06-11 15:02:54 -040020--------------
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +000021
Antoine Pitroud4d60552013-12-07 00:56:59 +010022The :mod:`pickle` module implements binary protocols for serializing and
23de-serializing a Python object structure. *"Pickling"* is the process
24whereby a Python object hierarchy is converted into a byte stream, and
25*"unpickling"* is the inverse operation, whereby a byte stream
26(from a :term:`binary file` or :term:`bytes-like object`) is converted
27back into an object hierarchy. Pickling (and unpickling) is alternatively
28known as "serialization", "marshalling," [#]_ or "flattening"; however, to
29avoid confusion, the terms used here are "pickling" and "unpickling".
Georg Brandl116aa622007-08-15 14:28:22 +000030
Georg Brandl0036bcf2010-10-17 10:24:54 +000031.. warning::
32
Benjamin Peterson7dcbf902015-07-06 11:28:07 -050033 The :mod:`pickle` module is not secure against erroneous or maliciously
Benjamin Petersonb8fd2622015-07-06 09:40:43 -050034 constructed data. Never unpickle data received from an untrusted or
35 unauthenticated source.
Georg Brandl0036bcf2010-10-17 10:24:54 +000036
Georg Brandl116aa622007-08-15 14:28:22 +000037
38Relationship to other Python modules
39------------------------------------
40
Antoine Pitroud4d60552013-12-07 00:56:59 +010041Comparison with ``marshal``
42^^^^^^^^^^^^^^^^^^^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +000043
44Python has a more primitive serialization module called :mod:`marshal`, but in
45general :mod:`pickle` should always be the preferred way to serialize Python
46objects. :mod:`marshal` exists primarily to support Python's :file:`.pyc`
47files.
48
Georg Brandl5aa580f2010-11-30 14:57:54 +000049The :mod:`pickle` module differs from :mod:`marshal` in several significant ways:
Georg Brandl116aa622007-08-15 14:28:22 +000050
51* The :mod:`pickle` module keeps track of the objects it has already serialized,
52 so that later references to the same object won't be serialized again.
53 :mod:`marshal` doesn't do this.
54
55 This has implications both for recursive objects and object sharing. Recursive
56 objects are objects that contain references to themselves. These are not
57 handled by marshal, and in fact, attempting to marshal recursive objects will
58 crash your Python interpreter. Object sharing happens when there are multiple
59 references to the same object in different places in the object hierarchy being
60 serialized. :mod:`pickle` stores such objects only once, and ensures that all
61 other references point to the master copy. Shared objects remain shared, which
62 can be very important for mutable objects.
63
64* :mod:`marshal` cannot be used to serialize user-defined classes and their
65 instances. :mod:`pickle` can save and restore class instances transparently,
66 however the class definition must be importable and live in the same module as
67 when the object was stored.
68
69* The :mod:`marshal` serialization format is not guaranteed to be portable
70 across Python versions. Because its primary job in life is to support
71 :file:`.pyc` files, the Python implementers reserve the right to change the
72 serialization format in non-backwards compatible ways should the need arise.
73 The :mod:`pickle` serialization format is guaranteed to be backwards compatible
Gregory P. Smithe3287532018-12-09 11:42:58 -080074 across Python releases provided a compatible pickle protocol is chosen and
75 pickling and unpickling code deals with Python 2 to Python 3 type differences
76 if your data is crossing that unique breaking change language boundary.
Georg Brandl116aa622007-08-15 14:28:22 +000077
Antoine Pitroud4d60552013-12-07 00:56:59 +010078Comparison with ``json``
79^^^^^^^^^^^^^^^^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +000080
Antoine Pitroud4d60552013-12-07 00:56:59 +010081There are fundamental differences between the pickle protocols and
82`JSON (JavaScript Object Notation) <http://json.org>`_:
83
84* JSON is a text serialization format (it outputs unicode text, although
85 most of the time it is then encoded to ``utf-8``), while pickle is
86 a binary serialization format;
87
88* JSON is human-readable, while pickle is not;
89
90* JSON is interoperable and widely used outside of the Python ecosystem,
91 while pickle is Python-specific;
92
93* JSON, by default, can only represent a subset of the Python built-in
94 types, and no custom classes; pickle can represent an extremely large
95 number of Python types (many of them automatically, by clever usage
96 of Python's introspection facilities; complex cases can be tackled by
97 implementing :ref:`specific object APIs <pickle-inst>`).
98
99.. seealso::
100 The :mod:`json` module: a standard library module allowing JSON
101 serialization and deserialization.
Georg Brandl116aa622007-08-15 14:28:22 +0000102
Antoine Pitrou9bcb1122013-12-07 01:05:57 +0100103
104.. _pickle-protocols:
105
Georg Brandl116aa622007-08-15 14:28:22 +0000106Data stream format
107------------------
108
109.. index::
Georg Brandl116aa622007-08-15 14:28:22 +0000110 single: External Data Representation
111
112The data format used by :mod:`pickle` is Python-specific. This has the
113advantage that there are no restrictions imposed by external standards such as
Antoine Pitroua9494f62012-05-10 15:38:30 +0200114JSON or XDR (which can't represent pointer sharing); however it means that
115non-Python programs may not be able to reconstruct pickled Python objects.
Georg Brandl116aa622007-08-15 14:28:22 +0000116
Antoine Pitroua9494f62012-05-10 15:38:30 +0200117By default, the :mod:`pickle` data format uses a relatively compact binary
118representation. If you need optimal size characteristics, you can efficiently
119:doc:`compress <archiving>` pickled data.
120
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000121The module :mod:`pickletools` contains tools for analyzing data streams
Antoine Pitroua9494f62012-05-10 15:38:30 +0200122generated by :mod:`pickle`. :mod:`pickletools` source code has extensive
123comments about opcodes used by pickle protocols.
Georg Brandl116aa622007-08-15 14:28:22 +0000124
Antoine Pitroub6457242014-01-21 02:39:54 +0100125There are currently 5 different protocols which can be used for pickling.
126The higher the protocol used, the more recent the version of Python needed
127to read the pickle produced.
Georg Brandl116aa622007-08-15 14:28:22 +0000128
Antoine Pitroua9494f62012-05-10 15:38:30 +0200129* Protocol version 0 is the original "human-readable" protocol and is
Alexandre Vassalottif7d08c72009-01-23 04:50:05 +0000130 backwards compatible with earlier versions of Python.
Georg Brandl116aa622007-08-15 14:28:22 +0000131
Antoine Pitroua9494f62012-05-10 15:38:30 +0200132* Protocol version 1 is an old binary format which is also compatible with
Georg Brandl116aa622007-08-15 14:28:22 +0000133 earlier versions of Python.
134
135* Protocol version 2 was introduced in Python 2.3. It provides much more
Antoine Pitroua9494f62012-05-10 15:38:30 +0200136 efficient pickling of :term:`new-style class`\es. Refer to :pep:`307` for
137 information about improvements brought by protocol 2.
Georg Brandl116aa622007-08-15 14:28:22 +0000138
Antoine Pitrou9bcb1122013-12-07 01:05:57 +0100139* Protocol version 3 was added in Python 3.0. It has explicit support for
Ɓukasz Langac51d8c92018-04-03 23:06:53 -0700140 :class:`bytes` objects and cannot be unpickled by Python 2.x. This was
141 the default protocol in Python 3.0--3.7.
Antoine Pitrou9bcb1122013-12-07 01:05:57 +0100142
143* Protocol version 4 was added in Python 3.4. It adds support for very large
144 objects, pickling more kinds of objects, and some data format
Ɓukasz Langac51d8c92018-04-03 23:06:53 -0700145 optimizations. It is the default protocol starting with Python 3.8.
146 Refer to :pep:`3154` for information about improvements brought by
147 protocol 4.
Georg Brandl116aa622007-08-15 14:28:22 +0000148
Antoine Pitroud4d60552013-12-07 00:56:59 +0100149.. note::
150 Serialization is a more primitive notion than persistence; although
151 :mod:`pickle` reads and writes file objects, it does not handle the issue of
152 naming persistent objects, nor the (even more complicated) issue of concurrent
153 access to persistent objects. The :mod:`pickle` module can transform a complex
154 object into a byte stream and it can transform the byte stream into an object
155 with the same internal structure. Perhaps the most obvious thing to do with
156 these byte streams is to write them onto a file, but it is also conceivable to
157 send them across a network or store them in a database. The :mod:`shelve`
158 module provides a simple interface to pickle and unpickle objects on
159 DBM-style database files.
160
Georg Brandl116aa622007-08-15 14:28:22 +0000161
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000162Module Interface
163----------------
Georg Brandl116aa622007-08-15 14:28:22 +0000164
Antoine Pitroua9494f62012-05-10 15:38:30 +0200165To serialize an object hierarchy, you simply call the :func:`dumps` function.
166Similarly, to de-serialize a data stream, you call the :func:`loads` function.
167However, if you want more control over serialization and de-serialization,
168you can create a :class:`Pickler` or an :class:`Unpickler` object, respectively.
169
170The :mod:`pickle` module provides the following constants:
Georg Brandl116aa622007-08-15 14:28:22 +0000171
172
173.. data:: HIGHEST_PROTOCOL
174
Antoine Pitrou9bcb1122013-12-07 01:05:57 +0100175 An integer, the highest :ref:`protocol version <pickle-protocols>`
176 available. This value can be passed as a *protocol* value to functions
177 :func:`dump` and :func:`dumps` as well as the :class:`Pickler`
178 constructor.
Georg Brandl116aa622007-08-15 14:28:22 +0000179
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000180.. data:: DEFAULT_PROTOCOL
181
Antoine Pitrou9bcb1122013-12-07 01:05:57 +0100182 An integer, the default :ref:`protocol version <pickle-protocols>` used
183 for pickling. May be less than :data:`HIGHEST_PROTOCOL`. Currently the
Ɓukasz Langac51d8c92018-04-03 23:06:53 -0700184 default protocol is 4, first introduced in Python 3.4 and incompatible
185 with previous versions.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000186
Ɓukasz Langac51d8c92018-04-03 23:06:53 -0700187 .. versionchanged:: 3.0
188
189 The default protocol is 3.
190
191 .. versionchanged:: 3.8
192
193 The default protocol is 4.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000194
Georg Brandl116aa622007-08-15 14:28:22 +0000195The :mod:`pickle` module provides the following functions to make the pickling
196process more convenient:
197
Antoine Pitrou91f43802019-05-26 17:10:09 +0200198.. function:: dump(obj, file, protocol=None, \*, fix_imports=True, buffer_callback=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000199
Géry Ogam362f5352019-08-07 07:02:23 +0200200 Write the pickled representation of the object *obj* to the open
201 :term:`file object` *file*. This is equivalent to
202 ``Pickler(file, protocol).dump(obj)``.
Georg Brandl116aa622007-08-15 14:28:22 +0000203
Antoine Pitrou91f43802019-05-26 17:10:09 +0200204 Arguments *file*, *protocol*, *fix_imports* and *buffer_callback* have
205 the same meaning as in the :class:`Pickler` constructor.
Georg Brandl116aa622007-08-15 14:28:22 +0000206
Antoine Pitrou91f43802019-05-26 17:10:09 +0200207 .. versionchanged:: 3.8
208 The *buffer_callback* argument was added.
Georg Brandl116aa622007-08-15 14:28:22 +0000209
Antoine Pitrou91f43802019-05-26 17:10:09 +0200210.. function:: dumps(obj, protocol=None, \*, fix_imports=True, buffer_callback=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000211
Géry Ogam362f5352019-08-07 07:02:23 +0200212 Return the pickled representation of the object *obj* as a :class:`bytes` object,
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800213 instead of writing it to a file.
Georg Brandl116aa622007-08-15 14:28:22 +0000214
Antoine Pitrou91f43802019-05-26 17:10:09 +0200215 Arguments *protocol*, *fix_imports* and *buffer_callback* have the same
216 meaning as in the :class:`Pickler` constructor.
Antoine Pitroud9dfaa92009-06-04 20:32:06 +0000217
Antoine Pitrou91f43802019-05-26 17:10:09 +0200218 .. versionchanged:: 3.8
219 The *buffer_callback* argument was added.
220
221.. function:: load(file, \*, fix_imports=True, encoding="ASCII", errors="strict", buffers=None)
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000222
Géry Ogam362f5352019-08-07 07:02:23 +0200223 Read the pickled representation of an object from the open :term:`file object`
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800224 *file* and return the reconstituted object hierarchy specified therein.
225 This is equivalent to ``Unpickler(file).load()``.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000226
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800227 The protocol version of the pickle is detected automatically, so no
Géry Ogam362f5352019-08-07 07:02:23 +0200228 protocol argument is needed. Bytes past the pickled representation
229 of the object are ignored.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000230
Antoine Pitrou91f43802019-05-26 17:10:09 +0200231 Arguments *file*, *fix_imports*, *encoding*, *errors*, *strict* and *buffers*
232 have the same meaning as in the :class:`Unpickler` constructor.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000233
Antoine Pitrou91f43802019-05-26 17:10:09 +0200234 .. versionchanged:: 3.8
235 The *buffers* argument was added.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000236
Antoine Pitrou91f43802019-05-26 17:10:09 +0200237.. function:: loads(bytes_object, \*, fix_imports=True, encoding="ASCII", errors="strict", buffers=None)
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000238
Géry Ogam362f5352019-08-07 07:02:23 +0200239 Return the reconstituted object hierarchy of the pickled representation
240 *bytes_object* of an object.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000241
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800242 The protocol version of the pickle is detected automatically, so no
Géry Ogam362f5352019-08-07 07:02:23 +0200243 protocol argument is needed. Bytes past the pickled representation
244 of the object are ignored.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000245
Antoine Pitrou91f43802019-05-26 17:10:09 +0200246 Arguments *file*, *fix_imports*, *encoding*, *errors*, *strict* and *buffers*
247 have the same meaning as in the :class:`Unpickler` constructor.
248
249 .. versionchanged:: 3.8
250 The *buffers* argument was added.
Georg Brandl116aa622007-08-15 14:28:22 +0000251
Georg Brandl116aa622007-08-15 14:28:22 +0000252
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000253The :mod:`pickle` module defines three exceptions:
Georg Brandl116aa622007-08-15 14:28:22 +0000254
255.. exception:: PickleError
256
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000257 Common base class for the other pickling exceptions. It inherits
Georg Brandl116aa622007-08-15 14:28:22 +0000258 :exc:`Exception`.
259
Georg Brandl116aa622007-08-15 14:28:22 +0000260.. exception:: PicklingError
261
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000262 Error raised when an unpicklable object is encountered by :class:`Pickler`.
263 It inherits :exc:`PickleError`.
Georg Brandl116aa622007-08-15 14:28:22 +0000264
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000265 Refer to :ref:`pickle-picklable` to learn what kinds of objects can be
266 pickled.
267
Georg Brandl116aa622007-08-15 14:28:22 +0000268.. exception:: UnpicklingError
269
Ezio Melottie62aad32011-11-18 13:51:10 +0200270 Error raised when there is a problem unpickling an object, such as a data
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000271 corruption or a security violation. It inherits :exc:`PickleError`.
Georg Brandl116aa622007-08-15 14:28:22 +0000272
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000273 Note that other exceptions may also be raised during unpickling, including
274 (but not necessarily limited to) AttributeError, EOFError, ImportError, and
275 IndexError.
276
277
Antoine Pitrou91f43802019-05-26 17:10:09 +0200278The :mod:`pickle` module exports three classes, :class:`Pickler`,
279:class:`Unpickler` and :class:`PickleBuffer`:
Georg Brandl116aa622007-08-15 14:28:22 +0000280
Antoine Pitrou91f43802019-05-26 17:10:09 +0200281.. class:: Pickler(file, protocol=None, \*, fix_imports=True, buffer_callback=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000282
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000283 This takes a binary file for writing a pickle data stream.
Georg Brandl116aa622007-08-15 14:28:22 +0000284
Antoine Pitroub6457242014-01-21 02:39:54 +0100285 The optional *protocol* argument, an integer, tells the pickler to use
286 the given protocol; supported protocols are 0 to :data:`HIGHEST_PROTOCOL`.
287 If not specified, the default is :data:`DEFAULT_PROTOCOL`. If a negative
288 number is specified, :data:`HIGHEST_PROTOCOL` is selected.
Georg Brandl116aa622007-08-15 14:28:22 +0000289
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000290 The *file* argument must have a write() method that accepts a single bytes
Serhiy Storchakad65c9492015-11-02 14:10:23 +0200291 argument. It can thus be an on-disk file opened for binary writing, an
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800292 :class:`io.BytesIO` instance, or any other custom object that meets this
293 interface.
Georg Brandl116aa622007-08-15 14:28:22 +0000294
Serhiy Storchakafbc1c262013-11-29 12:17:13 +0200295 If *fix_imports* is true and *protocol* is less than 3, pickle will try to
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800296 map the new Python 3 names to the old module names used in Python 2, so
297 that the pickle data stream is readable with Python 2.
Antoine Pitroud9dfaa92009-06-04 20:32:06 +0000298
Antoine Pitrou91f43802019-05-26 17:10:09 +0200299 If *buffer_callback* is None (the default), buffer views are
300 serialized into *file* as part of the pickle stream.
301
302 If *buffer_callback* is not None, then it can be called any number
303 of times with a buffer view. If the callback returns a false value
304 (such as None), the given buffer is :ref:`out-of-band <pickle-oob>`;
305 otherwise the buffer is serialized in-band, i.e. inside the pickle stream.
306
307 It is an error if *buffer_callback* is not None and *protocol* is
308 None or smaller than 5.
309
310 .. versionchanged:: 3.8
311 The *buffer_callback* argument was added.
312
Benjamin Petersone41251e2008-04-25 01:59:09 +0000313 .. method:: dump(obj)
Georg Brandl116aa622007-08-15 14:28:22 +0000314
Géry Ogam362f5352019-08-07 07:02:23 +0200315 Write the pickled representation of *obj* to the open file object given in
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000316 the constructor.
Georg Brandl116aa622007-08-15 14:28:22 +0000317
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000318 .. method:: persistent_id(obj)
319
320 Do nothing by default. This exists so a subclass can override it.
321
322 If :meth:`persistent_id` returns ``None``, *obj* is pickled as usual. Any
323 other value causes :class:`Pickler` to emit the returned value as a
324 persistent ID for *obj*. The meaning of this persistent ID should be
325 defined by :meth:`Unpickler.persistent_load`. Note that the value
326 returned by :meth:`persistent_id` cannot itself have a persistent ID.
327
328 See :ref:`pickle-persistent` for details and examples of uses.
Georg Brandl116aa622007-08-15 14:28:22 +0000329
Antoine Pitrou8d3c2902012-03-04 18:31:48 +0100330 .. attribute:: dispatch_table
331
332 A pickler object's dispatch table is a registry of *reduction
333 functions* of the kind which can be declared using
334 :func:`copyreg.pickle`. It is a mapping whose keys are classes
335 and whose values are reduction functions. A reduction function
336 takes a single argument of the associated class and should
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300337 conform to the same interface as a :meth:`__reduce__`
Antoine Pitrou8d3c2902012-03-04 18:31:48 +0100338 method.
339
340 By default, a pickler object will not have a
341 :attr:`dispatch_table` attribute, and it will instead use the
342 global dispatch table managed by the :mod:`copyreg` module.
343 However, to customize the pickling for a specific pickler object
344 one can set the :attr:`dispatch_table` attribute to a dict-like
345 object. Alternatively, if a subclass of :class:`Pickler` has a
346 :attr:`dispatch_table` attribute then this will be used as the
347 default dispatch table for instances of that class.
348
349 See :ref:`pickle-dispatch` for usage examples.
350
351 .. versionadded:: 3.3
352
Pierre Glaser289f1f82019-05-08 23:08:25 +0200353 .. method:: reducer_override(self, obj)
354
355 Special reducer that can be defined in :class:`Pickler` subclasses. This
356 method has priority over any reducer in the :attr:`dispatch_table`. It
357 should conform to the same interface as a :meth:`__reduce__` method, and
358 can optionally return ``NotImplemented`` to fallback on
359 :attr:`dispatch_table`-registered reducers to pickle ``obj``.
360
361 For a detailed example, see :ref:`reducer_override`.
362
363 .. versionadded:: 3.8
364
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000365 .. attribute:: fast
366
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000367 Deprecated. Enable fast mode if set to a true value. The fast mode
368 disables the usage of memo, therefore speeding the pickling process by not
369 generating superfluous PUT opcodes. It should not be used with
370 self-referential objects, doing otherwise will cause :class:`Pickler` to
371 recurse infinitely.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000372
373 Use :func:`pickletools.optimize` if you need more compact pickles.
374
Georg Brandl116aa622007-08-15 14:28:22 +0000375
Antoine Pitrou91f43802019-05-26 17:10:09 +0200376.. class:: Unpickler(file, \*, fix_imports=True, encoding="ASCII", errors="strict", buffers=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000377
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000378 This takes a binary file for reading a pickle data stream.
Georg Brandl116aa622007-08-15 14:28:22 +0000379
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000380 The protocol version of the pickle is detected automatically, so no
381 protocol argument is needed.
382
Antoine Pitrou91f43802019-05-26 17:10:09 +0200383 The argument *file* must have three methods, a read() method that takes an
384 integer argument, a readinto() method that takes a buffer argument
385 and a readline() method that requires no arguments, as in the
386 :class:`io.BufferedIOBase` interface. Thus *file* can be an on-disk file
Martin Panter7462b6492015-11-02 03:37:02 +0000387 opened for binary reading, an :class:`io.BytesIO` object, or any other
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800388 custom object that meets this interface.
Georg Brandl116aa622007-08-15 14:28:22 +0000389
Antoine Pitrou91f43802019-05-26 17:10:09 +0200390 The optional arguments *fix_imports*, *encoding* and *errors* are used
391 to control compatibility support for pickle stream generated by Python 2.
392 If *fix_imports* is true, pickle will try to map the old Python 2 names
393 to the new names used in Python 3. The *encoding* and *errors* tell
394 pickle how to decode 8-bit string instances pickled by Python 2;
395 these default to 'ASCII' and 'strict', respectively. The *encoding* can
Sebastian Pucilowskia8d25a12017-12-21 20:00:49 +1100396 be 'bytes' to read these 8-bit string instances as bytes objects.
Antoine Pitrou91f43802019-05-26 17:10:09 +0200397 Using ``encoding='latin1'`` is required for unpickling NumPy arrays and
398 instances of :class:`~datetime.datetime`, :class:`~datetime.date` and
399 :class:`~datetime.time` pickled by Python 2.
400
401 If *buffers* is None (the default), then all data necessary for
402 deserialization must be contained in the pickle stream. This means
403 that the *buffer_callback* argument was None when a :class:`Pickler`
404 was instantiated (or when :func:`dump` or :func:`dumps` was called).
405
406 If *buffers* is not None, it should be an iterable of buffer-enabled
407 objects that is consumed each time the pickle stream references
408 an :ref:`out-of-band <pickle-oob>` buffer view. Such buffers have been
409 given in order to the *buffer_callback* of a Pickler object.
410
411 .. versionchanged:: 3.8
412 The *buffers* argument was added.
Georg Brandl116aa622007-08-15 14:28:22 +0000413
Benjamin Petersone41251e2008-04-25 01:59:09 +0000414 .. method:: load()
Georg Brandl116aa622007-08-15 14:28:22 +0000415
Géry Ogam362f5352019-08-07 07:02:23 +0200416 Read the pickled representation of an object from the open file object
417 given in the constructor, and return the reconstituted object hierarchy
418 specified therein. Bytes past the pickled representation of the object
419 are ignored.
Georg Brandl116aa622007-08-15 14:28:22 +0000420
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000421 .. method:: persistent_load(pid)
Georg Brandl116aa622007-08-15 14:28:22 +0000422
Ezio Melottie62aad32011-11-18 13:51:10 +0200423 Raise an :exc:`UnpicklingError` by default.
Georg Brandl116aa622007-08-15 14:28:22 +0000424
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000425 If defined, :meth:`persistent_load` should return the object specified by
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000426 the persistent ID *pid*. If an invalid persistent ID is encountered, an
Ezio Melottie62aad32011-11-18 13:51:10 +0200427 :exc:`UnpicklingError` should be raised.
Georg Brandl116aa622007-08-15 14:28:22 +0000428
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000429 See :ref:`pickle-persistent` for details and examples of uses.
430
431 .. method:: find_class(module, name)
432
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000433 Import *module* if necessary and return the object called *name* from it,
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000434 where the *module* and *name* arguments are :class:`str` objects. Note,
435 unlike its name suggests, :meth:`find_class` is also used for finding
436 functions.
Georg Brandl116aa622007-08-15 14:28:22 +0000437
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000438 Subclasses may override this to gain control over what type of objects and
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000439 how they can be loaded, potentially reducing security risks. Refer to
440 :ref:`pickle-restrict` for details.
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000441
Steve Dower44f91c32019-06-27 10:47:59 -0700442 .. audit-event:: pickle.find_class module,name pickle.Unpickler.find_class
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000443
Antoine Pitrou91f43802019-05-26 17:10:09 +0200444.. class:: PickleBuffer(buffer)
445
446 A wrapper for a buffer representing picklable data. *buffer* must be a
447 :ref:`buffer-providing <bufferobjects>` object, such as a
448 :term:`bytes-like object` or a N-dimensional array.
449
450 :class:`PickleBuffer` is itself a buffer provider, therefore it is
451 possible to pass it to other APIs expecting a buffer-providing object,
452 such as :class:`memoryview`.
453
454 :class:`PickleBuffer` objects can only be serialized using pickle
455 protocol 5 or higher. They are eligible for
456 :ref:`out-of-band serialization <pickle-oob>`.
457
458 .. versionadded:: 3.8
459
460 .. method:: raw()
461
462 Return a :class:`memoryview` of the memory area underlying this buffer.
463 The returned object is a one-dimensional, C-contiguous memoryview
464 with format ``B`` (unsigned bytes). :exc:`BufferError` is raised if
465 the buffer is neither C- nor Fortran-contiguous.
466
467 .. method:: release()
468
469 Release the underlying buffer exposed by the PickleBuffer object.
470
471
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000472.. _pickle-picklable:
Georg Brandl116aa622007-08-15 14:28:22 +0000473
474What can be pickled and unpickled?
475----------------------------------
476
477The following types can be pickled:
478
479* ``None``, ``True``, and ``False``
480
Georg Brandlba956ae2007-11-29 17:24:34 +0000481* integers, floating point numbers, complex numbers
Georg Brandl116aa622007-08-15 14:28:22 +0000482
Georg Brandlf6945182008-02-01 11:56:49 +0000483* strings, bytes, bytearrays
Georg Brandl116aa622007-08-15 14:28:22 +0000484
485* tuples, lists, sets, and dictionaries containing only picklable objects
486
Ethan Furman2498d9e2013-10-18 00:45:40 -0700487* functions defined at the top level of a module (using :keyword:`def`, not
488 :keyword:`lambda`)
Georg Brandl116aa622007-08-15 14:28:22 +0000489
490* built-in functions defined at the top level of a module
491
492* classes that are defined at the top level of a module
493
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300494* instances of such classes whose :attr:`~object.__dict__` or the result of
495 calling :meth:`__getstate__` is picklable (see section :ref:`pickle-inst` for
Eli Bendersky78f3ce52013-01-02 05:53:59 -0800496 details).
Georg Brandl116aa622007-08-15 14:28:22 +0000497
498Attempts to pickle unpicklable objects will raise the :exc:`PicklingError`
499exception; when this happens, an unspecified number of bytes may have already
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000500been written to the underlying file. Trying to pickle a highly recursive data
Yury Selivanovf488fb42015-07-03 01:04:23 -0400501structure may exceed the maximum recursion depth, a :exc:`RecursionError` will be
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000502raised in this case. You can carefully raise this limit with
Georg Brandl116aa622007-08-15 14:28:22 +0000503:func:`sys.setrecursionlimit`.
504
505Note that functions (built-in and user-defined) are pickled by "fully qualified"
Ethan Furman2498d9e2013-10-18 00:45:40 -0700506name reference, not by value. [#]_ This means that only the function name is
Eli Bendersky78f3ce52013-01-02 05:53:59 -0800507pickled, along with the name of the module the function is defined in. Neither
508the function's code, nor any of its function attributes are pickled. Thus the
Georg Brandl116aa622007-08-15 14:28:22 +0000509defining module must be importable in the unpickling environment, and the module
510must contain the named object, otherwise an exception will be raised. [#]_
511
512Similarly, classes are pickled by named reference, so the same restrictions in
513the unpickling environment apply. Note that none of the class's code or data is
514pickled, so in the following example the class attribute ``attr`` is not
515restored in the unpickling environment::
516
517 class Foo:
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000518 attr = 'A class attribute'
Georg Brandl116aa622007-08-15 14:28:22 +0000519
520 picklestring = pickle.dumps(Foo)
521
522These restrictions are why picklable functions and classes must be defined in
523the top level of a module.
524
525Similarly, when class instances are pickled, their class's code and data are not
526pickled along with them. Only the instance data are pickled. This is done on
527purpose, so you can fix bugs in a class or add methods to the class and still
528load objects that were created with an earlier version of the class. If you
529plan to have long-lived objects that will see many versions of a class, it may
530be worthwhile to put a version number in the objects so that suitable
531conversions can be made by the class's :meth:`__setstate__` method.
532
533
Georg Brandl116aa622007-08-15 14:28:22 +0000534.. _pickle-inst:
535
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000536Pickling Class Instances
537------------------------
Georg Brandl116aa622007-08-15 14:28:22 +0000538
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300539.. currentmodule:: None
540
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000541In this section, we describe the general mechanisms available to you to define,
542customize, and control how class instances are pickled and unpickled.
Georg Brandl116aa622007-08-15 14:28:22 +0000543
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000544In most cases, no additional code is needed to make instances picklable. By
545default, pickle will retrieve the class and the attributes of an instance via
546introspection. When a class instance is unpickled, its :meth:`__init__` method
547is usually *not* invoked. The default behaviour first creates an uninitialized
548instance and then restores the saved attributes. The following code shows an
549implementation of this behaviour::
Georg Brandl85eb8c12007-08-31 16:33:38 +0000550
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000551 def save(obj):
552 return (obj.__class__, obj.__dict__)
553
554 def load(cls, attributes):
555 obj = cls.__new__(cls)
556 obj.__dict__.update(attributes)
557 return obj
Georg Brandl116aa622007-08-15 14:28:22 +0000558
Georg Brandl6faee4e2010-09-21 14:48:28 +0000559Classes can alter the default behaviour by providing one or several special
Georg Brandlc8148262010-10-17 11:13:37 +0000560methods:
Georg Brandl116aa622007-08-15 14:28:22 +0000561
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100562.. method:: object.__getnewargs_ex__()
563
Serhiy Storchakab6d84832015-10-13 21:26:35 +0300564 In protocols 2 and newer, classes that implements the
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100565 :meth:`__getnewargs_ex__` method can dictate the values passed to the
566 :meth:`__new__` method upon unpickling. The method must return a pair
567 ``(args, kwargs)`` where *args* is a tuple of positional arguments
568 and *kwargs* a dictionary of named arguments for constructing the
569 object. Those will be passed to the :meth:`__new__` method upon
570 unpickling.
571
572 You should implement this method if the :meth:`__new__` method of your
573 class requires keyword-only arguments. Otherwise, it is recommended for
574 compatibility to implement :meth:`__getnewargs__`.
575
Serhiy Storchakab6d84832015-10-13 21:26:35 +0300576 .. versionchanged:: 3.6
577 :meth:`__getnewargs_ex__` is now used in protocols 2 and 3.
578
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100579
Georg Brandlc8148262010-10-17 11:13:37 +0000580.. method:: object.__getnewargs__()
Georg Brandl116aa622007-08-15 14:28:22 +0000581
Andrés Delfino0e0534c2018-06-09 21:41:09 -0300582 This method serves a similar purpose as :meth:`__getnewargs_ex__`, but
Serhiy Storchakab6d84832015-10-13 21:26:35 +0300583 supports only positional arguments. It must return a tuple of arguments
584 ``args`` which will be passed to the :meth:`__new__` method upon unpickling.
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100585
Serhiy Storchakab6d84832015-10-13 21:26:35 +0300586 :meth:`__getnewargs__` will not be called if :meth:`__getnewargs_ex__` is
587 defined.
588
589 .. versionchanged:: 3.6
590 Before Python 3.6, :meth:`__getnewargs__` was called instead of
591 :meth:`__getnewargs_ex__` in protocols 2 and 3.
Georg Brandl116aa622007-08-15 14:28:22 +0000592
Georg Brandl116aa622007-08-15 14:28:22 +0000593
Georg Brandlc8148262010-10-17 11:13:37 +0000594.. method:: object.__getstate__()
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000595
Georg Brandlc8148262010-10-17 11:13:37 +0000596 Classes can further influence how their instances are pickled; if the class
597 defines the method :meth:`__getstate__`, it is called and the returned object
598 is pickled as the contents for the instance, instead of the contents of the
599 instance's dictionary. If the :meth:`__getstate__` method is absent, the
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300600 instance's :attr:`~object.__dict__` is pickled as usual.
Georg Brandl116aa622007-08-15 14:28:22 +0000601
Georg Brandlc8148262010-10-17 11:13:37 +0000602
603.. method:: object.__setstate__(state)
604
605 Upon unpickling, if the class defines :meth:`__setstate__`, it is called with
606 the unpickled state. In that case, there is no requirement for the state
607 object to be a dictionary. Otherwise, the pickled state must be a dictionary
608 and its items are assigned to the new instance's dictionary.
609
610 .. note::
611
612 If :meth:`__getstate__` returns a false value, the :meth:`__setstate__`
613 method will not be called upon unpickling.
614
Georg Brandl116aa622007-08-15 14:28:22 +0000615
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000616Refer to the section :ref:`pickle-state` for more information about how to use
617the methods :meth:`__getstate__` and :meth:`__setstate__`.
Georg Brandl116aa622007-08-15 14:28:22 +0000618
Benjamin Petersond23f8222009-04-05 19:13:16 +0000619.. note::
Georg Brandle720c0a2009-04-27 16:20:50 +0000620
Benjamin Petersond23f8222009-04-05 19:13:16 +0000621 At unpickling time, some methods like :meth:`__getattr__`,
622 :meth:`__getattribute__`, or :meth:`__setattr__` may be called upon the
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100623 instance. In case those methods rely on some internal invariant being
624 true, the type should implement :meth:`__getnewargs__` or
625 :meth:`__getnewargs_ex__` to establish such an invariant; otherwise,
626 neither :meth:`__new__` nor :meth:`__init__` will be called.
Benjamin Petersond23f8222009-04-05 19:13:16 +0000627
Georg Brandlc8148262010-10-17 11:13:37 +0000628.. index:: pair: copy; protocol
Christian Heimes05e8be12008-02-23 18:30:17 +0000629
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000630As we shall see, pickle does not use directly the methods described above. In
631fact, these methods are part of the copy protocol which implements the
632:meth:`__reduce__` special method. The copy protocol provides a unified
633interface for retrieving the data necessary for pickling and copying
Georg Brandl48310cd2009-01-03 21:18:54 +0000634objects. [#]_
Georg Brandl116aa622007-08-15 14:28:22 +0000635
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000636Although powerful, implementing :meth:`__reduce__` directly in your classes is
637error prone. For this reason, class designers should use the high-level
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100638interface (i.e., :meth:`__getnewargs_ex__`, :meth:`__getstate__` and
Georg Brandlc8148262010-10-17 11:13:37 +0000639:meth:`__setstate__`) whenever possible. We will show, however, cases where
640using :meth:`__reduce__` is the only option or leads to more efficient pickling
641or both.
Georg Brandl116aa622007-08-15 14:28:22 +0000642
Georg Brandlc8148262010-10-17 11:13:37 +0000643.. method:: object.__reduce__()
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000644
Georg Brandlc8148262010-10-17 11:13:37 +0000645 The interface is currently defined as follows. The :meth:`__reduce__` method
646 takes no argument and shall return either a string or preferably a tuple (the
647 returned object is often referred to as the "reduce value").
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000648
Georg Brandlc8148262010-10-17 11:13:37 +0000649 If a string is returned, the string should be interpreted as the name of a
650 global variable. It should be the object's local name relative to its
651 module; the pickle module searches the module namespace to determine the
652 object's module. This behaviour is typically useful for singletons.
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000653
Pierre Glaser65d98d02019-05-08 21:40:25 +0200654 When a tuple is returned, it must be between two and six items long.
Georg Brandlc8148262010-10-17 11:13:37 +0000655 Optional items can either be omitted, or ``None`` can be provided as their
656 value. The semantics of each item are in order:
Georg Brandl116aa622007-08-15 14:28:22 +0000657
Georg Brandlc8148262010-10-17 11:13:37 +0000658 .. XXX Mention __newobj__ special-case?
Georg Brandl116aa622007-08-15 14:28:22 +0000659
Georg Brandlc8148262010-10-17 11:13:37 +0000660 * A callable object that will be called to create the initial version of the
661 object.
Georg Brandl116aa622007-08-15 14:28:22 +0000662
Georg Brandlc8148262010-10-17 11:13:37 +0000663 * A tuple of arguments for the callable object. An empty tuple must be given
664 if the callable does not accept any argument.
Georg Brandl116aa622007-08-15 14:28:22 +0000665
Georg Brandlc8148262010-10-17 11:13:37 +0000666 * Optionally, the object's state, which will be passed to the object's
667 :meth:`__setstate__` method as previously described. If the object has no
668 such method then, the value must be a dictionary and it will be added to
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300669 the object's :attr:`~object.__dict__` attribute.
Georg Brandl116aa622007-08-15 14:28:22 +0000670
Georg Brandlc8148262010-10-17 11:13:37 +0000671 * Optionally, an iterator (and not a sequence) yielding successive items.
672 These items will be appended to the object either using
673 ``obj.append(item)`` or, in batch, using ``obj.extend(list_of_items)``.
674 This is primarily used for list subclasses, but may be used by other
675 classes as long as they have :meth:`append` and :meth:`extend` methods with
676 the appropriate signature. (Whether :meth:`append` or :meth:`extend` is
677 used depends on which pickle protocol version is used as well as the number
678 of items to append, so both must be supported.)
Georg Brandl116aa622007-08-15 14:28:22 +0000679
Georg Brandlc8148262010-10-17 11:13:37 +0000680 * Optionally, an iterator (not a sequence) yielding successive key-value
681 pairs. These items will be stored to the object using ``obj[key] =
682 value``. This is primarily used for dictionary subclasses, but may be used
683 by other classes as long as they implement :meth:`__setitem__`.
Georg Brandl116aa622007-08-15 14:28:22 +0000684
Pierre Glaser65d98d02019-05-08 21:40:25 +0200685 * Optionally, a callable with a ``(obj, state)`` signature. This
Xtreak9b5a0ef2019-05-16 10:04:24 +0530686 callable allows the user to programmatically control the state-updating
Pierre Glaser65d98d02019-05-08 21:40:25 +0200687 behavior of a specific object, instead of using ``obj``'s static
688 :meth:`__setstate__` method. If not ``None``, this callable will have
689 priority over ``obj``'s :meth:`__setstate__`.
690
691 .. versionadded:: 3.8
692 The optional sixth tuple item, ``(obj, state)``, was added.
693
Georg Brandlc8148262010-10-17 11:13:37 +0000694
695.. method:: object.__reduce_ex__(protocol)
696
697 Alternatively, a :meth:`__reduce_ex__` method may be defined. The only
698 difference is this method should take a single integer argument, the protocol
699 version. When defined, pickle will prefer it over the :meth:`__reduce__`
700 method. In addition, :meth:`__reduce__` automatically becomes a synonym for
701 the extended version. The main use for this method is to provide
702 backwards-compatible reduce values for older Python releases.
Georg Brandl116aa622007-08-15 14:28:22 +0000703
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300704.. currentmodule:: pickle
705
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000706.. _pickle-persistent:
707
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000708Persistence of External Objects
709^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +0000710
Christian Heimes05e8be12008-02-23 18:30:17 +0000711.. index::
712 single: persistent_id (pickle protocol)
713 single: persistent_load (pickle protocol)
714
Georg Brandl116aa622007-08-15 14:28:22 +0000715For the benefit of object persistence, the :mod:`pickle` module supports the
716notion of a reference to an object outside the pickled data stream. Such
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000717objects are referenced by a persistent ID, which should be either a string of
718alphanumeric characters (for protocol 0) [#]_ or just an arbitrary object (for
719any newer protocol).
Georg Brandl116aa622007-08-15 14:28:22 +0000720
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000721The resolution of such persistent IDs is not defined by the :mod:`pickle`
Géry Ogam362f5352019-08-07 07:02:23 +0200722module; it will delegate this resolution to the user-defined methods on the
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300723pickler and unpickler, :meth:`~Pickler.persistent_id` and
724:meth:`~Unpickler.persistent_load` respectively.
Georg Brandl116aa622007-08-15 14:28:22 +0000725
Géry Ogam362f5352019-08-07 07:02:23 +0200726To pickle objects that have an external persistent ID, the pickler must have a
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300727custom :meth:`~Pickler.persistent_id` method that takes an object as an
Géry Ogam362f5352019-08-07 07:02:23 +0200728argument and returns either ``None`` or the persistent ID for that object.
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300729When ``None`` is returned, the pickler simply pickles the object as normal.
730When a persistent ID string is returned, the pickler will pickle that object,
731along with a marker so that the unpickler will recognize it as a persistent ID.
Georg Brandl116aa622007-08-15 14:28:22 +0000732
733To unpickle external objects, the unpickler must have a custom
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300734:meth:`~Unpickler.persistent_load` method that takes a persistent ID object and
735returns the referenced object.
Georg Brandl116aa622007-08-15 14:28:22 +0000736
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000737Here is a comprehensive example presenting how persistent ID can be used to
738pickle external objects by reference.
Georg Brandl116aa622007-08-15 14:28:22 +0000739
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000740.. literalinclude:: ../includes/dbpickle.py
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +0000741
Antoine Pitrou8d3c2902012-03-04 18:31:48 +0100742.. _pickle-dispatch:
743
744Dispatch Tables
745^^^^^^^^^^^^^^^
746
747If one wants to customize pickling of some classes without disturbing
748any other code which depends on pickling, then one can create a
749pickler with a private dispatch table.
750
751The global dispatch table managed by the :mod:`copyreg` module is
752available as :data:`copyreg.dispatch_table`. Therefore, one may
753choose to use a modified copy of :data:`copyreg.dispatch_table` as a
754private dispatch table.
755
756For example ::
757
758 f = io.BytesIO()
759 p = pickle.Pickler(f)
760 p.dispatch_table = copyreg.dispatch_table.copy()
761 p.dispatch_table[SomeClass] = reduce_SomeClass
762
763creates an instance of :class:`pickle.Pickler` with a private dispatch
764table which handles the ``SomeClass`` class specially. Alternatively,
765the code ::
766
767 class MyPickler(pickle.Pickler):
768 dispatch_table = copyreg.dispatch_table.copy()
769 dispatch_table[SomeClass] = reduce_SomeClass
770 f = io.BytesIO()
771 p = MyPickler(f)
772
773does the same, but all instances of ``MyPickler`` will by default
774share the same dispatch table. The equivalent code using the
775:mod:`copyreg` module is ::
776
777 copyreg.pickle(SomeClass, reduce_SomeClass)
778 f = io.BytesIO()
779 p = pickle.Pickler(f)
Georg Brandl116aa622007-08-15 14:28:22 +0000780
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000781.. _pickle-state:
782
783Handling Stateful Objects
784^^^^^^^^^^^^^^^^^^^^^^^^^
785
786.. index::
787 single: __getstate__() (copy protocol)
788 single: __setstate__() (copy protocol)
789
790Here's an example that shows how to modify pickling behavior for a class.
791The :class:`TextReader` class opens a text file, and returns the line number and
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300792line contents each time its :meth:`!readline` method is called. If a
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000793:class:`TextReader` instance is pickled, all attributes *except* the file object
794member are saved. When the instance is unpickled, the file is reopened, and
795reading resumes from the last location. The :meth:`__setstate__` and
796:meth:`__getstate__` methods are used to implement this behavior. ::
797
798 class TextReader:
799 """Print and number lines in a text file."""
800
801 def __init__(self, filename):
802 self.filename = filename
803 self.file = open(filename)
804 self.lineno = 0
805
806 def readline(self):
807 self.lineno += 1
808 line = self.file.readline()
809 if not line:
810 return None
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +0000811 if line.endswith('\n'):
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000812 line = line[:-1]
813 return "%i: %s" % (self.lineno, line)
814
815 def __getstate__(self):
816 # Copy the object's state from self.__dict__ which contains
817 # all our instance attributes. Always use the dict.copy()
818 # method to avoid modifying the original state.
819 state = self.__dict__.copy()
820 # Remove the unpicklable entries.
821 del state['file']
822 return state
823
824 def __setstate__(self, state):
825 # Restore instance attributes (i.e., filename and lineno).
826 self.__dict__.update(state)
827 # Restore the previously opened file's state. To do so, we need to
828 # reopen it and read from it until the line count is restored.
829 file = open(self.filename)
830 for _ in range(self.lineno):
831 file.readline()
832 # Finally, save the file.
833 self.file = file
834
835
836A sample usage might be something like this::
837
838 >>> reader = TextReader("hello.txt")
839 >>> reader.readline()
840 '1: Hello world!'
841 >>> reader.readline()
842 '2: I am line number two.'
843 >>> new_reader = pickle.loads(pickle.dumps(reader))
844 >>> new_reader.readline()
845 '3: Goodbye!'
846
Pierre Glaser289f1f82019-05-08 23:08:25 +0200847.. _reducer_override:
848
849Custom Reduction for Types, Functions, and Other Objects
850--------------------------------------------------------
851
852.. versionadded:: 3.8
853
854Sometimes, :attr:`~Pickler.dispatch_table` may not be flexible enough.
855In particular we may want to customize pickling based on another criterion
856than the object's type, or we may want to customize the pickling of
857functions and classes.
858
859For those cases, it is possible to subclass from the :class:`Pickler` class and
860implement a :meth:`~Pickler.reducer_override` method. This method can return an
861arbitrary reduction tuple (see :meth:`__reduce__`). It can alternatively return
862``NotImplemented`` to fallback to the traditional behavior.
863
864If both the :attr:`~Pickler.dispatch_table` and
865:meth:`~Pickler.reducer_override` are defined, then
866:meth:`~Pickler.reducer_override` method takes priority.
867
868.. Note::
869 For performance reasons, :meth:`~Pickler.reducer_override` may not be
870 called for the following objects: ``None``, ``True``, ``False``, and
871 exact instances of :class:`int`, :class:`float`, :class:`bytes`,
872 :class:`str`, :class:`dict`, :class:`set`, :class:`frozenset`, :class:`list`
873 and :class:`tuple`.
874
875Here is a simple example where we allow pickling and reconstructing
876a given class::
877
878 import io
879 import pickle
880
881 class MyClass:
882 my_attribute = 1
883
884 class MyPickler(pickle.Pickler):
885 def reducer_override(self, obj):
886 """Custom reducer for MyClass."""
887 if getattr(obj, "__name__", None) == "MyClass":
888 return type, (obj.__name__, obj.__bases__,
889 {'my_attribute': obj.my_attribute})
890 else:
891 # For any other object, fallback to usual reduction
892 return NotImplemented
893
894 f = io.BytesIO()
895 p = MyPickler(f)
896 p.dump(MyClass)
897
898 del MyClass
899
900 unpickled_class = pickle.loads(f.getvalue())
901
902 assert isinstance(unpickled_class, type)
903 assert unpickled_class.__name__ == "MyClass"
904 assert unpickled_class.my_attribute == 1
905
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000906
Antoine Pitrou91f43802019-05-26 17:10:09 +0200907.. _pickle-oob:
908
909Out-of-band Buffers
910-------------------
911
912.. versionadded:: 3.8
913
914In some contexts, the :mod:`pickle` module is used to transfer massive amounts
915of data. Therefore, it can be important to minimize the number of memory
916copies, to preserve performance and resource consumption. However, normal
917operation of the :mod:`pickle` module, as it transforms a graph-like structure
918of objects into a sequential stream of bytes, intrinsically involves copying
919data to and from the pickle stream.
920
921This constraint can be eschewed if both the *provider* (the implementation
922of the object types to be transferred) and the *consumer* (the implementation
923of the communications system) support the out-of-band transfer facilities
924provided by pickle protocol 5 and higher.
925
926Provider API
927^^^^^^^^^^^^
928
929The large data objects to be pickled must implement a :meth:`__reduce_ex__`
930method specialized for protocol 5 and higher, which returns a
931:class:`PickleBuffer` instance (instead of e.g. a :class:`bytes` object)
932for any large data.
933
934A :class:`PickleBuffer` object *signals* that the underlying buffer is
935eligible for out-of-band data transfer. Those objects remain compatible
936with normal usage of the :mod:`pickle` module. However, consumers can also
937opt-in to tell :mod:`pickle` that they will handle those buffers by
938themselves.
939
940Consumer API
941^^^^^^^^^^^^
942
943A communications system can enable custom handling of the :class:`PickleBuffer`
944objects generated when serializing an object graph.
945
946On the sending side, it needs to pass a *buffer_callback* argument to
947:class:`Pickler` (or to the :func:`dump` or :func:`dumps` function), which
948will be called with each :class:`PickleBuffer` generated while pickling
949the object graph. Buffers accumulated by the *buffer_callback* will not
950see their data copied into the pickle stream, only a cheap marker will be
951inserted.
952
953On the receiving side, it needs to pass a *buffers* argument to
954:class:`Unpickler` (or to the :func:`load` or :func:`loads` function),
955which is an iterable of the buffers which were passed to *buffer_callback*.
956That iterable should produce buffers in the same order as they were passed
957to *buffer_callback*. Those buffers will provide the data expected by the
958reconstructors of the objects whose pickling produced the original
959:class:`PickleBuffer` objects.
960
961Between the sending side and the receiving side, the communications system
962is free to implement its own transfer mechanism for out-of-band buffers.
963Potential optimizations include the use of shared memory or datatype-dependent
964compression.
965
966Example
967^^^^^^^
968
969Here is a trivial example where we implement a :class:`bytearray` subclass
970able to participate in out-of-band buffer pickling::
971
972 class ZeroCopyByteArray(bytearray):
973
974 def __reduce_ex__(self, protocol):
975 if protocol >= 5:
976 return type(self)._reconstruct, (PickleBuffer(self),), None
977 else:
978 # PickleBuffer is forbidden with pickle protocols <= 4.
979 return type(self)._reconstruct, (bytearray(self),)
980
981 @classmethod
982 def _reconstruct(cls, obj):
983 with memoryview(obj) as m:
984 # Get a handle over the original buffer object
985 obj = m.obj
986 if type(obj) is cls:
987 # Original buffer object is a ZeroCopyByteArray, return it
988 # as-is.
989 return obj
990 else:
991 return cls(obj)
992
993The reconstructor (the ``_reconstruct`` class method) returns the buffer's
994providing object if it has the right type. This is an easy way to simulate
995zero-copy behaviour on this toy example.
996
997On the consumer side, we can pickle those objects the usual way, which
998when unserialized will give us a copy of the original object::
999
1000 b = ZeroCopyByteArray(b"abc")
1001 data = pickle.dumps(b, protocol=5)
1002 new_b = pickle.loads(data)
1003 print(b == new_b) # True
1004 print(b is new_b) # False: a copy was made
1005
1006But if we pass a *buffer_callback* and then give back the accumulated
1007buffers when unserializing, we are able to get back the original object::
1008
1009 b = ZeroCopyByteArray(b"abc")
1010 buffers = []
1011 data = pickle.dumps(b, protocol=5, buffer_callback=buffers.append)
1012 new_b = pickle.loads(data, buffers=buffers)
1013 print(b == new_b) # True
1014 print(b is new_b) # True: no copy was made
1015
1016This example is limited by the fact that :class:`bytearray` allocates its
1017own memory: you cannot create a :class:`bytearray` instance that is backed
1018by another object's memory. However, third-party datatypes such as NumPy
1019arrays do not have this limitation, and allow use of zero-copy pickling
1020(or making as few copies as possible) when transferring between distinct
1021processes or systems.
1022
1023.. seealso:: :pep:`574` -- Pickle protocol 5 with out-of-band data
1024
1025
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001026.. _pickle-restrict:
Georg Brandl116aa622007-08-15 14:28:22 +00001027
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001028Restricting Globals
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +00001029-------------------
Georg Brandl116aa622007-08-15 14:28:22 +00001030
Christian Heimes05e8be12008-02-23 18:30:17 +00001031.. index::
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001032 single: find_class() (pickle protocol)
Christian Heimes05e8be12008-02-23 18:30:17 +00001033
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001034By default, unpickling will import any class or function that it finds in the
1035pickle data. For many applications, this behaviour is unacceptable as it
1036permits the unpickler to import and invoke arbitrary code. Just consider what
1037this hand-crafted pickle data stream does when loaded::
Georg Brandl116aa622007-08-15 14:28:22 +00001038
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001039 >>> import pickle
1040 >>> pickle.loads(b"cos\nsystem\n(S'echo hello world'\ntR.")
1041 hello world
1042 0
Georg Brandl116aa622007-08-15 14:28:22 +00001043
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001044In this example, the unpickler imports the :func:`os.system` function and then
1045apply the string argument "echo hello world". Although this example is
1046inoffensive, it is not difficult to imagine one that could damage your system.
Georg Brandl116aa622007-08-15 14:28:22 +00001047
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001048For this reason, you may want to control what gets unpickled by customizing
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +03001049:meth:`Unpickler.find_class`. Unlike its name suggests,
1050:meth:`Unpickler.find_class` is called whenever a global (i.e., a class or
1051a function) is requested. Thus it is possible to either completely forbid
1052globals or restrict them to a safe subset.
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001053
1054Here is an example of an unpickler allowing only few safe classes from the
1055:mod:`builtins` module to be loaded::
1056
1057 import builtins
1058 import io
1059 import pickle
1060
1061 safe_builtins = {
1062 'range',
1063 'complex',
1064 'set',
1065 'frozenset',
1066 'slice',
1067 }
1068
1069 class RestrictedUnpickler(pickle.Unpickler):
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +00001070
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001071 def find_class(self, module, name):
1072 # Only allow safe classes from builtins.
1073 if module == "builtins" and name in safe_builtins:
1074 return getattr(builtins, name)
1075 # Forbid everything else.
1076 raise pickle.UnpicklingError("global '%s.%s' is forbidden" %
1077 (module, name))
1078
1079 def restricted_loads(s):
1080 """Helper function analogous to pickle.loads()."""
1081 return RestrictedUnpickler(io.BytesIO(s)).load()
1082
1083A sample usage of our unpickler working has intended::
1084
1085 >>> restricted_loads(pickle.dumps([1, 2, range(15)]))
1086 [1, 2, range(0, 15)]
1087 >>> restricted_loads(b"cos\nsystem\n(S'echo hello world'\ntR.")
1088 Traceback (most recent call last):
1089 ...
1090 pickle.UnpicklingError: global 'os.system' is forbidden
1091 >>> restricted_loads(b'cbuiltins\neval\n'
1092 ... b'(S\'getattr(__import__("os"), "system")'
1093 ... b'("echo hello world")\'\ntR.')
1094 Traceback (most recent call last):
1095 ...
1096 pickle.UnpicklingError: global 'builtins.eval' is forbidden
1097
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +00001098
1099.. XXX Add note about how extension codes could evade our protection
Georg Brandl48310cd2009-01-03 21:18:54 +00001100 mechanism (e.g. cached classes do not invokes find_class()).
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +00001101
1102As our examples shows, you have to be careful with what you allow to be
1103unpickled. Therefore if security is a concern, you may want to consider
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +00001104alternatives such as the marshalling API in :mod:`xmlrpc.client` or
1105third-party solutions.
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +00001106
Georg Brandl116aa622007-08-15 14:28:22 +00001107
Antoine Pitroud4d60552013-12-07 00:56:59 +01001108Performance
1109-----------
1110
1111Recent versions of the pickle protocol (from protocol 2 and upwards) feature
1112efficient binary encodings for several common features and built-in types.
1113Also, the :mod:`pickle` module has a transparent optimizer written in C.
1114
1115
Georg Brandl116aa622007-08-15 14:28:22 +00001116.. _pickle-example:
1117
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +00001118Examples
1119--------
Georg Brandl116aa622007-08-15 14:28:22 +00001120
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +00001121For the simplest code, use the :func:`dump` and :func:`load` functions. ::
Georg Brandl116aa622007-08-15 14:28:22 +00001122
1123 import pickle
1124
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +00001125 # An arbitrary collection of objects supported by pickle.
1126 data = {
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +00001127 'a': [1, 2.0, 3, 4+6j],
1128 'b': ("character string", b"byte string"),
Raymond Hettingerdf1b6992014-11-09 15:56:33 -08001129 'c': {None, True, False}
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +00001130 }
Georg Brandl116aa622007-08-15 14:28:22 +00001131
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +00001132 with open('data.pickle', 'wb') as f:
1133 # Pickle the 'data' dictionary using the highest protocol available.
1134 pickle.dump(data, f, pickle.HIGHEST_PROTOCOL)
Georg Brandl116aa622007-08-15 14:28:22 +00001135
Georg Brandl116aa622007-08-15 14:28:22 +00001136
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +00001137The following example reads the resulting pickled data. ::
Georg Brandl116aa622007-08-15 14:28:22 +00001138
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +00001139 import pickle
Georg Brandl116aa622007-08-15 14:28:22 +00001140
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +00001141 with open('data.pickle', 'rb') as f:
1142 # The protocol version used is detected automatically, so we do not
1143 # have to specify it.
1144 data = pickle.load(f)
Georg Brandl116aa622007-08-15 14:28:22 +00001145
Georg Brandl116aa622007-08-15 14:28:22 +00001146
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +00001147.. XXX: Add examples showing how to optimize pickles for size (like using
1148.. pickletools.optimize() or the gzip module).
1149
1150
Georg Brandl116aa622007-08-15 14:28:22 +00001151.. seealso::
1152
Alexandre Vassalottif7fa63d2008-05-11 08:55:36 +00001153 Module :mod:`copyreg`
Georg Brandl116aa622007-08-15 14:28:22 +00001154 Pickle interface constructor registration for extension types.
1155
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +00001156 Module :mod:`pickletools`
1157 Tools for working with and analyzing pickled data.
1158
Georg Brandl116aa622007-08-15 14:28:22 +00001159 Module :mod:`shelve`
1160 Indexed databases of objects; uses :mod:`pickle`.
1161
1162 Module :mod:`copy`
1163 Shallow and deep object copying.
1164
1165 Module :mod:`marshal`
1166 High-performance serialization of built-in types.
1167
1168
Georg Brandl116aa622007-08-15 14:28:22 +00001169.. rubric:: Footnotes
1170
1171.. [#] Don't confuse this with the :mod:`marshal` module
1172
Ethan Furman2498d9e2013-10-18 00:45:40 -07001173.. [#] This is why :keyword:`lambda` functions cannot be pickled: all
Serhiy Storchaka2b57c432018-12-19 08:09:46 +02001174 :keyword:`!lambda` functions share the same name: ``<lambda>``.
Ethan Furman2498d9e2013-10-18 00:45:40 -07001175
Georg Brandl116aa622007-08-15 14:28:22 +00001176.. [#] The exception raised will likely be an :exc:`ImportError` or an
1177 :exc:`AttributeError` but it could be something else.
1178
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +00001179.. [#] The :mod:`copy` module uses this protocol for shallow and deep copying
1180 operations.
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +00001181
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001182.. [#] The limitation on alphanumeric characters is due to the fact
1183 the persistent IDs, in protocol 0, are delimited by the newline
1184 character. Therefore if any kind of newline characters occurs in
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +00001185 persistent IDs, the resulting pickle will become unreadable.