blob: 6aa30492c7060cc0e74f79924c0cadc822479673 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001:mod:`pickle` --- Python object serialization
2=============================================
3
Terry Jan Reedyfa089b92016-06-11 15:02:54 -04004.. module:: pickle
5 :synopsis: Convert Python objects to streams of bytes and back.
6
7.. sectionauthor:: Jim Kerr <jbkerr@sr.hp.com>.
8.. sectionauthor:: Barry Warsaw <barry@python.org>
9
10**Source code:** :source:`Lib/pickle.py`
11
Georg Brandl116aa622007-08-15 14:28:22 +000012.. index::
13 single: persistence
14 pair: persistent; objects
15 pair: serializing; objects
16 pair: marshalling; objects
17 pair: flattening; objects
18 pair: pickling; objects
19
Terry Jan Reedyfa089b92016-06-11 15:02:54 -040020--------------
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +000021
Antoine Pitroud4d60552013-12-07 00:56:59 +010022The :mod:`pickle` module implements binary protocols for serializing and
23de-serializing a Python object structure. *"Pickling"* is the process
24whereby a Python object hierarchy is converted into a byte stream, and
25*"unpickling"* is the inverse operation, whereby a byte stream
26(from a :term:`binary file` or :term:`bytes-like object`) is converted
27back into an object hierarchy. Pickling (and unpickling) is alternatively
28known as "serialization", "marshalling," [#]_ or "flattening"; however, to
29avoid confusion, the terms used here are "pickling" and "unpickling".
Georg Brandl116aa622007-08-15 14:28:22 +000030
Georg Brandl0036bcf2010-10-17 10:24:54 +000031.. warning::
32
Benjamin Peterson7dcbf902015-07-06 11:28:07 -050033 The :mod:`pickle` module is not secure against erroneous or maliciously
Benjamin Petersonb8fd2622015-07-06 09:40:43 -050034 constructed data. Never unpickle data received from an untrusted or
35 unauthenticated source.
Georg Brandl0036bcf2010-10-17 10:24:54 +000036
Georg Brandl116aa622007-08-15 14:28:22 +000037
38Relationship to other Python modules
39------------------------------------
40
Antoine Pitroud4d60552013-12-07 00:56:59 +010041Comparison with ``marshal``
42^^^^^^^^^^^^^^^^^^^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +000043
44Python has a more primitive serialization module called :mod:`marshal`, but in
45general :mod:`pickle` should always be the preferred way to serialize Python
46objects. :mod:`marshal` exists primarily to support Python's :file:`.pyc`
47files.
48
Georg Brandl5aa580f2010-11-30 14:57:54 +000049The :mod:`pickle` module differs from :mod:`marshal` in several significant ways:
Georg Brandl116aa622007-08-15 14:28:22 +000050
51* The :mod:`pickle` module keeps track of the objects it has already serialized,
52 so that later references to the same object won't be serialized again.
53 :mod:`marshal` doesn't do this.
54
55 This has implications both for recursive objects and object sharing. Recursive
56 objects are objects that contain references to themselves. These are not
57 handled by marshal, and in fact, attempting to marshal recursive objects will
58 crash your Python interpreter. Object sharing happens when there are multiple
59 references to the same object in different places in the object hierarchy being
60 serialized. :mod:`pickle` stores such objects only once, and ensures that all
61 other references point to the master copy. Shared objects remain shared, which
62 can be very important for mutable objects.
63
64* :mod:`marshal` cannot be used to serialize user-defined classes and their
65 instances. :mod:`pickle` can save and restore class instances transparently,
66 however the class definition must be importable and live in the same module as
67 when the object was stored.
68
69* The :mod:`marshal` serialization format is not guaranteed to be portable
70 across Python versions. Because its primary job in life is to support
71 :file:`.pyc` files, the Python implementers reserve the right to change the
72 serialization format in non-backwards compatible ways should the need arise.
73 The :mod:`pickle` serialization format is guaranteed to be backwards compatible
Gregory P. Smithe3287532018-12-09 11:42:58 -080074 across Python releases provided a compatible pickle protocol is chosen and
75 pickling and unpickling code deals with Python 2 to Python 3 type differences
76 if your data is crossing that unique breaking change language boundary.
Georg Brandl116aa622007-08-15 14:28:22 +000077
Antoine Pitroud4d60552013-12-07 00:56:59 +010078Comparison with ``json``
79^^^^^^^^^^^^^^^^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +000080
Antoine Pitroud4d60552013-12-07 00:56:59 +010081There are fundamental differences between the pickle protocols and
82`JSON (JavaScript Object Notation) <http://json.org>`_:
83
84* JSON is a text serialization format (it outputs unicode text, although
85 most of the time it is then encoded to ``utf-8``), while pickle is
86 a binary serialization format;
87
88* JSON is human-readable, while pickle is not;
89
90* JSON is interoperable and widely used outside of the Python ecosystem,
91 while pickle is Python-specific;
92
93* JSON, by default, can only represent a subset of the Python built-in
94 types, and no custom classes; pickle can represent an extremely large
95 number of Python types (many of them automatically, by clever usage
96 of Python's introspection facilities; complex cases can be tackled by
97 implementing :ref:`specific object APIs <pickle-inst>`).
98
99.. seealso::
100 The :mod:`json` module: a standard library module allowing JSON
101 serialization and deserialization.
Georg Brandl116aa622007-08-15 14:28:22 +0000102
Antoine Pitrou9bcb1122013-12-07 01:05:57 +0100103
104.. _pickle-protocols:
105
Georg Brandl116aa622007-08-15 14:28:22 +0000106Data stream format
107------------------
108
109.. index::
Georg Brandl116aa622007-08-15 14:28:22 +0000110 single: External Data Representation
111
112The data format used by :mod:`pickle` is Python-specific. This has the
113advantage that there are no restrictions imposed by external standards such as
Antoine Pitroua9494f62012-05-10 15:38:30 +0200114JSON or XDR (which can't represent pointer sharing); however it means that
115non-Python programs may not be able to reconstruct pickled Python objects.
Georg Brandl116aa622007-08-15 14:28:22 +0000116
Antoine Pitroua9494f62012-05-10 15:38:30 +0200117By default, the :mod:`pickle` data format uses a relatively compact binary
118representation. If you need optimal size characteristics, you can efficiently
119:doc:`compress <archiving>` pickled data.
120
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000121The module :mod:`pickletools` contains tools for analyzing data streams
Antoine Pitroua9494f62012-05-10 15:38:30 +0200122generated by :mod:`pickle`. :mod:`pickletools` source code has extensive
123comments about opcodes used by pickle protocols.
Georg Brandl116aa622007-08-15 14:28:22 +0000124
Antoine Pitroub6457242014-01-21 02:39:54 +0100125There are currently 5 different protocols which can be used for pickling.
126The higher the protocol used, the more recent the version of Python needed
127to read the pickle produced.
Georg Brandl116aa622007-08-15 14:28:22 +0000128
Antoine Pitroua9494f62012-05-10 15:38:30 +0200129* Protocol version 0 is the original "human-readable" protocol and is
Alexandre Vassalottif7d08c72009-01-23 04:50:05 +0000130 backwards compatible with earlier versions of Python.
Georg Brandl116aa622007-08-15 14:28:22 +0000131
Antoine Pitroua9494f62012-05-10 15:38:30 +0200132* Protocol version 1 is an old binary format which is also compatible with
Georg Brandl116aa622007-08-15 14:28:22 +0000133 earlier versions of Python.
134
135* Protocol version 2 was introduced in Python 2.3. It provides much more
Antoine Pitroua9494f62012-05-10 15:38:30 +0200136 efficient pickling of :term:`new-style class`\es. Refer to :pep:`307` for
137 information about improvements brought by protocol 2.
Georg Brandl116aa622007-08-15 14:28:22 +0000138
Antoine Pitrou9bcb1122013-12-07 01:05:57 +0100139* Protocol version 3 was added in Python 3.0. It has explicit support for
Ɓukasz Langac51d8c92018-04-03 23:06:53 -0700140 :class:`bytes` objects and cannot be unpickled by Python 2.x. This was
141 the default protocol in Python 3.0--3.7.
Antoine Pitrou9bcb1122013-12-07 01:05:57 +0100142
143* Protocol version 4 was added in Python 3.4. It adds support for very large
144 objects, pickling more kinds of objects, and some data format
Ɓukasz Langac51d8c92018-04-03 23:06:53 -0700145 optimizations. It is the default protocol starting with Python 3.8.
146 Refer to :pep:`3154` for information about improvements brought by
147 protocol 4.
Georg Brandl116aa622007-08-15 14:28:22 +0000148
Antoine Pitroud4d60552013-12-07 00:56:59 +0100149.. note::
150 Serialization is a more primitive notion than persistence; although
151 :mod:`pickle` reads and writes file objects, it does not handle the issue of
152 naming persistent objects, nor the (even more complicated) issue of concurrent
153 access to persistent objects. The :mod:`pickle` module can transform a complex
154 object into a byte stream and it can transform the byte stream into an object
155 with the same internal structure. Perhaps the most obvious thing to do with
156 these byte streams is to write them onto a file, but it is also conceivable to
157 send them across a network or store them in a database. The :mod:`shelve`
158 module provides a simple interface to pickle and unpickle objects on
159 DBM-style database files.
160
Georg Brandl116aa622007-08-15 14:28:22 +0000161
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000162Module Interface
163----------------
Georg Brandl116aa622007-08-15 14:28:22 +0000164
Antoine Pitroua9494f62012-05-10 15:38:30 +0200165To serialize an object hierarchy, you simply call the :func:`dumps` function.
166Similarly, to de-serialize a data stream, you call the :func:`loads` function.
167However, if you want more control over serialization and de-serialization,
168you can create a :class:`Pickler` or an :class:`Unpickler` object, respectively.
169
170The :mod:`pickle` module provides the following constants:
Georg Brandl116aa622007-08-15 14:28:22 +0000171
172
173.. data:: HIGHEST_PROTOCOL
174
Antoine Pitrou9bcb1122013-12-07 01:05:57 +0100175 An integer, the highest :ref:`protocol version <pickle-protocols>`
176 available. This value can be passed as a *protocol* value to functions
177 :func:`dump` and :func:`dumps` as well as the :class:`Pickler`
178 constructor.
Georg Brandl116aa622007-08-15 14:28:22 +0000179
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000180.. data:: DEFAULT_PROTOCOL
181
Antoine Pitrou9bcb1122013-12-07 01:05:57 +0100182 An integer, the default :ref:`protocol version <pickle-protocols>` used
183 for pickling. May be less than :data:`HIGHEST_PROTOCOL`. Currently the
Ɓukasz Langac51d8c92018-04-03 23:06:53 -0700184 default protocol is 4, first introduced in Python 3.4 and incompatible
185 with previous versions.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000186
Ɓukasz Langac51d8c92018-04-03 23:06:53 -0700187 .. versionchanged:: 3.0
188
189 The default protocol is 3.
190
191 .. versionchanged:: 3.8
192
193 The default protocol is 4.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000194
Georg Brandl116aa622007-08-15 14:28:22 +0000195The :mod:`pickle` module provides the following functions to make the pickling
196process more convenient:
197
Antoine Pitrou91f43802019-05-26 17:10:09 +0200198.. function:: dump(obj, file, protocol=None, \*, fix_imports=True, buffer_callback=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000199
Antoine Pitrou11cb9612010-09-15 11:11:28 +0000200 Write a pickled representation of *obj* to the open :term:`file object` *file*.
201 This is equivalent to ``Pickler(file, protocol).dump(obj)``.
Georg Brandl116aa622007-08-15 14:28:22 +0000202
Antoine Pitrou91f43802019-05-26 17:10:09 +0200203 Arguments *file*, *protocol*, *fix_imports* and *buffer_callback* have
204 the same meaning as in the :class:`Pickler` constructor.
Georg Brandl116aa622007-08-15 14:28:22 +0000205
Antoine Pitrou91f43802019-05-26 17:10:09 +0200206 .. versionchanged:: 3.8
207 The *buffer_callback* argument was added.
Georg Brandl116aa622007-08-15 14:28:22 +0000208
Antoine Pitrou91f43802019-05-26 17:10:09 +0200209.. function:: dumps(obj, protocol=None, \*, fix_imports=True, buffer_callback=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000210
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800211 Return the pickled representation of the object as a :class:`bytes` object,
212 instead of writing it to a file.
Georg Brandl116aa622007-08-15 14:28:22 +0000213
Antoine Pitrou91f43802019-05-26 17:10:09 +0200214 Arguments *protocol*, *fix_imports* and *buffer_callback* have the same
215 meaning as in the :class:`Pickler` constructor.
Antoine Pitroud9dfaa92009-06-04 20:32:06 +0000216
Antoine Pitrou91f43802019-05-26 17:10:09 +0200217 .. versionchanged:: 3.8
218 The *buffer_callback* argument was added.
219
220.. function:: load(file, \*, fix_imports=True, encoding="ASCII", errors="strict", buffers=None)
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000221
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800222 Read a pickled object representation from the open :term:`file object`
223 *file* and return the reconstituted object hierarchy specified therein.
224 This is equivalent to ``Unpickler(file).load()``.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000225
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800226 The protocol version of the pickle is detected automatically, so no
227 protocol argument is needed. Bytes past the pickled object's
228 representation are ignored.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000229
Antoine Pitrou91f43802019-05-26 17:10:09 +0200230 Arguments *file*, *fix_imports*, *encoding*, *errors*, *strict* and *buffers*
231 have the same meaning as in the :class:`Unpickler` constructor.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000232
Antoine Pitrou91f43802019-05-26 17:10:09 +0200233 .. versionchanged:: 3.8
234 The *buffers* argument was added.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000235
Antoine Pitrou91f43802019-05-26 17:10:09 +0200236.. function:: loads(bytes_object, \*, fix_imports=True, encoding="ASCII", errors="strict", buffers=None)
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000237
238 Read a pickled object hierarchy from a :class:`bytes` object and return the
Martin Panterd21e0b52015-10-10 10:36:22 +0000239 reconstituted object hierarchy specified therein.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000240
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800241 The protocol version of the pickle is detected automatically, so no
242 protocol argument is needed. Bytes past the pickled object's
243 representation are ignored.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000244
Antoine Pitrou91f43802019-05-26 17:10:09 +0200245 Arguments *file*, *fix_imports*, *encoding*, *errors*, *strict* and *buffers*
246 have the same meaning as in the :class:`Unpickler` constructor.
247
248 .. versionchanged:: 3.8
249 The *buffers* argument was added.
Georg Brandl116aa622007-08-15 14:28:22 +0000250
Georg Brandl116aa622007-08-15 14:28:22 +0000251
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000252The :mod:`pickle` module defines three exceptions:
Georg Brandl116aa622007-08-15 14:28:22 +0000253
254.. exception:: PickleError
255
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000256 Common base class for the other pickling exceptions. It inherits
Georg Brandl116aa622007-08-15 14:28:22 +0000257 :exc:`Exception`.
258
Georg Brandl116aa622007-08-15 14:28:22 +0000259.. exception:: PicklingError
260
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000261 Error raised when an unpicklable object is encountered by :class:`Pickler`.
262 It inherits :exc:`PickleError`.
Georg Brandl116aa622007-08-15 14:28:22 +0000263
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000264 Refer to :ref:`pickle-picklable` to learn what kinds of objects can be
265 pickled.
266
Georg Brandl116aa622007-08-15 14:28:22 +0000267.. exception:: UnpicklingError
268
Ezio Melottie62aad32011-11-18 13:51:10 +0200269 Error raised when there is a problem unpickling an object, such as a data
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000270 corruption or a security violation. It inherits :exc:`PickleError`.
Georg Brandl116aa622007-08-15 14:28:22 +0000271
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000272 Note that other exceptions may also be raised during unpickling, including
273 (but not necessarily limited to) AttributeError, EOFError, ImportError, and
274 IndexError.
275
276
Antoine Pitrou91f43802019-05-26 17:10:09 +0200277The :mod:`pickle` module exports three classes, :class:`Pickler`,
278:class:`Unpickler` and :class:`PickleBuffer`:
Georg Brandl116aa622007-08-15 14:28:22 +0000279
Antoine Pitrou91f43802019-05-26 17:10:09 +0200280.. class:: Pickler(file, protocol=None, \*, fix_imports=True, buffer_callback=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000281
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000282 This takes a binary file for writing a pickle data stream.
Georg Brandl116aa622007-08-15 14:28:22 +0000283
Antoine Pitroub6457242014-01-21 02:39:54 +0100284 The optional *protocol* argument, an integer, tells the pickler to use
285 the given protocol; supported protocols are 0 to :data:`HIGHEST_PROTOCOL`.
286 If not specified, the default is :data:`DEFAULT_PROTOCOL`. If a negative
287 number is specified, :data:`HIGHEST_PROTOCOL` is selected.
Georg Brandl116aa622007-08-15 14:28:22 +0000288
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000289 The *file* argument must have a write() method that accepts a single bytes
Serhiy Storchakad65c9492015-11-02 14:10:23 +0200290 argument. It can thus be an on-disk file opened for binary writing, an
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800291 :class:`io.BytesIO` instance, or any other custom object that meets this
292 interface.
Georg Brandl116aa622007-08-15 14:28:22 +0000293
Serhiy Storchakafbc1c262013-11-29 12:17:13 +0200294 If *fix_imports* is true and *protocol* is less than 3, pickle will try to
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800295 map the new Python 3 names to the old module names used in Python 2, so
296 that the pickle data stream is readable with Python 2.
Antoine Pitroud9dfaa92009-06-04 20:32:06 +0000297
Antoine Pitrou91f43802019-05-26 17:10:09 +0200298 If *buffer_callback* is None (the default), buffer views are
299 serialized into *file* as part of the pickle stream.
300
301 If *buffer_callback* is not None, then it can be called any number
302 of times with a buffer view. If the callback returns a false value
303 (such as None), the given buffer is :ref:`out-of-band <pickle-oob>`;
304 otherwise the buffer is serialized in-band, i.e. inside the pickle stream.
305
306 It is an error if *buffer_callback* is not None and *protocol* is
307 None or smaller than 5.
308
309 .. versionchanged:: 3.8
310 The *buffer_callback* argument was added.
311
Benjamin Petersone41251e2008-04-25 01:59:09 +0000312 .. method:: dump(obj)
Georg Brandl116aa622007-08-15 14:28:22 +0000313
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000314 Write a pickled representation of *obj* to the open file object given in
315 the constructor.
Georg Brandl116aa622007-08-15 14:28:22 +0000316
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000317 .. method:: persistent_id(obj)
318
319 Do nothing by default. This exists so a subclass can override it.
320
321 If :meth:`persistent_id` returns ``None``, *obj* is pickled as usual. Any
322 other value causes :class:`Pickler` to emit the returned value as a
323 persistent ID for *obj*. The meaning of this persistent ID should be
324 defined by :meth:`Unpickler.persistent_load`. Note that the value
325 returned by :meth:`persistent_id` cannot itself have a persistent ID.
326
327 See :ref:`pickle-persistent` for details and examples of uses.
Georg Brandl116aa622007-08-15 14:28:22 +0000328
Antoine Pitrou8d3c2902012-03-04 18:31:48 +0100329 .. attribute:: dispatch_table
330
331 A pickler object's dispatch table is a registry of *reduction
332 functions* of the kind which can be declared using
333 :func:`copyreg.pickle`. It is a mapping whose keys are classes
334 and whose values are reduction functions. A reduction function
335 takes a single argument of the associated class and should
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300336 conform to the same interface as a :meth:`__reduce__`
Antoine Pitrou8d3c2902012-03-04 18:31:48 +0100337 method.
338
339 By default, a pickler object will not have a
340 :attr:`dispatch_table` attribute, and it will instead use the
341 global dispatch table managed by the :mod:`copyreg` module.
342 However, to customize the pickling for a specific pickler object
343 one can set the :attr:`dispatch_table` attribute to a dict-like
344 object. Alternatively, if a subclass of :class:`Pickler` has a
345 :attr:`dispatch_table` attribute then this will be used as the
346 default dispatch table for instances of that class.
347
348 See :ref:`pickle-dispatch` for usage examples.
349
350 .. versionadded:: 3.3
351
Pierre Glaser289f1f82019-05-08 23:08:25 +0200352 .. method:: reducer_override(self, obj)
353
354 Special reducer that can be defined in :class:`Pickler` subclasses. This
355 method has priority over any reducer in the :attr:`dispatch_table`. It
356 should conform to the same interface as a :meth:`__reduce__` method, and
357 can optionally return ``NotImplemented`` to fallback on
358 :attr:`dispatch_table`-registered reducers to pickle ``obj``.
359
360 For a detailed example, see :ref:`reducer_override`.
361
362 .. versionadded:: 3.8
363
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000364 .. attribute:: fast
365
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000366 Deprecated. Enable fast mode if set to a true value. The fast mode
367 disables the usage of memo, therefore speeding the pickling process by not
368 generating superfluous PUT opcodes. It should not be used with
369 self-referential objects, doing otherwise will cause :class:`Pickler` to
370 recurse infinitely.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000371
372 Use :func:`pickletools.optimize` if you need more compact pickles.
373
Georg Brandl116aa622007-08-15 14:28:22 +0000374
Antoine Pitrou91f43802019-05-26 17:10:09 +0200375.. class:: Unpickler(file, \*, fix_imports=True, encoding="ASCII", errors="strict", buffers=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000376
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000377 This takes a binary file for reading a pickle data stream.
Georg Brandl116aa622007-08-15 14:28:22 +0000378
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000379 The protocol version of the pickle is detected automatically, so no
380 protocol argument is needed.
381
Antoine Pitrou91f43802019-05-26 17:10:09 +0200382 The argument *file* must have three methods, a read() method that takes an
383 integer argument, a readinto() method that takes a buffer argument
384 and a readline() method that requires no arguments, as in the
385 :class:`io.BufferedIOBase` interface. Thus *file* can be an on-disk file
Martin Panter7462b6492015-11-02 03:37:02 +0000386 opened for binary reading, an :class:`io.BytesIO` object, or any other
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800387 custom object that meets this interface.
Georg Brandl116aa622007-08-15 14:28:22 +0000388
Antoine Pitrou91f43802019-05-26 17:10:09 +0200389 The optional arguments *fix_imports*, *encoding* and *errors* are used
390 to control compatibility support for pickle stream generated by Python 2.
391 If *fix_imports* is true, pickle will try to map the old Python 2 names
392 to the new names used in Python 3. The *encoding* and *errors* tell
393 pickle how to decode 8-bit string instances pickled by Python 2;
394 these default to 'ASCII' and 'strict', respectively. The *encoding* can
Sebastian Pucilowskia8d25a12017-12-21 20:00:49 +1100395 be 'bytes' to read these 8-bit string instances as bytes objects.
Antoine Pitrou91f43802019-05-26 17:10:09 +0200396 Using ``encoding='latin1'`` is required for unpickling NumPy arrays and
397 instances of :class:`~datetime.datetime`, :class:`~datetime.date` and
398 :class:`~datetime.time` pickled by Python 2.
399
400 If *buffers* is None (the default), then all data necessary for
401 deserialization must be contained in the pickle stream. This means
402 that the *buffer_callback* argument was None when a :class:`Pickler`
403 was instantiated (or when :func:`dump` or :func:`dumps` was called).
404
405 If *buffers* is not None, it should be an iterable of buffer-enabled
406 objects that is consumed each time the pickle stream references
407 an :ref:`out-of-band <pickle-oob>` buffer view. Such buffers have been
408 given in order to the *buffer_callback* of a Pickler object.
409
410 .. versionchanged:: 3.8
411 The *buffers* argument was added.
Georg Brandl116aa622007-08-15 14:28:22 +0000412
Benjamin Petersone41251e2008-04-25 01:59:09 +0000413 .. method:: load()
Georg Brandl116aa622007-08-15 14:28:22 +0000414
Benjamin Petersone41251e2008-04-25 01:59:09 +0000415 Read a pickled object representation from the open file object given in
416 the constructor, and return the reconstituted object hierarchy specified
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000417 therein. Bytes past the pickled object's representation are ignored.
Georg Brandl116aa622007-08-15 14:28:22 +0000418
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000419 .. method:: persistent_load(pid)
Georg Brandl116aa622007-08-15 14:28:22 +0000420
Ezio Melottie62aad32011-11-18 13:51:10 +0200421 Raise an :exc:`UnpicklingError` by default.
Georg Brandl116aa622007-08-15 14:28:22 +0000422
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000423 If defined, :meth:`persistent_load` should return the object specified by
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000424 the persistent ID *pid*. If an invalid persistent ID is encountered, an
Ezio Melottie62aad32011-11-18 13:51:10 +0200425 :exc:`UnpicklingError` should be raised.
Georg Brandl116aa622007-08-15 14:28:22 +0000426
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000427 See :ref:`pickle-persistent` for details and examples of uses.
428
429 .. method:: find_class(module, name)
430
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000431 Import *module* if necessary and return the object called *name* from it,
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000432 where the *module* and *name* arguments are :class:`str` objects. Note,
433 unlike its name suggests, :meth:`find_class` is also used for finding
434 functions.
Georg Brandl116aa622007-08-15 14:28:22 +0000435
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000436 Subclasses may override this to gain control over what type of objects and
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000437 how they can be loaded, potentially reducing security risks. Refer to
438 :ref:`pickle-restrict` for details.
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000439
Steve Dowerb82e17e2019-05-23 08:45:22 -0700440 .. audit-event:: pickle.find_class "module name"
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000441
Antoine Pitrou91f43802019-05-26 17:10:09 +0200442.. class:: PickleBuffer(buffer)
443
444 A wrapper for a buffer representing picklable data. *buffer* must be a
445 :ref:`buffer-providing <bufferobjects>` object, such as a
446 :term:`bytes-like object` or a N-dimensional array.
447
448 :class:`PickleBuffer` is itself a buffer provider, therefore it is
449 possible to pass it to other APIs expecting a buffer-providing object,
450 such as :class:`memoryview`.
451
452 :class:`PickleBuffer` objects can only be serialized using pickle
453 protocol 5 or higher. They are eligible for
454 :ref:`out-of-band serialization <pickle-oob>`.
455
456 .. versionadded:: 3.8
457
458 .. method:: raw()
459
460 Return a :class:`memoryview` of the memory area underlying this buffer.
461 The returned object is a one-dimensional, C-contiguous memoryview
462 with format ``B`` (unsigned bytes). :exc:`BufferError` is raised if
463 the buffer is neither C- nor Fortran-contiguous.
464
465 .. method:: release()
466
467 Release the underlying buffer exposed by the PickleBuffer object.
468
469
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000470.. _pickle-picklable:
Georg Brandl116aa622007-08-15 14:28:22 +0000471
472What can be pickled and unpickled?
473----------------------------------
474
475The following types can be pickled:
476
477* ``None``, ``True``, and ``False``
478
Georg Brandlba956ae2007-11-29 17:24:34 +0000479* integers, floating point numbers, complex numbers
Georg Brandl116aa622007-08-15 14:28:22 +0000480
Georg Brandlf6945182008-02-01 11:56:49 +0000481* strings, bytes, bytearrays
Georg Brandl116aa622007-08-15 14:28:22 +0000482
483* tuples, lists, sets, and dictionaries containing only picklable objects
484
Ethan Furman2498d9e2013-10-18 00:45:40 -0700485* functions defined at the top level of a module (using :keyword:`def`, not
486 :keyword:`lambda`)
Georg Brandl116aa622007-08-15 14:28:22 +0000487
488* built-in functions defined at the top level of a module
489
490* classes that are defined at the top level of a module
491
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300492* instances of such classes whose :attr:`~object.__dict__` or the result of
493 calling :meth:`__getstate__` is picklable (see section :ref:`pickle-inst` for
Eli Bendersky78f3ce52013-01-02 05:53:59 -0800494 details).
Georg Brandl116aa622007-08-15 14:28:22 +0000495
496Attempts to pickle unpicklable objects will raise the :exc:`PicklingError`
497exception; when this happens, an unspecified number of bytes may have already
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000498been written to the underlying file. Trying to pickle a highly recursive data
Yury Selivanovf488fb42015-07-03 01:04:23 -0400499structure may exceed the maximum recursion depth, a :exc:`RecursionError` will be
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000500raised in this case. You can carefully raise this limit with
Georg Brandl116aa622007-08-15 14:28:22 +0000501:func:`sys.setrecursionlimit`.
502
503Note that functions (built-in and user-defined) are pickled by "fully qualified"
Ethan Furman2498d9e2013-10-18 00:45:40 -0700504name reference, not by value. [#]_ This means that only the function name is
Eli Bendersky78f3ce52013-01-02 05:53:59 -0800505pickled, along with the name of the module the function is defined in. Neither
506the function's code, nor any of its function attributes are pickled. Thus the
Georg Brandl116aa622007-08-15 14:28:22 +0000507defining module must be importable in the unpickling environment, and the module
508must contain the named object, otherwise an exception will be raised. [#]_
509
510Similarly, classes are pickled by named reference, so the same restrictions in
511the unpickling environment apply. Note that none of the class's code or data is
512pickled, so in the following example the class attribute ``attr`` is not
513restored in the unpickling environment::
514
515 class Foo:
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000516 attr = 'A class attribute'
Georg Brandl116aa622007-08-15 14:28:22 +0000517
518 picklestring = pickle.dumps(Foo)
519
520These restrictions are why picklable functions and classes must be defined in
521the top level of a module.
522
523Similarly, when class instances are pickled, their class's code and data are not
524pickled along with them. Only the instance data are pickled. This is done on
525purpose, so you can fix bugs in a class or add methods to the class and still
526load objects that were created with an earlier version of the class. If you
527plan to have long-lived objects that will see many versions of a class, it may
528be worthwhile to put a version number in the objects so that suitable
529conversions can be made by the class's :meth:`__setstate__` method.
530
531
Georg Brandl116aa622007-08-15 14:28:22 +0000532.. _pickle-inst:
533
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000534Pickling Class Instances
535------------------------
Georg Brandl116aa622007-08-15 14:28:22 +0000536
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300537.. currentmodule:: None
538
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000539In this section, we describe the general mechanisms available to you to define,
540customize, and control how class instances are pickled and unpickled.
Georg Brandl116aa622007-08-15 14:28:22 +0000541
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000542In most cases, no additional code is needed to make instances picklable. By
543default, pickle will retrieve the class and the attributes of an instance via
544introspection. When a class instance is unpickled, its :meth:`__init__` method
545is usually *not* invoked. The default behaviour first creates an uninitialized
546instance and then restores the saved attributes. The following code shows an
547implementation of this behaviour::
Georg Brandl85eb8c12007-08-31 16:33:38 +0000548
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000549 def save(obj):
550 return (obj.__class__, obj.__dict__)
551
552 def load(cls, attributes):
553 obj = cls.__new__(cls)
554 obj.__dict__.update(attributes)
555 return obj
Georg Brandl116aa622007-08-15 14:28:22 +0000556
Georg Brandl6faee4e2010-09-21 14:48:28 +0000557Classes can alter the default behaviour by providing one or several special
Georg Brandlc8148262010-10-17 11:13:37 +0000558methods:
Georg Brandl116aa622007-08-15 14:28:22 +0000559
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100560.. method:: object.__getnewargs_ex__()
561
Serhiy Storchakab6d84832015-10-13 21:26:35 +0300562 In protocols 2 and newer, classes that implements the
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100563 :meth:`__getnewargs_ex__` method can dictate the values passed to the
564 :meth:`__new__` method upon unpickling. The method must return a pair
565 ``(args, kwargs)`` where *args* is a tuple of positional arguments
566 and *kwargs* a dictionary of named arguments for constructing the
567 object. Those will be passed to the :meth:`__new__` method upon
568 unpickling.
569
570 You should implement this method if the :meth:`__new__` method of your
571 class requires keyword-only arguments. Otherwise, it is recommended for
572 compatibility to implement :meth:`__getnewargs__`.
573
Serhiy Storchakab6d84832015-10-13 21:26:35 +0300574 .. versionchanged:: 3.6
575 :meth:`__getnewargs_ex__` is now used in protocols 2 and 3.
576
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100577
Georg Brandlc8148262010-10-17 11:13:37 +0000578.. method:: object.__getnewargs__()
Georg Brandl116aa622007-08-15 14:28:22 +0000579
Andrés Delfino0e0534c2018-06-09 21:41:09 -0300580 This method serves a similar purpose as :meth:`__getnewargs_ex__`, but
Serhiy Storchakab6d84832015-10-13 21:26:35 +0300581 supports only positional arguments. It must return a tuple of arguments
582 ``args`` which will be passed to the :meth:`__new__` method upon unpickling.
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100583
Serhiy Storchakab6d84832015-10-13 21:26:35 +0300584 :meth:`__getnewargs__` will not be called if :meth:`__getnewargs_ex__` is
585 defined.
586
587 .. versionchanged:: 3.6
588 Before Python 3.6, :meth:`__getnewargs__` was called instead of
589 :meth:`__getnewargs_ex__` in protocols 2 and 3.
Georg Brandl116aa622007-08-15 14:28:22 +0000590
Georg Brandl116aa622007-08-15 14:28:22 +0000591
Georg Brandlc8148262010-10-17 11:13:37 +0000592.. method:: object.__getstate__()
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000593
Georg Brandlc8148262010-10-17 11:13:37 +0000594 Classes can further influence how their instances are pickled; if the class
595 defines the method :meth:`__getstate__`, it is called and the returned object
596 is pickled as the contents for the instance, instead of the contents of the
597 instance's dictionary. If the :meth:`__getstate__` method is absent, the
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300598 instance's :attr:`~object.__dict__` is pickled as usual.
Georg Brandl116aa622007-08-15 14:28:22 +0000599
Georg Brandlc8148262010-10-17 11:13:37 +0000600
601.. method:: object.__setstate__(state)
602
603 Upon unpickling, if the class defines :meth:`__setstate__`, it is called with
604 the unpickled state. In that case, there is no requirement for the state
605 object to be a dictionary. Otherwise, the pickled state must be a dictionary
606 and its items are assigned to the new instance's dictionary.
607
608 .. note::
609
610 If :meth:`__getstate__` returns a false value, the :meth:`__setstate__`
611 method will not be called upon unpickling.
612
Georg Brandl116aa622007-08-15 14:28:22 +0000613
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000614Refer to the section :ref:`pickle-state` for more information about how to use
615the methods :meth:`__getstate__` and :meth:`__setstate__`.
Georg Brandl116aa622007-08-15 14:28:22 +0000616
Benjamin Petersond23f8222009-04-05 19:13:16 +0000617.. note::
Georg Brandle720c0a2009-04-27 16:20:50 +0000618
Benjamin Petersond23f8222009-04-05 19:13:16 +0000619 At unpickling time, some methods like :meth:`__getattr__`,
620 :meth:`__getattribute__`, or :meth:`__setattr__` may be called upon the
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100621 instance. In case those methods rely on some internal invariant being
622 true, the type should implement :meth:`__getnewargs__` or
623 :meth:`__getnewargs_ex__` to establish such an invariant; otherwise,
624 neither :meth:`__new__` nor :meth:`__init__` will be called.
Benjamin Petersond23f8222009-04-05 19:13:16 +0000625
Georg Brandlc8148262010-10-17 11:13:37 +0000626.. index:: pair: copy; protocol
Christian Heimes05e8be12008-02-23 18:30:17 +0000627
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000628As we shall see, pickle does not use directly the methods described above. In
629fact, these methods are part of the copy protocol which implements the
630:meth:`__reduce__` special method. The copy protocol provides a unified
631interface for retrieving the data necessary for pickling and copying
Georg Brandl48310cd2009-01-03 21:18:54 +0000632objects. [#]_
Georg Brandl116aa622007-08-15 14:28:22 +0000633
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000634Although powerful, implementing :meth:`__reduce__` directly in your classes is
635error prone. For this reason, class designers should use the high-level
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100636interface (i.e., :meth:`__getnewargs_ex__`, :meth:`__getstate__` and
Georg Brandlc8148262010-10-17 11:13:37 +0000637:meth:`__setstate__`) whenever possible. We will show, however, cases where
638using :meth:`__reduce__` is the only option or leads to more efficient pickling
639or both.
Georg Brandl116aa622007-08-15 14:28:22 +0000640
Georg Brandlc8148262010-10-17 11:13:37 +0000641.. method:: object.__reduce__()
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000642
Georg Brandlc8148262010-10-17 11:13:37 +0000643 The interface is currently defined as follows. The :meth:`__reduce__` method
644 takes no argument and shall return either a string or preferably a tuple (the
645 returned object is often referred to as the "reduce value").
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000646
Georg Brandlc8148262010-10-17 11:13:37 +0000647 If a string is returned, the string should be interpreted as the name of a
648 global variable. It should be the object's local name relative to its
649 module; the pickle module searches the module namespace to determine the
650 object's module. This behaviour is typically useful for singletons.
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000651
Pierre Glaser65d98d02019-05-08 21:40:25 +0200652 When a tuple is returned, it must be between two and six items long.
Georg Brandlc8148262010-10-17 11:13:37 +0000653 Optional items can either be omitted, or ``None`` can be provided as their
654 value. The semantics of each item are in order:
Georg Brandl116aa622007-08-15 14:28:22 +0000655
Georg Brandlc8148262010-10-17 11:13:37 +0000656 .. XXX Mention __newobj__ special-case?
Georg Brandl116aa622007-08-15 14:28:22 +0000657
Georg Brandlc8148262010-10-17 11:13:37 +0000658 * A callable object that will be called to create the initial version of the
659 object.
Georg Brandl116aa622007-08-15 14:28:22 +0000660
Georg Brandlc8148262010-10-17 11:13:37 +0000661 * A tuple of arguments for the callable object. An empty tuple must be given
662 if the callable does not accept any argument.
Georg Brandl116aa622007-08-15 14:28:22 +0000663
Georg Brandlc8148262010-10-17 11:13:37 +0000664 * Optionally, the object's state, which will be passed to the object's
665 :meth:`__setstate__` method as previously described. If the object has no
666 such method then, the value must be a dictionary and it will be added to
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300667 the object's :attr:`~object.__dict__` attribute.
Georg Brandl116aa622007-08-15 14:28:22 +0000668
Georg Brandlc8148262010-10-17 11:13:37 +0000669 * Optionally, an iterator (and not a sequence) yielding successive items.
670 These items will be appended to the object either using
671 ``obj.append(item)`` or, in batch, using ``obj.extend(list_of_items)``.
672 This is primarily used for list subclasses, but may be used by other
673 classes as long as they have :meth:`append` and :meth:`extend` methods with
674 the appropriate signature. (Whether :meth:`append` or :meth:`extend` is
675 used depends on which pickle protocol version is used as well as the number
676 of items to append, so both must be supported.)
Georg Brandl116aa622007-08-15 14:28:22 +0000677
Georg Brandlc8148262010-10-17 11:13:37 +0000678 * Optionally, an iterator (not a sequence) yielding successive key-value
679 pairs. These items will be stored to the object using ``obj[key] =
680 value``. This is primarily used for dictionary subclasses, but may be used
681 by other classes as long as they implement :meth:`__setitem__`.
Georg Brandl116aa622007-08-15 14:28:22 +0000682
Pierre Glaser65d98d02019-05-08 21:40:25 +0200683 * Optionally, a callable with a ``(obj, state)`` signature. This
Xtreak9b5a0ef2019-05-16 10:04:24 +0530684 callable allows the user to programmatically control the state-updating
Pierre Glaser65d98d02019-05-08 21:40:25 +0200685 behavior of a specific object, instead of using ``obj``'s static
686 :meth:`__setstate__` method. If not ``None``, this callable will have
687 priority over ``obj``'s :meth:`__setstate__`.
688
689 .. versionadded:: 3.8
690 The optional sixth tuple item, ``(obj, state)``, was added.
691
Georg Brandlc8148262010-10-17 11:13:37 +0000692
693.. method:: object.__reduce_ex__(protocol)
694
695 Alternatively, a :meth:`__reduce_ex__` method may be defined. The only
696 difference is this method should take a single integer argument, the protocol
697 version. When defined, pickle will prefer it over the :meth:`__reduce__`
698 method. In addition, :meth:`__reduce__` automatically becomes a synonym for
699 the extended version. The main use for this method is to provide
700 backwards-compatible reduce values for older Python releases.
Georg Brandl116aa622007-08-15 14:28:22 +0000701
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300702.. currentmodule:: pickle
703
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000704.. _pickle-persistent:
705
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000706Persistence of External Objects
707^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +0000708
Christian Heimes05e8be12008-02-23 18:30:17 +0000709.. index::
710 single: persistent_id (pickle protocol)
711 single: persistent_load (pickle protocol)
712
Georg Brandl116aa622007-08-15 14:28:22 +0000713For the benefit of object persistence, the :mod:`pickle` module supports the
714notion of a reference to an object outside the pickled data stream. Such
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000715objects are referenced by a persistent ID, which should be either a string of
716alphanumeric characters (for protocol 0) [#]_ or just an arbitrary object (for
717any newer protocol).
Georg Brandl116aa622007-08-15 14:28:22 +0000718
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000719The resolution of such persistent IDs is not defined by the :mod:`pickle`
720module; it will delegate this resolution to the user defined methods on the
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300721pickler and unpickler, :meth:`~Pickler.persistent_id` and
722:meth:`~Unpickler.persistent_load` respectively.
Georg Brandl116aa622007-08-15 14:28:22 +0000723
724To pickle objects that have an external persistent id, the pickler must have a
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300725custom :meth:`~Pickler.persistent_id` method that takes an object as an
726argument and returns either ``None`` or the persistent id for that object.
727When ``None`` is returned, the pickler simply pickles the object as normal.
728When a persistent ID string is returned, the pickler will pickle that object,
729along with a marker so that the unpickler will recognize it as a persistent ID.
Georg Brandl116aa622007-08-15 14:28:22 +0000730
731To unpickle external objects, the unpickler must have a custom
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300732:meth:`~Unpickler.persistent_load` method that takes a persistent ID object and
733returns the referenced object.
Georg Brandl116aa622007-08-15 14:28:22 +0000734
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000735Here is a comprehensive example presenting how persistent ID can be used to
736pickle external objects by reference.
Georg Brandl116aa622007-08-15 14:28:22 +0000737
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000738.. literalinclude:: ../includes/dbpickle.py
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +0000739
Antoine Pitrou8d3c2902012-03-04 18:31:48 +0100740.. _pickle-dispatch:
741
742Dispatch Tables
743^^^^^^^^^^^^^^^
744
745If one wants to customize pickling of some classes without disturbing
746any other code which depends on pickling, then one can create a
747pickler with a private dispatch table.
748
749The global dispatch table managed by the :mod:`copyreg` module is
750available as :data:`copyreg.dispatch_table`. Therefore, one may
751choose to use a modified copy of :data:`copyreg.dispatch_table` as a
752private dispatch table.
753
754For example ::
755
756 f = io.BytesIO()
757 p = pickle.Pickler(f)
758 p.dispatch_table = copyreg.dispatch_table.copy()
759 p.dispatch_table[SomeClass] = reduce_SomeClass
760
761creates an instance of :class:`pickle.Pickler` with a private dispatch
762table which handles the ``SomeClass`` class specially. Alternatively,
763the code ::
764
765 class MyPickler(pickle.Pickler):
766 dispatch_table = copyreg.dispatch_table.copy()
767 dispatch_table[SomeClass] = reduce_SomeClass
768 f = io.BytesIO()
769 p = MyPickler(f)
770
771does the same, but all instances of ``MyPickler`` will by default
772share the same dispatch table. The equivalent code using the
773:mod:`copyreg` module is ::
774
775 copyreg.pickle(SomeClass, reduce_SomeClass)
776 f = io.BytesIO()
777 p = pickle.Pickler(f)
Georg Brandl116aa622007-08-15 14:28:22 +0000778
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000779.. _pickle-state:
780
781Handling Stateful Objects
782^^^^^^^^^^^^^^^^^^^^^^^^^
783
784.. index::
785 single: __getstate__() (copy protocol)
786 single: __setstate__() (copy protocol)
787
788Here's an example that shows how to modify pickling behavior for a class.
789The :class:`TextReader` class opens a text file, and returns the line number and
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300790line contents each time its :meth:`!readline` method is called. If a
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000791:class:`TextReader` instance is pickled, all attributes *except* the file object
792member are saved. When the instance is unpickled, the file is reopened, and
793reading resumes from the last location. The :meth:`__setstate__` and
794:meth:`__getstate__` methods are used to implement this behavior. ::
795
796 class TextReader:
797 """Print and number lines in a text file."""
798
799 def __init__(self, filename):
800 self.filename = filename
801 self.file = open(filename)
802 self.lineno = 0
803
804 def readline(self):
805 self.lineno += 1
806 line = self.file.readline()
807 if not line:
808 return None
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +0000809 if line.endswith('\n'):
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000810 line = line[:-1]
811 return "%i: %s" % (self.lineno, line)
812
813 def __getstate__(self):
814 # Copy the object's state from self.__dict__ which contains
815 # all our instance attributes. Always use the dict.copy()
816 # method to avoid modifying the original state.
817 state = self.__dict__.copy()
818 # Remove the unpicklable entries.
819 del state['file']
820 return state
821
822 def __setstate__(self, state):
823 # Restore instance attributes (i.e., filename and lineno).
824 self.__dict__.update(state)
825 # Restore the previously opened file's state. To do so, we need to
826 # reopen it and read from it until the line count is restored.
827 file = open(self.filename)
828 for _ in range(self.lineno):
829 file.readline()
830 # Finally, save the file.
831 self.file = file
832
833
834A sample usage might be something like this::
835
836 >>> reader = TextReader("hello.txt")
837 >>> reader.readline()
838 '1: Hello world!'
839 >>> reader.readline()
840 '2: I am line number two.'
841 >>> new_reader = pickle.loads(pickle.dumps(reader))
842 >>> new_reader.readline()
843 '3: Goodbye!'
844
Pierre Glaser289f1f82019-05-08 23:08:25 +0200845.. _reducer_override:
846
847Custom Reduction for Types, Functions, and Other Objects
848--------------------------------------------------------
849
850.. versionadded:: 3.8
851
852Sometimes, :attr:`~Pickler.dispatch_table` may not be flexible enough.
853In particular we may want to customize pickling based on another criterion
854than the object's type, or we may want to customize the pickling of
855functions and classes.
856
857For those cases, it is possible to subclass from the :class:`Pickler` class and
858implement a :meth:`~Pickler.reducer_override` method. This method can return an
859arbitrary reduction tuple (see :meth:`__reduce__`). It can alternatively return
860``NotImplemented`` to fallback to the traditional behavior.
861
862If both the :attr:`~Pickler.dispatch_table` and
863:meth:`~Pickler.reducer_override` are defined, then
864:meth:`~Pickler.reducer_override` method takes priority.
865
866.. Note::
867 For performance reasons, :meth:`~Pickler.reducer_override` may not be
868 called for the following objects: ``None``, ``True``, ``False``, and
869 exact instances of :class:`int`, :class:`float`, :class:`bytes`,
870 :class:`str`, :class:`dict`, :class:`set`, :class:`frozenset`, :class:`list`
871 and :class:`tuple`.
872
873Here is a simple example where we allow pickling and reconstructing
874a given class::
875
876 import io
877 import pickle
878
879 class MyClass:
880 my_attribute = 1
881
882 class MyPickler(pickle.Pickler):
883 def reducer_override(self, obj):
884 """Custom reducer for MyClass."""
885 if getattr(obj, "__name__", None) == "MyClass":
886 return type, (obj.__name__, obj.__bases__,
887 {'my_attribute': obj.my_attribute})
888 else:
889 # For any other object, fallback to usual reduction
890 return NotImplemented
891
892 f = io.BytesIO()
893 p = MyPickler(f)
894 p.dump(MyClass)
895
896 del MyClass
897
898 unpickled_class = pickle.loads(f.getvalue())
899
900 assert isinstance(unpickled_class, type)
901 assert unpickled_class.__name__ == "MyClass"
902 assert unpickled_class.my_attribute == 1
903
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000904
Antoine Pitrou91f43802019-05-26 17:10:09 +0200905.. _pickle-oob:
906
907Out-of-band Buffers
908-------------------
909
910.. versionadded:: 3.8
911
912In some contexts, the :mod:`pickle` module is used to transfer massive amounts
913of data. Therefore, it can be important to minimize the number of memory
914copies, to preserve performance and resource consumption. However, normal
915operation of the :mod:`pickle` module, as it transforms a graph-like structure
916of objects into a sequential stream of bytes, intrinsically involves copying
917data to and from the pickle stream.
918
919This constraint can be eschewed if both the *provider* (the implementation
920of the object types to be transferred) and the *consumer* (the implementation
921of the communications system) support the out-of-band transfer facilities
922provided by pickle protocol 5 and higher.
923
924Provider API
925^^^^^^^^^^^^
926
927The large data objects to be pickled must implement a :meth:`__reduce_ex__`
928method specialized for protocol 5 and higher, which returns a
929:class:`PickleBuffer` instance (instead of e.g. a :class:`bytes` object)
930for any large data.
931
932A :class:`PickleBuffer` object *signals* that the underlying buffer is
933eligible for out-of-band data transfer. Those objects remain compatible
934with normal usage of the :mod:`pickle` module. However, consumers can also
935opt-in to tell :mod:`pickle` that they will handle those buffers by
936themselves.
937
938Consumer API
939^^^^^^^^^^^^
940
941A communications system can enable custom handling of the :class:`PickleBuffer`
942objects generated when serializing an object graph.
943
944On the sending side, it needs to pass a *buffer_callback* argument to
945:class:`Pickler` (or to the :func:`dump` or :func:`dumps` function), which
946will be called with each :class:`PickleBuffer` generated while pickling
947the object graph. Buffers accumulated by the *buffer_callback* will not
948see their data copied into the pickle stream, only a cheap marker will be
949inserted.
950
951On the receiving side, it needs to pass a *buffers* argument to
952:class:`Unpickler` (or to the :func:`load` or :func:`loads` function),
953which is an iterable of the buffers which were passed to *buffer_callback*.
954That iterable should produce buffers in the same order as they were passed
955to *buffer_callback*. Those buffers will provide the data expected by the
956reconstructors of the objects whose pickling produced the original
957:class:`PickleBuffer` objects.
958
959Between the sending side and the receiving side, the communications system
960is free to implement its own transfer mechanism for out-of-band buffers.
961Potential optimizations include the use of shared memory or datatype-dependent
962compression.
963
964Example
965^^^^^^^
966
967Here is a trivial example where we implement a :class:`bytearray` subclass
968able to participate in out-of-band buffer pickling::
969
970 class ZeroCopyByteArray(bytearray):
971
972 def __reduce_ex__(self, protocol):
973 if protocol >= 5:
974 return type(self)._reconstruct, (PickleBuffer(self),), None
975 else:
976 # PickleBuffer is forbidden with pickle protocols <= 4.
977 return type(self)._reconstruct, (bytearray(self),)
978
979 @classmethod
980 def _reconstruct(cls, obj):
981 with memoryview(obj) as m:
982 # Get a handle over the original buffer object
983 obj = m.obj
984 if type(obj) is cls:
985 # Original buffer object is a ZeroCopyByteArray, return it
986 # as-is.
987 return obj
988 else:
989 return cls(obj)
990
991The reconstructor (the ``_reconstruct`` class method) returns the buffer's
992providing object if it has the right type. This is an easy way to simulate
993zero-copy behaviour on this toy example.
994
995On the consumer side, we can pickle those objects the usual way, which
996when unserialized will give us a copy of the original object::
997
998 b = ZeroCopyByteArray(b"abc")
999 data = pickle.dumps(b, protocol=5)
1000 new_b = pickle.loads(data)
1001 print(b == new_b) # True
1002 print(b is new_b) # False: a copy was made
1003
1004But if we pass a *buffer_callback* and then give back the accumulated
1005buffers when unserializing, we are able to get back the original object::
1006
1007 b = ZeroCopyByteArray(b"abc")
1008 buffers = []
1009 data = pickle.dumps(b, protocol=5, buffer_callback=buffers.append)
1010 new_b = pickle.loads(data, buffers=buffers)
1011 print(b == new_b) # True
1012 print(b is new_b) # True: no copy was made
1013
1014This example is limited by the fact that :class:`bytearray` allocates its
1015own memory: you cannot create a :class:`bytearray` instance that is backed
1016by another object's memory. However, third-party datatypes such as NumPy
1017arrays do not have this limitation, and allow use of zero-copy pickling
1018(or making as few copies as possible) when transferring between distinct
1019processes or systems.
1020
1021.. seealso:: :pep:`574` -- Pickle protocol 5 with out-of-band data
1022
1023
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001024.. _pickle-restrict:
Georg Brandl116aa622007-08-15 14:28:22 +00001025
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001026Restricting Globals
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +00001027-------------------
Georg Brandl116aa622007-08-15 14:28:22 +00001028
Christian Heimes05e8be12008-02-23 18:30:17 +00001029.. index::
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001030 single: find_class() (pickle protocol)
Christian Heimes05e8be12008-02-23 18:30:17 +00001031
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001032By default, unpickling will import any class or function that it finds in the
1033pickle data. For many applications, this behaviour is unacceptable as it
1034permits the unpickler to import and invoke arbitrary code. Just consider what
1035this hand-crafted pickle data stream does when loaded::
Georg Brandl116aa622007-08-15 14:28:22 +00001036
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001037 >>> import pickle
1038 >>> pickle.loads(b"cos\nsystem\n(S'echo hello world'\ntR.")
1039 hello world
1040 0
Georg Brandl116aa622007-08-15 14:28:22 +00001041
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001042In this example, the unpickler imports the :func:`os.system` function and then
1043apply the string argument "echo hello world". Although this example is
1044inoffensive, it is not difficult to imagine one that could damage your system.
Georg Brandl116aa622007-08-15 14:28:22 +00001045
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001046For this reason, you may want to control what gets unpickled by customizing
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +03001047:meth:`Unpickler.find_class`. Unlike its name suggests,
1048:meth:`Unpickler.find_class` is called whenever a global (i.e., a class or
1049a function) is requested. Thus it is possible to either completely forbid
1050globals or restrict them to a safe subset.
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001051
1052Here is an example of an unpickler allowing only few safe classes from the
1053:mod:`builtins` module to be loaded::
1054
1055 import builtins
1056 import io
1057 import pickle
1058
1059 safe_builtins = {
1060 'range',
1061 'complex',
1062 'set',
1063 'frozenset',
1064 'slice',
1065 }
1066
1067 class RestrictedUnpickler(pickle.Unpickler):
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +00001068
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001069 def find_class(self, module, name):
1070 # Only allow safe classes from builtins.
1071 if module == "builtins" and name in safe_builtins:
1072 return getattr(builtins, name)
1073 # Forbid everything else.
1074 raise pickle.UnpicklingError("global '%s.%s' is forbidden" %
1075 (module, name))
1076
1077 def restricted_loads(s):
1078 """Helper function analogous to pickle.loads()."""
1079 return RestrictedUnpickler(io.BytesIO(s)).load()
1080
1081A sample usage of our unpickler working has intended::
1082
1083 >>> restricted_loads(pickle.dumps([1, 2, range(15)]))
1084 [1, 2, range(0, 15)]
1085 >>> restricted_loads(b"cos\nsystem\n(S'echo hello world'\ntR.")
1086 Traceback (most recent call last):
1087 ...
1088 pickle.UnpicklingError: global 'os.system' is forbidden
1089 >>> restricted_loads(b'cbuiltins\neval\n'
1090 ... b'(S\'getattr(__import__("os"), "system")'
1091 ... b'("echo hello world")\'\ntR.')
1092 Traceback (most recent call last):
1093 ...
1094 pickle.UnpicklingError: global 'builtins.eval' is forbidden
1095
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +00001096
1097.. XXX Add note about how extension codes could evade our protection
Georg Brandl48310cd2009-01-03 21:18:54 +00001098 mechanism (e.g. cached classes do not invokes find_class()).
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +00001099
1100As our examples shows, you have to be careful with what you allow to be
1101unpickled. Therefore if security is a concern, you may want to consider
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +00001102alternatives such as the marshalling API in :mod:`xmlrpc.client` or
1103third-party solutions.
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +00001104
Georg Brandl116aa622007-08-15 14:28:22 +00001105
Antoine Pitroud4d60552013-12-07 00:56:59 +01001106Performance
1107-----------
1108
1109Recent versions of the pickle protocol (from protocol 2 and upwards) feature
1110efficient binary encodings for several common features and built-in types.
1111Also, the :mod:`pickle` module has a transparent optimizer written in C.
1112
1113
Georg Brandl116aa622007-08-15 14:28:22 +00001114.. _pickle-example:
1115
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +00001116Examples
1117--------
Georg Brandl116aa622007-08-15 14:28:22 +00001118
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +00001119For the simplest code, use the :func:`dump` and :func:`load` functions. ::
Georg Brandl116aa622007-08-15 14:28:22 +00001120
1121 import pickle
1122
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +00001123 # An arbitrary collection of objects supported by pickle.
1124 data = {
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +00001125 'a': [1, 2.0, 3, 4+6j],
1126 'b': ("character string", b"byte string"),
Raymond Hettingerdf1b6992014-11-09 15:56:33 -08001127 'c': {None, True, False}
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +00001128 }
Georg Brandl116aa622007-08-15 14:28:22 +00001129
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +00001130 with open('data.pickle', 'wb') as f:
1131 # Pickle the 'data' dictionary using the highest protocol available.
1132 pickle.dump(data, f, pickle.HIGHEST_PROTOCOL)
Georg Brandl116aa622007-08-15 14:28:22 +00001133
Georg Brandl116aa622007-08-15 14:28:22 +00001134
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +00001135The following example reads the resulting pickled data. ::
Georg Brandl116aa622007-08-15 14:28:22 +00001136
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +00001137 import pickle
Georg Brandl116aa622007-08-15 14:28:22 +00001138
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +00001139 with open('data.pickle', 'rb') as f:
1140 # The protocol version used is detected automatically, so we do not
1141 # have to specify it.
1142 data = pickle.load(f)
Georg Brandl116aa622007-08-15 14:28:22 +00001143
Georg Brandl116aa622007-08-15 14:28:22 +00001144
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +00001145.. XXX: Add examples showing how to optimize pickles for size (like using
1146.. pickletools.optimize() or the gzip module).
1147
1148
Georg Brandl116aa622007-08-15 14:28:22 +00001149.. seealso::
1150
Alexandre Vassalottif7fa63d2008-05-11 08:55:36 +00001151 Module :mod:`copyreg`
Georg Brandl116aa622007-08-15 14:28:22 +00001152 Pickle interface constructor registration for extension types.
1153
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +00001154 Module :mod:`pickletools`
1155 Tools for working with and analyzing pickled data.
1156
Georg Brandl116aa622007-08-15 14:28:22 +00001157 Module :mod:`shelve`
1158 Indexed databases of objects; uses :mod:`pickle`.
1159
1160 Module :mod:`copy`
1161 Shallow and deep object copying.
1162
1163 Module :mod:`marshal`
1164 High-performance serialization of built-in types.
1165
1166
Georg Brandl116aa622007-08-15 14:28:22 +00001167.. rubric:: Footnotes
1168
1169.. [#] Don't confuse this with the :mod:`marshal` module
1170
Ethan Furman2498d9e2013-10-18 00:45:40 -07001171.. [#] This is why :keyword:`lambda` functions cannot be pickled: all
Serhiy Storchaka2b57c432018-12-19 08:09:46 +02001172 :keyword:`!lambda` functions share the same name: ``<lambda>``.
Ethan Furman2498d9e2013-10-18 00:45:40 -07001173
Georg Brandl116aa622007-08-15 14:28:22 +00001174.. [#] The exception raised will likely be an :exc:`ImportError` or an
1175 :exc:`AttributeError` but it could be something else.
1176
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +00001177.. [#] The :mod:`copy` module uses this protocol for shallow and deep copying
1178 operations.
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +00001179
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001180.. [#] The limitation on alphanumeric characters is due to the fact
1181 the persistent IDs, in protocol 0, are delimited by the newline
1182 character. Therefore if any kind of newline characters occurs in
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +00001183 persistent IDs, the resulting pickle will become unreadable.